Our New Model Helps AI Think Before it Acts


Today, we’re excited to share V-JEPA 2, our state-of-the-art world model, trained on video, that enables robots and other AI agents to understand the physical world and predict how it will respond to their actions. These capabilities are essential to building AI agents that can think before they act, and V-JEPA 2 represents meaningful progress toward our ultimate goal of developing advanced machine intelligence (AMI). 

As humans, we have the ability to predict how the physical world will evolve in response to our actions or the actions of others. For example, you know that if you toss a tennis ball into the air, gravity will pull it back down. When you walk through an unfamiliar crowded area, you’re making moves toward our destination while also trying not to bump into people or obstacles along the path. When playing hockey, you skate to where the puck is going, not where it currently is. We achieve this physical intuition by observing the world around us and developing an internal model of it, which we can use to predict the outcomes of hypothetical actions. 

V-JEPA 2 helps AI agents mimic this intelligence, making them smarter about the physical world. The models we use to develop this kind of intelligence in machines are called world models, and they enable three essential capabilities: understanding, predicting and planning.

Building on V-JEPA, our first model trained on video that we released last year, V-JEPA 2 improves understanding and predicting, enabling robots to interact with unfamiliar objects and environments to complete a task. 

We trained V-JEPA 2 using video, which helped the model learn important patterns in the physical world, including how people interact with objects, how objects move in the physical world and how objects interact with other objects. When deployed on robots in our labs, we found that robots can use V-JEPA 2 to perform tasks like reaching, picking up an object and placing an object in a new location.  

Today, in addition to releasing V-JEPA 2, we’re sharing three new benchmarks to help the research community evaluate how well their existing models learn and reason about the world using video. By sharing this work, we aim to give researchers and developers access to the best models and benchmarks to help accelerate research and progress – ultimately leading to better and more capable AI systems that will help enhance people’s lives.





Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Discover

Sponsor

spot_imgspot_img

Latest

This AI is BETTER than ChatGPT ! *DeepSeek*

143894 AI ka zamana hai doston, AI 🔥 Subscribe for Tabahi Videos INSTAGRAM ► http://instagram.com/techburner TWITTER ► https://twitter.com/tech_burner FACEBOOK ► https://www.facebook.com/techburner1 WEBSITE ► https://www.techburner.in source

Discover how the Ayushman Bharat card is revolutionizing healthcare, and providi…

Discover how the Ayushman Bharat card is revolutionizing healthcare, and providing crucial support to elderly citizens. With easy access to quality medical services, it’s...

BTS J-Hope Family, Education, Career, Net Worth & Military Service

About BTS J-hopeJ-Hope, also known as Jung Ho-Seok, is a South Korean rapper, singer, songwriter, dancer, and record producer. In 2013, under the...

TechBurner New Product Revealed – QnA !

68134 Ab time aa gaya hai Sawaal Jawaab ka ! 🔥 Subscribe for Tabahi Videos INSTAGRAM ► http://instagram.com/techburner TWITTER ► https://twitter.com/tech_burner FACEBOOK ► https://www.facebook.com/techburner1 WEBSITE ► https://www.techburner.in व्ट टाइम इट इज...

#PrathiKanamKanam పాట మీకోసం. #PrathiKanamKanam paata meekosam *link in bio* Watch #Tiger3 at your nearest big screen in Hindi, Tamil & Telugu. Book your tickets now...

#PrathiKanamKanam ❤️ పాట మీకోసం. #PrathiKanamKanam paata meekosam *link in bio* Watch #Tiger3 at your nearest big screen in Hindi, Tamil & Telugu. Book your tickets now...