This AI Model Can Intuit How the Physical World Works


The original version of this story appeared in Quanta Magazine.

Here’s a test for infants: Show them a glass of water on a desk. Hide it behind a wooden board. Now move the board toward the glass. If the board keeps going past the glass, as if it weren’t there, are they surprised? Many 6-month-olds are, and by a year, almost all children have an intuitive notion of an object’s permanence, learned through observation. Now some artificial intelligence models do too.

Researchers have developed an AI system that learns about the world via videos and demonstrates a notion of “surprise” when presented with information that goes against the knowledge it has gleaned.

The model, created by Meta and called Video Joint Embedding Predictive Architecture (V-JEPA), does not make any assumptions about the physics of the world contained in the videos. Nonetheless, it can begin to make sense of how the world works.

“Their claims are, a priori, very plausible, and the results are super interesting,” says Micha Heilbron, a cognitive scientist at the University of Amsterdam who studies how brains and artificial systems make sense of the world.

Higher Abstractions

As the engineers who build self-driving cars know, it can be hard to get an AI system to reliably make sense of what it sees. Most systems designed to “understand” videos in order to either classify their content (“a person playing tennis,” for example) or identify the contours of an object—say, a car up ahead—work in what’s called “pixel space.” The model essentially treats every pixel in a video as equal in importance.

But these pixel-space models come with limitations. Imagine trying to make sense of a suburban street. If the scene has cars, traffic lights and trees, the model might focus too much on irrelevant details such as the motion of the leaves. It might miss the color of the traffic light, or the positions of nearby cars. “When you go to images or video, you don’t want to work in [pixel] space because there are too many details you don’t want to model,” said Randall Balestriero, a computer scientist at Brown University.

Image may contain Yann LeCun Face Happy Head Person Smile Photography Portrait Dimples Adult and Accessories

Yann LeCun, a computer scientist at New York University and the director of AI research at Meta, created JEPA, a predecessor to V-JEPA that works on still images, in 2022.

Photograph: École Polytechnique Université Paris-Saclay



Source link

  • Related Posts

    Free MLB.TV on T-Mobile Offer Returns for Opening Day, but You Have to Act Soon

    Batter up! The 2026 Major League Baseball season starts tomorrow, March 25, with a matchup between the San Francisco Giants and the New York Yankees, and T-Mobile customers (who don’t…

    Google’s Android Automotive is moving from the dashboard to the ‘brain’ of the car

    Google announced a new version of its Android Automotive open-source operating system for software-defined vehicles. Whereas previously Android Automotive operated exclusively in the car’s infotainment system, Google is now expanding…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Johan Fourie interviews me at University of Stellenbosch

    Johan Fourie interviews me at University of Stellenbosch

    Airlines allow passengers to rebook flights for free at some airports amid TSA chaos

    Airlines allow passengers to rebook flights for free at some airports amid TSA chaos

    The Newest Payday Game Is Unexpected, And I Hope It's Better Than Payday 3

    The Newest Payday Game Is Unexpected, And I Hope It's Better Than Payday 3

    US public health groups urge firing of EPA boss Zeldin, saying he ‘brazenly betrayed’ agency | US Environmental Protection Agency

    US public health groups urge firing of EPA boss Zeldin, saying he ‘brazenly betrayed’ agency | US Environmental Protection Agency

    Mastermind of Canada's largest gold heist admits to $20M theft at Pearson airport, paying off 'debt list'

    Mastermind of Canada's largest gold heist admits to $20M theft at Pearson airport, paying off 'debt list'

    Free MLB.TV on T-Mobile Offer Returns for Opening Day, but You Have to Act Soon

    Free MLB.TV on T-Mobile Offer Returns for Opening Day, but You Have to Act Soon