Here's an AI that can predict the next 25 frames of a movie

November 8, 2019

Technology

An artificial intelligence is able to predict quite accurately what the 25 future frames of this movie will be.

AI and machine learning algorithms are getting better at predicting actions in videos.

The best of current algorithms can predict quite precisely where a baseball will go after it has been thrown, or the appearance of a road in the sequence to come. In other words? Predicting frames in the future of a movie.

A new approach proposed by researchers at Google, University of Michigan and Adobe advances the state of the art with large-scale models that generate high-quality video from just a few frames.

“With this project we aim to obtain precise video forecasts. We will optimize the capabilities of a neural network,” the researchers wrote in a document which describes their work.

The team model

The team's basic model is based on a stochastic video generation architecture, with a component that manages the predictions of the frames following those considered.

The team trained and tested different versions of the model separately from custom datasets based on three forecast categories: interactions between objects, structured movement and partial observability.

For the first task (interactions with objects) the researchers selected 256 clips from a block of videos showing a robotic arm while interacting with towels.

For the second (structured movement) they edited clips from Human 3.6M, a block containing clips of humans performing actions like sitting in a chair.

As for the third (partial observability activity), used an open source KITTI driving dataset collected from footage of cameras mounted on car dashboards.

After this “training,” the AI model generated up to 25 frames into the future.

The researchers report that “predictions” were preferred 90,2%, 98,7%, and 99,3% of the time by raters over the three types of videos: object interactions, structured motion, and partial observability tasks, respectively. respectively.

Qualitatively, the team notes that the AI crisply depicted human arms and legs and done “very precise predictions that seemed realistic compared to the scenes depicted in the video” .

The artificial intelligence model while providing video data frames of human actions

The artificial intelligence model while providing video data frames of a car camera.

“We found that maximizing the capacity of such models improves the quality of video prediction,” coauthors write. We hope that our work encourages the field to push in similar directions in the future. For example, to see how far we can go."

How many minutes of the future can you imagine?

Gianluca Riccio, creative director of Melancia adv, copywriter and journalist. He is part of the Italian Institute for the Future, World Future Society and H+. Since 2006 he has directed Futuroprossimo.it, the Italian Futurology resource.

To report research, discoveries and inventions, contact the editorial team! Follow Futuro Prossimo on Whatsapp: exclusive news and updates (free).

FP on Fatto Quotidiano
Alberto Robiati and Gianluca Riccio guide readers through scenarios of the future: the opportunities, risks and possibilities we have to create a possible tomorrow.

On the same theme:

The last

Here's an AI that can predict the next 25 frames of a movie

Technology

Share

AI and machine learning algorithms are getting better at predicting actions in videos.

The team model

How many minutes of the future can you imagine?

The news we expect from 2024: it will be a year you won't believe

Google Gemini arrives, and it takes your breath away: but when can it be used?

End of humanity, the 14 traps that no one dares face

Energy from CO2: does a nanogenerator turn pollution into a resource?

Amodei, Anthropic: 'AI will soon be able to replicate and survive autonomously'

When physics becomes heretical: will tachyons rewrite the laws of the cosmos?

Goldene: from a forgotten method gold one atom thick

The biotech revolution is knocking: who will open first?