We still haven't recovered from the advent of DALL-E2, Midjourney and companions that Meta announced Make A Video, a tool that generates short video clips from text descriptions. It's the next step for the world of AI-generated content.
It's the first time a text-to-video conversion tool has come this close to final launch. “Artificial intelligence research is pushing creative expression forward by providing people with tools to create new content quickly and easily,” reads the Press release presentation.
Make-A-Video is able to bring creativity to life with a few words or lines of text and to create distinctive films rich in colors, characters and settings. The system can also transform existing photographs or videos into similar new movies.
Great shot, David
“It's much harder to generate videos than photos,” says the Meta CEO Mark Zuckerberg in a post on Facebook. But go, I didn't think. “In addition to correctly generating each pixel, the system must also predict how they will change over time. Make-A-Video solves this problem by adding an unsupervised learning layer that allows the system to understand motion in the physical world and apply it to traditional text-to-image generation.”
Il website by Make-A-Video presents some example videos made by the AI, such as “a dog wearing a superhero costume with a red cape flying in the sky” and “a teddy bear painting”. It is yet another demonstration of the incredibly rapid progress of these systems. Only two? Three years ago? These things were practically science fiction.
Make-A-Video, wonder (and of course dangers)
As we increasingly rely on AI to generate art, it will be increasingly important for companies to adopt transparency policies around these algorithms. Reading the research paper behind Make-A-Video, it is clear that this artificial intelligence was "trained" using a subset of a dataset called LAOIN, which also includes less than clean images. Which? ISIS executions, non-consensual nudity, and so on. Meta ensures that it has sifted through this data thoroughly, automatically discarding nude and other false images.
Will be. Meanwhile, the battle over ethics continues.
The introduction of text-to-video as a tool for artists and creators also complicates the (already thorny) question of the legitimacy of AI-generated art. In August, as you know, a guy called Jason Allen won an art contest using an image created by Midjourney, stirring up a hornet's nest of controversy.
Even companies that collect images for commercial use (like Shutterstock or Getty Images) have closed the door to this content. No ethical question, in this case. Legal only. Who owns the images used by the algorithms to train? Is turning those images into new things a violation of copyright or not? The laws have not yet adapted.
Meanwhile the tsunami continues: these technologies are literally overwhelming the public, with the same speed they show in learning to perfect themselves. Yesterday's announcement on Make-A-Video follows by just one day the public release of DALLE-2 by OpenAI. The company that developed DALLE-2 has removed the system's waiting list, allowing anyone to generate images from lines of text.
But even as the public has access to more and more AI art-generating tools, some of the fundamental ethical questions about their use remain open: and they demand answers.