It's becoming extremely easy (and I don't think it's just a good thing) to alter a video, and the latest developments in AI are truly impressive.
A collaboration between giants (Stanford and Princeton Universities plus the Max Planck Institute for Informatics and Adobe) makes it possible to alter the speech in a video simply by modifying the textual transcription, and without creating the "dubbing" effect.
In other words the person who is speaking on video will literally change the words of his speech also modifying lip movements.
To achieve this somewhat disturbing result, the algorithm “learns” the phonemes and their pronunciation. by the subject in the video and creates an accurate 3D model of his face, capable of replicating all the sounds and the movements: at that point it will be enough to edit the text of the speech and the algorithm will replace the original sentence.
Currently the algorithm needs at least 40 minutes of film to “train” to replicate a person in a film.
Here is a video demonstrating how the system works:
Huge ethical doubts
It is clear that this mechanism creates the possibility that anyone can modify a speech (perhaps by political figures or public figures) by inserting elements of hatred, or disinformation, and spread them. as original and natural: this only increases concerns about the spread of systems based on deepfake.
On the other hand, there are some positive sides, and that is in the enormous savings that editing will obtain by avoiding to turn whole scenes again because of small mistakes of pronunciation.
For the rest, I am sure that other "anti-counterfeiting" methods will be developed for videos too: dynamic watermarks or watermarks that make it even more the work of artificial intelligence is complex, in a competition between reality and manipulation which already seems intended to characterize the next years.