Researchers at the University of Washington have developed a video editing tool that allows users to create videos in which people (including public figures) can realistically be made to appear to say things that they never have. These "lip synched" videos that can splice together video and audio from different sources.
As a demonstration, the team posted an 8-minute clip of the audio tool that blends audio taken from remarks made by then-President Barack Obama on 18 June 2016 with video footage from a separate speech:
According to the university, the program works by converting audio into "basic mouth shapes," which are then superimposed onto existing video footage of that person speaking. Suwajanakorn said:
Realistic audio-to-video conversion has practical applications like improving video conferencing for meetings, as well as futuristic ones such as being able to hold a conversation with a historical figure in virtual reality by creating visuals just from audio. This is the kind of breakthrough that will help enable those next steps.
Suwajanakorn's team chose to use Obama excerpts for their demonstration because the large amount of available material featuring the former president allowed their tool to recognize his speech patterns. But co-author Steve Seitz said that the program's algorithms could be developed to the point that it could recognize a person's speech patterns based on one hour of footage:
In the future video, chat tools like Skype or Messenger will enable anyone to collect videos that could be used to train computer models.
Quite clearly, though, the technique could be used to deceive. People are already fooled by doctored photos, impostor accounts on social media, and other sorts of digital mimicry all the time. Imagine the confusion that might surround a convincing video of the president being made to say something he never actually said.
Suwajanakorn responded to the criticism by saying that users could discern video manipulation by using a database of available footage, or by having access to multiple videos of a person's remarks:
My thought is that people will not believe videos, just like how we do not believe photos once we’re aware that tools like Photoshop exist. This could be both good and bad, and we have to move on to a more reliable source of evidence.
Alexios Mantzarlis, the director of the International Fact-Checking Network at the Poynter Institute, told us that his group has taken no position on the editing tool but noted:
I of course understand the risk of someone seeing a video of a public figure saying something hateful they never said and that having real-life consequences.
Overall, however, I don't think fact-checkers can quixotically fight any new technology that has the potential of being misused. Otherwise, we would have been signing petitions against Photoshop. The answer, as always, is solid research and good education.
We contacted the University of Washington seeking further comment, but have yet to hear back.