Researchers at the University of Washington have developed a video editing tool that allows users to create videos in which people (including public figures) can realistically be made to appear to say things that they never have. These “lip synched” videos that can splice together video and audio from different sources.

Lead researcher Supasorn Suwajanakorn and his team documented their work in a paper slated to be presented on 2 August 2017 at the SIGGRAPH Conference on Computer Graphics and Interactive Techniques.

As a demonstration, the team posted an 8-minute clip of the audio tool that blends audio taken from remarks made by then-President Barack Obama on 18 June 2016 with video footage from a separate speech:

 

According to the university, the program works by converting audio into “basic mouth shapes,” which are then superimposed onto existing video footage of that person speaking. Suwajanakorn said:

Realistic audio-to-video conversion has practical applications like improving video conferencing for meetings, as well as futuristic ones such as being able to hold a conversation with a historical figure in virtual reality by creating visuals just from audio. This is the kind of breakthrough that will help enable those next steps.

Suwajanakorn’s team chose to use Obama excerpts for their demonstration because the large amount of available material featuring the former president allowed their tool to recognize his speech patterns. But co-author Steve Seitz said that the program’s algorithms could be developed to the point that it could recognize a person’s speech patterns based on one hour of footage:

In the future video, chat tools like Skype or Messenger will enable anyone to collect videos that could be used to train computer models.

However, the researchers’ work has also provoked a negative reaction from several media outlets. Mashable described it as “scary”, while The Atlantic observed:

Quite clearly, though, the technique could be used to deceive. People are already fooled by doctored photos, impostor accounts on social media, and other sorts of digital mimicry all the time. Imagine the confusion that might surround a convincing video of the president being made to say something he never actually said.

Suwajanakorn responded to the criticism by saying that users could discern video manipulation by using a database of available footage, or by having access to multiple videos of a person’s remarks:

My thought is that people will not believe videos, just like how we do not believe photos once we’re aware that tools like Photoshop exist. This could be both good and bad, and we have to move on to a more reliable source of evidence.

Of course, despite widespread knowledge of photo-editing tools and a long history of photograph manipulation, people are still often fooled by duplicitous photographs.

Alexios Mantzarlis, the director of the International Fact-Checking Network at the Poynter Institute, told us that his group has taken no position on the editing tool but noted:

I of course understand the risk of someone seeing a video of a public figure saying something hateful they never said and that having real-life consequences.

Overall, however, I don’t think fact-checkers can quixotically fight any new technology that has the potential of being misused. Otherwise, we would have been signing petitions against Photoshop. The answer, as always, is solid research and good education.

We contacted the University of Washington seeking further comment, but have yet to hear back.

Added a comment from Alexios Mantzarlis, director of the International Fact-Checking Network.

Sources:

. “Weekly Address: Standing with Orlando.” YouTube, posted by The Obama White House. 18 June 2016. https://youtu.be/nIxM8rL5GVE

LaFrance, Adrienne. “The Technology That Will Make It Impossible for You to Believe What You See.” The Atlantic. 11 July 2017

Nuñez, Michael. “This Scary Video Tool Makes Fake News Look Legit.” Mashable. 17 July 2017.

Langston, Jennifer. “Lip-Syncing Obama: New Tools Turn Audio Clips Into Realistic Video.” University of Washington. 11 2017.

Suwajanakorn, Supasorn, Seitz, Steven M. et al. “Synthesizing Obama: Learning Lip Sync from Audio.” University of Washington.