Deepfake video: It takes AI to beat AI
By now, most of us have shared a few chuckles over AI-generated deepfake videos, like those in which the face of comedian and impressionist Bill Hader gradually takes on the likenesses of Tom Cruise, Seth Rogen, and Arnold Schwarzenegger as he imitates the celebrities. We’ve seen actor Ryan Reynolds' mug superimposed over Gene Wilder’s in the 1971 classic film "Willy Wonka & the Chocolate Factory." We’ve even marveled over businessman Elon Musk being turned into a baby.
It all can be quite humorous, but not everyone is laughing. Tech companies, researchers, and politicians alike are growing concerned that the increasing sophistication of the artificial intelligence and machine learning technology powering deepfakes will outpace our ability to discern between genuine and doctored imagery.
Sounding alarm bells
Consider what might happen if a bogus video surfaced just prior to next year’s elections with a top candidate saying something offensive or sounding intoxicated, as a widely viewed deepfake involving House Speaker Nancy Pelosi did back in June. It’s entirely possible such deception could swing a close election one way or the other. Conversely, politicians legitimately recorded saying or doing something offensive might claim the tapings are “fake” rather than owning up to their behavior.
Beyond their potential effect on electoral processes, experts say deepfakes could also threaten national security. Imagine, for example, the consequences of NORAD receiving video purporting to show the leader of a hostile nation ordering a nuclear strike against the U.S. Even if the clip’s authenticity was initially in doubt, the military would have little choice but to take precautions, such as escalating the nation’s defense readiness condition (DEFCON) to a force-readiness level.
These types of scenarios are a major reason why lawmakers, such as Rep. Adam B. Schiff (D-Calif.), who chairs the House Intelligence Committee, say regulating deepfake technology should now be “worthy of serious consideration” and why about a dozen bills to limit deepfakes have been introduced in both the U.S. Congress and state legislatures in the past year, according to a recent survey. They’re also partly behind Facebook and Microsoft's recent announcement of the $10 million Deepfake Detection Challenge to produce “technology that everyone can use to better detect when AI has been used to alter a video in order to mislead the viewer.”
The general concern isn’t so much that deepfakes might occasionally stir controversy, muddle electoral processes, or create national security challenges. Rather, it’s the possibility they could do so frequently. While not the case just yet (96 percent of the estimated 14,678 deepfake videos in the wild are pornography related, according to a recent Deeptrace study), rapid progress in AI technology and the ease with which deepfake videos are created could make them a far more pervasive and destructive threat before long.
What’s more, if fake-video distribution reaches epidemic levels, it could irreparably distort how people consume information and form opinions on matters of public importance. As humans, we sometimes shun facts conflicting with our preexisting beliefs while embracing data points confirming them—even when that information is clearly false. For example, to this day, it’s thought many of the estimated 3 million viewers of the Pelosi video still assume it was an accurate representation of her comments, simply because they already disliked her when they viewed it. Psychologists call this phenomenon “confirmation bias,” and many worry about it spinning out of control. In fact, nearly two-thirds of Americans (63 percent) feel altered videos and images are creating a “great deal of confusion,” and roughly three-quarters (77 percent) think they should be restricted, according to a recent Pew Research Center survey.
“Deepfake technology is going to be one of the largest problems of the next decade, with the potential to disrupt our societal balance and push populations into a state of civil unrest,” predicts Avivah Litan, a vice president and analyst at Gartner. “It is a global and national security problem at the same time that should be treated with the urgency it deserves.”
The detection dilemma
Early efforts to get ahead of the deepfake problem have focused on producing technology for detecting when something is real or imitation. Facebook, which has been criticized for failing to stop the spread of misinformation on its social site, thinks one approach for doing that is to create huge datasets of valid source video and personalities. Deep learning algorithms would essentially examine the shapes of faces, the way individuals smile or blink their eyes, the way they cock their heads, and hundreds of other criteria to make probabilistic guesses about the authenticity of videos.
As observant human beings, most of us are capable of spotting these things with the naked eye—at least for now. We watch a video and something feels off kilter. The person featured doesn’t seem to be blinking very much, if at all. Or the edges of their face blend too easily into their hairlines. Or there seems to be an unnatural audio anomaly occurring between their spoken words. When we see and hear these kinds of things, we instinctively recognize something is wrong. But as the tools for making deepfakes continue to advance, our ability to distinguish them from authentic videos will wane. We’ll be outsmarted by smart technology.
Emerging detection techniques have proved to be more than 90 percent accurate in spotting several variations of deepfakes, such as face swaps. And both government agencies and venture capitalists have been investing in programs aimed at developing powerful detection capabilities.
But few observers believe detection technology alone will adequately address the deepfake dilemma.
Nasir Memon, a professor of computer science and engineering at the New York University (NYU) Tandon School of Engineering, notes that more than 500 hours of video are uploaded to YouTube every minute. Magnify that by all websites, and you’d have to peruse millions of video clips to determine authenticity.
“Checking all of them with a high degree of accuracy would be really hard, if not impossible,” Memon says. “Even if you have developed the most sophisticated detection technology, it would have limited effect. It would not be able to scale to the billions of media objects that are put out there every day. And then, if you have even a 0.0001 false positive rate, you could still end up with millions of false positives.”
A shift toward authentication
Industry professionals such as Shamir Allibhai, CEO of Amber Video in San Francisco, believe that because deepfakes are AI-enabled, using detection technology alone to overcome them would be a “losing battle.”
“Right now, you can recognize videos as deepfake with your own eyes, and detection software provides an added advantage, but that advantage won’t be there forever,” he says. “So, we are really big believers and advocates for also using authentication technology. Our company has products on both sides, detection and authentication. But we believe authentication will be the best way to go, especially when it concerns a video of criticality or of an evidentiary nature, such as footage from a police body camera or a security camera that might be used in court. Nobody wants to live in the world that detection technology lives in, which is all about probabilities. You want to be unequivocally certain that a video being entered as evidence in court is authentic. That it hasn’t been altered since its recording.”
To assure a video hasn’t been altered, some form of certification or verification at the point of inception would be necessary. One of the more common methods under discussion involves watermarks that would be digitally inserted into certain color frequencies. Theoretically, using AI, this would make it possible to create a digital bread crumb trail for verifying a video’s authenticity throughout its lifecycle.
A healthy dose of blockchain
In fact, NYU Tandon School researchers recently showed off an experimental AI watermarking technique they say boosted the chances of detecting manipulated images from about 45 percent to more than 90 percent, without compromising image quality. Insertion of these digital fingerprints would happen in cameras, producing indelible records along with time stamps for forensic purposes.
Of course, that raises the question of what you do with those watermarks once they’re generated. Many believe the answer is to write them to a blockchain, which Gartner defines as, “an expanding list of cryptographically signed, irrevocable transactional records shared by all participants in a network.” Placing watermarks in blockchains would effectively create a “shared single version of the truth,” providing undisputable and trustworthy audit trails for suspect videos.
Amber Video’s Allibhai says companies such as his, which are dabbling in the use of blockchain for authenticating videos, aren’t preoccupied with assessing the market size at this early stage. They just sense a need and are trying to address it with technological innovation.
“It’s hard to figure out what the profit potential is,” he says. “We just think it’s so important to aggressively move forward and build pools of video information, share them, and get them into the hands of companies and consumers.”
Sniffing out fake news
Experts say the capability to share video data in blockchains would not only be beneficial for the usual suspects who should care about it—social media sites, celebrities, politicians, and attorneys, for example—but for news media outlets as well, which have grown sensitive to accusations about propagating “fake news.” Indeed, The New York Times is experimenting with blockchain technology for recording and sharing metadata about media—news photos and videos in particular—published by news organizations. Part of the News Provenance Project, the pilot will lead to a proof of concept that could be used by other news organizations, the newspaper says.
For its part, Gartner believes this could be the future for the world of journalism, predicting that by 2023, as much as 30 percent of world news and video content will be authenticated as real by blockchain-countering deepfake technology.1 Cryptographic links to validated and invalidated videos could be stored in the blockchain and be made readily available to any authorized person needing to refer to them.
“A layered security approach requires identifying [exclusion] of ‘bad’ videos through advanced and AI detection models as well as [identifying ‘good’ videos], whose provenance can be tracked and traced through immutable blockchain ledgers,” says Gartner’s Litan. “[Identifying ‘good’ videos’] is always a more effective security measure than [excluding ‘bad’ videos]. But for practical reasons, we need a combination of both.”
The technology for detecting and authenticating deepfake videos will continue to develop, driven by the motivations of the private sector as well as the national security concerns of government. Today, however, there aren’t many viable solutions for slowing the rise of this content. So, for the time being, experts advise viewers to always assume that if something looks, sounds, and feels unreliable, it probably is.
“Deepfake videos are becoming a fact of life,” says NYU’s Memon. “People are just going to have to take a few deep breaths and not jump to conclusions with any video they see. They’ll have to look beyond simple sequences of frames before accepting something as true. We can no longer assume that seeing is believing. Trust is unfortunately eroding. For now, we’ll just have to learn to live with it.”
1 "Gartner Top Strategic Predictions for 2020 and Beyond," Oct. 22, 2019
Deepfake video: Lessons for leaders
- Detecting high-quality deepfake video is beyond the skills of the average user.
- Even when identified as fake, these videos can have a big impact on public perception.
- The future may be in validating video via blockchain or some similar mechanism.
This article/content was written by the individual writer identified and does not necessarily reflect the view of Hewlett Packard Enterprise Company.