Deepfake videos have been news for some time. But governments and experts alike are concerned that deepfake audio could be another technology responsible for spreading fake news far and wide. As well as this, a computer’s ability to replicate voices could also be used to scam regular people and large companies. So is there a way to detect deepfake audio fraud and prepare yourself for this kind of scamming?
First of all, let’s look at how deepfake actually works.
“Deepfake is the use of machine learning to create audio/visual impersonations of real people,” explains Alexander Korff, executive director and general counsel at voice biometrics company ValidSoft. “It uses techniques based on so-called Generative Adversarial Network (GAN), which can generate new data from existing data sets.”
For example, he says, existing audio of a person speaking “can be used to generate new synthetic audio of that person, based on what the algorithm learnt from the real audio.”
“What you see is less important than what you hear,” Dr. Matthew Aylett, chief scientific officer at text-to-speech specialist CereProc, adds. “A deepfake video, where misinformation is the aim of criminals or other bad actors, isn’t harmful unless there is a message and it is believable. This isn’t visual in most cases, it’s auditory. Through weaponising this technology, you can manufacture conversations or statements that never happened.”
And according to one security firm, this technology isn’t a far-off threat. It’s happening right now. As the BBC reports, Symantec noted three cases of deepfake audio being used to “trick” senior financial controllers into transferring money. The audio replicated the voices of company chief executives.
Chief technology officer Dr. Hugh Thompson told the BBC that corporate videos, presentations, and media appearances made by senior company figures could be useful for an artificial intelligence algorithm such as this. But he noted that it would take hours of audio and thousands of pounds to create high-quality deepfakes.
Other experts disagree. In an Open Access Government article, Dr. Aylett stated that around half an hour of recording would be enough and that the computing power required is “more accessible than ever.” However, there are a number of challenges. Finding clean audio without background noise and creating a realistic conversation are two.
Spotting deepfake audio can be tricky. But it’s possible, says Dr. Aylett. “Ask yourself: ‘Does this person’s voice sound completely natural?’ If in doubt, ask them questions.” This may catch technology out as, he says, as currently it “struggles to converse as effectively as humans.” If you’re still unsure, he advises calling the number back “to see who or what answers.” And, whatever you do, “never share sensitive information,” Dr. Aylett warns.
For non-conversational audio, such as a statement, it’s a different story. “The current quality is high,” says Korff, “and extremely difficult for the human ear to discern a real voice from a deepfake.” Although individual people may have difficulty, software used by banks most likely won’t. “A potential victim of bank fraud therefore cannot protect themselves from deepfakes per se, but their bank, if using the appropriate voice biometric technology, can do exactly that.”
Saying that, deepfake cases are expected to rise — especially ones relating to fraud. “Anyone possessing the skills can spread fake news, influence staff by passing on fake commands from senior leadership, create false evidence to change the outcome of court cases, [and] even blackmail innocent people,” says Dr. Aylett.
And, by replicating a loved one’s voice, victims could “be duped or scammed into sharing their bank details or transferring money to a third party. This is especially concerning for the elderly,” he says.
Right now, educating yourself on what the technology is, how it can be used, and the ways to detect it are the best things you can do.