The GMB general secretary speaks to Ben Chacko at the union’s annual conference in Brighton

IN early June 2019, a fake video of Mark Zuckerberg, the CEO of Facebook, began to circulate around the social media platform Instagram. In it, he appeared to thank a shadowy organisation called “Spectre” for helping him manipulate people into willingly divulging their personal information to him.
The video, which is still available to watch on YouTube, looks authentic, but was in fact generated by an algorithm that is able to mimic human voices and facial appearances. Worryingly, other algorithms exist that can do the task even more convincingly.
An international team of scientists, working with Adobe — the company that is responsible for Photoshop, commonly used to create doctored images — have created software that can alter the words that people say in a video of them speaking.
People using the software can alter the transcript of what is said to suit their own needs, creating altered videos that are realistic enough to fool people watching them 60 per cent of the time.
These test subjects were actively searching for fake videos (rating real videos to be fake 20 per cent of the time), so the videos should be even harder for the casual observer to detect.
One limiting factor is that the underlying algorithm requires around 40 minutes of video of someone speaking before it has gathered enough information to begin to change what they say.
It goes through the source video, chopping up the words that are spoken into “phonemes” and “visemes.” Phonemes are the different sounds that form the building blocks of spoken words, while visemes are the facial expressions that we make while we are pronouncing phonemes. The software scans the footage to build a library of phonemes and visemes that are linked together.
From this library it can reconstruct the sound of any possible word from the phonemes that make it up. Consider the word “when,” composed of three phonemes: “/wh/,” “/e/” and “/n/.” As long as these phonemes are in the library (for example, if the person was recorded saying “what” and “sent”), the person can be made to say “when.”
The algorithm combines these phonemes with the corresponding visual sequence of visemes, and video editing techniques smoothly layer the newly created visemes on top of their face to match the newly generated audio.
Consider the fact that your laptop’s webcam and microphone, if hacked, could provide ample footage to fake you saying anything
It is easy to see how in the wrong hands this could be a tool for mass misinformation, yet the researchers that created the software are dismissive of its disruptive effects. They argue that we have been able to convincingly alter photographs for many years now without catastrophic repercussions.
But the consequences of photo-manipulation pervading every aspect of our lives are slowly emerging. From its earliest beginnings photography has never been free from the influence of retouching, but the increasing technological ability to do so has made any claim to represent reality look extremely tenuous.
Most notably, airbrushed photographs of models in magazines have been a focus of modern campaigning over many years because of their dramatic enhancement of negative body-image.
Real-time photo-editing technology has brought the fight from the world of mass media into our own phone cameras. Photo-sharing apps like Snapchat use filters which enhance camera photos as they are taken.
While enhancements like cartoon dog ears may be obvious, beautifying techniques such as smoother skin and bigger eyes can be unconsciously absorbed. A recent study found a positive association between a person’s investment in social media and their attitude towards cosmetic surgery.
As for this newest video manipulation software, clearly not just any video could be altered because of the requirement for 40 minutes of source material, but in the future, this requirement may be reduced or new techniques could be developed that assemble your phoneme and viseme library from many different videos. Or consider the fact that your laptop’s webcam and microphone, if hacked, could provide ample footage to fake you saying anything.
There are currently methods to detect when a video has been changed which can be used to distinguish fake from real videos.
However, proving that intentional doctoring is not always possible.
In one famous recent case last November, the White House press secretary retweeted a video posted by an alt-right conspiracy theorist as evidence that CNN journalist Jim Acosta had lashed out at a White House intern during a press briefing.
Amateur analysis was enough to show that several key frames had been removed from the video, making Acosta’s arm appear to move more quickly than it actually did, transforming a brush off into an aggressive chopping movement.
However, the intention to deceive was impossible to prove. Video compression techniques also use frame deletion to decrease file size, meaning that the key frames could have been dropped unintentionally during repeated download and upload of the video.
The existence of an “authentic” recorded version of reality is not enough to end debate. Some of the most popular conspiracy theories revolve around events that were extensively documented on film in real-time: the assassination of JFK, the Moon landings, or the 9/11 attacks. Debates focus not just on possible technical manipulation at a frame-by-frame level, but on the overall interpretation of the sequence of images.
One doesn’t have to be a conspiracy theorist to endorse the message of the late communist art critic John Berger: visual images have always been used to promote and reinforce ways of thinking and living that uphold the status quo. In his book Ways of Seeing he writes: “The art of the past no longer exists as it once did, its authority is lost. In its place there is a language of images. What matters now is who uses that language for what purpose.”
In a world where photos and videos can be manipulated more subtly and pervasively than ever, we should examine carefully how images are used and in whose interests, regardless of their “authenticity.”

The distinction between domestic and military drones is more theoretical than practical, write ROX MIDDLETON, LIAM SHAW and MIRIAM GAUNTLETT

Nature's self-reconstruction is both intriguing and beneficial and as such merits human protection, write ROX MIDDLETON, LIAM SHAW and MIRIAM GAUNTLETT

A maverick’s self-inflicted snake bites could unlock breakthrough treatments – but they also reveal deeper tensions between noble scientific curiosity and cold corporate callousness, write ROX MIDDLETON, LIAM SHAW and MIRIAM GAUNTLETT
Science has always been mixed up with money and power, but as a decorative facade for megayachts, it risks leaving reality behind altogether, write ROX MIDDLETON, LIAM SHAW and MIRIAM GAUNTLETT