Coming To A Surveillance State Near You: Lip-Reading Computers

from the I'm-sorry-Dave,-you-can't-say-that dept

One of the most famous — and important — scenes in Stanley Kubrick’s film “2001” is when the two astronauts sit in a space pod in order to avoid being overheard by the ship’s computer, HAL, which they believe may represent a threat to their lives. Although they have prudently turned off the pod’s communication system, what they don’t realize is that HAL is able to follow their conversation by lip-reading, and hence is alerted to their disconnection plans.

Although it is unlikely that the Turkish authorities were inspired by the film, the following incident, reported by in a post on the growing censorship in the country, reminds us that the use of lip-reading for surveillance purposes is not science fiction:

Last week, at the funeral of a solider in Osmaniye, south-eastern Turkey, mourners voiced anger at the government’s decision to commit troops to conflict with PKK forces in the south-east, leading to several arrests.

Veli Ağbaba, deputy president of the opposition Republican People’s Party (CHP), and his colleagues visited two suspects in prison, and have stated that they were arrested on charges of “insulting the president” after footage of the funeral was scrutinized by lip-reading experts.

Calling in lip-reading experts to check whether somebody was insulting the President of Turkey at a funeral might seem a one-off product of an increasingly-paranoid security apparatus. Moreover, using humans is a surveillance technique that doesn’t really scale — unlike metadata analysis, say — so you might hope this is unlikely to be a problem for most of us. But it turns out that we are very close to building real lip-reading HALs. Here’s a 2014 article from The Week:

A Jordanian scientist has created an automated lip-reading system that can decipher speech with an average success rate of 76 per cent. The findings, in conjunction with recent advances in the fields of computer vision, pattern recognition, and signal processing, suggest that computers will soon be able to read lips accurately enough to raise questions about privacy and security.

Moore’s Law and other advances in computing pretty much guarantee that 76 percent success rate will rise inexorably, until high-accuracy lip-reading becomes a standard feature for CCTV surveillance systems, especially as very high-resolution cameras fall in price and are deployed more widely. HAL would be proud.

Follow me @glynmoody on Twitter or, and +glynmoody on Google+

Filed Under: , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Coming To A Surveillance State Near You: Lip-Reading Computers”

Subscribe: RSS Leave a comment
Roger Strong (profile) says:

Just to be clear: It’s wouldn’t just be lip reading. It would be lip reading to text, with the text stored permanently in a database. Searchable.

*Commercial* searchable databases, not just police/government. Much like there are now commercial licence plate readers along the road collecting and marketing your travel data. And your grocery purchase data, if you paid with plastic.

It can be combined with other information to tie it to individuals.

Employers and potential employers would find it invaluable. For proper “screening”, of course. Gotta make those security cameras pay for themselves.

Anonymous Anonymous Coward says:

When there IS freedom of speech

Based upon the headline I was ready to suggest “stop talking to your computer”. Personally, I only scream at it now and then.

While lip reading computers are likely a reality, 76% accuracy rate sounds low. In addition, this will probably not be an issue in countries that have 1rst Amendment like rules, such as the US…oh…wait…

The likely outcome will be that people start wearing masks, at the very least covering their mouths when in public, and then what have the governments won? A masked populace?

Anonymous Coward says:

Easy Peasy Lemon Squeezy

At least in the US. All you have to do is something like:

10 print “Subject 1: I am a terrorist.”
20 print “Subject 2: I am also a terrorist.”
30 print “Subject 1: I am going to blow up a building tomorrow.”
40 print “Subject 2: I will help you. I have bomb making instructions.”
50 print “Subject 1: Yes, tomorrow we will blow up a building.”
60 print “Subject 2: Afterwards, let us go rape little girls.”
70 print “Subject 1: A good idea. Little boys also.”

And any US prosecutor or law enforcement officer will swear it’s 100% accurate. It never makes a mistake.

To the FBI agents reading this, I’ll sell you this wonderful program for just $50,000,000.00. That way you can claim it’s got to be good if you spent that much on it.

Uriel-238 (profile) says:

Not that they need anymore tools to justify murder and robbery.

So who determines what technologies are usable without a warrant by law enforcement agents?

We already know that detection dogs are misused. Many dogs are trained to signal not when they detect something but when their handlers command them to, giving the handler alleged probable cause to search. Detection dogs are still used this way despite that have a greater than 50% false positive rate even at their best.

So lip-reading software is just going to be another means for law enforcement to justify probable cause to SWAT your home.

Given how the police are more interested in robbing or bullying the laity rather than protecting or serving them, they shouldn’t be trusted with any further forensic technology until the DoJ is reformed.

Or they can continue what they’re doing and enjoy their deteriorating reputation as tax-fed uniformed thugs.

A. Lauridsen says:

Moore's law??

I’m not convinced that Moore’s law applies here. I do not doubt that the technology is feasible, lip reading is “just” another pattern recognition problem.

What I do doubt is that Moore’s law will in any way influence the efficiency of the pattern recognition problem.

The real question is, how reversible is the process of going from lip shape to phoneme production? Phoneme production is determined by the vocal chords, tongue position, strength of air flow, and lips. Any lip reader – human or otherwise – will only have partial information about the sounds being produced.

This limiting factor seems to me to be more important than raw computing power.

nasch (profile) says:

Re: Moore's law??

This limiting factor seems to me to be more important than raw computing power.

I don’t claim to have specific information about how lip reading software works. However, it seems likely that to be any good it would use context to resolve ambiguities. And with more processing power, the program can do a better and faster job of analyzing lip movements compared to the context of the other surrounding movements and the conversation it’s decoded so far.

TRX (profile) says:

Re: Moore's law??

That’s going to be some hefty processing. I’m mostly deaf and despite years of practice, I can’t lip-read enough to pick out more than occasional phonemes.

On the other manipulator, someone who comes up with a working lip-reading algorithm will not only help the deaf, it will also be useful for communicating in noisy environments or when you don’t want to make any noise.

Coyne Tibbets (profile) says:

Coming soon? Hope springs eternal...

Moore’s Law and other advances in computing pretty much guarantee that 76 percent success rate will rise inexorably…

If the Jordanians have done this publicly, you can bet that NSA has done it, and better, in secret.

NSA builds this technology and uses it for years; snickering evilly behind its black cloth. Then someone comes up with a public version and everyone thinks it’s “new.” Hope springs eternal in the human breast, but in the case of the NSA, there is no hope: only surveillance.

TRX (profile) says:

Re: About time. . .

ventriloquists … terrorists

“Prime time. There I was in the TV studio, disguised as a paper-shredder, recordin’ the assassination of Blocky Yocks. Dunno why the Academy’d bothered—it was seen live coast t’ coast by half the population of the country at the time.”

“But, in its early dyin’ throes, the system lashed out at its tormentor. The week followin’ Blocky’s announcement, as I was crouched, sweatin’ inside a plastic bagfulla confetti an’ he was in the middle of his openin’ monologue, two CIA loaners an’ a paira outa-work installers busted into Studio B with silenced Ruger Mark IIs an’ emptied their clips into poor Blocky, endin’ his career forever.

Too bad the stupid jerks didn’t think t’ shoot his partner, the ventriloquist.”

– excerpted from “The Nagasaki Vector” by L. Neil Smith

Rekrul says:

I’m sure this technology won’t be misused…

[facing away from camera on July 1st]

Bob: How’s the pool party coming along?

Jim: Great. I’ve got the food and sent the invites.

Bob: Did you inflate all the pool toys?

[turns toward camera]

Jim: Not yet, I want to wait until the 4th to see how many people show up, then I can decide how many floats I’m going to blow up.

[FBI swoops in, charges Jim with planning an act of terrorism]

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...