Making CAPTCHAs Productive

from the now-there's-a-good-idea dept

About five years ago, Louis von Ahn was the PhD. student who came up with the idea for CAPTCHAs, the little requests to “type this” before you could fill out a form or sign up for a service. These days, of course, such CAPTCHAs have become nearly ubiquitous. Since then, Ahn has gone on to create other online systems that figured out ways to shift labor resources to users, such as the ESP Game, which is designed to make image search much more effective (and which Google eventually licensed). However, it seems that Ahn has switched his attention back to CAPTCHAs after recognizing what a productivity drain they must be. The nice thing about the ESP Game is the end result benefits image search. CAPTCHAs only help weed out spammers and scammers. However, John writes in to let us know that Ahn’s latest work is about making CAPTCHAs useful. What he’s done is made it so the text that users have to type are scans from books or other printed materials that are being scanned by Brewster Kahle’s Internet Archive project. That way, each time people are simply trying to enter a comment on a website, they’re also helping to turn a scanned word into text for the Internet Archive. Of course, if someone were really sneaky, they would just do the same sort of thing and hook it up to Amazon’s Mechanical Turk and keep all the earnings. Every time someone entered a comment on a site, it would earn you money. So, if anyone wants to do this, please reserve a cut for me.


Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Making CAPTCHAs Productive”

Subscribe: RSS Leave a comment
9 Comments
scottbp (user link) says:

A nice idea but...

The whole idea of using CAPTCHAs to

shift labor resources to users

seems to me to be a crazy UI design decision.

Personally when I am designing sites and applications I try to increase usability and decrease cognitive load for the user. This seems backward to me. It also seems counter productive to put more barriers in front of a user right when we are already asking them to search their memories for sign in, or registration type information.

The ESP game works because it is a game, and produces useful info as a by product. This new CAPTCHA scheme takes something that we should already doing on the application side (identifying robots) and then makes it even more complicated. I think we should be decreasing use of tools like CAPTCHAs not adding more complexity to the system.

Of course I think this idea is quite ingenious, i just don’t want to use it on any of the sites I design.

Ajax 4Hire (profile) says:

I had forgotten about the CAPTCHA,

thanks for the article to remind me the name.

In the early 21st century these were used to help site owners distinguish between human and machine. The only problem was that graphics engines, facial recognition and increased computing Zs (archaic term for MHz/horsepower) allowed for good sometimes even better simple character recognition than the human.

Consider the problem of “recognizing” 1(one) and l(ell);
Upper case letter entered in as lower case, zer0 and Oh.

CAPTCHA was transcended by similar techniques that required turing style test to gestalt the GIF/JPG/MPG/264.

Next used were images/pictures with
a question of “what is this?” answer: flower
Moving beyond the text based recognition to simple images.

But the CAPTCHA was a minor irritation, the image recognition was more frustrating, multiple valid answers (like the zer0/Oh) caused more ire directed at the site.

A short lived attempt was tried to use near current event questions similar to the World War II “Who won the World Series last year?” questions. A query that only a human or someone on your side would know.

CAPTCHA and Image queries were followed by secondary email authentication; a user must provide an email address and respond to THAT. This also proved to be relatively easy to overcome as machine generated email and email filtering/recognition was advanced enough to parse the query and provide the appropriate response.

There were also some short lived attempts to valid thru the exchange of fractional currency (Microsoft, eBay/Paypal, Oracle all tried Bank/CreditCard/Currency based checks on the assumption that only a human was too stupid to give up access to a currency exchange account).

By the early teens (2017 uwantwat.com is probably the best early example), sites became indistinguishable from human response in turing test. In fact, the best false positive test (machine passes as human) was summed up in the statement:
“to human is to err.”

Turing test started using statistical expectation of a slightly wrong answer. but again the basic problem is a
machine is trying to authenticate real human response.
Given sufficient access to the machine, you can craft a complement machine to give the expected response.

Read your history books, its all in there.

JBB says:

Re: Knowing if the response is correct...

The system uses two words in the CAPTCHA. The first is a known word. The second is one the OCR didn’t recognize. If the first one is entered correctly, the system knows you’re a human. It then records the second one and compares that answer with other people’s answer and if enough agree it decides that’s the unOCRable word.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...