Spammers Solving Difficult AI Problems With An Underground X Prize

from the fascinating dept

Slashdot points us to an interview with Luis von Ahn (who we’re a big fan of), where he talks about how spammers who are frustrated by various types of CAPTCHA tests have set up their own sort of “innovation prize,” offering up somewhere in the range of $500,000 for software that can automatically pass CAPTCHA and reCAPTCHA reading tests (the things where you have to fill in a series of letters to sign up for a service or post a comment). As von Ahn points out: “If [the spammers] are really able to write a programme to read distorted text, great — they have solved an AI problem.” It is, effectively, an “X Prize” for optical character recognition. Not that we like to encourage spammers, but it is rather fascinating how the underground business seems to mirror the above ground innovation world as well.

Filed Under: , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Spammers Solving Difficult AI Problems With An Underground X Prize”

Subscribe: RSS Leave a comment
Xanthir, FCD (profile) says:


For those of you who don’t know, reCAPTCHA is a very popular captcha program that uses images of words that couldn’t be read properly by the OCR software used by various book digitizing programs. The fact that the OCR software failed means that it’s unlikely any other similar software will succeed, which is what makes it such a great captcha.

The neat thing is that it forces you to decipher *two* words, one of which is already known and one which is not. If you get the known word right, you pass. Once enough people give the same answer for a particular unknown word, though, that information is passed back upstream to the book digitizers.

If the spammers can solve this reliably, it means that they’ve made a great advancement in the field of OCR which can be passed back to the book digitizers for great benefits. And it still won’t defeat reCAPTCHA unless the new software is *perfect* – there will still be words that can’t be read by the new software.

Anonymous Coward says:

Re: Awesome

however, if you were to write a program and seed it with correct answers and have it consistently send the wrong answers back to recaptcha you may be able to trick their servers into accepting wrong answers and then build up your set of solved captchas simply by using a reasonably small set of seeds. … but then again Im not working on the recaptcha so they may already have things in place to prevent this, but there are a ton of sites using it, and i feel like it is almost entirely automated, so this type of hack could potentially ruin their model.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...