Bleeding Edge

by Mike Masnick

The Anti-Turing Test

from the captchas dept

A few months ago someone sent me the following which I found to be very cool: "... randomising letters in the middle of words [has] little or no effect on the ability of skilled readers to understand the text. This is easy to denmtrasote. In a pubiltacion of New Scnieitst you could ramdinose all the letetrs, keipeng the first two and last two the same, and reibadailty would hadrly be aftcfeed. My ansaylis did not come to much beucase the thoery at the time was for shape and senqeuce retigcionon. Saberi's work sugsegts we may have some pofrweul palrlael prsooscers at work. The resaon for this is suerly that idnetiyfing coentnt by paarllel prseocsing speeds up regnicoiton. We only need the first and last two letetrs to spot chganes in meniang." I wish I had a real source for it, but all I get on a Google search is other sites posting the same quote. Anyway, I was just reminded of that when reading this NY Times article about the idea of "Captchas", which are tricks to make sure someone filling out a web-form or registration page is really a human, and not a bot. In other words, it's a sort of anti-Turing test. I would think that a system using plenty of misspelled words like the above paragraph could easily fool a computer, but is understandable by humans, and could make a good captcha.

Reader Comments (rss)

(Flattened / Threaded)

  1. identicon
    Bob Bechtel, Dec 10th, 2002 @ 5:16am

    Randomize vs. swap?

    How (if at all) does this finding interact with what is known of dyslexia?

    reply to this | link to this | view in thread ]

  2. identicon
    2Lazy2Register, Dec 10th, 2002 @ 6:22am

    This was easy to read...

    ... but then again, I read Slashdot a lot where spelling correctly only confuses matters.

    reply to this | link to this | view in thread ]

  3. identicon
    Anonymous Coward, Dec 10th, 2002 @ 12:54pm

    I fail to see

    where this would be of any use in defeating a robot.

    If my robot is matching "Address" now, how hard would it be to change that to match /A[dres]{5,5}s/?

    Not very.

    reply to this | link to this | view in thread ]

  4. identicon
    Timmmay!, Dec 10th, 2002 @ 1:02pm

    Not a suprise

    Part of learning languages is learning the N-gram statistics (combinations of N letters -- usually 2 / 3 that are used in the language). For example, ea is a lot more common in english than ae. Your brain uses that to "repair" mixed up text. If you were not a fluent english speaker then you would have a great deal of difficulty doing this.

    A computer can easily compensate for this by using a dictionary and N-gram statistics to correct text.

    reply to this | link to this | view in thread ]

  5. identicon
    Zak McKracken, Dec 10th, 2002 @ 8:33pm

    No Subject Given

    Isn't this just a factor of dealing with twits^H^H^H^H^H users who can't be bothered learning how to either correct their spelling or just plain type accurately?<br><br>I know that I'll make the odd type-o that I don't pick up, but heck - its really not that hard to spend a little time proof reading?

    reply to this | link to this | view in thread ]

  6. identicon
    Karthik, Apr 10th, 2003 @ 12:15pm

    Re: Randomize vs. swap?

    Don't you mean Slysdexic? :-p

    reply to this | link to this | view in thread ]

  7. identicon
    Anonymous Coward, Apr 10th, 2003 @ 1:39pm

    Re: I fail to see

    Because your robot has to know which variation to use by context (something beyond the scope of regex). And your pattern is too strict, you need to also have it accomodate missing letters (not mentioned here but another important challenge) like adress or addrss. It also needs to understand extra letters (in combination with missing letters) like addresse and adresse.

    reply to this | link to this | view in thread ]

  8. identicon
    Anonymous Coward, Apr 10th, 2003 @ 3:20pm

    Re: Randomize vs. swap?

    I'm a reformed dyslexic. That is, I was taught strategies for dealing with it when I was very young. I've long since internalized them and can function as well as a non-dyslexic. Usually that is; if I'm emotionally agitated I start having trouble with spacial relationships..."right", "left", "inside", "outside" are very much front brain concepts to me. Get me upset and I can forget until I calm down.

    That said, I can say I had no trouble whatsover reading the mixed up paragraph. At least in my case, dyslexia has no effect on my ability to do such things.

    reply to this | link to this | view in thread ]

  9. identicon
    michael rosenbaum, Apr 10th, 2003 @ 9:10pm

    Re: Randomize vs. swap?

    most dyslexia occurs in english speaking countries, according to
    or more precisely

    english has so many spelling variations for individual phonemes, it makes sense we can understand this story. i wonder if readers of languages with fewer spelling variations can do this trick.

    reply to this | link to this | view in thread ]

  10. identicon
    Anonymous Coward, Apr 11th, 2003 @ 5:24am

    Re: No Subject Given

    That "its" was deliberate, I take it.

    reply to this | link to this | view in thread ]

  11. identicon
    James, Apr 11th, 2003 @ 6:30am

    Re: I fail to see

    Actually, that isn't so hard either. Perl has a fuzzy string matching function somewhere - I remember using it. Knowing that generally, the first and last two letters will be the same, I think you could translate most of the words back into English.
    Of course, whether this helps to spot the difference between a computer and a person, like the captchas, depends on how you use it. If it's simply a matter of repeating the muddled word in English, then it's easy. If it's interpreting a sentence like "It's Friday today, and this weekend I'm having a party. Would you expect me to be happy?", or "What colour is grass?" or "My foot itches. Should I scratch it, slap it, or paint it blue?", then it's a problem. Of course, that would be a problem anyway.
    What was my point again?

    reply to this | link to this | view in thread ]

  12. identicon
    Noam, Apr 11th, 2003 @ 8:43am

    Re: Not a suprise

    In cases like these, syntax and semantic context are probably at least as important as word-by-word analysis. Linguistic research has shown that people tend to anticipate later words or grammatical structures as they read earlier words in a passage, and that set of expectations, produced by such on-the-fly progressive analysis of a sentence, speeds up our processing time. Familiarity with a language, with the lexicon and with the syntactical conventions will be useful in all instances.

    A related scenario comes up when letters, instead of being transposed, are substituted for the wrong letters or symbols. This tends to happen when Americans in France, using a French keyboard, write to me in English. Because the locations of the keys are transposed (QWERTY is not used there), I end up getting things like: "Deqr Noq,; It zqs reqlly greqt tqlking to you..." (This is a mild example.) If found this type of substitution very easy to pick up in real time, partly on the basis of context and partly because many key word-initial or word-final letters were not changed.

    reply to this | link to this | view in thread ]

  13. identicon
    Anonymous Coward, Apr 11th, 2003 @ 9:36am

    what a crock

    yes, there will always be tricks to weed the X's from the Y's and the X's and Y's will continue to change and new tricks will pop up. Calling this an "anti-turing test" is glorifying stupid hacked up tests to tell things apart. if you think intelligence can be tested with some tricks then i feel sorry for you.

    reply to this | link to this | view in thread ]

  14. identicon
    Jan, Jul 23rd, 2003 @ 7:42pm

    Quote source

    Your "reibadailty" quote is a letter to the New Scientist but I don't know what date. We have a copy of it on our staffroom wall -

    "You report that reversing 50-millisecond segments of recorded sound does not greatly affect listeners' ability to understand speech (In Brief, 1 May, p27).
    This reminds me of my PhD at Nottingham University (1976), which showed that randomising letters ..." etc.

    Hope this helps.

    reply to this | link to this | view in thread ]

  15. identicon
    huayangao, Dec 28th, 2007 @ 3:50pm

    Turing Test Two

    ... In Turing Test Two, two players A and B are again being questioned by a human interrogator C. Before A gave out his answer (labeled as aa) to a question, he would also be required to guess how the other player B will answer the same question and this guess is labeled as ab. Similarly B will give her answer (labeled as bb) and her guess of A's answer, ba. The answers aa and ba will be grouped together as group a and similarly bb and ab will be grouped together as group b. The interrogator will be given first the answers as two separate groups and with only the group label (a and b) and without the individual labels (aa, ab, ba and bb). If C cannot tell correctly which of the aa and ba is from player A and which is from player B, B will get a score of one. If C cannot tell which of the bb and ab is from player B and which is from player A, A will get a score of one. All answers (with the individual labels) are then made available to all parties (A, B and C) and then the game continues. At the end of the game, the player who scored more is considered had won the game and is more "intelligent". ...

    reply to this | link to this | view in thread ]

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Use markdown for basic formatting. HTML is no longer supported.
  Save me a cookie
Follow Techdirt
Insider Shop - Show Your Support!

Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Report this ad  |  Hide Techdirt ads
Recent Stories
Report this ad  |  Hide Techdirt ads


Email This

This feature is only available to registered users. Register or sign in to use it.