No, A 'Supercomputer' Did NOT Pass The Turing Test For The First Time And Everyone Should Know Better

from the what-a-waste-of-time dept

So, this weekend's news in the tech world was flooded with a "story" about how a "chatbot" passed the Turing Test for "the first time," with lots of publications buying every point in the story and talking about what a big deal it was. Except, almost everything about the story is bogus and a bunch of gullible reporters ran with it, because that's what they do. First, here's the press release from the University of Reading, which should have set off all sorts of alarm bells for any reporter. Here are some quotes, almost all of which are misleading or bogus:
The 65 year-old iconic Turing Test was passed for the very first time by supercomputer Eugene Goostman during Turing Test 2014 held at the renowned Royal Society in London on Saturday.

'Eugene', a computer programme that simulates a 13 year old boy, was developed in Saint Petersburg, Russia. The development team includes Eugene's creator Vladimir Veselov, who was born in Russia and now lives in the United States, and Ukrainian born Eugene Demchenko who now lives in Russia.

[....] If a computer is mistaken for a human more than 30% of the time during a series of five minute keyboard conversations it passes the test. No computer has ever achieved this, until now. Eugene managed to convince 33% of the human judges that it was human.
Okay, almost everything about the story is bogus. Let's dig in:
  1. It's not a "supercomputer," it's a chatbot. It's a script made to mimic human conversation. There is no intelligence, artificial or not involved. It's just a chatbot.
  2. Plenty of other chatbots have similarly claimed to have "passed" the Turing test in the past (often with higher ratings). Here's a story from three years ago about another bot, Cleverbot, "passing" the Turing Test by convincing 59% of judges it was human (much higher than the 33% Eugene Goostman) claims.
  3. It "beat" the Turing test here by "gaming" the rules -- by telling people the computer was a 13-year-old boy from Ukraine in order to mentally explain away odd responses.
  4. The "rules" of the Turing test always seem to change. Hell, Turing's original test was quite different anyway.
  5. As Chris Dixon points out, you don't get to run a single test with judges that you picked and declare you accomplished something. That's just not how it's done. If someone claimed to have created nuclear fusion or cured cancer, you'd wait for some peer review and repeat tests under other circumstances before buying it, right?
  6. The whole concept of the Turing Test itself is kind of a joke. While it's fun to think about, creating a chatbot that can fool humans is not really the same thing as creating artificial intelligence. Many in the AI world look on the Turing Test as a needless distraction.
Oh, and the biggest red flag of all. The event was organized by Kevin Warwick at Reading University. If you've spent any time at all in the tech world, you should automatically have red flags raised around that name. Warwick is somewhat infamous for his ridiculous claims to the press, which gullible reporters repeat without question. He's been doing it for decades. All the way back in 2000, we were writing about all the ridiculous press he got for claiming to be the world's first "cyborg" for implanting a chip in his arm. There was even a -- since taken down -- Kevin Warwick Watch website that mocked and categorized all of his media appearances in which gullible reporters simply repeated all of his nutty claims. Warwick had gone quiet for a while, but back in 2010, we wrote about how his lab was getting bogus press for claiming to have "the first human infected with a computer virus." The Register has rightly referred to Warwick as both "Captain Cyborg" and a "media strumpet" and has long been chronicling his escapades in exaggerating bogus stories about the intersection of humans and computers for many, many years.

Basically, any reporter should view extraordinary claims associated with Warwick with extreme caution. But that's not what happened at all. Instead, as is all too typical with Warwick claims, the press went nutty over it, including publications that should know better. Here are just a few sample headlines. The absolute worst are the ones who claim this is a "supercomputer." Anyway, a lot of hubbub over nothing special that everyone seemed to buy into because of the easy headlines (which is exactly what Warwick always counts on). So, since we just spent all this time on a useless nothing, let's end it with the obligatory xkcd:
Turing Test

Reader Comments

Subscribe: RSS

View by: Time | Thread


  1. identicon
    Sam, 10 Jun 2014 @ 2:32am

    Poorly researched

    This article irritates me a little. There is a strong theme of hate for Professor Kevin Warwick along with poorly researched attempts at debunking.

    A quick google will tell you that the event was held in partnership with RoboLaw. The event was aimed toward raising awareness about the ability for a chat bot to convince a human it was a human and how this is very dangerous in the online security arena. For example, you're on your banks website and a chat offering pops up asking you if you need some help. You do, so you click on it. You have a lovely conversation and happily hand over details about your account because you're convinced it's a human on the other end. If that were a robot it now has your details and can do with them as it pleases. This is scary and people need to be made aware of it so they can prepare themselves and be better at identifying possible situations where it might be occurring.

    The event was not geared toward some magical development of strong AI overnight, which this author clearly thinks it was trying to claim.

    Time to debunk the debunking.

    == articles attempt at debunking
    - reality check

    == "It's not a "supercomputer," it's a chatbot. It's a script made to mimic human conversation. There is no intelligence, artificial or not involved. It's just a chatbot."

    - I'm not sure how to answer this. Here code is being compared to hardware performance. For all is known the 'script' could be run on a supercomputer. And since when did being a supercomputer imply AI?!?!?!

    == "Plenty of other chatbots have similarly claimed to have "passed" the Turing test in the past (often with higher ratings). Here's a story from three years ago about another bot, Cleverbot, "passing" the Turing Test by convincing 59% of judges it was human (much higher than the 33% Eugene Goostman) claims."

    - Just the smallest amount of research will tell you that many other Turing tests restrict the conversation types that are allowed in the testing. This was the first passing of an UNRESTRICTED TURING TEST. This means that the judges were not told in any way that they had to talk about a certain topic. They were literally sat down and told to chat.

    == ""beat" the Turing test here by "gaming" the rules -- by telling people the computer was a 13-year-old boy from Ukraine in order to mentally explain away odd responses."

    - I'm not sure about the excessive use of quote marks in this debunking. Is the writer afraid to say these words or feels they carry more weight when possibly said by another party? Anyway, yes it was clever that the developer utilised humans willingness to allow increased errors when talking to younger and foreign people. This is just really clever psychology. Can we not just appreciate that? I see no way it is gaming the system, it's just an easier way to pass the test. Sometimes the simplest solutions are the most effective.

    == The "rules" of the Turing test always seem to change. Hell, Turing's original test was quite different anyway.

    - Welcome to science. Ideas and testing methodologies change over time.

    == As Chris Dixon points out, you don't get to run a single test with judges that you picked and declare you accomplished something. That's just not how it's done. If someone claimed to have created nuclear fusion or cured cancer, you'd wait for some peer review and repeat tests under other circumstances before buying it, right?

    - Many things wrong with this. There were a total of 350, yes THREE HUNDRED AND FIFTY, tests performed on the day of testing. The judges were picked from all age ranges, backgrounds, genders and nationalities to make the testing more fair. There were multiple academics from multiple universities there to specifically monitor the testing methods and ensure all the results were gathered correctly and the results were interpreted correctly. This is peer review.

    - If this is not enough peer review for you Dr Huma Shah will be publishing a paper at some point in the future on the event.

    == The whole concept of the Turing Test itself is kind of a joke. While it's fun to think about, creating a chatbot that can fool humans is not really the same thing as creating artificial intelligence. Many in the AI world look on the Turing Test as a needless distraction.

    - This seems like mostly opinion so I'm not sure how to debunk it. They are right in that it is fun to think about. So why can't re think about it? Lets get talking about the possible effects of this kind of chat with regards to RoboLaw.

    This kind of poorly researched, emotive reporting on scientific subjects really gets my goat.

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Use markdown for basic formatting. HTML is no longer supported.
  Save me a cookie
Follow Techdirt
Special Affiliate Offer

Advertisement
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Advertisement
Report this ad  |  Hide Techdirt ads
Recent Stories
Advertisement
Report this ad  |  Hide Techdirt ads

Close

Email This

This feature is only available to registered users. Register or sign in to use it.