Google Embracing Unintentional Crowdsourcing

from the sneaky-bastards dept

I'm always fascinated by businesses that are built around incentives to multiple parties, where all those incentives align even if not everyone participating is aware of it. One of the best in the business at coming up with such things is Luis von Ahn, who has done research for many years on creating systems online that get other people to do some kind of "work" for you. After the CAPTCHA concept, probably von Ahn's most famous concept is the ESP Game, which is a game that helps to get much more detailed info about what's in any image by having multiple play a game to name what's in an image. People get more points if they match up keywords faster, encouraging them to be as accurate as possible in defining the key characteristics in an image.

Last year, von Ahn went and gave a talk at Google, after which he licensed the concept of the ESP Game to Google (though, Google's version was too boring to get very much attention). However, it appears that the folks at Google did pick up a few additional lessons in how this concept works. Paul Kedrosky points us to the news that Google is admitting its GOOG-411 project has little to do with taking on the 411 telephone information service, and everything to do with building a better speech recognition system. You see, to build speech recognition, you need many different voices saying many different phonemes (the sounds that make up words) in a variety of accents/tones/pitches/etc. Rather than go out and ask people to speak, Google gets plenty of phonemes just by providing this service.

Cynics may call this exploitation or sharecropping, but it's nothing of the sort. It's giving something of value to get something of value -- even if not everyone is fully aware of what the exchange really is about. Too many people seem to think the idea of "crowdsourcing" is really just about getting the crowd to do work for you -- but that's not it at all. It's about setting up incentives so that everyone involved gets value in some form or another, making it a beneficial transaction to everyone.
Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: 411 service, crowdsourcing, speech recognition
Companies: google

Reader Comments

Subscribe: RSS

View by: Time | Thread

  • icon
    shanoboy (profile), 18 Dec 2007 @ 1:18pm


    I must have missed that one! What's Goog-411 and how can I use it? Is it available yet?

    This is an awesome idea!

    reply to this | link to this | view in chronology ]

  • identicon
    tychism, 18 Dec 2007 @ 2:00pm

    1800GOOG411 essentially a free 411 service. Dial the number and try it out, it works great; I've been using for probably past 6 months.

    I have been under the impression that they were offering premium positioning for certain key phrases through the service much like they do through Adwords.

    "What city & state, What business name or category" at which point it lists off a number of related business. Pizza for example in my location always lists Papa John's first.

    reply to this | link to this | view in chronology ]

    • identicon
      Gunnar, 18 Dec 2007 @ 3:51pm


      It probably just searches Google or Google Maps using your words. If you type my town + pizza into Google, Papa Johns is the first thing to come up too.

      reply to this | link to this | view in chronology ]

  • identicon
    Nicko, 18 Dec 2007 @ 4:00pm

    Its pretty common....

    Its pretty common to do this with speech rec, at my company we've been doing this for years. And I wouldn't be surprised if our engines in phones, games, gps, etc collect some level of stats for use in research.

    Look at any company running services, pay or free, and you'll find the same things happening on the back-end. Do you really think walmart/target/etc tosses away all the point-of-sale info, AT&T all the call data, comcast data from its converters, even some McDonald's fountain machines collect usage data.

    And behind every data collection point there is someone else extracting other data. I used to work for a company that combined things like grocery value card transactions with insurance data and car sales, so they could figure out the optimum place to put gas stations (or where to send marketing data, or put up a billboard).

    Its pretty common method of collecting data for researching future products. If you have a 'crowd' using a service there is always going to be someone manipulating it for gain.

    reply to this | link to this | view in chronology ]

  • identicon
    Thom, 18 Dec 2007 @ 4:05pm

    ESP Game

    I haven't checked out the ESP game because the first page asks for a signup and I won't. I did check out Google's adaptation and found it so rediculously useless that I'm shocked Google's stock price didn't drop at the sight of it.

    Let's get real here Google, a huge number of images on the web don't need labelling or can't be given meaningful/usefull labels. How many hundreds of thousands of meaningless pictures of spreadsheets or dialogs are there? How many meaningless pictures of factory floors are there? The answer to both, and similar questions, is way too many and that's the kind of garbage I saw during most attempts.

    If you want to label images on the web and enlist the help of others you need only do three things: 1) identify labelers by expertise and use that 2) ask for their help 3) aid the labeler in helping you

    How do you do that? Guess who's looking for images of machinery, of art and paintings, of celebrities, etc. - people who are interested and, frequently, knowledgeable about those subjects. Ask them to help as they search. Let them specify a subject and contribute their knowledge.

    Heck, I've had many a boring day that I sat searching through Google images trying to locate something I wanted and would have gladly marked images as irrelevant or provided accurate labels in order to help others. I couldn't do that on 90+% of the images Google asked me to label, but I could do that on the majority of the images that come up in my searches.

    reply to this | link to this | view in chronology ]

  • identicon
    Ben, 20 Dec 2007 @ 12:33am


    I believe that Tellme did/does this. you could call thier number 800-555-tell maybe?? and they would give you stock updates, sports score, news, etc. i believe i reads that they were just using to tune thier system that were selling to airlines, banks, any company that puts you on hold and doesn't let you push 0.

    reply to this | link to this | view in chronology ]

  • identicon
    Can Duruk, 22 Dec 2007 @ 2:30am

    Luis' Experiment

    Hah, funny. I took a class from Luis last semester. He's an amazing professor; almost all his lectures are as fun as the speech he gave at Google.

    At one point in class, he made us call a phone number and then say 1 to 10. That's it.

    Two months later, I heard my friend's voice in a audio captcha in the recaptcha.

    Good stuff.

    reply to this | link to this | view in chronology ]

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here

Subscribe to the Techdirt Daily newsletter

Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)


Add A Reply

Have a Techdirt Account? Sign in now. Want one? Register here

Subscribe to the Techdirt Daily newsletter

Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Special Affiliate Offer

Essential Reading
Techdirt Insider Chat
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it

Email This

This feature is only available to registered users. Register or sign in to use it.