Google Embracing Unintentional Crowdsourcing

from the sneaky-bastards dept

I'm always fascinated by businesses that are built around incentives to multiple parties, where all those incentives align even if not everyone participating is aware of it. One of the best in the business at coming up with such things is Luis von Ahn, who has done research for many years on creating systems online that get other people to do some kind of "work" for you. After the CAPTCHA concept, probably von Ahn's most famous concept is the ESP Game, which is a game that helps to get much more detailed info about what's in any image by having multiple play a game to name what's in an image. People get more points if they match up keywords faster, encouraging them to be as accurate as possible in defining the key characteristics in an image.

Last year, von Ahn went and gave a talk at Google, after which he licensed the concept of the ESP Game to Google (though, Google's version was too boring to get very much attention). However, it appears that the folks at Google did pick up a few additional lessons in how this concept works. Paul Kedrosky points us to the news that Google is admitting its GOOG-411 project has little to do with taking on the 411 telephone information service, and everything to do with building a better speech recognition system. You see, to build speech recognition, you need many different voices saying many different phonemes (the sounds that make up words) in a variety of accents/tones/pitches/etc. Rather than go out and ask people to speak, Google gets plenty of phonemes just by providing this service.

Cynics may call this exploitation or sharecropping, but it's nothing of the sort. It's giving something of value to get something of value -- even if not everyone is fully aware of what the exchange really is about. Too many people seem to think the idea of "crowdsourcing" is really just about getting the crowd to do work for you -- but that's not it at all. It's about setting up incentives so that everyone involved gets value in some form or another, making it a beneficial transaction to everyone.


Reader Comments (rss)

(Flattened / Threaded)

  1.  
    icon
    shanoboy (profile), Dec 18th, 2007 @ 1:18pm

    GOOG-411?

    I must have missed that one! What's Goog-411 and how can I use it? Is it available yet?

    This is an awesome idea!

     

    reply to this | link to this | view in thread ]

  2.  
    identicon
    Chronno S. Trigger, Dec 18th, 2007 @ 1:42pm

    Re: GOOG-411?

    Click on the "building a better speech recognition system" link. It gives a small description.

     

    reply to this | link to this | view in thread ]

  3.  
    identicon
    tychism, Dec 18th, 2007 @ 2:00pm

    1800GOOG411 essentially a free 411 service. Dial the number and try it out, it works great; I've been using for probably past 6 months.

    I have been under the impression that they were offering premium positioning for certain key phrases through the service much like they do through Adwords.

    "What city & state, What business name or category" at which point it lists off a number of related business. Pizza for example in my location always lists Papa John's first.

     

    reply to this | link to this | view in thread ]

  4.  
    identicon
    Gunnar, Dec 18th, 2007 @ 3:51pm

    Re:

    It probably just searches Google or Google Maps using your words. If you type my town + pizza into Google, Papa Johns is the first thing to come up too.

     

    reply to this | link to this | view in thread ]

  5.  
    identicon
    Nicko, Dec 18th, 2007 @ 4:00pm

    Its pretty common....

    Its pretty common to do this with speech rec, at my company we've been doing this for years. And I wouldn't be surprised if our engines in phones, games, gps, etc collect some level of stats for use in research.

    Look at any company running services, pay or free, and you'll find the same things happening on the back-end. Do you really think walmart/target/etc tosses away all the point-of-sale info, AT&T all the call data, comcast data from its converters, even some McDonald's fountain machines collect usage data.

    And behind every data collection point there is someone else extracting other data. I used to work for a company that combined things like grocery value card transactions with insurance data and car sales, so they could figure out the optimum place to put gas stations (or where to send marketing data, or put up a billboard).

    Its pretty common method of collecting data for researching future products. If you have a 'crowd' using a service there is always going to be someone manipulating it for gain.

     

    reply to this | link to this | view in thread ]

  6.  
    identicon
    Thom, Dec 18th, 2007 @ 4:05pm

    ESP Game

    I haven't checked out the ESP game because the first page asks for a signup and I won't. I did check out Google's adaptation and found it so rediculously useless that I'm shocked Google's stock price didn't drop at the sight of it.

    Let's get real here Google, a huge number of images on the web don't need labelling or can't be given meaningful/usefull labels. How many hundreds of thousands of meaningless pictures of spreadsheets or dialogs are there? How many meaningless pictures of factory floors are there? The answer to both, and similar questions, is way too many and that's the kind of garbage I saw during most attempts.

    If you want to label images on the web and enlist the help of others you need only do three things: 1) identify labelers by expertise and use that 2) ask for their help 3) aid the labeler in helping you

    How do you do that? Guess who's looking for images of machinery, of art and paintings, of celebrities, etc. - people who are interested and, frequently, knowledgeable about those subjects. Ask them to help as they search. Let them specify a subject and contribute their knowledge.

    Heck, I've had many a boring day that I sat searching through Google images trying to locate something I wanted and would have gladly marked images as irrelevant or provided accurate labels in order to help others. I couldn't do that on 90+% of the images Google asked me to label, but I could do that on the majority of the images that come up in my searches.

     

    reply to this | link to this | view in thread ]

  7.  
    identicon
    Ben, Dec 20th, 2007 @ 12:33am

    Tellme

    I believe that Tellme did/does this. you could call thier number 800-555-tell maybe?? and they would give you stock updates, sports score, news, etc. i believe i reads that they were just using to tune thier system that were selling to airlines, banks, any company that puts you on hold and doesn't let you push 0.

     

    reply to this | link to this | view in thread ]

  8.  
    identicon
    Can Duruk, Dec 22nd, 2007 @ 2:30am

    Luis' Experiment

    Hah, funny. I took a class from Luis last semester. He's an amazing professor; almost all his lectures are as fun as the speech he gave at Google.

    At one point in class, he made us call a phone number and then say 1 to 10. That's it.

    Two months later, I heard my friend's voice in a audio captcha in the recaptcha.

    Good stuff.

     

    reply to this | link to this | view in thread ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
Flattr rss rss
A word from our Sponsors...
Sponsored Resource
Essential Reading
Techdirt Reading List
Techdirt Insider Chat

A word from our Sponsors...
Recent Stories
A word from our Sponsors...

Close

Email This