(Mis)Uses of Technology

(Mis)Uses of Technology

by Mike Masnick


Filed Under:
411 service, crowdsourcing, speech recognition

Companies:
google



Google Embracing Unintentional Crowdsourcing

from the sneaky-bastards dept

I'm always fascinated by businesses that are built around incentives to multiple parties, where all those incentives align even if not everyone participating is aware of it. One of the best in the business at coming up with such things is Luis von Ahn, who has done research for many years on creating systems online that get other people to do some kind of "work" for you. After the CAPTCHA concept, probably von Ahn's most famous concept is the ESP Game, which is a game that helps to get much more detailed info about what's in any image by having multiple play a game to name what's in an image. People get more points if they match up keywords faster, encouraging them to be as accurate as possible in defining the key characteristics in an image.

Last year, von Ahn went and gave a talk at Google, after which he licensed the concept of the ESP Game to Google (though, Google's version was too boring to get very much attention). However, it appears that the folks at Google did pick up a few additional lessons in how this concept works. Paul Kedrosky points us to the news that Google is admitting its GOOG-411 project has little to do with taking on the 411 telephone information service, and everything to do with building a better speech recognition system. You see, to build speech recognition, you need many different voices saying many different phonemes (the sounds that make up words) in a variety of accents/tones/pitches/etc. Rather than go out and ask people to speak, Google gets plenty of phonemes just by providing this service.

Cynics may call this exploitation or sharecropping, but it's nothing of the sort. It's giving something of value to get something of value -- even if not everyone is fully aware of what the exchange really is about. Too many people seem to think the idea of "crowdsourcing" is really just about getting the crowd to do work for you -- but that's not it at all. It's about setting up incentives so that everyone involved gets value in some form or another, making it a beneficial transaction to everyone.

8 Comments | Leave a Comment..

 
 

Reader Comments (rss)

(Flattened / Threaded)

  1. Dec 18th, 2007 @ 1:18pm

    GOOG-411?

    by shanoboy

    I must have missed that one! What's Goog-411 and how can I use it? Is it available yet?

    This is an awesome idea!

    (reply to this comment) (link to this comment)

  2. Dec 18th, 2007 @ 1:42pm

    Re: GOOG-411?

    by Chronno S. Trigger

    Click on the "building a better speech recognition system" link. It gives a small description.

    (reply to this comment) (link to this comment)

  3. Dec 18th, 2007 @ 2:00pm
    by tychism

    1800GOOG411 essentially a free 411 service. Dial the number and try it out, it works great; I've been using for probably past 6 months.

    I have been under the impression that they were offering premium positioning for certain key phrases through the service much like they do through Adwords.

    "What city & state, What business name or category" at which point it lists off a number of related business. Pizza for example in my location always lists Papa John's first.

    (reply to this comment) (link to this comment)

  4. Dec 18th, 2007 @ 3:51pm

    Re:

    by Gunnar

    It probably just searches Google or Google Maps using your words. If you type my town + pizza into Google, Papa Johns is the first thing to come up too.

    (reply to this comment) (link to this comment)

  5. Dec 18th, 2007 @ 4:00pm

    Its pretty common....

    by Nicko

    Its pretty common to do this with speech rec, at my company we've been doing this for years. And I wouldn't be surprised if our engines in phones, games, gps, etc collect some level of stats for use in research.

    Look at any company running services, pay or free, and you'll find the same things happening on the back-end. Do you really think walmart/target/etc tosses away all the point-of-sale info, AT&T all the call data, comcast data from its converters, even some McDonald's fountain machines collect usage data.

    And behind every data collection point there is someone else extracting other data. I used to work for a company that combined things like grocery value card transactions with insurance data and car sales, so they could figure out the optimum place to put gas stations (or where to send marketing data, or put up a billboard).

    Its pretty common method of collecting data for researching future products. If you have a 'crowd' using a service there is always going to be someone manipulating it for gain.

    (reply to this comment) (link to this comment)

  6. Dec 18th, 2007 @ 4:05pm

    ESP Game

    by Thom

    I haven't checked out the ESP game because the first page asks for a signup and I won't. I did check out Google's adaptation and found it so rediculously useless that I'm shocked Google's stock price didn't drop at the sight of it.

    Let's get real here Google, a huge number of images on the web don't need labelling or can't be given meaningful/usefull labels. How many hundreds of thousands of meaningless pictures of spreadsheets or dialogs are there? How many meaningless pictures of factory floors are there? The answer to both, and similar questions, is way too many and that's the kind of garbage I saw during most attempts.

    If you want to label images on the web and enlist the help of others you need only do three things: 1) identify labelers by expertise and use that 2) ask for their help 3) aid the labeler in helping you

    How do you do that? Guess who's looking for images of machinery, of art and paintings, of celebrities, etc. - people who are interested and, frequently, knowledgeable about those subjects. Ask them to help as they search. Let them specify a subject and contribute their knowledge.

    Heck, I've had many a boring day that I sat searching through Google images trying to locate something I wanted and would have gladly marked images as irrelevant or provided accurate labels in order to help others. I couldn't do that on 90+% of the images Google asked me to label, but I could do that on the majority of the images that come up in my searches.

    (reply to this comment) (link to this comment)

  7. Dec 20th, 2007 @ 12:33am

    Tellme

    by Ben

    I believe that Tellme did/does this. you could call thier number 800-555-tell maybe?? and they would give you stock updates, sports score, news, etc. i believe i reads that they were just using to tune thier system that were selling to airlines, banks, any company that puts you on hold and doesn't let you push 0.

    (reply to this comment) (link to this comment)

  8. Dec 22nd, 2007 @ 2:30am

    Luis' Experiment

    Hah, funny. I took a class from Luis last semester. He's an amazing professor; almost all his lectures are as fun as the speech he gave at Google.

    At one point in class, he made us call a phone number and then say 1 to 10. That's it.

    Two months later, I heard my friend's voice in a audio captcha in the recaptcha.

    Good stuff.

    (reply to this comment) (link to this comment)

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Plain Text HTML Save me a cookie
  • Plain Text: A CRLF will be replaced by break <br> tag, all other allowable HTML is intact
  • HTML: No formatting of any kind is done without explicitly being written in
  • Allowed HTML Tags: <b> <i> <p> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Close
Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Plain Text HTML Save me a cookie

Search Techdirt
And now, a word from our Sponsors..



Subscribe to Techdirt's Daily Email Newsletter

Techdirt's Daily Email Newsletter

Related Stories
Close
E-mail It