Could You Google Bomb Google Flu?

from the would-you-want-to? dept

Google got a lot of attention recently for the launch of Google Flu Trends, which looks at aggregate data on searches related to the flu, to see if it can act as something of an early warning system for where there are flu problems. It's an interesting use of the data, and it will be worth watching what else can be done with this sort of data over time. However, Ed Felten raises an interesting question: can Google Flu Trends be manipulated? The idea is that, right now, it may be accurate, but the very fact that people know Google is tracking this information, could create incentives to game that info -- in the same way people have tried gaming Google in other ways for years, using tricks such as Google bombing. While you might not think there would be that many reasons to manipulate Google Flu Trends, there could be reasons to do so. Google is being somewhat secretive of how Flu Trends is set up, so perhaps that makes it more difficult to manipulate, but it does point to an interesting issue in using data in this manner. As soon as you've set up a system to measure the data, and made that public, is the data still reliable?


Reader Comments (rss)

(Flattened / Threaded)

  •  
    identicon
    eleete, Nov 20th, 2008 @ 6:17pm

    either side?

    Has either side got you on the right. Not the right but the right ?

     

    reply to this | link to this | view in chronology ]

  •  
    icon
    Lutomes (profile), Nov 20th, 2008 @ 6:40pm

    Outbreaks

    Well you can break the data, but the long tail test will be to see what it looks like in comparison to actual flu outbreaks. I mean to 'really' manipulate it you could start spreading the flu virus and then watch the google results while sitting in the comfort of your evil lair.

    Plus if google trends were that reliable then comparing the correlation of searches for "Zombies" and "Shotguns" in comparison to actual zombie outbreaks reveals accurate patterns. But I know that can't be right because its currently showing search traffic doubled in the last 12 months for "Zombies" in Australia...

    Oh god wait, I can't ^^oia$hdFsjh;as&oIHdlkhl

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Esther, Nov 20th, 2008 @ 6:46pm

    The diarrhea runs brown in Beverly Hills

    I just typed this one into google search:

    google flu 90210 diarrhea like a river

    (90210 is the zip code for Beverly Hills, California. God help me - I need a life)

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Werner Heisenberg, Nov 20th, 2008 @ 6:49pm

    Certainly

    One could determine the magnitude of the outbreak or its location but not both with any precision.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Nov 20th, 2008 @ 8:33pm

    zombies???
    you have to destroy the heads.... where is my shovel???

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Shane C, Nov 20th, 2008 @ 10:43pm

    Particle/Wave Duality extends beyond quantum physics?

    Upon reading the summery, the first thing that came to mind was how similar data mining is to the Particle Wave Duality Paradox. In short (and I don't claim to be a quantum physicist here) particles can exhibit wave like behavior, and waves can exhibit particle like behavior. The determining factor is whether they are being directly watched at the time.

    Data mining (might) have a similar paradox. The information is there, and truthfully accurate right up to the point that we start looking at it. But, by simply looking at the data, we may be corrupting the future data. Keeping the fact that the data is being examined private, or making it public, doesn't change the fact that the data will be corrupted. It only changes to what extent the data will be corrupted.

    Interesting...

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Trevlac, Nov 21st, 2008 @ 7:05am

    As soon as you've set up a system to measure the data, and made that public, is the data still reliable? No. It's called the Heisenberg Uncertainty Principle. Normally, it deals with only the Second Law of Thermal Dynamics but in this case it serves as a reminder that any non-protected public data instantly influences the outcome of that data. On the other hand, there are cases where this isn't the situation. With public encryption like RSA and AES, the algorithms are known to all but they can't be decrypted without the private key. Google Bombing is akin to brute forcing that key. We can only sit back and say the trite but astute lines "Only time will tell."

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Dr Al, Nov 21st, 2008 @ 7:23am

    Interesting concept. We will be using the Google data to lauch newspaper ads reminding people that Tamiflu can help recovery if started early on. In our Urgent Care clinic we obviously see the flu coming, but often patient wait too long, so the Tamiful is ineffective.

    If you download the Google historic data you see a huge peak in 2003. Recall that this was the year of the vaccine shortage due to contamination, but not a huge year for flu cases. Just shows other factors (media) can affect results.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Anonymous Coward, Nov 21st, 2008 @ 9:34am

    What gets measured, gets manipulated.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    G. Eysenbach, May 1st, 2009 @ 7:28am

    newsflash: flutrends not a google invention

    This kind of misinformation makes me angry. Please stop positioning Google Flutrends as a "Google invention" and regurgitating Google press releases.

    NEWSFLASH: The idea to look at search data as early warning systems for flu outbreaks is not a Google invention, but was actually already proposed over 3 years ago (published 2006), by researchers from the Centre for Global eHealth Innovation and U of T.


    Eysenbach G. Infodemiology: tracking flu-related searches on the web for syndromic surveillance. AMIA Annu Symp Proc 2006:244-248
    http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=17238340

    Eysenbach G. Infodemiology and Infoveillance: Framework for an Emerging Set of Public Health Informatics Methods to Analyze Search, Communication and Publication Behavior on the Internet
    J Med Internet Res 2009;11(1):e11
    URL: http://www.jmir.org/2009/1/e11

    The Virus Chasers
    http://www.cihr-irsc.gc.ca/e/35061.html
    CIHR Newsarticle (2007) about the infodemiology / infoveillance work at the Centre for Global eHealth Innovation in Toronto

     

    reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This