by Mike Masnick

Filed Under:
bing, copying, innovation, search

google, microsoft

Microsoft Highlights Why Google's 'Cheater' Accusations Ring Hollow

from the good-for-them dept

We had a long discussion recently about Google's response to discovering that Microsoft used clickstream data from users to help improve the relevance of their own search. Microsoft's Yusuf Mehdi has now written up a much more detailed response from Microsoft's point of view, in which it again clarifies that contrary to Google's statements, Microsoft is not "copying" Google's search results, but merely using clickstream data as one of many (Microsoft says approximately 1,000) variables in improving search relevance. Microsoft does take one cheap shot: noting that, technically, the "honeypot" trick that Google used to uncover this certainly appears to be a form of "clickfraud." That is, it was a trick designed specifically to manipulate Bing's search results.

But the key point is made towards the end:
We have brought a number of things to market that we are very proud of -- our daily home page photos, infinite scroll in image search, great travel and shopping experiences, a new and more useful visual approach to search, and partnerships with key leaders like Facebook and Twitter. If you are keeping tabs, you will notice Google has "copied" a few of these. Whether they have done it well we leave to customers. But more importantly, we take no issue and are glad we could help move the industry to adopt some good ideas.
That's the point that I tried to make in the original post. History has shown that innovation occurs via competition, and part of that competition often involves competitors building on each other's work. A few months back, I wrote a review of the excellent book Copycats by Oded Shenkar, which makes this point very, very clear. Innovation happens when companies build on each other's work. But, what you learn is that it's not just about "copying," it's about all of the players learning, innovating and expanding the overall market. Just straight up copying rarely does enough to make a difference (in fact, we've discussed this problem in the form of cargo cult copying, where companies just copy some superficial aspect, and discover that it's meaningless). That's clearly not what Microsoft was doing here.

In the comments to our original post, someone made the comment, in defense of Google, by saying if what Microsoft did was okay, then couldn't he just go out and say "I've got a billion dollar search engine idea!" and then just copy Google's results. But, of course, if anyone actually thinks this through, they'd realize that copying Google's search results is not a billion dollar search idea. Assuming that, tomorrow, we launched a "new search engine" that gave the identical results to Google, almost no one would use it. Why would you? There's no real advantage to doing so. And for people who already use Google, it's probably much more integrated into their lives, with Gmail, Google Docs and more. The search results themselves are not the "billion dollar idea." It's the overall execution.

Hopefully Google learns from this and realizes that it has learned plenty from watching Microsoft as well, and complaining about Microsoft using clickstream data is a waste of time. Focus on continuing to innovate, Google, which'll probably mean learning more things from Microsoft, in addition to what you're doing yourself.

To be fair, Matt Cutts also has a put together a decent response, where he points out that the real issue here may be disclosure -- in that Microsoft did not clearly disclose that it was using clicskstream data (and especially how it was using that data). That's a perfectly reasonable point, but it was not the original point that Google raised. I agree that Microsoft could and should be much clearer in its disclosure -- but that's a totally separate issue. Cutts also explains why he thinks that Microsoft really is "copying," but again, even if we grant that premise (which I don't think is accurate), I still don't see why that matters. Copying and improving is a part of the innovative process. Google should embrace it.

Reader Comments

Subscribe: RSS

View by: Time | Thread

  1. icon
    cc (profile), 4 Feb 2011 @ 6:05am

    Re: This word you keep using, Google does not think it means what you think it means

    That's more or less what I was arguing in the comments of the previous article, even though my thoughts focused only on innovation in search quality and not on presentation.

    The result of my discussion with Marcus Carab can be summarised thus: It depends on how much Microsoft is (indirectly) using Google's results, and we won't know that until more data is made available.

    My hunch is, a lot. They must be getting massive amounts of Google data, seeing how many people use Google, and it's all in pure query->document format, no less. In my view, instead of coming up with a better way to analyse the data it already has, Bing is trying to replicate Google's existing semantic links* between terms, which is possibly the hardest thing to tweak when you're making advanced document retrieval systems.

    That they say they use "over 1000 variables" is irrelevant, because as any statistician will tell you it's not the number of variables that counts but their weighting. If Bing is aiming to "become Google" because that's the search engine people want, they'll use the query->document data they get from Google to directly reinforce the query->document mappings in their system, which makes the other sources mostly irrelevant...

    And that's why this is cheating, in my opinion. Perhaps that's not necessarily a "bad" thing and their technology will eventually and inevitably catch up, but it leaves a bad taste in my mouth all the same.

    * For instance, Google may have decided to use a thesaurus (or even automatically learned a thesaurus!) to create a link between the terms "cat" and "feline", so when a user searches for cats, they also get documents about felines. This is not an obvious link for a computer, but it very likely improves retrieval performance. If Bing didn't think to do the same, and they only start showing documents about felines because they saw Google do the same, then their technology is still inferior, so in my book this cannot possibly count as innovation or as science. They are giving the illusion that they are competing with Google, but they are simply giving a "counterfeited" version of their competitor's results that they couldn't recreate by their own means.

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Use markdown for basic formatting. HTML is no longer supported.
  Save me a cookie
Follow Techdirt
Special Affiliate Offer
Anonymous number for texting and calling from Hushed. $25 lifetime membership, use code TECHDIRT25
Report this ad  |  Hide Techdirt ads
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Report this ad  |  Hide Techdirt ads
Recent Stories
Report this ad  |  Hide Techdirt ads


Email This

This feature is only available to registered users. Register or sign in to use it.