In a move that looks much more designed to be about getting publicity than actually being useful, someone has released, as open source, code needed to scrape Google, and present a Google clone -- sans ads. There's a bunch of blather from the guy who did this about how he's trying to take back the internet from commercial interests or something and he fully expects Google to sue him, but it does raise some interesting legal questions. Part of his claim is that Google is making money by scraping other sites without permission and putting ads on it. So, considering that it was legal for them to scrape other sites, his belief is that it's perfectly legal to scrape Google's site and take the ads away. Of course, Google does let those who request it be removed from their index. Either way, this does highlight the legal question of whether or not compiling a database of publicly available info is copyrightable itself. It's probably in Google's best interest to simply let this one lie. The guy is obviously looking for a fight, and honestly, the number of people likely to use such a thing is going to be tiny compared to how many will continue to just use Google. The argument that it's a "loss of advertising revenue" isn't going to hold much weight -- as anyone who goes through the trouble of using this instead of Google directly is unlikely to click on the ads anyway. Also, Google could let this thing run its course to see how many people use it as a backdoor into using Google as a platform to design more compelling applications. If people do, that could give Google some direction in how to push forward with their own open API plans.

    sfb, 12 Jan 2005 @ 3:56pm


    Google's robots.txt pretty clearly forbids this, so there isn't any comparison to google crawling other web sites, since google DOES obey robots.txt.

