Sites Freak About A Feature Google Has Had For Years
from the deep-breath dept
Websites are wringing their hands about the fact that Google is adding a “search within a search” feature that makes it easier to use Google’s search engine to search a particular website. So, for example, if you want to find something at the New York Times, you can search for “nytimes.com,” and then Google will display a search box that will let you search for content on just the New York Times website. Apparently a lot of websites are up in arms because this will divert traffic away from their own search engines and give Google, rather than the target site, opportunities to serve up ads to those users. The question I was left with after reading the article is: am I the only one who’s been doing this for years with Google’s “site:” syntax? I assumed that anyone who uses Google on a regular basis already knew about this feature. If the ability to search within a particular site is problematic, these sites should have objected years ago, when Google added this functionality.
Anyway, there are two things to say about this. One is that sites should take this as a wake-up call to improve the search functionality on their own sites. For a company whose business is increasingly centered on the Internet, having a decent search engine should be a high priority. Furthermore, although few companies will be able to develop search algorithms as sophisticated as Google’s, they have the big advantage that they’ve got access to a lot of metadata that Google doesn’t. For example, news sites should be able to offer searches by date, author, category, and other criteria that Google might not be able to extract easily from a mere scrape of its pages. They might also be able to use information like the number of page views, the number of times a page has been emailed, etc to decide which pages to list first. With all that extra information, it shouldn’t be that hard to develop (or license) a search engine that will give performance that’s at least roughly comparable to Google for one’s own site. Secondly, if you don’t have a good search engine, isn’t better to have Google helping users find the pages they want on your site than for those users not to find your content at all?
Filed Under: search, site search
Comments on “Sites Freak About A Feature Google Has Had For Years”
I’m sure some websites out there have very good internal search engines, but unfortunately I’ve never been to any of them. I can count many occasions where I was using a website’s search function and had no luck, only to Google the same keywords and there it was – right on the website that couldn’t find it.
Besides, if you’re looking for something on a particular site, don’t you want the answer from that site? Even if you do see an ad for a competitor’s page? The case the article describes, a search for job ads – well, yes, I’d probably look on the other sites too, but ALSO, and probably FIRST, look at the Post’s site since that’s where I intended to go anyway.
The internal search function on the Australian Taxation Office page is terrible. I’ve always used site:ato.gov.au at google for my searches. The ATO native search doesnt like finding things in the middle of words and wont find g.s.t if you search for gst. It was even recommended by a lecturer to our class after I explained it to him when he was complaining about the search.
Re: ATO Search
Maybe it’s to do with taxes, then? My most frustrating experiences of that sort have been with the IRS website here in the States. ^_^
Just today, I was searching an old French law from 1850 on the French legal website legifrance.fr. I tried all the ways I could think of and nothing, untill I made a simple google search (not even on the site) and made an instant hit on the very site I had been digging into for an hour.
Why don’t these people deal with Google then? Google could provide them with better search tools for their websites and get a part of the ad revenue.
Does it breech walled gardens?
I see NYT has caught on to BUGMENOT, I tried to follow a link there and none of the cached passwords worked. I can’t imagine what they have to gain, I’d give them false info if I cared enough to register anyhow. As it was; I just went to the next shiny object that caught my eye. Screw their story, it was probably BS anyhow.
Re: Does it breech walled gardens?
You didn’t miss out on anything. The Times is a lying sack of sh*t anyway, so unless your looking for fiction, stay away from them.
Actually they should be doing better than google
There is no reason why these sites can’t have better search than google within their own domains. Google knows what the rest of the web is linking to but publishers should know what their users are clicking on, which isn’t necessarily available to Google. For example, I wouldn’t use google to see what movies to rent since Netflix has better data on that question.
Also, the algorithms google uses are insanely asymptotically brilliant etc., but we are not talking about springfieldtimes.com going head-to-head on exabyte sized indexes. All they need to do is be useful to the people that go to their sites and we will get out of the habit of site: searches.
Everybody who enjoys this stuff should check out Toby Segaran’s incredible book “Programming Collective Intelligence”; all examples in Python and available here: http://blog.kiwitobes.com/?p=44
Yeah, I’ve been using that all the time for years because most sites’ built-in search engines suck balls.
I’ve been using it since forever. The reason why it’s been brought out is because google has stupid-tized their advanced search page so it’s not hidden anymore(as such people who are say busy with something else will end up noticing it_.
I use that at work on my own companies website. site:xyz.com is a much better search than anything my company has available by a country mile.
Instead of complaining about Google, these people ought to spend that demonstrative energy making their own site better. Guess it is easier to be lazy and blame someone else (Google)
dont know if it has been mentioned but if you make the site dynamicaly generated there are no links for google to bot through thus users have to use your site search engine because google cant index it
why complain so much over a trivial problem
I think you guys are missing or hiding from the point.
Search within Search that then brings up additional paid advertising links (to competitors) is now offered to the non savvy searcher. (not just the techie SITE: users) The whole basis of the argument is that it has the potential to take business elsewhere, and to line Googles pockets even more with clickthroughs. Its a perfectly valid argument. If you can imagine this feature enabled for loads of sites, even more traffic will be directed to those that pay Google. Thats the argument, which has a valid point, PPC Adwords are gettign more expensive by the day. The only person that wins is Google, not the searcher, the non savvy searcher is directed through Ads that are not organic and so less likely to have what they are looking for.
The only person that wins is Google, not the searcher, the non savvy searcher is directed through Ads that are not organic and so less likely to have what they are looking for.
You’re right, Google profits. But as it’s a for-profit company, that should be neither shocking nor troubling.
The “non savvy” search does win. Read post #21. The search can find more stuff via Google’s “site:” command than you can find via Amazon’s internal search engine.
And not only does this help the person searching, it helps Amazon to sell more stuff.
And one last thing, I’ve never been “directed through Ads” via Google. “Directed through” means that you have to click through the ads to get to the search results. That’s complete BS and you know it.
I love that feature. Whoever thinks this is new or bad should just stop using the internet.
I see complicated sites...
…all day long, and I can’t find what I’m looking for within them. Generally, it ticks me off enough that I’m leave thier site and go back to the search engine and try to find a different site that offers the same products/services but that cares enough to pay a web designer to make it user-friendly and not shit-tastic like at least 60% of the web pages that I see. (And that’s being kind.)
This is the typical Big Business attitude. Someone does something better than you, whine. How about, if you don’t like what Google is doing, then make your site more user-friendly so that we don’t need nor want to use a third-party service.
Compete, not complain!
Come on, Tim, you’re not that dense. Kudos to you and the rest of us who know about and use this feature, but as Graham mentions above, most searchers probably aren’t that savvy.
I can understand the concern over lost revenue, but I think it’s just fine for Google to offer this.
I like to configure IE7 to launch a google site search on wikipedia.org. It makes a good combination.
Get the Robots involved...
Of course, any website that doesn’t want Google or any other search engine to peek in their site and provide advertised links to the site just have to set up a robots.txt file in the root of their webhost, which will tell these search engines to bugger off. Searching inside a site is then not possible anymore. Then again, finding such site through Google will become more difficult too.
Dont get me wrong, its a great tool to have at your disposal, I use site search all the time.
Its irrelvant if your site has a great on-site search or not, as the customer would never get there with this tool at their disposal, and they miss all the extra marketing opportunities you may have on your site.
Belive me, Google are not doing this for the benefit of the searcher, the longer they can keep eyeballs on the PPC ads the better for them, thats their one and only reasoning behind it. But as long as they only do it to the big guys, which means more eyeballs to my PPC ads, i’m not going to worry yet.
As for competing with Google, get real, theres not one company on this planet that can compete at their web presence level. This is by no means the only ‘tool’ they have brought out lately, SEO’s and those in the internet business will know exactly what i mean. How long before they start forcing subscription to Google Maps, or Google Local Listings, or start putting top 5 or 10 paid Ads at the top of organic listings etc, then we’ll see how many more website owners start crying foul at this ‘unbiased’ search engine.
Google is the De facto engine and holds an incredible amount of power, too much to even comprehend sometimes. E-commerce sales in the US ALONE for 2007 was over 137 Billion USD, with an estimated ‘Web influenced store sales’ of 470 Billion USD. (Emarketer figures). With Google holding a 60% search engine usage rate, thats a heck of a lot of power they have. To much, without regulation, some would argue. Think about it..
When did google add this feature? I can’t remember a time when I didn’t use it.
The same syntax works on yahoo, live, ask, and even dogpile.
I’ve been using live a lot more lately. Especially at work, since I now work for a Microsoft partner and they like to live a Microsoft culture :). The live maps are a lot better than google maps in my area, I’m glad I looked into it.
I wanted to find a book for my son on Amazon. I had the subject matter, Batman, the title, Double Trouble, and the publisher, Scholastic. I couldn’t find it. Maybe it doesn’t exist, I thought.
I decided to try google with the site:amazon.com command. I.e., “batman double trouble scholastic site:amazon.com” without the quotes, and sure enough the book was found.
Can someone explain to me why Amazon makes it so hard to find stuff? I have to wonder what other merchandise Amazon offers for sale that is hidden from the public!
i agree with ima fish as amazons search function is the ABSOLUTE WORST when it really should be a leader in the industry. There are times when a book has a onw word title and when you search for it, its not even on the fist (or first few) pages of results, while dozens of books with longer titles are before it which makes no sense!!!
look, its very simple, if you dont want your content found in search engines, dont put it on the web – thats the deal. you are free to put it in a binder on a dusty shelf somewhere and no one will bother you. Or you could robot.txt it but why go to all that trouble if you hate your customers (or potential customers) that much
Maybe if various site’s internal search tools didn’t fail so much, we wouldn’t turn to superior alternatives. I know many vBulletin sites that have the “flood control” feature. You cannot do two searches within 30 seconds of each other. On top of that, the search is slow, and the results are pathetic. Google has no flood control, and the results are superior and faster.
As for Google exposing users to more ads, that is nothing more than a positive side effect. I cannot live without Google’s site search functionality. I welcome the ads since Google offers superior search service. Web sites need to wise up and accept their shortcomings rather than cry foul when someone else comes along and does the work FOR them with more quality.
Am I mistaken, or can I remember using either “site:” or the URL field on the advanced page as far back as about 1999?
Anyway, site-specific search is just a different way of displaying the info Google has already gathered. If someone doesn’t want Googlebot indexing their site, they’re always free to block it and forgo the traffic.
Google’s about finding something about anything in a wildly heterogeneous collection but not about finding something specific— and especially not in a more homogeneous corpus. Using site: or any of the other interfaces to any of the various Google business concepts as site search is the approach of many publishers and companies that don’t care. Its a cheap way out. It probably misses a large portion of their site. Implementing proper search strategies is expensive— and most companies don’t even have the expertise or consultants or even know where to look (most so-called “search consultants” don’t know terribly much about information retrieval and discovery). Backing out of the scheme is, however, not easy. One can, of course, define robot exclusion rules and/or specify that pages should not be cached. Doing this in a manner that does not completely remove a site from Google or other robots yet includes it in a controlled manner is for most organizations non-trivial given the observation that Google (as other harvesters and crawlers) does not index all pages specified and what gets indexed and what is not is hidden in the mysteriums of Google’s “black box”.