Sites Freak About A Feature Google Has Had For Years

from the deep-breath dept

Websites are wringing their hands about the fact that Google is adding a "search within a search" feature that makes it easier to use Google's search engine to search a particular website. So, for example, if you want to find something at the New York Times, you can search for "nytimes.com," and then Google will display a search box that will let you search for content on just the New York Times website. Apparently a lot of websites are up in arms because this will divert traffic away from their own search engines and give Google, rather than the target site, opportunities to serve up ads to those users. The question I was left with after reading the article is: am I the only one who's been doing this for years with Google's "site:" syntax? I assumed that anyone who uses Google on a regular basis already knew about this feature. If the ability to search within a particular site is problematic, these sites should have objected years ago, when Google added this functionality.

Anyway, there are two things to say about this. One is that sites should take this as a wake-up call to improve the search functionality on their own sites. For a company whose business is increasingly centered on the Internet, having a decent search engine should be a high priority. Furthermore, although few companies will be able to develop search algorithms as sophisticated as Google's, they have the big advantage that they've got access to a lot of metadata that Google doesn't. For example, news sites should be able to offer searches by date, author, category, and other criteria that Google might not be able to extract easily from a mere scrape of its pages. They might also be able to use information like the number of page views, the number of times a page has been emailed, etc to decide which pages to list first. With all that extra information, it shouldn't be that hard to develop (or license) a search engine that will give performance that's at least roughly comparable to Google for one's own site. Secondly, if you don't have a good search engine, isn't better to have Google helping users find the pages they want on your site than for those users not to find your content at all?



Reader Comments (rss)

(Flattened / Threaded)

  1.  
    identicon
    Celes, Mar 25th, 2008 @ 3:38pm

    I'm sure some websites out there have very good internal search engines, but unfortunately I've never been to any of them. I can count many occasions where I was using a website's search function and had no luck, only to Google the same keywords and there it was - right on the website that couldn't find it.

    Besides, if you're looking for something on a particular site, don't you want the answer from that site? Even if you do see an ad for a competitor's page? The case the article describes, a search for job ads - well, yes, I'd probably look on the other sites too, but ALSO, and probably FIRST, look at the Post's site since that's where I intended to go anyway.

     

    reply to this | link to this | view in thread ]

  2.  
    icon
    Lutomes (profile), Mar 25th, 2008 @ 4:01pm

    ATO Search

    The internal search function on the Australian Taxation Office page is terrible. I've always used site:ato.gov.au at google for my searches. The ATO native search doesnt like finding things in the middle of words and wont find g.s.t if you search for gst. It was even recommended by a lecturer to our class after I explained it to him when he was complaining about the search.

     

    reply to this | link to this | view in thread ]

  3.  
    identicon
    anonymous frogman, Mar 25th, 2008 @ 4:13pm

    Just today, I was searching an old French law from 1850 on the French legal website legifrance.fr. I tried all the ways I could think of and nothing, untill I made a simple google search (not even on the site) and made an instant hit on the very site I had been digging into for an hour.
    Why don't these people deal with Google then? Google could provide them with better search tools for their websites and get a part of the ad revenue.

     

    reply to this | link to this | view in thread ]

  4.  
    identicon
    Haywood, Mar 25th, 2008 @ 4:15pm

    Does it breech walled gardens?

    I see NYT has caught on to BUGMENOT, I tried to follow a link there and none of the cached passwords worked. I can't imagine what they have to gain, I'd give them false info if I cared enough to register anyhow. As it was; I just went to the next shiny object that caught my eye. Screw their story, it was probably BS anyhow.

     

    reply to this | link to this | view in thread ]

  5.  
    identicon
    Timon, Mar 25th, 2008 @ 4:27pm

    Actually they should be doing better than google

    There is no reason why these sites can't have better search than google within their own domains. Google knows what the rest of the web is linking to but publishers should know what their users are clicking on, which isn't necessarily available to Google. For example, I wouldn't use google to see what movies to rent since Netflix has better data on that question.

    Also, the algorithms google uses are insanely asymptotically brilliant etc., but we are not talking about springfieldtimes.com going head-to-head on exabyte sized indexes. All they need to do is be useful to the people that go to their sites and we will get out of the habit of site: searches.

    Everybody who enjoys this stuff should check out Toby Segaran's incredible book "Programming Collective Intelligence"; all examples in Python and available here: http://blog.kiwitobes.com/?p=44

     

    reply to this | link to this | view in thread ]

  6.  
    identicon
    Pinch Sulzberger, Mar 25th, 2008 @ 4:39pm

    Re: Does it breech walled gardens?

    You didn't miss out on anything. The Times is a lying sack of sh*t anyway, so unless your looking for fiction, stay away from them.

     

    reply to this | link to this | view in thread ]

  7.  
    identicon
    Danno, Mar 25th, 2008 @ 4:41pm

    Yeah, I've been using that all the time for years because most sites' built-in search engines suck balls.

     

    reply to this | link to this | view in thread ]

  8.  
    identicon
    Hoeppner, Mar 25th, 2008 @ 4:43pm

    I've been using it since forever. The reason why it's been brought out is because google has stupid-tized their advanced search page so it's not hidden anymore(as such people who are say busy with something else will end up noticing it_.

     

    reply to this | link to this | view in thread ]

  9.  
    identicon
    Celes, Mar 25th, 2008 @ 5:52pm

    Re: ATO Search

    Maybe it's to do with taxes, then? My most frustrating experiences of that sort have been with the IRS website here in the States. ^_^

     

    reply to this | link to this | view in thread ]

  10.  
    identicon
    Kevin h, Mar 25th, 2008 @ 6:07pm

    site:

    I use that at work on my own companies website. site:xyz.com is a much better search than anything my company has available by a country mile.

     

    reply to this | link to this | view in thread ]

  11.  
    identicon
    erica, Mar 25th, 2008 @ 6:30pm

    Instead of complaining about Google, these people ought to spend that demonstrative energy making their own site better. Guess it is easier to be lazy and blame someone else (Google)

     

    reply to this | link to this | view in thread ]

  12.  
    identicon
    neil, Mar 25th, 2008 @ 7:20pm

    dont know if it has been mentioned but if you make the site dynamicaly generated there are no links for google to bot through thus users have to use your site search engine because google cant index it

    why complain so much over a trivial problem

     

    reply to this | link to this | view in thread ]

  13.  
    identicon
    Graham, Mar 25th, 2008 @ 8:38pm

    I think you guys are missing or hiding from the point.

    Search within Search that then brings up additional paid advertising links (to competitors) is now offered to the non savvy searcher. (not just the techie SITE: users) The whole basis of the argument is that it has the potential to take business elsewhere, and to line Googles pockets even more with clickthroughs. Its a perfectly valid argument. If you can imagine this feature enabled for loads of sites, even more traffic will be directed to those that pay Google. Thats the argument, which has a valid point, PPC Adwords are gettign more expensive by the day. The only person that wins is Google, not the searcher, the non savvy searcher is directed through Ads that are not organic and so less likely to have what they are looking for.

     

    reply to this | link to this | view in thread ]

  14.  
    identicon
    inc, Mar 25th, 2008 @ 8:47pm

    I love that feature. Whoever thinks this is new or bad should just stop using the internet.

     

    reply to this | link to this | view in thread ]

  15.  
    identicon
    Rose M. Welch, Mar 26th, 2008 @ 1:19am

    I see complicated sites...

    ...all day long, and I can't find what I'm looking for within them. Generally, it ticks me off enough that I'm leave thier site and go back to the search engine and try to find a different site that offers the same products/services but that cares enough to pay a web designer to make it user-friendly and not shit-tastic like at least 60% of the web pages that I see. (And that's being kind.)

    This is the typical Big Business attitude. Someone does something better than you, whine. How about, if you don't like what Google is doing, then make your site more user-friendly so that we don't need nor want to use a third-party service.

    Compete, not complain!

     

    reply to this | link to this | view in thread ]

  16.  
    identicon
    Joe Schmoe, Mar 26th, 2008 @ 3:14am

    Come on, Tim, you're not that dense. Kudos to you and the rest of us who know about and use this feature, but as Graham mentions above, most searchers probably aren't that savvy.

    I can understand the concern over lost revenue, but I think it's just fine for Google to offer this.

     

    reply to this | link to this | view in thread ]

  17.  
    identicon
    Anonymous frog coward, Mar 26th, 2008 @ 4:01am

    I like to configure IE7 to launch a google site search on wikipedia.org. It makes a good combination.

     

    reply to this | link to this | view in thread ]

  18.  
    identicon
    Lisa Westveld, Mar 26th, 2008 @ 4:44am

    Get the Robots involved...

    Of course, any website that doesn't want Google or any other search engine to peek in their site and provide advertised links to the site just have to set up a robots.txt file in the root of their webhost, which will tell these search engines to bugger off. Searching inside a site is then not possible anymore. Then again, finding such site through Google will become more difficult too.

     

    reply to this | link to this | view in thread ]

  19.  
    identicon
    Graham, Mar 26th, 2008 @ 5:15am

    Dont get me wrong, its a great tool to have at your disposal, I use site search all the time.

    Its irrelvant if your site has a great on-site search or not, as the customer would never get there with this tool at their disposal, and they miss all the extra marketing opportunities you may have on your site.

    Belive me, Google are not doing this for the benefit of the searcher, the longer they can keep eyeballs on the PPC ads the better for them, thats their one and only reasoning behind it. But as long as they only do it to the big guys, which means more eyeballs to my PPC ads, i'm not going to worry yet.

    As for competing with Google, get real, theres not one company on this planet that can compete at their web presence level. This is by no means the only 'tool' they have brought out lately, SEO's and those in the internet business will know exactly what i mean. How long before they start forcing subscription to Google Maps, or Google Local Listings, or start putting top 5 or 10 paid Ads at the top of organic listings etc, then we'll see how many more website owners start crying foul at this 'unbiased' search engine.

    Google is the De facto engine and holds an incredible amount of power, too much to even comprehend sometimes. E-commerce sales in the US ALONE for 2007 was over 137 Billion USD, with an estimated 'Web influenced store sales' of 470 Billion USD. (Emarketer figures). With Google holding a 60% search engine usage rate, thats a heck of a lot of power they have. To much, without regulation, some would argue. Think about it..

     

    reply to this | link to this | view in thread ]

  20.  
    identicon
    Joseph Durnal, Mar 26th, 2008 @ 5:52am

    When did google add this feature? I can't remember a time when I didn't use it.

    The same syntax works on yahoo, live, ask, and even dogpile.

    I've been using live a lot more lately. Especially at work, since I now work for a Microsoft partner and they like to live a Microsoft culture :). The live maps are a lot better than google maps in my area, I'm glad I looked into it.

     

    reply to this | link to this | view in thread ]

  21.  
    identicon
    Ima Fish, Mar 26th, 2008 @ 7:14am

    I wanted to find a book for my son on Amazon. I had the subject matter, Batman, the title, Double Trouble, and the publisher, Scholastic. I couldn't find it. Maybe it doesn't exist, I thought.

    I decided to try google with the site:amazon.com command. I.e., "batman double trouble scholastic site:amazon.com" without the quotes, and sure enough the book was found.

    Can someone explain to me why Amazon makes it so hard to find stuff? I have to wonder what other merchandise Amazon offers for sale that is hidden from the public!

     

    reply to this | link to this | view in thread ]

  22.  
    identicon
    Ima Fish, Mar 26th, 2008 @ 7:23am

    Re:

    The only person that wins is Google, not the searcher, the non savvy searcher is directed through Ads that are not organic and so less likely to have what they are looking for.

    You're right, Google profits. But as it's a for-profit company, that should be neither shocking nor troubling.

    The "non savvy" search does win. Read post #21. The search can find more stuff via Google's "site:" command than you can find via Amazon's internal search engine.

    And not only does this help the person searching, it helps Amazon to sell more stuff.

    And one last thing, I've never been "directed through Ads" via Google. "Directed through" means that you have to click through the ads to get to the search results. That's complete BS and you know it.

     

    reply to this | link to this | view in thread ]

  23.  
    identicon
    nipseyrussell, Mar 26th, 2008 @ 9:19am

    i agree with ima fish as amazons search function is the ABSOLUTE WORST when it really should be a leader in the industry. There are times when a book has a onw word title and when you search for it, its not even on the fist (or first few) pages of results, while dozens of books with longer titles are before it which makes no sense!!!
    look, its very simple, if you dont want your content found in search engines, dont put it on the web - thats the deal. you are free to put it in a binder on a dusty shelf somewhere and no one will bother you. Or you could robot.txt it but why go to all that trouble if you hate your customers (or potential customers) that much

     

    reply to this | link to this | view in thread ]

  24.  
    identicon
    Buzz, Mar 26th, 2008 @ 1:45pm

    HAHAHA

    Maybe if various site's internal search tools didn't fail so much, we wouldn't turn to superior alternatives. I know many vBulletin sites that have the "flood control" feature. You cannot do two searches within 30 seconds of each other. On top of that, the search is slow, and the results are pathetic. Google has no flood control, and the results are superior and faster.

    As for Google exposing users to more ads, that is nothing more than a positive side effect. I cannot live without Google's site search functionality. I welcome the ads since Google offers superior search service. Web sites need to wise up and accept their shortcomings rather than cry foul when someone else comes along and does the work FOR them with more quality.

     

    reply to this | link to this | view in thread ]

  25.  
    identicon
    BR, Mar 29th, 2008 @ 1:10pm

    old news

    Am I mistaken, or can I remember using either "site:" or the URL field on the advanced page as far back as about 1999?

    Anyway, site-specific search is just a different way of displaying the info Google has already gathered. If someone doesn't want Googlebot indexing their site, they're always free to block it and forgo the traffic.

     

    reply to this | link to this | view in thread ]

  26.  
    identicon
    Edward C. Zimmermann, Apr 6th, 2008 @ 10:58pm

    Google's about finding something about anything in a wildly heterogeneous collection but not about finding something specific--- and especially not in a more homogeneous corpus. Using site: or any of the other interfaces to any of the various Google business concepts as site search is the approach of many publishers and companies that don't care. Its a cheap way out. It probably misses a large portion of their site. Implementing proper search strategies is expensive--- and most companies don't even have the expertise or consultants or even know where to look (most so-called "search consultants" don't know terribly much about information retrieval and discovery). Backing out of the scheme is, however, not easy. One can, of course, define robot exclusion rules and/or specify that pages should not be cached. Doing this in a manner that does not completely remove a site from Google or other robots yet includes it in a controlled manner is for most organizations non-trivial given the observation that Google (as other harvesters and crawlers) does not index all pages specified and what gets indexed and what is not is hidden in the mysteriums of Google's "black box".

     

    reply to this | link to this | view in thread ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This