Current Insight Community Cases

Essential Datacenter Tips On Application Performance Monitoring

The Importance Of Skilled Immigrants To The American Economy

Help A New Kind of Music Label Revolutionize The Industry

Mandates To Buy American Should Be More Carefully Considered

Navigating The New Business World After This Recession

Shut Us Up

-- For Only $100 Million

Brought to you by Floor64 and the Techdirt crew.

stories filed under: "robots.txt"
(Mis)Uses of Technology

(Mis)Uses of Technology

by Mike Masnick


Filed Under:
newspapers, robots.txt

Companies:
google



Google To Newspapers: Here, Let Me Introduce You To Robots.txt

from the snappy dept

With the silly introduction last week of the AP's attempt to create a weird and totally unnecessary new data feed to keep out aggregators and search engines, it seems that Google has gotten fed up. Google execs and employees have made similar statements on various panels and discussions, but Senior Business Product Manager Josh Cohen put up a blog post directed at newspapers, that can be summarized as: Dear newspapers: let me introduce you to a tool that's been around forever. It's called robots.txt. If you don't like us indexing you, use it. Otherwise, shut up. In only slightly nicer language.

27 Comments | Leave a Comment..

 
Overhype

Overhype

by Timothy Lee


Filed Under:
publishers, robots.txt, search engines

Companies:
associated press, google, microsoft, yahoo



Search Engines Should Ignore Bossy Publishers

from the disallow dept

James Grimmelman has an in depth look a ACAP, the new "standard" for website access control that we discussed last Friday. I put "standard" in scare quotes because, as Grimmelman points out, the specs clearly weren't written by people with any experience in writing technical standards. While a well-written standard will very precisely specify which behaviors are required, which are prohibited, and under what circumstances, the ACAP spec is full of vague directives and confusing terminology. Some parts of the standard are apparently designed to "only be interpreted by prior arrangement." Also, despite the "1.0" branding, the latest version of the specification has several sections that are labeled "not yet fully ready for implementation." It is, in short, a big mess.

Of course, this shouldn't surprise us, because it's not really a technical standard at all. Robots.txt works just fine for almost everyone, and search engines aren't clamoring to replace it. Rather, some publishers are using the trappings of a technical standard to try to micromanage the uses to which search engines put their content, and they're laying the groundwork for lawsuits if search engines fail to heed the demands embedded in ACAP files. Not only are the rules vague and confused, but the "standard" also helpfully notes that the rules "may change or be withdrawn without notice." In other words, a search engine that committed to complying with ACAP directives would be setting itself up to have their search engine's functionality micro-managed by the publishers who control the ACAP specifications.

Luckily, as Mike pointed out on Friday, search engines have the upper hand here. So here's my suggestion for search engines: instead of trying to comply with every nitpicky detail of the ACAP standard, just announce that every line of an ACAP file will be interpreted as the equivalent of a "Disallow" line in a robots.txt file. Websites would discover pretty quickly that posting ACAP directives on their sites just caused their content to disappear from search engines. As much as they might bluster about other search engines "stealing" their content, the reality is that they can't afford to give up the traffic that search engines send their way. If search engines simply refused to include ACAP-restricted pages in their index, publishers would quickly realize that those old robots.txt files aren't so bad after all.

Timothy Lee is an expert at the Insight Community. To get insight and analysis from Timothy Lee and other experts on challenges your company faces, click here.

4 Comments | Leave a Comment..

 
News You Could Do Without

News You Could Do Without

by Mike Masnick


Filed Under:
publishers, robots.txt

Companies:
associated press



News Publishers Want To Change Robots.txt; Want To Make Sure Their Content Is Less Useful

from the deep-misunderstandings dept

Following on the speech given earlier this month by the head of the Associated Press, where it was made clear that the AP and news organizations still think that they can be gatekeepers of news, a bunch of publishers along with the AP are now trying to revise robots.txt so that they can hide content on a more selective level. Now, it is true that robots.txt can be rather broad in its sweep. But it's rather telling that it's the publishers who banded together and are telling search engines what changes are needed, rather than working with the search engines to come up with a reasonable solution. In the meantime, there really are some simple solutions if you don't want content indexed by search engines -- but we've yet to fully understand why publishers are so upset that Google, Yahoo and others are sending them so much traffic in the first place.

5 Comments | Leave a Comment..

 
Search Techdirt
And now, a word from our Sponsors..



Popular Posts
Poll

Which Internet Concern Worries You The Most?

 

 

 

 

 

 


Add Techdirt RSS To Your Reader
rss Add Techdirt to your Bloglines
Add Techdirt to your Google Add Techdirt to your My Yahoo
Add Techdirt to your Netvibes Add Techdirt to your Newsgator
Subscribe to Techdirt's Daily Email Newsletter

Techdirt's Daily Email Newsletter

Older Stuff

Monday

1:31pm: Tiburon Approves Recording Every Car That Enters/Leaves... Despite More Evidence Of Traffic Camera Abuse In UK (74)
12:18pm: Label Exec Arrested For Not Using Twitter To Disperse Crowd At Mall To See Singer (53)
11:01am: Spanish Court Dismisses Complaint From Nintendo Against Counterfiet DS Cartridges, Since They Add Functionality (12)
9:55am: Dear PR People: If Your Exec Has A Comment, Our Comments Are Open (25)
8:44am: What Kind Of Mickey Mouse (And Donald Duck) Lawsuits Are These? (23)
7:30am: Prosecutors Ending Lawsuit Against Lori Drew (13)
6:06am: Dear Rupert: You Don't Succeed By Making Life More Difficult For Users (70)
4:20am: ESPN Writer Suspended From Twitter (59)
2:10am: School Can't Handle Critical Community Message Board; Sends Legal Nastygram (21)

Friday

7:39pm: Liberian Laws Are A Secret Due To Copyright; Even The Gov't Doesn't Have Them (43)
6:56pm: Lily Allen: It's Ok To Sell My Counterfeit CDs, Just Don't Give My Music For Free (97)
6:10pm: EFF Looks To Bust Bogus Podcasting Patent; Needs Prior Art (34)
5:28pm: Google Blocking Set Top Boxes From Showing YouTube Unless They Pay Up? (64)
4:44pm: Entertainment Industry: Yes, Please Keep Negotiating Secret Copyright Treaty To Save Our Asses (43)
4:02pm: If Google's Book Scanning Violates Copyright Law, What About The AP's Book Scanning? (21)
3:05pm: iPhone App Developer Backlash Growing (49)
2:14pm: Norwegian Band Told It Can't Post Its Own Music To The Pirate Bay, Even Though It Wants To (24)
1:08pm: If You Only Share A Tiny Bit Of A File Via BitTorrent, Is It Still Copyright Infringement? (79)
12:00pm: UK Digital Economy Bill As Bad As Expected; Digital Britain Minister Flat Out Lies About ISP Support (25)
10:57am: NPR's Daniel Schorr Blames The Internet For Ft. Hood Shootings (37)
9:49am: No, ACTA Secrecy Is Not 'Normal' -- Nor Is It A 'Distraction' (28)
8:33am: Murdoch's The Times Accused Of Blatant Copying, Just As It Tells The World You Should Pay For News (28)
7:15am: Copyright Extension Moves To Japan (24)
5:46am: Canadian Ebook Store Offers 'Free' Public Domain Ebooks -- Claims Copyright Says You Can Only Make 1 Copy (27)
4:01am: There Are Lots Of Ways To Fund Journalism (14)
1:49am: Winner Takes All, Long Tails And The Fractilization Of Culture (10)

Thursday

10:37pm: The Lobbyists' Ability To Control The Message (29)
8:11pm: In Going Free, London Evening Standard Doubles Circulation While Slashing Costs (27)
6:10pm: Senate Exploring Med School Profs Putting Names On Ghostwritten Journal Articles In Favor Of Drugs (22)
4:52pm: What Does It Say When A Comedy Show Does More Fact Checking Than News Programs? (56)
More arrow
Quick Links
Close
E-mail It