June 10, 2009 at 9:53 pm

The fact that the scraper reads copyrighted content shouldn’t mean that it’s infringement. It’s not doing anything with that content other than using it to find the content it can make use of.

Anonymous Coward

June 10, 2009 at 10:36 pm

I guess what the judge is implying is that if the terms of service forbid access by automated tools, then any access by automated tools is unauthorized and is therefore “infringing.”

As far as I know, copyright infringement requires a medium – in other words, it is not infringement for Power.com to view copyrighted Facebook material without authorization, but it may be infringement for Power.com to republish copyrighted Facebook material without authorization. Admittedly I don’t know exactly how Power.com’s application works, but it would seem that if they are only republishing user information (info that Facebook does not own the rights to) then their scraper tool would fall into the first category.

Anonymous Coward

June 10, 2009 at 10:43 pm

Perhaps facebook should have an option asking whether or not a user wants his information to be displayed on third party sites, kinda like how google and search engines have an option allowing webmasters to prevent sites from showing up on search engines (ie: using metatags). It should ultimately be up to the users, perhaps users don’t want their information being easily searchable by others and they should have that choice.

Killer_Tofu (profile)

June 11, 2009 at 5:46 am

Re: Re:

@#3
But that would be logical and rational.
Asking any large company to be logical or rational is like pulling teeth. It just very rarely prevails.

Anonymous Coward the Second

June 11, 2009 at 5:50 am

Re: Re: Re:

But that would be logical and rational.

No it wouldn’t. How would such an option be enforced? At best FaceBook could set up it’s API to play nice with such an option, but it wouldn’t stop things like scrapers, or even just a ‘friend’ cut-and-pasting info to their blog.

Will Sinclair

June 11, 2009 at 12:40 am

This is an interesting topic.

What does the law actually say regarding automated retrieval of content?

I write scrapers, and the very way that they work requires you to use either an HTML POST or GET command to retrieve a target url, the same way your browser works, then you parse the results for the content you are after. Mostly I do all this client side, so it isn’t done by my company’s servers but rather in javascript in the client’s own browser.

So really it’s not my company’s server that’s accessing the content, but a client’s computer, and in the case of a facebook profile, presumably the client has agreed authorized access with facebook. Surely therefore it’s not breaching the ToS to access your facebook profile with your own browser through a secondary interface (scraper)??

No webmaster likes their precious site being scraped, but in all honesty there’s not a great deal they can do about it unless they put a CAPTCHA image on every page, which would seriously detract from the user experience.

Anonymous Coward

June 11, 2009 at 5:48 am

Re: Re:

the same way your browser works

This is what popped into my head when I read the post. Why would a Browser not be a problem but a Scraper would be?

Also, as far as I know, it’s nmot copyright infringment until you produce a copy — if the product is just information FB don’t control, there should be no case for infringement.

joe malley

June 20, 2009 at 8:05 am

Re: Re:

will sinclair

i would like to learn more about scrapers. plz email to chat in confidence at:

malleylaw@gmail.com

thanks
joe malley

Anonymous Coward

June 11, 2009 at 12:44 am

Why should the service be liable for what it’s users are choosing to do when they initiate a ‘scrape’ of facebook?

Also, why should the service be held to terms of service that users, not the service, has a agreed to?

Andrew

June 11, 2009 at 12:46 am

Dirty scrapers

What about Firefox and IE? To my knowledge both of them make a copy the raw code from the facebook website, enabling the user to extract information from it.

Ok that’s a forced argument – but the point is, if you don’t want information to be available, don’t publish it on the web. And this is as nothing to Facebook asking for my email password. What a stupid question.

Allen Harkleroad (user link)

June 11, 2009 at 1:32 am

Scrapers

Generally speaking in order to display content from another website using a page scraper, the entire page is copied and most time Regular Expressions (RegEx) is uses to extract specific information and then display it.

We often scrape our own websites content and then display it on another of our websites to reduce the number of times we publish content.

I am sure one could write a scraper that only copied specific content directly from a full web page and display it.

Doctor Strange

June 11, 2009 at 1:47 am

The rationale seems to be indirectly based on the Ticketmaster decision.

The argument, it seems to me, goes thusly:

When a program or a browser or some other software accesses a website, it is making a copy of the content of that site. That content may be copyrighted. By default, this would be a violation of copyright law. However, the Ticketmaster website included some terms of use that permitted this – effectively granting a limited license for some uses, but not others. Thus, an ordinary user with a browser was not in violation, because they were accessing the content under a license.

RMG’s product, on the other hand, explicitly went outside the bounds of this license, and thus fell back under the aegis of copyright law, where such copying is prohibited. I suppose RMG could have mounted a fair-use defense, but it appears they explicitly did not.

Now, you could argue, what about the millions of websites that do not have terms of use granting a license? Is just browsing these sites a commission of copyright infringement?

The answer is likely no, based on the Perfect 10 ruling. This ruling indicated that some caching can be fair use; in that case, the caching was “noncommercial, transformative, and has a minimal impact on the potential market for the original work.” The difference in Ticketmaster was that the fair use defense was not adequately asserted, and moreover the RMG software was in pretty clear violation of Ticketmaster’s terms of use.

Killer_Tofu (profile)

June 11, 2009 at 5:50 am

Re: Re:

Your post once again proves to me how stupid copyright laws are in the technological world we live in today.
None of those tiny details should matter at all.
Copyright is a horribly outdated pre-internet system that just isn’t needed anymore.
Scraping like this should be less than no problem. Simple.

Granting tiny limited licenses? C’mon. Its dumb.

Anonymous Coward

June 11, 2009 at 5:54 am

Re: Re:

How is caching (or browsing) ‘transformative’?

Pangolin (profile)

June 11, 2009 at 3:59 am

What about Google?

Google’s crawler (and other search engines) must be infringing then as well as violating the TOS. Should google stop crawling facebook?

Anonymous Coward

June 11, 2009 at 12:59 pm

Re: What about Google?

Google’s crawler (and other search engines) must be infringing then as well as violating the TOS. Should google stop crawling facebook?

I thought the same thing, but it goes beyond Facebook. Is this yet another ruling that would effectively criminalize Google if it were applied to everyone (of course it won’t be)?

Designerfx (profile)

June 11, 2009 at 5:27 am

uh?

I’ve not heard a lot of times where a TOS is deemed not legally enforceable . This flies right in the face of that. So why is the judge overshooting his influence in trying to say that it is?

Truly there is bad lawyering, or the judge is unfamiliar with the internet.

Designerfx (profile)

June 11, 2009 at 5:27 am

Re: uh?

typo: I meant where they deem that a TOS cannot be legally enforced in this kind of way.

Anonymous Coward

June 11, 2009 at 1:04 pm

Re: uh?

Truly there is bad lawyering, or the judge is unfamiliar with the internet.

Nah, you seem to laboring under the illusion the law apples equally to everyone. It doesn’t.

Anonymous Coward

June 11, 2009 at 5:33 am

whaaaaa faceplant

“Yet Facebook charged the company with all sorts of complaints”

including,
1) we aren’t making any money off of it so it must be illegal somehow.
2) our TOS states that you can not out innovate us
3) we own all user submitted content, including copyright

CStrube (profile)

June 11, 2009 at 5:40 am

If the scrapper is committing copyright infringement, then how is any web browser not guilty of the same? They both work in the same manner:

1) Get entire page
2) Parse for content (either automatically, or by a human reading the interesting parts and skipping things like ads)
3) ???
4) Profit

Anonymous Coward

June 11, 2009 at 5:57 am

Re: Re:

Browsers parse content automatically, too, it’s just the processing that’s different. Otherwise you’d just get a bunch of flat text, URLs, and HTML tags on your screen.

inc

June 11, 2009 at 5:47 am

So Facebook puts a page available to be viewed by users, or scraped by their browser and eyes. So using a different access method also allows users to see the same available page. Also how can a ToS create law? Last I checked in the U.S. only Congress had the power to create law not the courts or a ToS. Just because it’s in a term of service doesn’t extend copyright law or give the courts power to interpret any law in a different way. Unless Power.com is actually breaking Trademark laws by claiming they are Facebook then this is just monopoly tactic.

Roland

June 11, 2009 at 6:52 am

Can Scraping Non-Infringing Content Become Copyright Infringement... Because Of How Scrapers Work?

So are web search submission engines (google, yahoo!), committing copyright infringement, when they scrape content from a web-page to be add to their search-able databases?

Hephaestus (profile)

June 11, 2009 at 6:58 am

Search Engines come to mind.....

If this is copyright infringement then pretty much any access to the facebook web site is also…. browsers, search engines, websites using iFrames to facebook pages, etc. Pretty stupid ruling IMHO.

Andr? de Kock

January 7, 2014 at 7:40 am

Dirty scrapers

Absolutely agree. If you publish something on the internet it becomes public domain

Alexander Miller

January 2, 2015 at 11:14 pm

Hey! It’s given the opportunity to compare the prices the different hotels with ease. Awesome man. Is this software is coming with its updated version? How it finds the info from the tripadvisor website? It’s something unbelievable.
For More Details : https://www.youtube.com/watch?v=Megr1WncZk4

Monday
20:06	Feds Begin Targeting 'Anti-Technology Extremists' Which Is Going To Make Everything So Much Worse (6)
15:22	Prosecutor Nopes Out Of The DOJ After Being Handed The James Comey '8647' Case (2)
13:05	John Deere Faces Second Class Action For Monopolizing Repair (7)
11:09	Judge Reopens Trump's IRS Case, Wants To Know If The Court Was Defrauded (22)
11:04	Daily Deal: uTalk Language Education (0)
09:31	CBP Commander Greg Bovino Is Taking Guest Speaker Spots At White Nationalist Conferences (14)
05:29	AT&T Sues California Regulators For Trying To Make Broadband Affordable (9)
Sunday
12:00	Funniest/Most Insightful Comments Of The Week At Techdirt (15)
Saturday
12:00	This Week In Techdirt History: May 24th - 30th (2)
Friday
19:39	Knox County, TN Rolls Back 'Roots' Book Ban After Backlash (10)

Can Scraping Non-Infringing Content Become Copyright Infringement… Because Of How Scrapers Work?

from the this-seems-troubling dept

Comments on “Can Scraping Non-Infringing Content Become Copyright Infringement… Because Of How Scrapers Work?”

Re: Re:

Re: Re: Re:

Re: Re:

Re: Re:

Dirty scrapers

Scrapers

Re: Re:

Re: Re:

What about Google?

Re: What about Google?

uh?

Re: uh?

Re: uh?

whaaaaa faceplant

Re: Re:

Can Scraping Non-Infringing Content Become Copyright Infringement... Because Of How Scrapers Work?

Search Engines come to mind.....

Dirty scrapers

Add Your Comment Cancel reply

Comment Options:

What's this?

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Monday

Sunday

Saturday

Friday

More

Tools & Services

Company

Contact

More

Can Scraping Non-Infringing Content Become Copyright Infringement… Because Of How Scrapers Work?

from the this-seems-troubling dept

Comments on “Can Scraping Non-Infringing Content Become Copyright Infringement… Because Of How Scrapers Work?”

Add Your Comment Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

Get all our posts in your inbox with the Techdirt Daily Newsletter!

The Techdirt Greenhouse

Monday

Sunday

Saturday

Friday

More

Email This Story

Tools & Services

Company

Contact

More