LinkedIn Appeals Important CFAA Ruling Regarding Scraping Public Info Just As Concerns Raised About Clearview

from the this-could-get-interesting dept

Last fall we were happy to see the 9th Circuit rule against LinkedIn in its CFAA case against HiQ. If you don't recall, the CFAA is the "anti-hacking" law that has been widely abused over the years to try to shut down perfectly reasonable activity. At issue is whether "scraping" information violates a terms of service, and thus, the CFAA. A few years back, the same court ruled in favor of Facebook against Power Ventures, saying that even though Power's users gave permission to Power and handed over their login credentials, Power was violating the CFAA in scraping Facebook, because the information was behind a registration wall -- and because Facebook had sent a cease-and-desist.

In the HiQ case, despite what seemed to be a similar fact pattern, the court ruled against LinkedIn, saying it could not block HiQ's scraping via a CFAA claim, with the main "difference" being that LinkedIn information was publicly viewable, and therefore should be open to scraping. I still don't quite see the difference between the cases -- because in the Facebook situation, once you have a login, the information is effectively available in the same manner, but that is how the courts ruled. After first asking (and not getting) an en banc review (and then asking for more time), LinkedIn has now asked the Supreme Court to weigh in on this issue (hat tip to Media Post). I worry that the court might make things much worse if it does take the case, and block all kinds of scraping.

Of course, one thing that's notable since the 9th Circuit ruling came down -- all of the attention that Clearview AI has received over the last few months, for its frightening facial recognition app, built of of scraping "public" social media images and profiles. This use of scraping has convinced some -- even some who seemed to support the HiQ ruling -- that perhaps there should be limits on scraping. I think that's a kneejerk reaction, and focusing in too narrowly on the wrong issue. The issue there is not with scraping, but with the specific use of the data as an attack on privacy going well beyond the internet itself (i.e., tracking and identifying people out in the real world). It's one thing to focus on that issue, as opposed to saying that's an argument against free scraping.

At a time when we're so worried about competition, the ability to scrape is incredibly important. It's how competitors can be built in a world with network effects. If other companies can build compatible services, without having to do a deal with Facebook or Linkedin or YouTube or Twitter, that enables more competition much more easily. And yet, too many efforts are being made to cut off that kind of interoperability. The LinkedIn case is just one example. If the Supreme Court does take it up, let's hope they recognize just how important this kind of adversarial interoperability can be, rather than buying into some nonsense about how scraping must be blocked and not allowed.

As for the petition itself, the question LinkedIn is asking the Court to review is whether or not bots can scrape websites, even after receiving a cease-and-desist letter:

Whether a company that deploys anonymous computer “bots” to circumvent technical barriers and harvest millions of individuals’ personal data from computer servers that host public-facing websites—even after the computer servers’ owner has expressly denied permission to access the data—“intentionally accesses a computer without authorization” in violation of the Computer Fraud and Abuse Act.

While I can understand the Clearview-like horror stories some may put forth about this activity, to allow companies to block all scraping like this would create huge problems for both a functioning internet (hello search...) as well as competition.

Filed Under: cfaa, interoperability, scotus, scraping, supreme court
Companies: hiq, linkedin


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • identicon
    Anon, 13 Mar 2020 @ 6:49am

    Scraping public info with login?

    Isn't this what Aaron Schwarz was threatened with 35 years in jail for doing?

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 13 Mar 2020 @ 7:32am

      Re: Scraping public info with login?

      I don't think Swartz had a login. When using the MIT network (as with those of other major universities), some sites will allow access without a login or paywall.

      reply to this | link to this | view in chronology ]

  • identicon
    stine, 13 Mar 2020 @ 8:13am

    you should have at least mentioned

    robots.txt

    That's what this file is for.

    reply to this | link to this | view in chronology ]

  • identicon
    Professor Ronny, 13 Mar 2020 @ 2:20pm

    I worry that the court might make >things much worse if it does take
    the case, and block all kinds of
    scraping.

    As I see it, scraping is taking my stuff off the internet without my permission. Stopping that is a good thing, not a bad thing.

    reply to this | link to this | view in chronology ]

    • icon
      Tanner Andrews (profile), 14 Mar 2020 @ 6:00am

      Re:

      As I see it, scraping is taking my stuff off the internet without my permission

      Not a good classification. Viewing your publicly available web page isalso taking your stuff off the internet without your permission.

      I may be viewing it immediately rather than at a later time as I review robot scrapings, but I would not want to be the person who must make a workable distinction. There is, after all, some delay involved in receiving and rendering the data. There may be proxies and routers involved, which never actually view the data.

      To add complexity, consider caching web browers. Some web browsers will try not to re-fetch the same information because they keep it in local store for potential re-use.

      You will need to make this distinction, and find a way to communicate it, if we are to give weight to your "permission".

      reply to this | link to this | view in chronology ]

    • icon
      Mike Masnick (profile), 15 Mar 2020 @ 9:01pm

      Re:

      As I see it, scraping is taking my stuff off the internet without my permission. Stopping that is a good thing, not a bad thing.

      You see it wrong. Under your definition, anyone browsing the web is "taking your stuff off the internet" because it downloads to their computer. That's not how it works.

      Scraping enables all sorts of important web services, including search.

      reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 13 Mar 2020 @ 8:47pm

    LINKEDIN was sued for sending people notices their friends had signed them up for LINKEDIN when they had not. They scraped all those email contacts from people.

    FUCK LINKEDIN

    reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Close

Add A Reply

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Special Affiliate Offer

Advertisement
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Advertisement
Report this ad  |  Hide Techdirt ads
Recent Stories
Advertisement
Report this ad  |  Hide Techdirt ads

Close

Email This

This feature is only available to registered users. Register or sign in to use it.