Federal Court Says Scraping Court Records Is Most Likely Protected By The First Amendment

from the public-access-by-any-means-necessary dept

Automated web scraping can be problematic. Just look at Clearview, which has leveraged open access to public websites to create a facial recognition program it now sells to government agencies. But web scraping can also be quite useful for people who don’t have the power or funding government agencies and their private contractors have access to.

The problem is the Computer Fraud and Abuse Act (CFAA). The act was written to give the government a way to go after malicious hackers. But instead of being used to prosecute malicious hackers, the government (and private companies allowed to file CFAA lawsuits) has gone after security researchers, academics, public interest groups, and anyone else who accesses systems in ways their creators haven’t anticipated.

Fortunately, things have been changing in recent years. In May of last year, the DOJ changed its prosecution policies, stating that it would not go after researchers and others who engaged in “good faith” efforts to notify others of data breaches or otherwise provide useful services to internet users. Web scraping wasn’t specifically addressed in this policy change, but the alteration suggested the DOJ was no longer willing to waste resources punishing people for being useful.

Web scraping is more than a CFAA issue. It’s also a constitutional issue. None other than Clearview claimed it had a First Amendment right to gather pictures, data, and other info from websites with its automated scraping.

Clearview may have a point. A few courts have found scraping of publicly available data to be something protected by the First Amendment, rather than a violation of the CFAA.

Unfortunately, all we really have is a pinkie swear from the DOJ and a handful of decisions that only have precedential weight in certain jurisdictions. But there’s more coming. As the ACLU reports, another federal court has come to the conclusion that government efforts banning web scraping violate the rights of would-be scrapers. But, as is the case in many legal actions, the details matter.

In an important victory, a federal judge in South Carolina ruled that a case to lift the categorical ban on automated data collection of online court records – known as “scraping” – can move forward. The case claims the ban violates the First Amendment.

The decision came in NAACP v. Kohn, a lawsuit filed by the American Civil Liberties Union, ACLU of South Carolina, and the NAACP on behalf of the South Carolina State Conference of the NAACP. The lawsuit asserts that the Court Administration’s blanket ban on scraping the Public Index – the state’s repository of court filings – violates the First Amendment by restricting access to, and use of, public information, and prohibiting recording public information in ways that enable subsequent speech and advocacy.

The case stems from the NAACP’s “Housing Navigator,” which scrapes publicly available info from government websites to find tenants subject to eviction in order to provide them assistance in fighting eviction orders or finding new housing. As the NAACP (and ACLU) point out, this valuable service would be impossible if the NAACP was limited to manual searches to find affected tenants.

The state of South Carolina — via a state appellate decisions — claims the NAACP is only allowed limited access — the manual searches the NAACP says render its eviction assistance efforts impossible to achieve. The federal court says the state does have the power to limit access to public records, but those limits must align themselves with the tenets of the First Amendment, which presume open access to government records by the governed.

The state comes down on the losing side here, at least for the moment. The limits proposed by the state court order nullify the services the NAACP hopes to offer. As it stands now, the state cannot escape this lawsuit because there’s enough on the record at the moment that suggests there’s a viable constitutional claim.

The NAACP alleges that without scraping, it is impossible to gather the information quickly enough to meet the ten-day deadline to request a hearing. It alleges that scraping poses at most a de minimis burden on the functionality of the website.

As discussed above, it also contends suggested alternatives to scraping, such as Rule 610, are insufficient, and that Defendants have, in any event, indicated an unwillingness to provide the information under that rule. […]

True, the evidence may eventually show that Defendants have a sufficient reason to prohibit scraping. It may indicate that the NAACP’s access to the records is unburdened by the restriction. Or, it may demonstrate that Defendants have provided sufficient alternatives to access the information. But, as alleged, the restrictions state a claim for violation of the First Amendment.

The bottom line is this: automated access to government records is almost certainly protected by the First Amendment. What will be argued going forward is how much the government can restrict this access without violating the Constitution. There’s not a lot on the record at the moment, but this early ruling seems to suggest this court will err on the side of unrestricted access, rather than give its blessing to unfettered fettering of the presumption of open access that guides citizens’ interactions with public records.

Filed Under: , , ,
Companies: aclu, naacp

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Federal Court Says Scraping Court Records Is Most Likely Protected By The First Amendment”

Subscribe: RSS Leave a comment

This comment has been flagged by the community. Click here to show it.

Anonymous Coward says:

Internet 2.0 Looks Really Good Already

Knowing how easy it is for a few countries to develop Internet 2.0 without commerce/advertising models already makes this century the best century for innovation. Internet 1.0 is for all the failed experiments.

The depressed people just have poor information sources. The plastic emotions are worthless. Thats what trained AI, which makes more fake emotions (scraped data) a very comical cesspool for their kids.

I think the smart people already abandonded 5+ billion illiterates in an echo chamber. That makes the scraped business models ill-suited for modern use. Internet 3.0 is even better.

Anonymous Coward says:

This is how APIs come into being.

On the one hand, the state isn’t required to make the data available online. It may not put the records in a locked bathroom with a sign “beware of the leopard”, but it could put the records in a physical ledger on the clerk’s desk

But having made them available online, the leopard is off duty, the bathroom is unlocked, and the government is not supposed to card you on your way in.

But these scraping sites? They’re looking for the data, and trawling for it from what the web site presents the public. This could be done a lot easier… by simply creating an API that provided the data in an archive, updated periodically. The users get the data in an easy-to-use lump without depending on interpreting the web site. The government gets better traffic management on its site. Everyone wins.

Now if only someone was willing to pony up a small amount to develop that API….

LostInLoDOS (profile) says:

Anyone who believes the CFAA was about protecting the general population is incredibly niece. And despite the floor commentary on the War Games film and a series of books; there was no real concern about publicly systems.
The CFAA was about being able to stop a Nixon style leak from happening again with governmental-criminal-activity records.
It gave the government and intelligence sectors a way to force a gag by claiming that any leaks were part of an on-going criminal investigation.
Notice it was thrown out there for just that reason regarding the Iraq and Afghanistan document leaks. And many others.
It didn’t work out. And it’s always been a used. But that’s because it was supposed to be abused. From the moment pen hit paper.

Not (user link) says:

Short Sighted

The ACLU fails to consider that allowing web scraping of public records also significantly harms these same people via enabling the scraping and selling of public arrest records, divorce records, restraining orders, divorce pleadings, civil suits, etc. They should have more concern about protecting privacy which is a human right. The NAACP can advertise. They don’t have to help people like clearview AI codify web scraping under the guise of preventing evictions. Is the ACLU on Spokeo’s payroll now? Spokeo, ZoomInfo, Clearview AI all benefit greatly from this litigation and in turn make me into a product to be sold. I’ll never give another cent to the ACLU. How stupid are they. Disgusting.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...