Reporters Find Exposed Personal Data Via Google, Threatened With CFAA Charges

from the sounds-familiar dept

In a story that sounds mighty similar to the Andrew “weev” Aurenheimer situation, two reporters from the Scripps News service have been told that they may be hit with Computer Fraud and Abuse Act (CFAA) charges after a Google search they did turned up personal data on 170,000 customers that two telcos left exposed. At issue are low-income customers of YourTel and TerraCom, who provide service for the FCC’s Lifeline, a phone service for people who are enrolled in state or federal assistance programs. Apparently, the real issue was a company called Vcare, which the two telcos outsourced certain services to. The Scripps reporters noted that they did nothing more than a Google search:

The unprotected TerraCom and YourTel records came to light through the simplest of tools: a reporter’s Google search of TerraCom.

The records include 44,000 application or certification forms and 127,000 supporting documents or “proof” files, such as scans or photos of food-stamp cards, driver’s licenses, tax records, U.S. and foreign passports, pay stubs and parole letters. Taken together, the records expose residents of at least 26 states.

The application records, drawn from 18 of those states and generally dated from last September through November, list potential customers’ names, signatures, birth dates, home addresses and partial or full Social Security numbers. The proof files, from last September through April, include residents of at least eight remaining states.

Of course, rather than be thankful to the reporters for letting them know about a huge security lapse, or be apologetic for revealing all sorts of key data on their customers, they decided to sue.

However, Vcare and the two telecom companies assert that the reporters “hacked” their way into the data using “automated” methods to access the data. And what was this malicious hacking tool that penetrated the security of Vcare’s servers? In a letter sent to Scripps News by Jonathan D. Lee, counsel for both of the cell carriers, Lee said that Vcare’s research had shown that the reporters were “using the ‘Wget’ program to search for and download the Companies’ confidential data.” GNU Wget is a free and open source tool used for batch downloads over HTTP and FTP. Lee claimed Vcare’s investigation found the files were bulk-downloaded via two Scripps IP addresses.

I’m not sure how anyone could claim that the mere use of Wget constitutes a form of hacking, even under the extremely loose interpretations of the CFAA. However, as mentioned, the story does have similarities to the weev case — except this time we’re talking about reporters for a well known news service, rather than someone with a reputation as an internet troll. Hopefully, if the telcos do decide to actually file a lawsuit, it gets laughed out of court.

Filed Under: , , ,
Companies: scripps, terracom, vcare, yourtel

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Reporters Find Exposed Personal Data Via Google, Threatened With CFAA Charges”

Subscribe: RSS Leave a comment
68 Comments
Chronno S. Trigger (profile) says:

Re: Re: Re: Computer Fraud and Abuse

Google isn’t involved, but I can understand where AC’s confusion comes from.

“While the reporters claim to have discovered the data with a simple Google search, the firms’ lawyer claims they used “automated” means…”

Referencing the company Google and then the telco firm without the name can be confusing. High school level reading skills are required to properly understand that without having to read it two or three times.

GMacGuffin says:

Pure Half-Assed CYA ...

“I’m not sure how anyone could claim that the mere use of Wget constitutes a form of hacking …”

Because if the telcos did not claim the reporters hacked the information, then they are tacitly admitting they posted the personal info of 100+k people openly online. And that’s a pretty big oops.

Watchit (profile) says:

Re: Pure Half-Assed CYA ...

It kinda reminds me of the story a while back about the guy who found a simple security loophole on his banks website and he got charged with CFAA. I can’t remember the details exactly though. It pretty much boiled down to “The act of trying to circumvent the website’s security counts as hacking, no matter how simple and obviously open the system was.”

Chronno S. Trigger (profile) says:

Isn't there something missing?

I admit I’m not an experienced web admin, I only run a few IIS web servers. Do other web servers not have the most basic of security that’s built directly into IIS? I can set a folder to require a password to access and that would go for any file in that folder even if the file was accessed directly. This would stop anyone from accessing any file including Google’s spider. I won’t even go into their lack of the basic use of robot.txt

For right now, I’m going to hold off judgement on the possible actions against the reporters, but shouldn’t there be an extra line to this article? Something along the lines of “TerraCom and YourTel are under investigation for gross negligence.”

This is bullshit says:

The article on Ars Technica states something to the effect that Vcare has a requirement not to retain this data (3rd paragraph).

So here’s the situation as I (and likely everyone else, except the asshats who are trying to cover up their gross negligence and extreme incompetence) sees it:

1. Company collects data
2. Company is required NOT to hold on to data
3. Company holds on to data in violation of #2
4. Company further shows how retarded they are by making it publicly accessilbe
5. Someone finds publicly accessible documents (that shouldn’t exist in the first place) and lets the retards know
6. Retards sue, crying “Hacking! Hacking!”

What an amazing strategy!

Anyone else just wondering how long it’s going to be before someone figures this out for what it really is?

alanbleiweiss (profile) says:

Companies that have asshats who don’t give a crap about security are the norm – its inexcusable. The fact they can even file such a suit, or that the police state can bring charges against people who expose such crap is frightening.

Just this morning in a cursory review of a prospective audit client’s online presence, I did a Google search and discovered over 1,000 PDFs of customer invoices they blocked via robots.txt file but since Google now includes URLs of robots blocked files and slaps a “description not available due to robots instruction” that shit is wide open to anyone on the web, no hacking needed.

Companies need to be held accountable for their massive security failings and Google needs to be held accountable as well, even though that shit should have been completely blocked and behind a secure firewall.

The fact that this situation involved a couple reporters gives me little comfort in the notion that asshat companies might eventually be held accountable for causing such massive failings.

We need a comprehensive overhaul of the system, one NOT determined by congress or lobbyists. One that severely penalizes the asshats that cause the problem and rewards the ones who expose it.

Anonymous Coward says:

Re: Re:

It’s not Google’s fault if incompetent companies publish information to the entire world and they happen to stumble over it accidentally. (And anyone using robots.txt as a ‘security measure’ should be punched in the face with a missile.) The problem here is that companies are greedy, stupid and negligent — and are looking for scapegoats. Prosecutors, eager to catch themselves a SOOOOPERHACKER and get their names in the press, are happy to oblige.

Anonymous Coward says:

Re: Re: Re: Re:

It’s actually even worse than that.

Neither Google nor wget magically know what files are available on a site. For the files to show up in a recursive wget (or to the Google spider, for that matter), it means that the files had to be actively linked to from other accessible pages on the site. Somewhere on their public-facing website, there was a link pointing to all those thousands of insecure, confidential documents that the company wasn’t even supposed to be keeping in the first place.

out_of_the_blue says:

Don't go out of your way to write a scraper program!

If there’s any data that looks personal, run, don’t walk, away. That’s just sound advice.

2nd point: “firms’ lawyer claims” is all I see of the “threat”, and yet Mike implies charges are imminent. So far this is just another of his panics.

Watchit (profile) says:

Re: Don't go out of your way to write a scraper program!

I don’t understand the first point?

your second point: why would someone imply that someone else is liable to be sued if not to threaten to sue? even if said threat is empty?

Also, I don’t think the article implies charges are imminent. It’s implied that if the companies decide to sue it will be laughed out of court, so it’s unlikely that they actually will.

Anonymous Coward says:

If anyone did any ‘hacking’ shouldn’t it be google, where they did the web search to find the personal information?

Going after the reports is basically the following happening.
-Person A compiles a list of over 100,000 customers and their personal data
-Person B gets hired by Person A to manage the data, and leaves the data lying out in the open where anyone can grab it.
-Person C finds the data lying around and grabs it and dumps it in a public area where anyone can still read it, but it’s in a place with much more traffic.
-Person D finds the dumped personal data, reports it to Person A, and gets charged with hacking.

kitsune361 (profile) says:

Actually...

I hope they DoJ goes after the reporters for CFAA violations and wins. If the DoJ refuses to go after them, it shows a definite double standard (weev is a troll/AT&T is more important that TerraCom). That and the court battle would be epic, and due to the more sympathetic defendants has a better shot at appeal then weev’s case.

Also, as evidenced by the DoJ subpoenas of AP and FoxNews’ phone records, the one surefire way to make sure “how abusive these laws are” gets into the goldfish-sized attention span of the professional news media is to use those laws against the professional news media.

Anonymous Coward says:

you can most definitely thank the government for this sorry state of affairs. had it stuck to what it said about protecting whistle blowers, then turning turtle and crapping all over them just to protect their own ridiculously stupid mistakes, every freakin’ industry has jumped on the band wagon! this was one of things that started the ever increasing slide down the shit chute for the USA. no one is safe from their own law enforcement. is it any wonder why people rebel against them when they get all protection for doing the right thing thrown straight out the window? any wonder why they are getting to not care a toss what happens when companies get caught out over things like this? no good deed goes unpunished is a ‘truer words spoken in jest’ kind of thing

Wally (profile) says:

I cannot wait to see how long it will take before Google Fanboys to defend Google’s lax in security over this.

Google apparently no longer uses its spider crawler to help show people the most relevancy possible when they search…Instead they do relevancy in the way most web advertisers do by what and how many things an individual clicks on and move those items up on said individual’s search results based on how many times you click a link. They identify you via IP address and hold your search data for up to 9 months. Within the 9 months if you search on Google, they don’t delete previous searches at all when the 9 month mark rolls around.

If Google’s CEO has the power to internally check any user’s e-mails without a password…imagine what these reporters who pointed out the security flaw could have done.

alanbleiweiss (profile) says:

Re: Re: Re:

While Google is not ultimately responsible for other the administration of other sites, they have chosen to take a stand against hacked sites and malware ridden sites, going so far as to block them from search results pages.

Google claims to be on the side of security, yet they ignore the robots.txt file’s disallow instructions, and not only that, but publicly display links found on a site where those links were clearly delineated as “disallow” in that file. As such, they are implicit in the breach.

Gwiz (profile) says:

Re: Re: Re: Re:

… yet they ignore the robots.txt file’s disallow instructions, and not only that, but publicly display links found on a site where those links were clearly delineated as “disallow” in that file.

Do you have a citation for that? It’s not that I don’t believe you – just haven’t heard that one before.

alanbleiweiss (profile) says:

Re: Re: Re:2 Re:

A citation for it? yeah half the search marketing industry. As an SEO audit professional I routinely encounter it. They list URLs, but beneath them, where a description of the file would go is a statement

A description for this result is not available because of this site’s robots.txt ? learn more.

They do not show all URLs that are blocked in the robots file, however if their (extremely flawed) system sees enough “other indicators” to countermand the robots instruction, they ignore that instruction.

“Other indicators” is most often “a link to that file somewhere on the site itself or pointing to the URL from another site.

The “learn more” link points to this Google answer page where it states:

While Google won’t crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.

Which is complete bullshit. because while they’re not actually indexing the CONTENT of the page, they’re indexing the URL.

So in the case of a URL that includes variable parameters labeled with “order” or “customerID” or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.

nasch (profile) says:

Re: Re: Re:3 Re:

Which is complete bullshit. because while they’re not actually indexing the CONTENT of the page, they’re indexing the URL.

So in the case of a URL that includes variable parameters labeled with “order” or “customerID” or some-such, that opens up the can of WTF for anyone savvy enough to go snooping.

You’re not suggesting this is a security issue, are you? Because if you’re relying on robots.txt to secure sensitive information, you’re doing it very, very wrong. So what is the problem with this behavior by Google? I’m not saying there isn’t one, I’m just not sure I’m even clear on why you would use robots.txt to keep search engines away. If it’s to save bandwidth, then this doesn’t cause a problem since Google isn’t downloading the page.

nasch (profile) says:

Re: Re: Re:5 Re:

What I’m saying is sites need to get their security methods right. At the same time, Google claims to be a security backstop, yet they allow those URLs into their system.

Google is saying robots.txt is a security measure? If that’s what you’re saying, do you have a reference? If not, what do you mean by security backstop?

nasch (profile) says:

Re: Re: Re:7 Re:

No, Google is NOT saying that. Yet their system is more than capable of keeping URLs out of the system that are listed in the robots file so there’s no excuse why they, as a supposed security advocate, shouldn’t honor robots.txt instructions.

But you keep mentioning security in connection with robots.txt. If you acknowledge that it’s not a security measure, and Google doesn’t say it’s a security measure, why are you still talking about security?

Also, why is this a big deal? I’m not trying to defend Google, I just really don’t see why it’s important. Can you explain it?

alanbleiweiss (profile) says:

Re: Re: Re:8 Re:

Because its an opportunity for Google to help improve the securing of private information on the web. Since they already take proactive steps in other areas to improve security online, why not here?

For example – they proactively block sites their system detects that have malware or viruses. They don’t have to. Its the responsibility of site owners to ensure their sites don’t have malware or viruses baked in. Yet Google has chosen to help.

This is no different.

nasch (profile) says:

Re: Re: Re:11 crticial thinking

While robots.txt is not by original nature related to search engines, a means of security, Google has the power and resources to respect it for the sake of security.

So you’re saying it was never intended as a security measure, it is not appropriate to rely on it for security, Google does not recommend it be so used, and Google should try to make it as effective a security tool as possible? No wonder I was confused. 🙂

aldestrawk says:

Re: Re: Re: Re:

The use of robots.txt is in no way a security measure. It was never intended to be and definitely should not be used as such. It is simply intended to relieve servers of unnecessary traffic as a result of spiders actions. Any script kiddie can do the same thing as a spider and intentionally ignore the request that robots.txt files represent.

nasch (profile) says:

Re: Re: Re:3 it's called innovation

Just because something did not have an original intent to be used in a certain way does not mean it should not be used in a new way if that way is innovative and provides value to the world.

Yes, but if using it in that way is stupid and ineffective, then that doesn’t provide value. Maybe the illusion of value, which is even worse than nothing.

Anonymous Coward says:

Re: Re:

How google orders searches and what it does with previous search data has nothing to do with this company making personal data it shouldn’t have even had searchable.

You want any data related to you removed from google there is a button you can press to scrub yourself from their system. Otherwise if you don’t want to give them access to the data you create you can feel free not to use their free services.

Anonymous Coward says:

Re: Re: Re:

Where is this imaginary button? Unless you’re talking about the “Remove Search” or “do not save search” in the options menue Google provides…but then again those are only saved as cookies on YOUR computer so YOUR system is tracked even without that option. The truth is that Google does not provide a magic button to users of their search engine to have their information on where they clicked stricken from Google’s servers.

3/4 of Google’s revenu in 2010 came from advertising and less than 1/100 came from search technologies. When you do the math, you start to realuze Google’s priorities have shifted towards catering to advertisers rather than web users.

Anonymous Coward says:

Re: Re: Re:

“You want any data related to you removed from google there is a button you can press to scrub yourself from their system. Otherwise if you don’t want to give them access to the data you create you can feel free not to use their free services.”

Although on second read…I recall that one could do this but only if one is logged onto Google+…which means they are still tracking your every move and can target ads to you with the advertising companies they own.

aldestrawk says:

Re: Re: Re: Re:

Wally has pointed out several issues with how Google operates. They are somewhat related but not closely enough for me to figure out what point he is trying to make.

-the use of spiders is the first step to setting up a search index of the web. It is the bottom layer here. Necessary, but any kind of page rank algorithm (relevancy?) takes this basic information and tweaks it in their own way. On its own, an index of the web does not determine page rank. This was Google’s innovation back in 1998.

-There is a basic issue with user privacy vs Google’s business model of using search data as a basis to extract advertising dollars. I have not recently followed their data retention practices, so I accept his statement of 9 months limit or lack of it. However, what has this got to do with the story at hand here about TerraCom’s security/privacy issue?

-Someone else (the user in the industry?) thinks that Google is being lax in security because they list the URLs for which the robots.txt file is asking not to be followed. This is not any kind of security failing. I assume that Google’s spider is not recursively following the hyperlink that URL represents or executing any code to produce a dynamic web page for that URL. That is the real intention of robots.txt. I think it is a minor issue that Google now lists the URL that begins a blocked branch. As the robots.txt file can be simply ignored, any real attempt to restrict access to the data on that page should require an authentication/authorization step.
Google’s security interest in labeling certain sites as dangerous is completely separate from any concern about indexing pages that the owner would rather have private. Such dangerous sites are identified, as best as possible, to contain malware such as a cross site scripting vulnerability.

In Wally’s final sentence I don’t see the connection between the technical ability of Google to view the content of email on the gmail domain with the security failings of TerraCom/Vcare. A series of statements with varying validity that do not have any obvious connection is, to me at least, the definition of rambling.

Maybe I am being too critical. After all this is just a forum where people, perhaps with limited time, just throw out thoughts to be consumed and either ridiculed or praised. I am loathe to trot out my credentials, but I have worked on network protocols for thirty years and network/computer security for 6 years.

Anonymous Coward says:

I’m not sure how anyone could claim that the mere use of Wget constitutes a form of hacking, even under the extremely loose interpretations of the CFAA.

I bet there are a few organizations in Washington DC that would be happy to pontificate on Wget being a dangerous hacker tool used by Chinese cyberhackers to perpetrate cyber-9/11 cyberterrorism on our cybercountry.

Donglebert the Needlessly Obtuse says:

People should not be allowed to use computers to freely access the internet

They should only be allowed to view passive pages via a screen in a reasonably public place, say, for example, their living rooms. Keyboards, mice, and touchscreens should be banned because they encourage hacking.

If this fails, the next step would be to distribute printed copies of approved web pages.

I Forgot says:

Trusting Any Company to Securely Protect Consumers

This is systematically problematic with corporations who require extended personal data of its customers. This case does give the appearance of these companies attempting to cover their own arses after they mishandled or neglected to secure even the most simple of data breaches after the fact.

This is so tiring to hear of yet more personal information that was entrusted to a company that once again ends up in the wrong hands. There should be a law that will bring to bear full liability upon the company’s CEO, Vice President and entire Board of the corporation when this occurs as well as all top management to the degree of damage it causes or potentially causes.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...