Is It Really That Big A Deal That Twitter Blocked US Intelligence Agencies From Mining Public Tweets?

from the it's-public-info dept

Over the weekend, some news broke about how Twitter was blocking Dataminr, a (you guessed it) social media data mining firm, from providing its analytics of real-time tweets to US intelligence agencies. Dataminr — which, everyone makes clear to state, has investments from both Twitter and the CIA’s venture arm, In-Q-Tel — has access to Twitter’s famed “firehose” API of basically every public tweet. The company already has relationships with financial firms, big companies and other parts of the US government, including the Department of Homeland Security, which has been known to snoop around on Twitter for quite some time.

Apparently, the details suggest, some (unnamed) intelligence agencies within the US government had signed up for a free pilot program, and it was as this program was ending that Twitter reminded Dataminr that part of the terms of their agreement in providing access to the firehose was that it not then be used for government surveillance. Twitter insists that this isn’t a change, it’s just it enforcing existing policies.

Many folks are cheering Twitter on in this move, and given the company’s past actions, the stance is perhaps not that surprising. The company was one of the very first to challenge government attempts to get access to Twitter account info (well before the whole Snowden stuff happened). Also, some of the Snowden documents revealed that Twitter was alone among internet companies in refusing to sign up for the NSA’s PRISM program, which made it easier for internet firms to supply the NSA with info in response to FISA Court orders. And, while most other big internet firms “settled” with the government over revealing government requests for information, Twitter has continued to fight on, pushing for the right to be much more specific about how often the government asks for what kinds of information. In other words, Twitter has a long and proud history of standing up to attempts to use its platform for surveillance purposes — and it deserves kudos for its principled stance on these issues.

That said… I’m not really sure that blocking this particular usage really makes any sense. This is public information, rather than private information. And, yes, not everyone has access to “the firehose,” so Twitter can put whatever restrictions it wants on usage of that firehose, but seeing as it’s public information, it’s likely that there are workarounds that others have (though, perhaps not quite as timely). But separately, reviewing public information actually doesn’t seem like a bad idea for the intelligence community. Yes, we can all agree (and we’ve been among the most vocal in arguing this) that the intelligence agencies have a long and horrifying history of questionable datamining of other databases that they should not have access to. But publicly posted tweet information seems like a weird thing for anyone to be concerned about. There’s no reasonable expectation of privacy in that information, and not because of some dumb “third party doctrine” concept, but because the individuals who tweet do, in fact, make a proactive decision to post that information publicly.

So, perhaps I’m missing something here (and I expect that some of you will explain what I’m missing in the comments), but I don’t see why it’s such a problem for intelligence agencies to do datamining on public tweets. We can argue that the intelligence community has abused its datamining capabilities in the past, and that’s true, but that’s generally over private info where the concern is raised. I’m not sure that it’s helpful to argue that the intelligence community shouldn’t even be allowed to scan publicly available information as well. It feels like it’s just “anti-intelligence” rather than “anti-abusive intelligence.”

Filed Under: , , , , ,
Companies: dataminr, twitter

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Is It Really That Big A Deal That Twitter Blocked US Intelligence Agencies From Mining Public Tweets?”

Subscribe: RSS Leave a comment
Anonymous Coward says:

“I don’t see why it’s such a problem for intelligence agencies to do datamining on public tweets”

I’m in agreement. Public information should be available for the commons, which means not dictating limits on its use. Those uncomfortable with this may want to rethink their use of public sites or more private ones like Facebook where power is ceded to the hosting entity.

Anonymous Coward says:

Re: Re: ALPR databases

There’s nothing wrong with ALPR databases either as it’s covered by the first amendment.

Creating an ALPR database is currently legal because of the First Amendment connection. However, despite being lawful to create and to read, there are substantial concerns about how such a database can be used. There are a variety of questionable or outright prohibited uses which can be readily conducted once the data is aggregated. Depending on the particular collection and retention policies of the database, there may be few or no desirable uses for the long term data. For example, retaining five years (or more) of historical data has questionable investigative value. How often would a legitimate investigation benefit from knowing where a given vehicle was seen four or five years ago? If the answer is “rarely” or “never”, then perhaps the database should not retain enough history to answer that question. There are definitely some illegitimate uses of very old data, and retaining the data for a long period enables all uses, both legitimate and illegitimate.

Thus, my stance is that while collecting an ALPR database is legal, many of those who do it are maintaining it in ways that I would like to see changed. License plate recognition and tracking, like some other forms of surveillance, was historically limited by the impracticality of personally conducting it on a large scale. No one would seriously expect the police to spend the manpower to build a license plate database by having officers stand at major intersections and enter all those plates by hand, so no one thought to pass a law limiting such surveillance to contexts with an articulable connection to an open investigation. Now they can very cheaply collect such information via ALPRs, and there is still no law telling them not to do it, so they do because it looks neat and might help some day. I want a law that, at minimum, clearly disallows surveillance that has no better basis than “it might be useful some day.”

Anonymous Coward says:

Re: Re: Re: ALPR databases

ALPR tech is a weird issue that’s always bugged me, and I can’t figure out if my concerns are ridiculous, paranoid, or just too slippery-slope-ish.

Given that the hardware will get smaller and cheaper (and storage space bigger and cheaper), it’s possible that ALPRs could become ubiquitous: slap tiny little button cameras on any & every street sign, mile marker, and public structure available. Plus, since the devices are leased/sold by private companies that also host the databases that retain captured info, there’s nothing to stop mounting them on every bit of private property owned by someone who’ll do it for a little compensation.

As the density of ALPRs increases, the resolution of the monitored path traveled by an individual car increases. Eventually, this resolution is effectively GPS tracking of every car that leaves its driveway. The metaphor here goes beyond a cop with a camera on every street corner: it becomes a cop car with a running dash cam following every single person 24/7. We’ll all have to switch to self-driving cars, if only to avoid all the traffic fines we’d have to pay (or we’ll have to insert credit cards into manually-driven cars to pay the near-real-time penalties as they roll in). Then there’s the ol’ problem of “this car stops on a street known for activity x every Saturday; maybe we should pull it over for ‘failure to signal’ this coming weekend.” (Thinking that that sort of thing mightn’t be legal, the words ‘parallel’ and ‘construction’ spring to mind for some reason.)

Even ignoring LEA (warrant-less) access to this sort of information, the private sector control of the data is more than a bit creepy. Imagine health insurance companies being able to ask how often someone stops at McDonalds or Krispy Kreme, auto insurers with black-box access to your driving habits, your boss being able to check if you spend a lot of time at the local watering-hole after work every day, etc. etc.

Privatizing mass surveillance doesn’t quite feel like an egalitarian utopia with lots of togas & flyin’ cars. Feels more like one of those cases where corporations being nothing more than everyday people (just like you & me!) with everyday rights isn’t exactly ‘leveling the playing field’.

GMacGuffin (profile) says:

Re: Re:

“…what is the difference between data-mining “the firehose” and building ALPR record databases?

In the aggregate, there is a big difference. Twitter posts track what the user chooses to say publicly. The user has total control over that.

But … aggregated ALPR data can track a person’s movements and locations, giving a picture of where and whom one deals with. This implicates privacy concerns (urologist office Wed 2:30; married coworker’s house at lunch), freedom of association concerns (stop by local KKK HQ), including false positives (cousin borrows your car for a murder), and so on. Big difference.

Anonymous Coward says:

Re: Re: Re: Re:

And a driver has total control over where they choose to drive publicly

Read the part of GP that you did not quote. It is unrealistic to say that people can freely visit embarassing places and keep their movement secret from an ALPR database. They might keep their movement secret from a single individual by engaging in counter-surveillance techniques, but the only way to counter an ALPR is not to drive anywhere important. Driving is ubiquitous in modern society.

Tracking someone in public because you have a good reason to do so (probable cause) is morally very different from tracking everyone in public just because you can do it cheaply and you might find something interesting.

GMacGuffin (profile) says:

Re: Re: Re: Re:

And a driver has total control over where they choose to drive publicly, and even which route they choose to take to get there. What’s the difference?

Oh, I dunno … say you have finally gotten that appointment with that HIV specialist and have no other way to get there but your own car.

Or say you are active in a protest group that the government would like to take you down for, Constitution be damned. Lots of stories out there about The Man surveilling perfectly legal activism. And say a FBI agent asks you about it and you waffle, and he doesn’t like it, so he says you lied to a federal agent. Then you can go to jail over for lying to a federal agent, just because. And you can’t prove otherwise because it’s his word/notes against yours as they don’t record interviews… that sort of thing.

Anonymous Coward says:

Re: Re: Re:2 Re:

i will go you one better: what if we citizens do some crowd-sourced public observation and tracking of LEOs, what bullshit would they come up with for why that is totally unacceptable ?
in general, OUR lives should be totally opaque to gummint, and gummint should be totally transparent to us, the theoretical owners and bosses of ‘our’ (sic) gummint…

Mason Wheeler (profile) says:

Re: Re: Re:2 Re:

Oh, I dunno … say you have finally gotten that appointment with that HIV specialist and have no other way to get there but your own car.

Why say that? Show me a scenario in which public transportation and cabs are unavailable, and yet normal businesses are open for business and the roads are clear for normal driving (ie. not a natural disaster situation) and I will show you a contrived scenario that has no place in a discussion of real-life events.

HegemonicDistortion says:

Re: Re: Re:

And it has First Amendments as well, particularly the right of association. Ubiquitous license plate readers can be used to essentially collect association information, which, the Supreme Court has ruled, can violate one’s free association rights (when the government does/uses it).

Anonymous Coward says:

Re: Re:

The difference between the firehose and ALPR, is that ALPR record databases are transformative, whereas with twitter, you’re intending for people to be able to find and read your messages.

ALPR is taking public license information that is in many places mandated by government, and capturing its movement in bulk.

Tweets are fully voluntary broadcasts.

And that’s the difference.

Wyrm (profile) says:

Re: Re:

If say there are two main differences.

Others have already pointed out the choice factor. You actively choose to spread your words on Twitter. You cannot choose to hide your license plate when driving your car, hence forcibly “communicating” your movements to anyone looking at your car.

I would add that there is also a difference in the level of publicity. When you type a tweet, you know it will be available to everyone, and indexed for easy search. When driving your car, another individual would have to follow you around to know where you’ve gone. ALPR is taking this to a level that is 1. not originally expected and 2. on a level that a single individual cannot match.

As I see it, those two points make ALPRs a completely different issue.

Anonymous Coward says:

Anti-abusive intelligence

It feels like it’s just “anti-intelligence” rather than “anti-abusive intelligence.”

Show me an intelligence agency that’s not abusive and I’ll show you an agency that deserves convenient access to public information. Currently, every relevant intelligence agency is either (a) known to be abusive or, (b) not known to make substantial efforts to avoid abuses (both at the individual and organizational levels). Thus, for now, there is no difference between penalizing all intelligence agencies and penalizing only the abusive ones. Penalizing all of them is easier than specifying who can and cannot have access, since that would mean building an access control list and then dumping all of them on the blacklist. For now, penalizing all of them targets exactly the right set.

theBlueSage (profile) says:

Piping Data has a cost

Coming from a pure firehose perspective, having a consumer of the firehose comes with a cost. It is not like the hose is pouring info into the ether and people stand under the spray. When someone attaches to the firehose it create a separate hose connection. The data flowing through that hose connection to the consumer has a network bandwidth cost. How much that cost is will depend on the total amount of firehose consumed. IMHO, I would assume that the snoops will have that hose in wide open, 24/7 mode. Depending on the size of the draw (one listening process, 5000 listening processes) the bandwidth cost can get huge.

Dr. David T. Macknet (profile) says:

Posturing for PR Purposes

Part of this may be PR as a tech company to say, “hey, we’re not in the government’s pocket!” They’re trying to spin it as if they still don’t take whatever contracts come their way, in other words, when in reality they haven’t ended those relationships.

Part of this may be PR as a company trying to attract job talent … and getting responses like mine:

A couple/three years ago I interviewed with Dataminr. I ended the interview process after I learned that their clients included various agencies, because I just didn’t want to be a part of a company that did analysis for “those people.”

The tech industry as a whole has tended to lean towards privacy over surveillance, and Dataminr is likely losing out on a lot of talent because they’re perceived as being in the pocket of those agencies.

As a data scientist (see that new buzzword?) I’m interested in their volume of data, and the science behind processing that data. However, until they cease to be a tool of oppression, they’ll not be attractive to me. And I don’t think I’m alone in making this assessment.

Anonymous Coward says:

Re: Posturing for PR Purposes

The tech industry as a whole has tended to lean towards privacy over surveillance

You obviously haven’t paid any attention to pretty much anything over the past decade or so if you believe that. Most of silicon valley has been built around the for-profit surveillance model.

John Fenderson (profile) says:

Re: Re: Posturing for PR Purposes

The tech industry has this bizarre attitude: privacy is sacred when it comes to “sharing” data outside of the tech industry. Inside of the tech industry, this data “sharing” is considered a tremendous virtue — because there’s money in that pile of data.

It’s a level of hypocrisy that borders on stunning.

orbitalinsertion (profile) says:

The problem. Hmmm, not exactly huge, but why should they have special treatment?

Anyone else trying to do the same would be flagged as abusive by servers.

Spook agencies already slurp data their own way. Maybe they should actually be using data gained by their methods rather than every possible convenience of access ever.

You know they are mostly bothering people who don’t do anything other than have an opinion, and only certain sorts, at that. They behave like everyone else who is “vigilant” about “terrorism”, freaking out over someone doing algebra, but go ahead and publicly demonstrate for white, christian, right-wing insurrection and that is hunky-dory. Gee, make a lifelong career of it. (Whether the run files on them or not, they aren’t the ones who mysteriously end up on no-fly lists, etc.)

More broadly, no on – not agencies or marketing arms of companies, properly minimize or anonymize and protect any data. Twitter firehose would be neither, by nature. And a very small percentage of the use of this sort of mass data is anything positive. And it could be. So why give access to such data to such poor stewards.

Anonymous Coward says:

What is concerning here is that the intelligence agencies are forcing their way into all the data streams on the Internet. They are not targeting their data collection, or rather are collecting data that is of political significance, rather than genuine national security interest. This goes double when it is data on foreigners that they are collecting, especially with the US propensity to interference in foreign countries when it looks like the internal politics are going against their political and commercial interests.

Slinky (profile) says:

Snowden 60 minutes interview

As far as I remember, when Ed Snowden was interviewd by 60 minutes, he said something about a government program that was able to not only datamine what you type, but HOW you type. The program would be able to recognize or “fingerprint” a user from his/her characteristics/patterns. This way it was possible to profile people. (Please check out the 60 minutes interview with Snowden, I may remember it wrong :/)

Annonimus says:

Crying Wolf

The inteligence agencies have abused every angle they can get their hands on to expand their information haystack. Why on earth should Twitter trust them not to bring up that the firehose is something Twitter shares with them, but they are refusing other stuff in their program?

The Intelligence community and the Justice Department have abused every piece of good will that the Silicon Valley has extended them. Why on earth should Twitter trust them not to attempt to spin any sort of access into a story of Twitter being picky about what it shares and what it doesn’t?

That One Guy (profile) says:

Re: Crying Wolf

Yeah, the DOJ’s actions in the recent Apple case made it clear that helping out government agencies is only going to hurt you down the line if you ever refuse a request.

‘They were willing to give us X, but now they’re refusing to give us Y because they care more about their bottom line than helping us catch criminals/terrorists/communists.’

I think David T. Macknet above is probably partially right, this is at least part PR move, but I imagine part of it is simply to reduce the government’s ability to use Twitter’s willingness to work with them against them down the road.

@b (user link) says:

False dichotomy though....

Mason already has it covered above but

1) There’s mountains of difference between ten dudes tweeting and what an entire country is tweeting “in aggregate”. Or everything those ten dudes have ever tweeted since sign up / the age of consent.

2) Microbloggers living under oppressive governments are on an online hair-trigger. Ever make a copy-paste mistake? Accidentally typed your password in the wrong text field? Regret something u wrote online? Operated a phone when a bit drunk? Heard of someone famous deleting a tweet?

3) Social networks have utterly complicating Public/Private dualism. With every node in the network, and every bloop arriving from your Contacts List, comes 50,000 shades of grey.

Do we really want to CC the government on every digital message ever sent that didn’t require a password to read?

Will it be to hell with broadcasters? Encrypt it or die?

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...