Twitter Wishes 4.5 Million Osama Bin Laden-Related Tweets Into Their API Cornfield

from the tweets-or-it-didn't-happen dept

Fri, Jun 17th 2011 04:08pm - Tim Cushing

Considering Twitter was instrumental in breaking the story of Osama Bin Laden’s death, it seems somewhat strange that they would also be instrumental in limiting access to one of the biggest stories of 2011, if not the decade. (Of course, we’re barely into this decade so we probably shouldn’t be building these "best of" lists quite yet…) At the center of this unfortunate situation is a dataset constructed from public tweets using either "osama" or "bin laden," which was compiled using Twitter’s own API.

Shortly after hearing of Bin Laden’s unexpected mortal coil shuffling, Rob Domanski, who blogs as The Nerfherder, was informed of an archive of Osama Bin Laden-related tweets, all packaged up in handy XML format for use with DiscoverText software:

The datafiles were samples taken from live feed Twitter imports starting shortly after the announcement that Osama bin Laden’s death.

Twitter searches for "bin laden" (647,585 documents, 505 MB)
Twitter searches for "osama" (586,665 documents, 451 MB)

This was all for research purposes, however Twitter quickly shut down the project citing their Terms of Service (TOS) Agreement.

Stuart Shulman of DiscoverText had compiled the documents "using an authorized connection to Twitter via their API" which is apparently a violation of Twitter’s API Terms of Service. He received an email from Twitter asking him to remove the datasets:

I’m writing about Twitter data being offered for sale on DiscoverText. Scraping the Twitter service is prohibited by our site Terms of Service, and furthermore, resyndicating data obtained through the Twitter API is prohibited by section I.4.a of our API Terms of Service (http://dev.twitter.com/pages/api_terms).

As such, we request you remove the datasets listed at http://discovertext.com/osamabinladen.aspx and any other datasets containing Tweets offered on your site.

Shulman responded:

Let’s be clear. We have never sold a Tweet. The data collected through the Twitter API and shared through our system is the same publicly available data other users capture with screenshots and share on blogs, Facebook or Twitter itself. Nonetheless, the datasets we have assembled and similar samples are being taken temporarily off the Web site pending a resolution of this issue with Twitter.

Well, "temporarily" has turned into "indefinitely." As of June 1st, Shulman’s dataset contained 4.5 million Osama Bin Laden-related tweets, all of which can only be marveled at as a REALLY BIG NUMBER but not shared in any usable fashion thanks to Twitter’s complaint.

If it’s just a "policy first" decision on Twitter’s part, it seems a little short-sighted. This information was (and is) of great interest to people worldwide. Perhaps some sort of warning could have been issued instead of a full takedown, thus allowing Twitter to assert its position on API usage without locking up the dataset. Once the dataset already exists, why block it? It’s disheartening to see something with as much potential as Shulman’s project getting thrown under the TOS bus.

Anonymous Coward

June 17, 2011 at 4:20 pm

Wait...

I thought only the evil ‘MAFIAA’ cared about protecting their content in a digital age? And all these upcoming tech companies were about ‘openness’ and a new way of doing things?

You people are ridiculous.

Anonymous Coward

June 17, 2011 at 4:21 pm

This information was (and is) of great interest to people worldwide.

I lol’ed. At least you didn’t say great importance.

Anonymous Howard, Cowering

June 17, 2011 at 4:55 pm

Maybe...

I wonder if the FBI or some other governmental/cryptically-initialled agency leaned on Twitter…

Time to adjust the tin-foil hat. They can hear you, you know.

Anonymous Coward

June 17, 2011 at 5:18 pm

List of Idiots

This is a fine example of Party A objecting to Party B making A’s data more valuable, for free. Mike has taught us well to recognize the pattern. Then A engages in some sort of sleaze to make the data less valuable again. Isn’t that the kind of management boneheadedness which is supposed to be restricted to members of the XXAA and the government? Looks like Twitter wants to be added to the list of idiots.

Rekrul

June 17, 2011 at 5:24 pm

“It’s a good thing you done there Twitter! A real good thing!” 🙂

Anonymous Coward

June 17, 2011 at 5:24 pm

How come it wasn’t just upped as an xml file over bittorrent?

Miff (profile)

June 17, 2011 at 5:52 pm

I’m honestly surprised Twitter still has an API capable of doing anything useful.

Anonymous Coward

June 17, 2011 at 7:44 pm

War on Terror is a hoax…

Androgynous Cowherd

June 18, 2011 at 1:51 pm

I’d like to know under what legal theory Twitter has any right to demand that data taken down. Surely the copyrights in the individual tweets are vested in the individual tweeters, and the United States doesn’t recognize a separate database right in a mere compilation of separately copyrighted works where that compilation doesn’t add any original creative expression (e.g. in the arrangement) — likely these archives are just chronological, or even disordered.

So I don’t think Twitter has a copyright claim. And I don’t see any other IPR being remotely applicable here. Since the tweets were public, trade secrecy clearly cannot apply, and they’re obviously not patentable, nor can Twitter have trademarked them, though the tweets may contain trademarks here and there. If there’s a publicity rights violation in there anywhere, again the right being violated would be an individual tweeter’s and not Twitter’s. (That’s leaving aside the question of whether a 140-character tweet contains enough creative expression to even be copyrightable at all.)

Which means that Twitter hasn’t a legal leg to stand on if that site operator puts the archives back up and keeps them that way. The most Twitter can do about an alleged TOS violation is a) terminate the alleged violator’s Twitter account and possibly b) sue for breach of contract. But they have no proprietary interest in that data, legally speaking.

Josh in CharlotteNC (profile)

June 20, 2011 at 6:13 am

Re: Re:

I’d like to know under what legal theory Twitter has any right to demand that data taken down.

The data was gathered using Twitter’s API.
In order to use the API, you must agree to a TOS.
TOS and EULAs are considered contracts, even though no one reads them.

I’m not a lawyer, but as far as I remember, no one’s really been willing to decisively challenge or defend click-through and shrink-wrap TOS and EULAs for fear of a judge making a ruling that turns out to be a precedent.

Tom Landry (profile)

June 18, 2011 at 1:58 pm

Utterly fantastic title for this piece Tim….lolol

Leave a Reply to Androgynous Cowherd Cancel reply

Wednesday
12:05	Biden Signs TikTok Ban Bill; Expect A Lawsuit By The Time You Finish Reading This Article (16)
10:50	DeSantis Signs Law Limiting Book Challenges After The Shitty People He Encouraged To Be Shitty Proved To Be Even Shittier Than He Thought They'd Be (18)
10:45	Daily Deal: The Premium Python Programming PCEP Certification Prep Bundle (0)
09:31	FTC Bans Non-Competes, Sparks Instant Lawsuit: The War For Worker Freedom (17)
05:31	Grindr Hit By UK Lawsuit For Reckless Sale Of Sensitive User Data (1)
Tuesday
20:00	David Chang Issues C&Ds Over 'Chile Crunch' Products, Then Apologizes And Promises To Stop (1)
15:34	Because It's Done Such A Great Job Policing Illegal Drugs, The DEA Decides It's Time To Start Engaging In Legal Drug Hysteria (23)
13:38	When You Need To Post A Lengthy Legal Disclaimer With Your Parody Song, You Know Copyright Is Broken (25)
12:09	No One Can Own The Law—So Why Is Congress Advancing A Bill To Extend Copyright To It? (17)
10:52	Top Lawyer In Texas Doesn't Understand Court Rulings, Celebrates Obvious SCOTUS Loss As A Win (19)

Twitter Wishes 4.5 Million Osama Bin Laden-Related Tweets Into Their API Cornfield

from the tweets-or-it-didn't-happen dept

Comments on “Twitter Wishes 4.5 Million Osama Bin Laden-Related Tweets Into Their API Cornfield”

Wait...

Maybe...

List of Idiots

Re: Re:

Leave a Reply to Androgynous Cowherd Cancel reply

Comment Options:

What's this?

The Techdirt Greenhouse

Trending Posts

Wednesday

Tuesday

More

Tools & Services

Company

Contact

More

Twitter Wishes 4.5 Million Osama Bin Laden-Related Tweets Into Their API Cornfield

from the tweets-or-it-didn't-happen dept

Comments on “Twitter Wishes 4.5 Million Osama Bin Laden-Related Tweets Into Their API Cornfield”

Leave a Reply to Androgynous Cowherd Cancel reply

Comment Options:

What's this?

Techdirt Daily Newsletter

The Techdirt Greenhouse

Trending Posts

Wednesday

Tuesday

More

Email This Story

Tools & Services

Company

Contact

More