by Mike Masnick
Thu, Dec 1st 2011 7:59pm
Filed Under:
api, apps, innovation, music, platform
Companies:
spotify
Twitter Wishes 4.5 Million Osama Bin Laden-Related Tweets Into Their API Cornfield
from the tweets-or-it-didn't-happen dept
Considering Twitter was instrumental in breaking the story of Osama Bin Laden's death, it seems somewhat strange that they would also be instrumental in limiting access to one of the biggest stories of 2011, if not the decade. (Of course, we're barely into this decade so we probably shouldn't be building these "best of" lists quite yet...) At the center of this unfortunate situation is a dataset constructed from public tweets using either "osama" or "bin laden," which was compiled using Twitter's own API.
Shortly after hearing of Bin Laden's unexpected mortal coil shuffling, Rob Domanski, who blogs as The Nerfherder, was informed of an archive of Osama Bin Laden-related tweets, all packaged up in handy XML format for use with DiscoverText software:
The datafiles were samples taken from live feed Twitter imports starting shortly after the announcement that Osama bin Laden’s death.This was all for research purposes, however Twitter quickly shut down the project citing their Terms of Service (TOS) Agreement.
- Twitter searches for "bin laden" (647,585 documents, 505 MB)
- Twitter searches for "osama" (586,665 documents, 451 MB)
Stuart Shulman of DiscoverText had compiled the documents "using an authorized connection to Twitter via their API" which is apparently a violation of Twitter's API Terms of Service. He received an email from Twitter asking him to remove the datasets:
I'm writing about Twitter data being offered for sale on DiscoverText. Scraping the Twitter service is prohibited by our site Terms of Service, and furthermore, resyndicating data obtained through the Twitter API is prohibited by section I.4.a of our API Terms of Service (http://dev.twitter.com/pages/api_terms).
As such, we request you remove the datasets listed at http://discovertext.com/osamabinladen.aspx and any other datasets containing Tweets offered on your site.
Shulman responded:
Let’s be clear. We have never sold a Tweet. The data collected through the Twitter API and shared through our system is the same publicly available data other users capture with screenshots and share on blogs, Facebook or Twitter itself. Nonetheless, the datasets we have assembled and similar samples are being taken temporarily off the Web site pending a resolution of this issue with Twitter.
Well, "temporarily" has turned into "indefinitely." As of June 1st, Shulman's dataset contained 4.5 million Osama Bin Laden-related tweets, all of which can only be marveled at as a REALLY BIG NUMBER but not shared in any usable fashion thanks to Twitter's complaint.
If it's just a "policy first" decision on Twitter's part, it seems a little short-sighted. This information was (and is) of great interest to people worldwide. Perhaps some sort of warning could have been issued instead of a full takedown, thus allowing Twitter to assert its position on API usage without locking up the dataset. Once the dataset already exists, why block it? It's disheartening to see something with as much potential as Shulman's project getting thrown under the TOS bus.
Vevo Doesn't Put Ads In YouTube API, Gets Upset When Music Streaming Startup Uses That Fact
from the umm... dept
Its latest screwup was that it didn't include its preroll ads in the YouTube API, meaning that others who used the API could access and repurpose Vevo content without the ads, and even show the content outside the US (which Vevo currently does not allow). It didn't take long for one enterprising startup, Muziic, to do exactly that. Muziic has received some attention for basically using the YouTube API to create an iTunes-like experience out of YouTube videos (it also gets attention for being founded by a 16-year old). Muziic sent out an announcement this week about how it was using the YouTube API to add Vevo content, meaning you could access Vevo videos without the preroll ads and outside the US.
Vevo's first response? To send a cease and desist. At the very least, it wasn't a legal nastygram, but a more friendly cease & desist sent by Caraeff himself. But "cease" what? Muziic was using the API as designed, and even though Caraeff admits that Vevo is quickly scrambling to change the API, he still says Muziic needs to cease from using the Vevo logo or referencing the company's name. But Muziic used the name in an accurate and descriptive manner. It accurately noted that it was now offering Vevo content -- without ads and outside the US -- all legally via the use of the API provided by YouTube/Vevo itself.
Muziic's co-founder responded to Caraeff's email over at Hypebot, saying that he "was as shocked as anyone when I realized there were not yet any "pre-roll" advertisements for Vevo content in the API," but since it was how the company set up the API, it seems perfectly reasonable to use it that way. He also notes that he had reached out to Vevo prior to this to try to work out an arrangement with the company and got no response.
by Mike Masnick
Fri, Nov 20th 2009 5:28pm
Filed Under:
api, blocking, set top boxes, tv, video, youtube
Google Blocking Set Top Boxes From Showing YouTube Unless They Pay Up?
from the evil-is-as-evil-does dept
by Mike Masnick
Tue, Sep 29th 2009 8:53am
Filed Under:
api, copyright, movie times, services
Companies:
movieshowtimes.com, west world media
Can You Copyright Movie Times?
from the and-if-you-could,-why-would-you? dept
You need to know it is unlawful and a violation of our copyright and intellectual property rights for you to build a system that obtains our content from any source other than to obtain an expressed license from West World Media for legal usage of our content. Each violation of our Intellectual property rights allows us to collect damages of up to $150,000 per infringement. This would equate in liquid damages of over $600,000 per month if you violate our rights.Anderson responded, asking the company how factual information (such as movie times) could be covered by copyright, and the company responded:
"It is not our responsibility or duty to explain complex US Intellectual Property rights law, we however enjoy many protections from them. I suggest you hire an IP attorney to explain it to you. From your response, it seems to me you have no intentions of moving forward in a legal manner. We closely monitor any and all usage of our content and if we discover your unlawful usage of it, we will exercise our rights to their fullest extent of the law."Now, obviously, the company makes its money by licensing its database of showtimes to certain websites, but that information is factual, and it's difficult to see how the company could hold a copyright on it (at least in the US, where there's no real "database right" -- elsewhere... perhaps a different story). There's also no creative element in merely listing showtimes, and it's hard to see how they would possibly be covered by copyright. If the problem is that the company is upset that its business model can't handle other people sending it traffic, that's a business model problem, not a copyright problem. Time to redesign the business model to take a cut of sales, rather than to rely on artificial copyrights. Unfortunately, though, it doesn't stop a company from making such threats...
Separately, this reminds me of the fact that, just a few months ago, we were talking about how the movie times in newspapers were apparently paid advertisements by the theaters themselves. So, this seems like an odd switch as well: newspapers get paid for movie listings, but websites have to pay for them? How does that work?
by Mike Masnick
Wed, Mar 11th 2009 5:15am
Filed Under:
api, news, newspapers, platform
Companies:
ny times, the guardian
The Guardian Follows The NY Times In Making News A Platform
from the good-job dept
Perhaps even more interesting (though, getting much less attention) is the companion bit of news from some editors at the Guardian -- who are pointing out that they hope and pray each day that the NY Times gives into temptation and starts trying to charge for news... because it will create a huge opening for the Guardian to create a much larger online audience. This is what plenty of people have been pointing out for years: if clueless newspaper execs decide to start charging for news, it just opens the door wide for smarter news organizations to stay free and accumulate a much larger audience.
NY Times Turning News Into A Platform
from the smart dept
Of course, just because there are some folks on the digital side who "get it" at the NY Times, it doesn't mean management has quite figured things out yet. At the same time as releasing this API, the paper's Executive Editor, Bill Keller is talking about trying to lock up their content and charge people for it, again. Yes, the newspaper needs new and innovative business models, but by now it should know that trying to charge for such content simply isn't a sustainable model. There's too much competition out there (which the NY Times discovered already when it tried and failed to charge for content a few years back). There are things that the paper can charge for -- but basic online content isn't one that will be successful.
NY Times Starting To Recognize That Data Is News
from the all-the-data-that's-fit-to-release dept
And that's exactly what the NY Times appears to be doing, if only on a limited scale (for now). It's set up an API for campaign finance data, allowing anyone to build useful tools or visualizations on top of it. And, that's not all, they're also getting ready to release an API for movie reviews. In other words, the NY Times is definitely recognizing the value in not just freeing up their stories, but making core underlying data totally accessible and useful.
Where Are The APIs For Government Data?
from the open-'em-up dept





