We were surprised
last year when the Associated Press decided to sue Meltwater News -- an online press clipping/news aggregator service. The issue is that Meltwater does index various news articles, just like Google, allowing you to search on them. But it does not deliver the full articles to users instead delivering snippets (just like Google News), and then points them to versions online (increasing traffic to AP stories). It delivers two snippets -- the first is only 300 characters, representing the opening of the article, and the second of less than 140 characters, highlighting the line that triggered the alert itself (i.e., matching keywords). However, Meltwater also
has a feature that lets subscribers choose to archive the text of articles they find elsewhere
-- like a ReadItLater/EverNote/Instapaper kind of thing. But that's all done at user discretion, not by Meltwater.
Not surprisingly, we found Meltwater's response
to the lawsuit quite compelling. It argued a strong fair use right, combined with claims of copyright misuse by the AP. Unfortunately, the court did not find them compelling. The district court has sided with the Associated Press in ruling for summary judgment against Meltwater
, a ruling that will surely be appealed.
The ruling itself has a ton
of problems, that hopefully will be fixed by an appeals court. The court seems strongly influenced by the fact that the AP has a service that competes with Meltwater, but that, alone, is no basis for finding infringement. Also, it notes that other "competitors" in the space do have AP licenses, including Google News. But that's misleading. The AP had threatened Google News with a similar lawsuit, but they got around it with a silly agreement by which Google licenses AP news to post full stories
as if it were an AP syndication partner. But that's not what Meltwater does.
The court doesn't agree with the fair use defense by Meltwater, but does so with some really wacky reasoning. It runs through the standard four factor test, but really has a unique interpretation of what counts as "transformative" use, somehow arguing that the use needs to be an entirely new form of use
to be transformative.
Based on the undisputed facts in this record, Meltwater provides the online equivalent to the traditional news clipping service. Indeed, Meltwater has described itself as adding “game-changing technology for the traditional press clipping market.” There is nothing transformative about that function.
I don't think that's what transformative means. No one said it had to be innovative
-- just different
from the purpose of the original work.
Separately, the court argues that because Meltwater competes with one part of the AP's business, then the transformative argument doesn't apply.
Meltwater copies AP content in order to make money directly from the undiluted use of the copyrighted material; this is the central feature of its business model and not an incidental consequence of the use to which it puts the copyrighted material. Thus, it is not surprising that Meltwater’s own marketing materials convey an intent to serve as a substitute for AP’s news service.
That, unfortunately, is confusing two different aspects of the AP's business. There's the reporting side which produces the news, and then there's the "news service" side, which sells the service of providing news alerts to various customers. Meltwater is competing with the latter, but the copyrights in question apply to the former -- and thus Meltwater's use should be seen as transformative since it is not competing with the copyright around the creation and presentation of the news itself, but rather with the service of finding the most relevant news.
Furthermore, the court keeps going back to the low rate of clickthrough on Meltwater's clips. But that, too, is misleading. First of all, this could just be because Meltwater's relevance engine isn't very good. If it's not delivering relevant content, then people won't click very much. But that should hardly weigh on the copyright claim. Alternatively, perhaps it's the Associated Press's content that isn't very good. That is, readers see little reason to click on their versions of the story. Again, the actual click through rate on such stories is somewhat meaningless and certainly doesn't suggest that people are using the AP less because of Meltwater. They still may be articles that that people wouldn't see otherwise, it's just that for, whatever reason, people didn't click through them that much. Also, you have to look at how many people use clipping services anyway -- it's not necessarily to read each and every article, but to do a quick skim and see if there's anything really important
they need to know about or to get a general sense of coverage out there. None of those would result in clicks. But that doesn't mean that Meltwater doesn't overall increase absolute clicks to AP articles simply by exposing them to more people.
In the discussion on the "amount" of work use, the court again runs into trouble, arguing ridiculously that taking the opening (or "the lede") from every article is somehow against a finding of fair use:
Meltwater took between 4.5% and 61% of the Registered Articles. It automatically took the lede from every AP story. As described by AP’s Standards Editor, the lede is “meant to convey the heart of the story.” A lede is a sentence that takes significant journalistic skill to craft. There is no other single sentence from an AP story that is as consistently important from article to article –- neither the final sentence nor any sentence that begins any succeeding paragraph in the story.
In other words, Meltwater is smartly using the first sentence to highlight what an article is about. Without that, it's service is a lot less useful. But that doesn't change the nature of the overall issue, which is that it's still only offering up a small portion of the content in question.
The court also seems to think it knows how to run a search engine:
Next, Meltwater argues that the extent of its copying is justified because its purpose is to serve as a search engine. But, Meltwater has failed to show that it takes only that amount of material from AP’s articles that is necessary for it to function as a search engine. Indeed, the evidence is compellingly to the contrary.
I'm curious. What is "the amount necessary to function as a search engine?" One might reasonably suggest that a search engine would be wise to index everything
. Yet the court here seems to be suggesting otherwise. I'm curious how many search engines the judge has built.
Basically, Meltwater points out that what it does is no different than a search engine, and the court says (without much basis) that it doesn't think Meltwater really is a search engine, and thus these defenses don't apply. But this is extremely troubling for actual search engines, because you can take each of the pieces out and then try to apply them to a basic search engine, and you'll find that if this ruling stands, it makes being a search engine much more difficult as well.
The court, also worryingly, rejects Meltwater's argument that because the AP did not block it via robots.txt, that it is providing an implicit license to index the site.
The implied license that Meltwater is advocating would reach to every web crawler with no distinction between those who make fair use and those who do not, or between those whose uses may be publicly observed and those whose uses are hidden within closed, subscriber systems. Meltwater has presented no evidence to suggest that robots.txt instructions are capable of communicating which types of use the copyright holder is permitting the web crawler to make of the content or the extent of the copying the copyright holder will allow.
Again, this appears to be off the mark, conflating a few separate issues. First, it's a combination both of (a) publishing openly online and (b) not using robots.txt that suggests an implied license to index
. The court ignores the (a) part, and seems to give much more weight to the argument that it's just (b) being implied. Second, it ignores the fact that this is solely about the implied license to index
. There could still be infringement for what's done with
the index, but that's separate from the indexing itself.
The court doubles down on this mistake.
There is yet another policy reason against the use of robots.txt protocol to enforce the Copyright Act. The protocol is a helpful innovation that gives instructions to cooperating crawlers. But, in the interest of openness on the Internet, one would expect it to be used only when it is in the clear interest of the website to broadly limit access. It is fair to assume that most Internet users (and many owners of websites) would like crawlers employed by search engines to visit as many websites as possible, to include those websites in their search results, and thereby to direct viewers to a vast array of sites. Adopting Meltwater’s position would require websites concerned about improper copying to signal crawlers that they are not welcome.
Again, that's not true. Sites concerned about improper copying can still use copyright law against the actual improper copying outside of the indexing. It's never the indexing that's the problem, but what that index is used for.
We had also noted, when Meltwater responded to the AP's claims, that it argued that the AP was engaged in copyright misuse. The court dismisses this point quickly. Not only is copyright misuse not established as a copyright defense in the Second Circuit, where the case is being heard, but the court argues that Meltwater has done little to show actual misuse by the AP.
There are numerous problems with this ruling, and I imagine the case will get even more interesting on appeal, as lots of internet companies who rely on fair use for scraping content suddenly need to be paying attention to the specifics of this case.