Copyright Enforcement Company Uses Sketchy Algorithms And Questionable Math In Hopes Of Becoming Copyright Trolls' Go-To Resource
from the DOES-NOT-COMPUTE dept
Yet another person thinks there’s money to be made (albeit indirectly) in the copyright trolling business. (h/t to the Cyberlaw and Policy Blog)
Stephen Moignard lives a quiet life in the Coonawarra wine district in South Australia, tending his vineyard and small wine company, the Hundred of Comaum.
He also beavers away until 4am most mornings writing software for a new business venture which he’s hoping will be a global winner in the internet age.
It detects breaches of international copyright on millions of websites and produces almost instantaneous legal letters of demand.
Moignard survived the turn-of-this-century dotcom bust. He used to have a successful company that installed high-speed internet connections in office buildings, but his fortunes crashed with many others in the early 2000s.
Now, he’s looking to make some money by using an algorithm to hunt down “substantially similar” text across multiple websites and serve demand letters to alleged copyright infringers. His new business is called Plfer, and its detection algorithm bears many similarities to commercial plagiarism detection software, albeit with a few tweaks that allow it to bypass web formatting and other obstacles that might throw off comparisons.
Moignard designates “victims” as “Plferees” and those using words written by others as “Plferers.” At the site, you can view scans requested by site visitors, along with some very sketchy math used to determine potential damages. (Bad news for those of you who block Java by default: nearly the entire site is Java, so you’ll be greeted with nothing but a banner. Incredibly annoying, but presumably there to prevent people like me from copying and pasting Moignard’s words and thus becoming one of those pesky “Plferers.”)
One such example of sketchy math and questionable algorithms involves perfume site Fragrantica and some short-lived WordPress blog. Somehow, the use of Cartier-related words adds up to more than $600,000 in potential damages. [pdf link to printed report]
The report contains a lot of cool-sounding “weights” and “scores,” all of which are presumably part of Plfer’s proprietary algorithm.
Shallow scan: (stage one)
Found with string: “Cartier gained notoriety in 1904 when Louis Cartier created the first wristwatch” on search page: 0
amongst total results of: 16 (weighted value: 1.6)
with snippet: “Cartier gained notoriety in 1904 when Louis Cartier created the first wristwatch for aviator Alberto Santos-Dumont. This famous timepiece was known as the …”
Recorded on Plfer search page:fragrantica.com (in full:fragrantica.com/designers/Cartier.html)
This string was number: 16 on the page.
It has an improbability weighting of: 520.
The infringement has a duration of: 708 days.
The Plfer score is:-1741.
The Plfer score is explained on the “Getting Started” page:
The complexity of the string of text, the time between the earliest and later dates and the total number of copies in existence can be used to create a score (plfer score)(10).
The lower the number (or the larger the negative number) the more serious the breach.
After a deep scan, the plfer score is updated with many more known factors. A shallow scan plfer score should not be solely relied upon to issue infringement notices.
Using both of these, Plfer arrives at this conclusion:
The plferer earned 1164 points which is greater than the score required to amount to an ‘actionable infringement’ .
The last sentence makes no sense, but there it is. “Actionable infringement” doesn’t need a score. Either it’s infringement or it isn’t, and much of what gets highlighted by Plfer’s “Deep Scan” seems to be nothing but language that would be common to two sites covering the same subject matter. Here’s a screenshot from one Plfer report on two SEO/web design companies’ websites.
“Substantially similar” phrases include “understanding… signals algorithmically” and “reach your audience.” For the two sites noted above, the “substantially similar” wording contains phrases that would be common across all Cartier biographical information. (“Cartier gained notoriety in 1904 when Louis Cartier created the first wristwatch…”)
Finding matching phrases and keywords across two marketing sites and claiming it’s copyright infringement is a bit like looking over the resume of someone applying for the same position as you and claiming the similar buzzwords and job descriptions are due to your competitor reading over your shoulder.
Now, we get to the really fun stuff: potential damages. These numbers are key to Plfer’s success. Plfer charges very minimal fees. “Deep Scans” and “Shallow Scans” run $1/per plus $0.85 in fees. There will presumably be small fees for demand letters and other forms, but the site is still in beta and no pricing is available. Plfer, notably, does not want a cut of recovered damages, which doesn’t make it so much a copyright troll as a copyright troll facilitator. From Moignard’s advertorial PDF “2015 – the end of copyright?”
Plfer differs from other online copyright service providers in that it takes no pecuniary interest in any of the copyright infringements it uncovers. It does not become a party to any of the cases it reveals but merely assists to provide evidence, pro-forma documents and “wizards” for users and their advisors.
Plfer may not partake of any damages recovered, but it still needs to sell its services. And when a scan returns an amount in the low hundreds, it still looks like a bargain because the infringed party only spent a few bucks in return for this “evidence” of “actionable infringement.” (The PDF quoted above also hints at Plfer entering into mutually-beneficial contracts with IP-oriented law firms, but there appears to be nothing in place at the moment.)
In the case of Fragrantica, the potential damages are huge. Here’s the “math” behind the massive number.
The total value of fragrantica is $ 2,389,600 according to Alexa.com and WorthOfWeb.com. We have calculated the plferee’s actual losses as follows:
Our daily advertising income is valued at a minimum of $3314. The proportion of our site contained in parentalstyle.wordpress.com is 5.51%, giving a proportionate advertising revenue loss of $182.60 per day.
The value of this loss over 708 days is therefore $129280.8 USD. Applying a penalty multiplier of 5 times gives a total fair and just actual damages amount of $646,404.00 USD. A standard fee for enforcing an infringement of this nature and degree is $1,998.00 USD.
The total amount payable is therefore $1,998.00 + $646,404.00 = $648,402.00 USD.
Plferer Alexa ranking: 15,105,799
Plferer value: 64
Plferee Alexa ranking: 8,185
Plferee value: 2389600
Duration (years): + 1.94
Penalty: + 646404.00
Fee: + 1998.00
Total: + 648,402.00
That’s some, um, interesting math, especially when the “plifering” site ranks 14 million places lower than the “victim” and would probably never surface in a search for Cartier products — which would seem to make it more difficult to claim damages. Sure, Fragrantica could pursue this payout and present Plfer’s proprietary Alexa math to a judge, but the numbers cited here as mathematically sound are actually beyond the point of speculative.
Going beyond the sketchy math, there’s the reality of the situation. Has anyone ever made money going after “scrapers,” who “republish” posts of others in their entirety and whose sites contain 100% infringing material? Of course not. Smaller infringements like these — which are closer to plagiarism than copyright infringement — won’t be moneymakers either. Plfer might have limited success selling $1 scans to the curious and litigiously stupid, but it’s not going to change the face of copyright enforcement, much less supplant Moignard’s vineyard as his primary moneymaker.
So, why is Moignard doing this? Well, according to his own statements, it appears to be some sort of crusade against the internet’s “devaluing” of copyright-protected content. In the FAQ, under the heading “Is copyright evil?,” Moignard first points out that copyright isn’t a moral right…
[C]opyright, like all intellectual property rights, is an incentive device, designed to elicit more of certain kinds of ‘learning’ or knowledge creation and certain kinds of knowledge processing by government, rather than being any fundamental sort of moral right…
… before going on to make this a moral issue by quoting two supposed copyright opponents (at least one of which will be very familiar to Techdirt readers)…
For instance, Mike Masnick at TechDirt says:
“People copy stuff all the time, because it’s a natural and normal thing to do. People make copies because it’s convenient and it serves a purpose — and quite often they know that doing so causes no harm in those situations.”
There are a raft of similar postings by annonymous file-sharing fans such as Enigmax [TorrentFreak], who argues that all information should be free and authors should not receive anything.
… and summing it up by claiming the high ground.
Plfer stands in total opposition to the Enigmaxs and Mike Masnick’s of this world, and can prove that the technology that makes copying easy also makes prosecuting infringers just as easy.
He also presents the copyright industry’s attitude towards technological advancement in a far better light than it deserves, while simultaneously portraying innovation as an “attack” on rightholders. (From the “End of copyright” PDF.)
Digital ‘internet’ transmissions have obviously increased the risk that copyrighted works will be ‘reproduced’ and ‘distributed’ in violation of the exclusive rights granted to copyright owners. Copyright law, however, has withstood attacks from other developing media.
Specifically, copyright has coped with the invention of broadcast media, copy machines, and the video cassette recorder, and technology is assisting copyright law to step up again today.
Yeah, if by “coped” you mean “pushed for favorable legislation” and “sued endlessly.” That’s not coping. That’s finally relenting to the inevitable because you’ve exhausted all your options.
Plfer is positioning itself as a “volume” business, making money from quantity rather than quality.
Its developers’ are assuming that the sheer volume of infringements will enable it to generate significant income despite offering these services at a fraction of the cost of equivalent legal advice.
This puts it in the same group as copyright trolls like Malibu Media and Prenda Law, even if it doesn’t directly benefit from settlements and awarded damages. What it hopes to do is become the starting point for aspiring copyright trolls, using questionable algorithms and damage assessments. It even wants to further limit fair use protections — again, by using some questionable rationalizations.
With the increasingly commercial nature of all aspects of the public internet and the “monetisation” of site traffic via ubiquitous advertising services such as Google™ AdSense™ and other variants, it is difficult to argue any part of the internet is truly “non-commercial” and so the application of the “fair use” defence would seem to remain limited.
Fair use isn’t limited to non-commercial enterprises. This misconception refuses to die, and self-proclaimed copyright enforcers like Plfer are doing their best — either out of spite or ignorance — to keep it alive. You can make money and still avail yourself of the fair use defense.
Plfer is a mess. Moignard may be ambitious, but his “solution” to small-time infringement will either become another also-ran or the tool of copyright trolls. There’s nothing here that doesn’t point to either of these two outcomes.