Kindle Spam Is A Filter Issue, Not A Spam Issue

from the filter-away dept

Via Slashdot, we learn that spammers have discovered the ability to publish cheap “ebooks”:

Thousands of digital books, called ebooks, are being published through Amazon?s self-publishing system each month. Many are not written in the traditional sense.

Instead, they are built using something known as Private Label Rights, or PLR content, which is information that can be bought very cheaply online then reformatted into a digital book.

These ebooks are listed for sale ? often at 99 cents ? alongside more traditional books on Amazon?s website, forcing readers to plow through many more titles to find what they want.

The article makes it sound like this is a big problem, calling it “the dark side” of self-publishing, but I don’t get it. Assuming no one wants this crap, then it seems likely that Amazon will start to filter it out of any search results or top lists.

There is some slightly more legitimate concern about outright plagiarism, where some of these “spammers” are merely copying other books and then re-branding them and selling them as ebooks. But, once again, this seems like a filter problem more than anything else. In fact, I’m a bit surprised that Amazon doesn’t do a basic check to make sure the content of an ebook hasn’t already been offered by someone else, and do a further investigation if that’s the case. Others have suggested that Amazon charge a small fee to upload a book, as that might prevent spammers from going crazy with such copies, and that could make sense as well. I just have trouble believing that this is such a serious “problem” that it can’t easily be stopped.

Filed Under: , ,
Companies: amazon

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Kindle Spam Is A Filter Issue, Not A Spam Issue”

Subscribe: RSS Leave a comment
26 Comments
Chargone (profile) says:

Re: Re:

you say that as if the absolute quantity of signal is reducing. it’s not.

also: that’s what filters are for. give each entry a bunch of filter tags for various things and let the person looking for stuff browse by filter catagory as well as searching. or even search within a catagory. in a lot of contexts good catigorisation is more useful than a search engine for finding what you want. especially when you’ve got a less than complete understanding of what that Is.

Michael Long (profile) says:

Re: Re: Re:

That’s why I think Amazon should require a $99 “publishing fee” per book.

In fact, requiring a fee would be the “first filter”. If you don’t think a book is good enough to earn back $99, it’s probably not good enough to be on the store in the first place.

Some guys has spammed the store with over 8,000 titles ripped off from hither and yon and selling for a buck a head. Would he have still done that if doing so would have cost him $80,000 up front?

Anonymous Coward says:

Re: Re: Re:

You said:” you say that as if the absolute quantity of signal is reducing. it’s not.”

Me: I don’t think so. It is a question of ratio. If you have 100 good works, and 10 bad ones, your signal to noise ratio is high and you have no problems. But if that shifts to 100 good works and 1000 bad ones, the chance that you find the signal (same level as before) is very low.

In a world where anyone can publish anything at any time with little or no real cost, you will get more noise. Spammers, jammers, and scammers will figure out how to make money in the noise, and the noise increases.

One only has to look at the number of twitter bots, automated posters, automated follow bots, and auto-retweeters to see where the noise comes from.

Bruce Partington says:

“I’m a bit surprised that Amazon doesn’t do a basic check to make sure the content of an ebook hasn’t already been offered by someone else, and do a further investigation if that’s the case.”

Hmmm, so you expect Amazon to have a searchable list of all ebook content that submissions are compared to. But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials.

Anonymous Coward says:

Re: Re:

“Hmmm, so you expect Amazon to have a searchable list of all ebook content that submissions are compared to.”

He doesn’t expect it to be a government imposed requirement, just a good business model suggestion. Also, Youtube already does such basic checks that attempt to identify infringing materials, but there are almost always ways around these checks. MM suggests that Amazon should do more to reduce plagiarism, not that it will ever be stopped completely.

“But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials.”

Maybe copyright shouldn’t exist to begin with.

The eejit (profile) says:

Re: Re:

You missed a spot:

“do a further investigation”.

In Youtube v Viacom, Viacom was expecting all ther content to be removed without investigation.

Can you spot the difference between the teo?

I do agree, however, that it would be silly to police on the behalf of content ‘creators’, as that would be a dangerous precedent (if not legally, then rationally).

Anonymous Coward says:

Re: Re:

Amazon is perfectly in the right to allow spam books to be released and indexed en masse. It’s a business choice if they pre-police their content.

YouTube is also perfectly in the right if they choose not to police their uploads either, because it is ALSO a business choice to pre-police their content.

The problem here, that you don’t seem to understand, is that no one has the right to demand that Amazon or YouTube make the choice one way or another. However, if the problem really is so bad for Amazon, that drives customers away, in which case it is a problem that a smart business would address.

Mike Masnick (profile) says:

Re: Re:

Hmmm, so you expect Amazon to have a searchable list of all ebook content that submissions are compared to. But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials

No. I expect Amazon to have the database of existing ebooks in its store, because it already has that. That’s all.

And then I’m not expecting them to compare for copyright, issues and automatically block, but as I said (I thought clearly, but perhaps not), all it should trigger is further investigation.

Not sure where the confusion comes in. Apologies if I wasn’t clear.

Anonymous Coward says:

Re: Re: Re:

“I expect Amazon to have the database of existing ebooks in its store”

Exactly, there is a difference between knowing whether you have duplicate information and knowing what information does and doesn’t infringe. It’s much easier to know that you have duplicate information than it is to magically know that some segment of information infringes. The later requires a psychic, the former simply requires some comparison tools.

Anonymous Coward says:

Re: Re:

Unsurprisingly, you seem to miss the point entirely.

Implementing a filter to help better improve search results for ease-of-access to customers is a business decision. It’s not about copyright – just profits. I find it hard to believe Mike seriously thinks Amazon should do any sort of policing over the content itself, only that they should verify whether the content in question is indeed what it’s labeled as. If for no other reason to avoid false advertising repercussions.

If you’re trying to find an e-book and you get 40 false results for every positive it’s going to be pretty annoying pretty quick to look for what you want. Find a different vendor who has only positive results and you’ll probably just shop from there instead and give them any future business.

The biggest question is: how much money is Amazon making off the sale of all the spam? I would have to guess not very much, but if it’s somehow generating revenue for them then no point shutting it down.

Asking third-party aggregate sites such as youtube or torrents to police and enforce copyright law is first and foremost granting them too much authority to declare what is or is not infringing. Second, the cost of implementing any form of workforce to go over the amount of data being uploaded to these sites would make it impossible for any sustainable service of the sort.

You cannot reasonably scrutinize thousands of terabytes of data without creating digitized signatures of the files that are being uploaded. If the file has a copyright on it then you are thereby violating that copyright by using an unauthorized copy of the work in your filtering software without express written permission by the content holder.

Either way – it’s all a matter of profits. Do whatever makes you the most.

Anonymous Coward says:

In fact, I’m a bit surprised that Amazon doesn’t do a basic check to make sure the content of an ebook hasn’t already been offered by someone else, and do a further investigation if that’s the case.

Why should someone have some exclusive right to publish a book just because they wrote it? If you wrote a book and someone does a better job of publishing it than you, why do you deserve to make any money from it at all? Execution matters, not the idea. Besides which, if someone sees your book published by someone else and likes it, and you can’t figure out how to make money that way, whose fault is that? They are giving you free promotion by selling your book for you. Get with the times, copyright maximalists!

Anonymous Coward says:

This is not a new problem...

…all that’s new is that the press is now reporting it. Those of us who work in the anti-spam arena have known about it for quite some time, have alerted Amazon, and quietly provided them with some free consulting advice on how to put a stop to it. It’s unfortunate that they haven’t used that advice, doubly so given that it comes from people vastly more experienced than anyone on their staff, but that’s their choice.

Unsurprisingly, the same scumbags who are engaged in this are also engaged in other abuse: link farming, content farming, SEO, Usenet spam, email spam, IM spam, text spam, etc. And one of their current strategies seems to be to combine these modalities into integrated “campaigns” designed to annoy as many people as possible.

Gene Cavanaugh (profile) says:

Amazon filtering based on content

WAIT! Please explain. Having search engines do this sort of filtering is wrong, they aren’t responsible for content; but requiring it (or encouraging, semantics again) from Amazon is reasonable?
I get the impression that we are in the famous “I don’t know how to distinguish it, but I know it when I see it” (not an exact quote, I am sure) quote.
Wouldn’t it be nice if we had consistency? To much to ask, I assume.

Joel (user link) says:

As a legitimate author, I can attest to the millions of crap books that proliferate the marketplace. Spammers tend to ruin everything that is internet related. Look at Pinterest, Facebook and any number of social sites.

The problem is that it is like sticking your finger in a dam. Another hole quickly opens up. I applaud Amazon for tightening up their publishing standards and at least removing the PLR books people were submitting. But they are running uphill against this epidemic….

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...