Kindle Spam Is A Filter Issue, Not A Spam Issue

from the filter-away dept

Via Slashdot, we learn that spammers have discovered the ability to publish cheap "ebooks":
Thousands of digital books, called ebooks, are being published through Amazonís self-publishing system each month. Many are not written in the traditional sense.

Instead, they are built using something known as Private Label Rights, or PLR content, which is information that can be bought very cheaply online then reformatted into a digital book.

These ebooks are listed for sale Ė often at 99 cents Ė alongside more traditional books on Amazonís website, forcing readers to plow through many more titles to find what they want.
The article makes it sound like this is a big problem, calling it "the dark side" of self-publishing, but I don't get it. Assuming no one wants this crap, then it seems likely that Amazon will start to filter it out of any search results or top lists.

There is some slightly more legitimate concern about outright plagiarism, where some of these "spammers" are merely copying other books and then re-branding them and selling them as ebooks. But, once again, this seems like a filter problem more than anything else. In fact, I'm a bit surprised that Amazon doesn't do a basic check to make sure the content of an ebook hasn't already been offered by someone else, and do a further investigation if that's the case. Others have suggested that Amazon charge a small fee to upload a book, as that might prevent spammers from going crazy with such copies, and that could make sense as well. I just have trouble believing that this is such a serious "problem" that it can't easily be stopped.


Reader Comments (rss)

(Flattened / Threaded)

  1.  
    identicon
    Anonymous Coward, Jun 20th, 2011 @ 10:07pm

    Calling Mike Masnick.... Clean up needed in the "small-businesses-are-the-backbonehead dept".

     

    reply to this | link to this | view in thread ]

  2.  
    identicon
    Anonymous Coward, Jun 20th, 2011 @ 10:16pm

    Re:

    Yes, because only big business is a good thing.

     

    reply to this | link to this | view in thread ]

  3.  
    identicon
    Anonymous Coward, Jun 20th, 2011 @ 10:47pm

    It seems that this is the start of the problem of signal to noise. More and more noise, drowning out whatever signal is left.

     

    reply to this | link to this | view in thread ]

  4.  
    icon
    Chargone (profile), Jun 20th, 2011 @ 11:03pm

    Re:

    you say that as if the absolute quantity of signal is reducing. it's not.

    also: that's what filters are for. give each entry a bunch of filter tags for various things and let the person looking for stuff browse by filter catagory as well as searching. or even search within a catagory. in a lot of contexts good catigorisation is more useful than a search engine for finding what you want. especially when you've got a less than complete understanding of what that Is.

     

    reply to this | link to this | view in thread ]

  5.  
    identicon
    Bruce Partington, Jun 20th, 2011 @ 11:06pm

    "I'm a bit surprised that Amazon doesn't do a basic check to make sure the content of an ebook hasn't already been offered by someone else, and do a further investigation if that's the case."

    Hmmm, so you expect Amazon to have a searchable list of all ebook content that submissions are compared to. But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials.

     

    reply to this | link to this | view in thread ]

  6.  
    identicon
    Anonymous Coward, Jun 20th, 2011 @ 11:11pm

    Re:

    Boom

    Head shot

     

    reply to this | link to this | view in thread ]

  7.  
    identicon
    Anonymous Coward, Jun 20th, 2011 @ 11:12pm

    Re:

    One mans noise is another mans signal.

     

    reply to this | link to this | view in thread ]

  8.  
    identicon
    Anonymous Coward, Jun 20th, 2011 @ 11:28pm

    Re:

    "Hmmm, so you expect Amazon to have a searchable list of all ebook content that submissions are compared to."

    He doesn't expect it to be a government imposed requirement, just a good business model suggestion. Also, Youtube already does such basic checks that attempt to identify infringing materials, but there are almost always ways around these checks. MM suggests that Amazon should do more to reduce plagiarism, not that it will ever be stopped completely.

    "But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials."

    Maybe copyright shouldn't exist to begin with.

     

    reply to this | link to this | view in thread ]

  9.  
    icon
    The eejit (profile), Jun 20th, 2011 @ 11:35pm

    Re:

    You missed a spot:

    "do a further investigation".

    In Youtube v Viacom, Viacom was expecting all ther content to be removed without investigation.

    Can you spot the difference between the teo?

    I do agree, however, that it would be silly to police on the behalf of content 'creators', as that would be a dangerous precedent (if not legally, then rationally).

     

    reply to this | link to this | view in thread ]

  10.  
    icon
    The eejit (profile), Jun 20th, 2011 @ 11:36pm

    Re: Re:

    And as for you, you spineless Coward, got anything productive to add to this discussion, or would you rather play the Taliban in CoD: MW2?

    Oh, wait, YOU CAN'T.

     

    reply to this | link to this | view in thread ]

  11.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 12:11am

    Re:

    Amazon is perfectly in the right to allow spam books to be released and indexed en masse. It's a business choice if they pre-police their content.

    YouTube is also perfectly in the right if they choose not to police their uploads either, because it is ALSO a business choice to pre-police their content.

    The problem here, that you don't seem to understand, is that no one has the right to demand that Amazon or YouTube make the choice one way or another. However, if the problem really is so bad for Amazon, that drives customers away, in which case it is a problem that a smart business would address.

     

    reply to this | link to this | view in thread ]

  12.  
    icon
    Michael Long (profile), Jun 21st, 2011 @ 12:38am

    Re: Re:

    That's why I think Amazon should require a $99 "publishing fee" per book.

    In fact, requiring a fee would be the "first filter". If you don't think a book is good enough to earn back $99, it's probably not good enough to be on the store in the first place.

    Some guys has spammed the store with over 8,000 titles ripped off from hither and yon and selling for a buck a head. Would he have still done that if doing so would have cost him $80,000 up front?

     

    reply to this | link to this | view in thread ]

  13.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 1:13am

    In fact, I'm a bit surprised that Amazon doesn't do a basic check to make sure the content of an ebook hasn't already been offered by someone else, and do a further investigation if that's the case.

    Why should someone have some exclusive right to publish a book just because they wrote it? If you wrote a book and someone does a better job of publishing it than you, why do you deserve to make any money from it at all? Execution matters, not the idea. Besides which, if someone sees your book published by someone else and likes it, and you can't figure out how to make money that way, whose fault is that? They are giving you free promotion by selling your book for you. Get with the times, copyright maximalists!

     

    reply to this | link to this | view in thread ]

  14.  
    icon
    Mike Masnick (profile), Jun 21st, 2011 @ 1:23am

    Re:

    Hmmm, so you expect Amazon to have a searchable list of all ebook content that submissions are compared to. But you regularly ridicule people who expect YouTube or torrent sites to have the same thing for copyrighted materials

    No. I expect Amazon to have the database of existing ebooks in its store, because it already has that. That's all.

    And then I'm not expecting them to compare for copyright, issues and automatically block, but as I said (I thought clearly, but perhaps not), all it should trigger is further investigation.

    Not sure where the confusion comes in. Apologies if I wasn't clear.

     

    reply to this | link to this | view in thread ]

  15.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 3:22am

    This is not a new problem...

    ...all that's new is that the press is now reporting it. Those of us who work in the anti-spam arena have known about it for quite some time, have alerted Amazon, and quietly provided them with some free consulting advice on how to put a stop to it. It's unfortunate that they haven't used that advice, doubly so given that it comes from people vastly more experienced than anyone on their staff, but that's their choice.

    Unsurprisingly, the same scumbags who are engaged in this are also engaged in other abuse: link farming, content farming, SEO, Usenet spam, email spam, IM spam, text spam, etc. And one of their current strategies seems to be to combine these modalities into integrated "campaigns" designed to annoy as many people as possible.

     

    reply to this | link to this | view in thread ]

  16.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 4:44am

    Re:

    Oooh... engineer talk. Cool.

     

    reply to this | link to this | view in thread ]

  17.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 5:30am

    Re: Re:

    "I expect Amazon to have the database of existing ebooks in its store"

    Exactly, there is a difference between knowing whether you have duplicate information and knowing what information does and doesn't infringe. It's much easier to know that you have duplicate information than it is to magically know that some segment of information infringes. The later requires a psychic, the former simply requires some comparison tools.

     

    reply to this | link to this | view in thread ]

  18.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 6:04am

    Re:

    Unsurprisingly, you seem to miss the point entirely.

    Implementing a filter to help better improve search results for ease-of-access to customers is a business decision. It's not about copyright - just profits. I find it hard to believe Mike seriously thinks Amazon should do any sort of policing over the content itself, only that they should verify whether the content in question is indeed what it's labeled as. If for no other reason to avoid false advertising repercussions.

    If you're trying to find an e-book and you get 40 false results for every positive it's going to be pretty annoying pretty quick to look for what you want. Find a different vendor who has only positive results and you'll probably just shop from there instead and give them any future business.

    The biggest question is: how much money is Amazon making off the sale of all the spam? I would have to guess not very much, but if it's somehow generating revenue for them then no point shutting it down.

    Asking third-party aggregate sites such as youtube or torrents to police and enforce copyright law is first and foremost granting them too much authority to declare what is or is not infringing. Second, the cost of implementing any form of workforce to go over the amount of data being uploaded to these sites would make it impossible for any sustainable service of the sort.

    You cannot reasonably scrutinize thousands of terabytes of data without creating digitized signatures of the files that are being uploaded. If the file has a copyright on it then you are thereby violating that copyright by using an unauthorized copy of the work in your filtering software without express written permission by the content holder.

    Either way - it's all a matter of profits. Do whatever makes you the most.

     

    reply to this | link to this | view in thread ]

  19.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 7:03am

    Re:

    Says the man who has obviously never written anything, but has stolen much.

     

    reply to this | link to this | view in thread ]

  20.  
    identicon
    Jake, Jun 21st, 2011 @ 8:33am

    Re: Re: Re:

    $99 is a bit on the high side, unless it were to take the form of a security deposit to be forfeit if your ebook had to be removed from their listings for spam or some other offence, but the basic idea seems sensible.

     

    reply to this | link to this | view in thread ]

  21.  
    identicon
    Anonymous Coward, Jun 21st, 2011 @ 8:55am

    Re: Re:

    You said:" you say that as if the absolute quantity of signal is reducing. it's not."

    Me: I don't think so. It is a question of ratio. If you have 100 good works, and 10 bad ones, your signal to noise ratio is high and you have no problems. But if that shifts to 100 good works and 1000 bad ones, the chance that you find the signal (same level as before) is very low.

    In a world where anyone can publish anything at any time with little or no real cost, you will get more noise. Spammers, jammers, and scammers will figure out how to make money in the noise, and the noise increases.

    One only has to look at the number of twitter bots, automated posters, automated follow bots, and auto-retweeters to see where the noise comes from.

     

    reply to this | link to this | view in thread ]

  22.  
    icon
    Atkray (profile), Jun 21st, 2011 @ 9:09am

    Re: Re: Re:

    While I don't agree with someone spamming 8000 titles, I find myself back at square one with spam, if it didn't work people wouldn't do it.

    Just looking at the raw numbers, 8000 titles @ a buck a piece, seems like that could provide a pretty decent revenue stream even at a low percentage.

     

    reply to this | link to this | view in thread ]

  23.  
    icon
    Gene Cavanaugh (profile), Jun 21st, 2011 @ 9:58am

    Amazon filtering based on content

    WAIT! Please explain. Having search engines do this sort of filtering is wrong, they aren't responsible for content; but requiring it (or encouraging, semantics again) from Amazon is reasonable?
    I get the impression that we are in the famous "I don't know how to distinguish it, but I know it when I see it" (not an exact quote, I am sure) quote.
    Wouldn't it be nice if we had consistency? To much to ask, I assume.

     

    reply to this | link to this | view in thread ]

  24.  
    icon
    Chris Rhodes (profile), Jun 21st, 2011 @ 11:01am

    Re: Amazon filtering based on content

    Did he say require? I took his statement to mean it would make good business sense for Amazon to root out poor quality items.

     

    reply to this | link to this | view in thread ]

  25.  
    icon
    Griff (profile), Jun 21st, 2011 @ 4:30pm

    Re:

    I thought he was saying just check the content hasn't already been uploaded once before.
    No need for an "official list". Just hash the content of every book at upload and block any whose hash is not unique.

    You can be pretty sure that among these 1000's of PLR books there will be a lot of duplication.

     

    reply to this | link to this | view in thread ]

  26.  
    identicon
    Joel, Apr 20th, 2013 @ 4:56pm

    As a legitimate author, I can attest to the millions of crap books that proliferate the marketplace. Spammers tend to ruin everything that is internet related. Look at Pinterest, Facebook and any number of social sites.

    The problem is that it is like sticking your finger in a dam. Another hole quickly opens up. I applaud Amazon for tightening up their publishing standards and at least removing the PLR books people were submitting. But they are running uphill against this epidemic....

     

    reply to this | link to this | view in thread ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This