Spammers Adding Text From Books To Avoid Filters
from the tricks-of-the-trade dept
Every time the filters get better, spammers try to figure out a way around them. You might think that a spammer would realize if people are going through so much trouble to block them, that they’re less likely to be happy to receive the spam, but apparently that’s not the case. Instead of focusing in on the small group of people who actually are interested in spam, they’re wasting their time trying to get messages to those people who will never respond to the stuff. Anyway, according to this article, now that spammers know that some filters look for certain keywords and determine how much of the email message is likely to be spam, they’re apparently cutting and pasting the text from various classic pieces of literature right into the spam. This way, they’re figuring that since so much of the spam isn’t “spammy” the filters won’t catch it. I doubt this is going to work very well. Most spam filters look at a combination of factors, including headers, to filter out spam. Just adding more useless text on top of the actual spam should only trick a few of the most basic spam filters.
Comments on “Spammers Adding Text From Books To Avoid Filters”
Aimed at ISP/Corporate Filters
The spammers are more likely attempting to circumvent the filters put in place by ISPs and businesses for the benefit of their non-technical users (people who might ordinally be inclined to be interested in some of the advertising).
Re: Aimed at ISP/Corporate Filters
I agree with the previous poster. I’d also add that the more spam is blocked to more of an impact the successful spam will have, and the less competition it will have, so it might be worth the effort.
No Subject Given
I would tend to agree with you on this that it isn’t going to have an effect on decent anti spam solutions. However, I’ve heard that a few have already been battling this with their Bayes implementations.
funny
I laughed when I started reading literature in my spam folder, immediately realizing what it was an how futile it would be. I set my email client to display the text only, not the HTML. As a result most spam looks like blank messages.
Now that spammers have been putting all kinds of obviously cut-and-pasted text into the text part of the message, checking the spam folder has become entertaining. I learned about the 19th century British economy last time I looked…
Any adaptive filter should easily recognize this text as statistically different from the other kinds of email you receive. The only way the added literature would succeed is if you normally receive similar text. But even that would quickly fail as the filters learned new statistical word sets that separated the spam literature from the literature that you don’t consider spam.
I’ve been pretty impressed with POPFile, a free Bayesian filter. There’s even a cool way to integrate it into Outlook called Outclass.
copyright implications?
if spammers are broadcasting excerpts from copyright publications, can’t someone come down on them like a ton of bricks?
Perhaps the DMCA could have a valid use…
Re: copyright implications?
Amazingly, according to the article, the spammers are using texts that are in the public domain.