In The Last Six Months Techdirt’s Antispam Algorithm Has Stopped Over A Million Spam Comments; Should We Lose 230 Protections For That?
The Supreme Court is currently deliberating whether or not algorithms deserve protections under Section 230. And I hear from lots of people that maybe Section 230 wasn’t meant to cover algorithmic policing and recommendations of content. But that’s utter nonsense.
The whole area of content moderation first came about as a response to the earliest versions of spam. And one thing that people learned quite quickly is that you can’t manually police for spam if your site has even the slightest level of popularity. It will get flooded.
This is why any site needs to have some sort of automation to deal with spam. Indeed, for years, Techdirt has actually been using a combination of multiple different tools and setups to fight spam comments, but I’d never really looked into the numbers until just recently (mostly on a whim because I found the setting where those stats are!), and it’s kinda stunning. First off, we get way more spam attempts than I had even realized.
In the last six months alone, Techdirt received over 1.3 million attempts to spam our comments. That’s compared to the slightly over 40,000 legitimate comments we received in the same period. Here’s a chart of the spam comments per month:
I have no idea why spam grew so rapidly in January and February before falling in March, but even with 125,000+ spam messages in March, it completely dwarfs the amount of legitimate comments we got. The only possible way to keep up is to use automated systems. And, to be clear, while some small percentage of spam does get through, and we have a few legit comments caught in the spam filter, I’d argue we do a pretty good job of catching most spam, and allowing through most comments.
But, the larger point: without the multiple algorithmic systems we use to catch spam, we’d never be able to manage that amount of spam. Hell, we couldn’t handle manually dealing with less than 1% of the spam we actually get attempted right now. It would overwhelm us.
So if the courts (or, horror of horrors, Congress) were to decide that “algorithms” no longer are protected under Section 230, it would destroy Techdirt. While the 1st Amendment would eventually protect us, the lack of 230 protections would make using a spam filter a liability that would open up the risk of having to fight a full legal battle just to prove our right to block spam comments.
As such, our choices would be to turn off the algorithms and let spam flow, shut down our comments entirely, or risk ruinous lawsuits for the “harm” of trying to stop spam with an automated filter.
Technological filters (i.e., algorithms) should obviously be protected by Section 230, because without them, we lose the ability to fight spam, and the amount of such content is truly overwhelming. And that’s just for us, a pretty small site. Imagine how larger sites are dealing with this stuff.
spam is evil
Without spam countermeasures there would be no forum. I used to use usenet quite a bit as a forum. Then spammers flooded it so bad it wasnt any good for anything else.
I would mostly agree (before the site migration I don’t think I ever had a comment caught). After the migration, it seems that some small percentage 5-10% (maybe?) fail to go through. Some times there’s a weird 429, which basically only occurs when I stop browsing to read an article, and then post a comment. If I got through and refresh several pages at once to look for new comments… it works fine. It also does seem like the comment system has been less ornery in the last month or so (but maybe that’s because I’ve been posting a bit less often).
Anyhow: Techdirt’s comment system is probably the best one I’ve seen. SCOTUS making its legality unclear would be a huge blow… probably too all the US. And it would also be pretty damaging for any self-hosting with user-generated content (and I guess most mastodon instances?).
It seems about the same to me, and the “Preview” button still does absolutely nothing (as has been the case since the “upgrade”). And the comment box still doesn’t show enough context, which results in errors such as accidentally-top-level posting.
As I’ve said elsewhere on TD, you cannot have an open comment section without robust, and ideally pre-emptive, ways to stop bad actors. Otherwise you have essentially nothing but spam.
Granted, it was in the context of saying that TD could be more aggressive in removing trolls and harassers and still retain anonymous commentary, but either way, yeah, if you allow UGC you have to be a fortress.
It isn't just Techdirt
Block lists would be sued into the ground regardless of if they are located in the US or not if algorithmic filtering is not protected under 230 anymore.
They don’t need to go after Techdirt, just preventing access to for example Spamhaus (and similar efforts) would increase the spam that gets through.
I always get astonished by the sheer amount of spam big companies report. It’s several hundred billions of messages if you consider email alone. This has a cost to send and filter both in terms of money and environmentally speaking as well. But crooks profit from it and these crooks are usually on countries that are not very interested in actually doing something to curb the problem. A coordinated international effort and some heavy prison time would help a lot, including putting pressure on those countries. Filtering is obvious and welcome but it’s like drying ice, it doesn’t go to the root of the problem.
No, it doesn’t. In fact, while filtering may be a cost on the receiving end, there is no such commensurate cost on the sending end. That’s the problem.
Institute an across-the-board fee to send messages, and all spam will stop in about 20 seconds, perhaps sooner. The details have to be worked out, such as volume per time segment, but I’ve no doubt that if Grandma were required to spend a penny to send an email to her grandson, she’d do it in a heartbeat.
Small clubs with a monthly newsletter, say 500 to 1,000 members, they’d incur a couple of dollars cost, and they’d gladly do it. Contrast that to spamming outfits that send more than a million pieces an hour, they’d not see enough ROI to make it worthwhile…. even at a penny a piece. But of course, if I were in charge, I’d raise the fee per piece on a tier basis – instead of a penny a piece, those million spams would cost a quarter a piece. $250K versus $10K starts to eat into the profit picture pretty quickly.
To those who would answer “yes” to the headline, I have One Simple Question for you.
Yes or no: Do you believe the government should have the legal right to compel any privately owned interactive web service into hosting legally protected speech that the owners/operators of said service don’t want to host?
Preempting Tobi, Koby, Mathew or whatever is the flavor of the troll of the moment? They seem to avoid you when you post that question like vampires would avoid Jesus himself lol
"What If" Scenario "Plan B"
The following question may have been asked and answered before but I could use a refresher. I assume that, for a small to medium operation like Techdirt, the anti-spam system mainly works in the cloud and that, as a practical and technical matter, the servers in the cloud could be based in some optimal (for certain purposes) non-U.S. location. So if TechDirt were forced to abandon its own anti-spam measures, could it not provide a raw data feed to a remote server that deployed the same, or equivalent, technology? Then couldn’t any of us who wished look at that filtered data instead of the raw data? Wouldn’t this avoid any liability on the part of Techdirt?
Someone please tell me the defect in this scheme.
The defect is the assumption that Techdirt would keep comments open if it were banned from filtering comments.
Re: Re: Please elaborate
I am speculating about a scenario where sites that moderate (via anti-spam technology and/or a hybrid of technology and manual methods) have liability but sites that allow comments without moderation are not subject to that liability. As everyone agrees, in that scenario, the raw data feed of a site that does not moderate becomes unreadable because it is buried in spam. Why cannot a 3rd party filtering service, perhaps from outside the U.S., solve of mitigate that problem?
Re: Re: Re:
Such a system might work, but there is the problem of bandwidth costs in sending all that spam to the filter, possibly repeatedly.
Re: Re: Re:
I would assume the site’s primary servers and owners being in the U.S. would still subject them to the laws of the U.S.
But again, that’s getting into a hypothetical that assumes Techdirt wouldn’t close comments if they were barred from filtering for spam. That remains the primary defect in your scheme, and you can’t avoid it by going “okay but what if [x]”.
Re: Re: Re:
This does nothing to protect them from the true cost of losing Section 230.
The whole benefit of Section 230, the reason it works so well, is that it quickly and cheaply halts litigation in its tracks.
It turns ruinously expensive lawsuits into simple open-and-shut legal motions that cost less than a percentage point of the cost of the normal lawsuit.
The move you suggest doesn’t help out Techdirt in avoiding the ruinous lawsuit. They’d still have to jump through the hoops for the court to determine if this setup means they’re out of suit jurisdiction, etc. etc. – discovery would probably be required and that is capable of bankrupting companies.
So all it takes is a deep-pocketed person who wants to kill the small business and even if the lawsuit turns out meritless it can still end them.
Re:
The “cloud” probably wouldn’t protect anyone if the target of the block was in the same country/state as TD, but that’s where 1A/230 come in.
The “cloud” means you run software on other peoples’ hardware, not that you’re immune from its effects. The most important thing is that a venue that doesn’t want assholes causing trouble is allowed to block them, and I don’t think I’ve ever seen a venue as lenient online or offline as this one.
Well, spam and language, I’d say. Family friendly services like Prodigy also didn’t want people swearing online.
(Going way way back, once in a while in the 70s and 80s someone would try to buy or sell something online, like a used bike or something, and would often be chided for using a government owned network for private commercial purposes, but it wasn’t really spam or moderation as we think of them)
That’s a huge volume.. The current captcha seems quite harmless nowadays (I barely ever see any window asking for input). Wouldn’t something like this ease the burden on TD servers?
I wonder where they originate from as well. If I was in charge I’d probably blanket ban certain ip ranges from the worst offenders from commenting without an account. I mean, you can still not reveal any identity even if you sign up. I certainly would understand if you took such actions and my country was hit (I wouldn’t be surprised if Brazil is one of the main sources of spam content).
If the remove section 230 protection for algorithms, they remove it completely, as web engines are algorithms, as I am sure some lawyers looking to extract money from a tech company would point out.
Betteridge’s law of headlines rings true once again.
Backup Plan
Hopefully, Gonzales will lose the case. But even if they win, a spam filter is different from a recommendation algorithm. Gonzales is arguing over the promotion of material regarding Section 230(c)(1), while the restrictions of spam filters would fall under Section 230(c)(2)(A). Even if recommendation algorithms become unusable, spam filters fall under a different legal category.
Re:
For now.
And it would depend on the ruling, if the Supreme Court ruled to strike down Section 230 in whole or in part.
And there’s the risk of a ruinous lawsuit to figure out what the new status quo is in the aftermath of a ruling that kills 230. Regardless of whether Spam filters were actually under a different section of it, there will be blood in the water as the legal sharks fight it out.
Re:
Do you really think that anybody should listen to what you have to say about section 230 when you posted that Facebook could use §230 to dismiss a lawsuit against Facebook’s own speech.
Just to remind you of what you said Koby:
How so? A spam filter essentially recommends that certain posts/emails/etc should be considered junk and recommends that others are not junk.