from the car-washes-and-mass-shootings dept
For several years now, we’ve been beating the idea that content moderation at scale is impossible to get right, otherwise known as Masnick’s Impossibility Theorem. The idea there is not that platforms shouldn’t do any form of moderation, or that they shouldn’t continue to try to improve the method for moderation. Instead, this is all about expectations setting, partially for a public that simply wants better content to show up on their various devices, but even more so for political leaders that often see a problem happening on the internet and assume that the answer is simply “moar tech!”.
Being an internet behemoth, Facebook catches a lot of heat for when its moderation practices suck. Several years ago, Mark Zuckerberg announced that Facebook had developed an AI-driven moderation program, alongside the claim that this program would capture “the vast majority” of objectionable content. Anyone who has spent 10 minutes on Facebook in the years since realizes how badly Facebook failed towards that goal. And, as it turns out, failed in both directions.
By that I mean that, while much of our own commentary on all this has focused on how often Facebook’s moderation ends up blocking non-offending content, a recent Ars Technica post on just how much hate speech makes its way onto the platform has some specific notes about how some of the most objectionable content is misclassified by the AI moderation platform.
Facebook’s internal documents reveal just how far its AI moderation tools are from identifying what human moderators were easily catching. Cockfights, for example, were mistakenly flagged by the AI as a car crash. “These are clearly cockfighting videos,” the report said. In another instance, videos livestreamed by perpetrators of mass shootings were labeled by AI tools as paintball games or a trip through a carwash.
It’s not entirely clear to me just why the AI system is seeing mass shootings and animals fighting and thinking its paintball or carwashes, though I unfortunately have some guesses and they aren’t fun to think about. Either way, this… you know… sucks! If the AI you’re relying on to filter out extreme and violent content labels a mass shooting as a trip through the carwash, well, that really should send us back to the drawing board, shouldn’t it?
It’s worse in other countries, as the Ars post notes. There are countries where Facebook has no database of racial slurs in native languages, meaning it cannot even begin blocking such content on the site, via AI or otherwise. Polled Facebook users routinely identify hate on the platform as its chief problem, but the company seems to be erring in the opposite direction.
Still, Facebook’s leadership has been more concerned with taking down too many posts, company insiders told WSJ. As a result, they said, engineers are now more likely to train models that avoid false positives, letting more hate speech slip through undetected.
Which may actually be the right thing to do. I’m not prepared to adjudicate that point in this post. But what we can say definitively is that Facebook has an expectations setting problem on its hands. For years it has touted its AI and human moderators as the solution to the most vile content on its platform… and it doesn’t work. Not at scale at least. And outside of America and a handful of other Western nations, barely at all.
It might be time for the company to just say so and tell the public and its representatives that this is going to take a long, long while before the company gets this anywhere close to right.