Techdirt's think tank, the Copia Institute, is working with the Trust & Safety Professional Association and its sister organization, the Trust & Safety Foundation, to produce an ongoing series of case studies about content moderation decisions. These case studies are presented in a neutral fashion, not aiming to criticize or applaud any particular decision, but to highlight the many different challenges that content moderators face and the tradeoffs they result in. Find more case studies here on Techdirt and on the TSF website.

Content Moderation Case Study: Twitter's Algorithm Misidentifies Harmless Tweet As 'Sensitive Content' (April 2018)

from the content-moderation-isn't-easy dept

Summary: While some Twitter users welcome the chance to view and interact with “sensitive” content, most do not. Twitter utilizes algorithms to detect content average users would like to avoid seeing, especially if they’ve opted in to Twitter’s content filtering via their user preferences.

Unfortunately, software can’t always tell what’s offensive and what just looks offensive to the programmable eye that constantly scans uploads for anything that should be hidden from public view unless the viewer has expressed a preference to see it.

A long-running and well-respected Twitter account that focused on the weirder aspects of Nintendo’s history found itself caught in Twitter’s filters. The tweeted image featured an actor putting on his Princess Peach costume. It focused on the massive Princess Peach head, which apparently contained enough flesh color and “sensitive” shapes to get it — and the Twitter account — flagged as “sensitive.”

The user behind the account tested Twitter to see if it was its algorithm or something else setting off the “sensitive” filter. Dummy accounts tweeting the image were flagged almost immediately, indicating it was the image — rather than other content contained in the user’s original account — that had triggered the automatic moderation.

Unfortunately, the account was likely followed by several users who never expected it to suddenly shift to “sensitive” content. Thanks to the algorithm, the entire account was flagged as “sensitive,” possibly resulting in the account losing followers.

Twitter ultimately removed the block, but the user was never directly contacted by Twitter about the alleged violation.

Decisions to be made by Twitter:

  • Are false positives common enough that a notification process should be implemented?
  • Should the process be stop-gapped by human moderators? If so, at what point does double-checking the algorithm become unprofitable?
  • Would a challenge process that involved affected users limit collateral damage caused by AI mistakes?
  • Does sensitive content negatively affect enough users that over-blocking/over-moderation is acceptable?

Questions and policy implications to consider:

  • Should Twitter change its content rules to further deter the posting of sensitive content?
  • Given Twitter’s reputation as a porn-friendly social media platform, would stricter moderation of sensitive content result in a noticeable loss of users?
  • Should Twitter continue to remain one of the only social media outlets that welcomes “adult” content?
  • If users are able to opt out of filtering at any point, is Twitter doing anything to ensure younger users aren’t exposed to sensitive material?

Resolution: Twitter removed the flag on the user’s account. According to the user behind the account, it took the work of an employee “behind the scenes” to remove the “sensitive content” warning. Since there was no communication between Twitter and the user, it’s unknown if Twitter has implemented any measures to limit future mischaracterizations of uploaded content.

Filed Under: , ,
Companies: twitter

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Content Moderation Case Study: Twitter's Algorithm Misidentifies Harmless Tweet As 'Sensitive Content' (April 2018)”

Subscribe: RSS Leave a comment
19 Comments
Samuel Abram (profile) says:

"flesh color"

The thing is, not all flesh color is pasty white; there is also flesh color that is darker in complexion, and unfortunately, many places in Silicon Valley fail to take that into account. The reason why it’s relevant to this article is that darker skin tones don’t always get tested and lighter skin tones get over-tested, so you have that disparity and whereas over-testing results in many false positives, false negatives can permeate when it comes to people with darker skin tones because the lack of diversity in skin color in Silicon Valley’s firms mean that people with more melanin don’t get true positives, let alone false ones.

This comment has been deemed insightful by the community.
Anonymous Coward says:

My question is: If they are using "AI", why does it never seem to get any better at its job?

I’ve watched people put video and images through Google’s API for their cloud machine learning wotsit, and it is frequently incredibly awful.

(I don’t know where the interface lives, or if it still does, (or one needs an account) but this
https://cloud.google.com/vision/docs/how-to and in particular the Safe Search bit. )

Anonymous Coward says:

Re: Re:

Because "AI" isn’t some kind of magical label that you can slap onto something to make it better. As of now, anything labeled "AI" is either (1) one of several variations on optimization algorithms which are not explicitly coded or (2) labeled incorrectly for marketing reasons.

Assuming that it’s (1), it still runs into the same problems as any other optimization algorithm… access to additional training data does not guarantee improvement, and is routinely either wholly or partially detrimental.

This is exacerbated in many cases due to the sparsity of labeled inputs relative to all potential inputs, that labeled inputs are not representative (it’s often not even possible to define what such representative sets would look like with our current understanding of the field), and that significant portions of those labeled inputs are internally inconsistent due to disagreement among human moderators, changes in strategy over time, accidental separation from context, etc.

You will find in particular that nobody has yet found a way to generally recognize the contents of an image (though some progress has been made on recognizing specific types of images eg human faces, landmarks). What algorithms there are simply don’t "see" images in any way similar to how humans see images; most image algorithms still struggle to reliably ignore compression artifacts in otherwise identical images… something that many humans don’t even notice.

Anonymous Coward says:

Re: Re: Re:

Labels: Yeah, that’s why "AI" is in quotes. AI is a field of study, not a product, and certainly not even a working model anywhere. Machine learning is central to AI, and still super loosey-goosey as to whether any of that works in any of the domains where people apparently really really want to use it. (Remember expert systems? I guess that term is like time-sharing is to the cloud.)

Anyway, my starting assumption is that AI and machine learning are indeed not what they are portrayed to be. I suppose i should have directly asked "if they are so bad at rating, classification, or identification, and do not seem to improve over years of input*, why are they being used at all?" (OK, the answer to that is the dosomething pressure and the fact that people like releasing pre-alpha crap into production – the good-enough philosophy. I should not have bothered writing anything i suppose.)

*Input: If there isn’t manual review more often, the negative feedback is never input to the system…

PaulT (profile) says:

Re: Re:

"My question is: If they are using "AI", why does it never seem to get any better at its job?"

It does. It’s just that identifying subjective content will never be perfect, and the best it can ever do is the same as a human being. Who will never be perfect at such a task, given that you can sit 100 human beings in a room and they will never agree on a subject. I’ll guarantee that if you did so, one of the 100 people would have flagged the above image.

The advantages of AI in this setting is speed and volume of processing. If you want accuracy surrounding subjective material, you want magic.

Anonymous Coward says:

Re: Re: Re:

Well sure, no one will ever agree on subjective matters, but i do not see any improvement in the putative machine learning for identifying even nudity or "raciness". Fine, it will of course be based on someone’s operant definition of "racy" or whatever, but lol a black and white photo of a cartoonish costume head (among endless other things)? No, no improvements there.

This comment has been deemed insightful by the community.
Anonymous Coward says:

It might be easier to understand moderation if you think about something that is universally despised. Like politics, but less complicated. Say … SPAM.

Everyone knows what spam is. Everyone agrees that in a perfectly just world spammers would be slow-cooked while their skin was being removed by an acid mist–before their bones were ground up to make latrine bricks.

Gmail does an extremely good job of filtering out spam. And yet, and yet–who hasn’t (very occasionally) seen important email show up in their spam folder? And the spammers are still operating, so apparently enough spam is getting through the filters to make the abhorrent habit profitable.

How do you react?

Well, if you’re an insane egocentric idiot, you immediately go across the web, posting that Google has it in for your bank, or nonprofit org, or second-cousin-once-removed, because THEIR email was deprecate, whereas some other parties’ email did not get filtered. You get your congresscritter (whichever side of the aisle they lair and liar in) to fulminate and spray froth all over the Sacred Halls of Our Republic. And you wrap yourself in a putrescent cloak of victimhood.

If you are sane, or less stupid than yeast, or you have any consideration at all for the difficulties other people are having in their quest to make your online experience less painful, then you try a different approach. In fact, even if you’re a viscious spammer, you take the different approach!

You look carefully at the deprecated email, looking for words or word-parts that could appear (to a stupid computer, not that there is any other kind) to be commercial/promotional. You look at the email address and sending server and linked-to sites to see if they show patterns that are commonly associated with spam. You remove your second cousin, bank, and charitable org from the blacklist and add them to the whitelist. And, if you’re a spammer, you try to recraft your spam so as not to LOOK like spam to the stupid computer. YOU DO NOT TAKE IT PERSONALLY, BECAUSE THE STUPID COMPUTER IS NOT A PERSON; IT IS CONTROLLED BY AN ARITHMETIC EXPRESSION, NOT A SOUL.

And if life is still sometimes complicated, frustrating, and inexplicable–welcome to the human condition.

Samuel Abram (profile) says:

Re: Re:

YOU DO NOT TAKE IT PERSONALLY, BECAUSE THE STUPID COMPUTER IS NOT A PERSON; IT IS CONTROLLED BY AN ARITHMETIC EXPRESSION, NOT A SOUL.

It reminds me of what I asked my father when I was young:

Me: "Daddy, are computers perfect?"
My father: "Computers aren’t perfect, but they expect us to be perfect."

If anything, computers are only as good as what we make out of them.

PaulT (profile) says:

Re: Re: Re:

"My father: "Computers aren’t perfect, but they expect us to be perfect.""

The better way of explaining this is the old adage GIGO – Garbage In, Garbage Out. Computers do perfectly do what they’re told to do. But, a human operator needs to tell them what to do, and they are far from perfect. If someone gives them a bad instruction, be that the coder who created the programs they run, or a user not using the program correctly, they will perfectly follow the bad instruction.

Tin-Foil-Hat says:

There should be some rules

Youtube is notorious. The encourage users to create content and when they quit their day job to create content YouTube demonetizes them for some unknown reason. It’s difficult to be reinstated too.

There really should be some obligation. They want business owners to use the service but when the business’ communication is shut down the platform is 100% void of responsibility even though the business is harmed.

Youtube, Twitter and Facebook should be lower priority methods of communication if you care about consistency and reliability.

Stephen T. Stone (profile) says:

Re:

They want business owners to use the service but when the business’ communication is shut down the platform is 100% void of responsibility even though the business is harmed.

Why should we hold YouTube responsible for the decision of a third party to rely on one service so heavily that getting the boot from said service can fuck up their entire business model?

PaulT (profile) says:

Re: There should be some rules

"The encourage users to create content and when they quit their day job to create content YouTube demonetizes them for some unknown reason."

Perhaps the problem isn’t YouTube, but the idiot who decided to base his entire business on a single supplier, then violated the T&Cs of that supplier’s contract?

"They want business owners to use the service but when the business’ communication is shut down the platform is 100% void of responsibility even though the business is harmed."

There’s not zero recourse. But, the user is not their customer, and if the user decides to violate YouTube’s policies in a way that puts off their paying customers (i.e. advertisers), YouTube do not have an obligation to throw free money at people who are losing it customers.

"Youtube, Twitter and Facebook should be lower priority methods of communication if you care about consistency and reliability."

Perhaps true. So why, in your example, is the user who decided to base their entire business on an unreliable and inconsistent platform not responsible for that decision?

Anonymous Coward says:

Banned from FB for violating unknowable "community standards"

4 times, banned, for posting images that violated "community standards". Several of the images were also posted by political groups who weren’t banned. One was "Don’t wash your MAGA hat with your klan outfit". I cannot post any image which suggests the GOP are similar to Nazis. No swastikas, etc. Yet I’m banned now for 30 days. Nice, right?

Leave a Reply to Stephen T. Stone Cancel reply

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...
Older Stuff
15:43 Content Moderation Case Study: Facebook Struggles To Correctly Moderate The Word 'Hoe' (2021) (21)
15:32 Content Moderation Case Study: Linkedin Blocks Access To Journalist Profiles In China (2021) (1)
16:12 Content Moderation Case Studies: Snapchat Disables GIPHY Integration After Racist 'Sticker' Is Discovered (2018) (11)
15:30 Content Moderation Case Study: Tumblr's Approach To Adult Content (2013) (5)
15:41 Content Moderation Case Study: Twitter's Self-Deleting Tweets Feature Creates New Moderation Problems (2)
15:47 Content Moderation Case Studies: Coca Cola Realizes Custom Bottle Labels Involve Moderation Issues (2021) (14)
15:28 Content Moderation Case Study: Bing Search Results Erases Images Of 'Tank Man' On Anniversary Of Tiananmen Square Crackdown (2021) (33)
15:32 Content Moderation Case Study: Twitter Removes 'Verified' Badge In Response To Policy Violations (2017) (8)
15:36 Content Moderation Case Study: Spam "Hacks" in Among Us (2020) (4)
15:37 Content Moderation Case Study: YouTube Deals With Disturbing Content Disguised As Videos For Kids (2017) (11)
15:48 Content Moderation Case Study: Twitter Temporarily Locks Account Of Indian Technology Minister For Copyright Violations (2021) (8)
15:45 Content Moderation Case Study: Spotify Comes Under Fire For Hosting Joe Rogan's Podcast (2020) (64)
15:48 Content Moderation Case Study: Twitter Experiences Problems Moderating Audio Tweets (2020) (6)
15:48 Content Moderation Case Study: Dealing With 'Cheap Fake' Modified Political Videos (2020) (9)
15:35 Content Moderation Case Study: Facebook Removes Image Of Two Men Kissing (2011) (13)
15:23 Content Moderation Case Study: Instagram Takes Down Instagram Account Of Book About Instagram (2020) (90)
15:49 Content Moderation Case Study: YouTube Relocates Video Accused Of Inflated Views (2014) (2)
15:34 Content Moderation Case Study: Pretty Much Every Platform Overreacts To Content Removal Stimuli (2015) (23)
16:03 Content Moderation Case Study: Roblox Tries To Deal With Adult Content On A Platform Used By Many Kids (2020) (0)
15:43 Content Moderation Case Study: Twitter Suspends Users Who Tweet The Word 'Memphis' (2021) (10)
15:35 Content Moderation Case Study: Time Warner Cable Doesn't Want Anyone To See Critical Parody (2013) (14)
15:38 Content Moderation Case Studies: Twitter Clarifies Hacked Material Policy After Hunter Biden Controversy (2020) (9)
15:42 Content Moderation Case Study: Kik Tries To Get Abuse Under Control (2017) (1)
15:31 Content Moderation Case Study: Newsletter Platform Substack Lets Users Make Most Of The Moderation Calls (2020) (8)
15:40 Content Moderation Case Study: Knitting Community Ravelry Bans All Talk Supporting President Trump (2019) (29)
15:50 Content Moderation Case Study: YouTube's New Policy On Nazi Content Results In Removal Of Historical And Education Videos (2019) (5)
15:36 Content Moderation Case Study: Google Removes Popular App That Removed Chinese Apps From Users' Phones (2020) (28)
15:42 Content Moderation Case Studies: How To Moderate World Leaders Justifying Violence (2020) (5)
15:47 Content Moderation Case Study: Apple Blocks WordPress Updates In Dispute Over Non-Existent In-app Purchase (2020) (18)
15:47 Content Moderation Case Study: Google Refuses To Honor Questionable Requests For Removal Of 'Defamatory' Content (2019) (25)
More arrow