Techdirt's think tank, the Copia Institute, is working with the Trust & Safety Professional Association and its sister organization, the Trust & Safety Foundation, to produce an ongoing series of case studies about content moderation decisions. These case studies are presented in a neutral fashion, not aiming to criticize or applaud any particular decision, but to highlight the many different challenges that content moderators face and the tradeoffs they result in. Find more case studies here on Techdirt and on the TSF website.

Content Moderation Case Study: Facebook's Internal 'Hate Speech' Guidelines Appear To Leave Protected Groups Unprotected (June 2017)

from the making-rules-is-difficult dept

Summary: Facebook has struggled to moderate “hate speech” over the years, resulting in it receiving steady criticism not only from users, but from government officials around the world. Part of this struggle is due to the nature of the term “hate speech” itself, which is often vaguely-defined. These definitions can vary from country to country, adding to the confusion and general difficulty of moderating user content.

Facebook’s application of local laws to moderate “hate speech” has resulted in collateral damage and the silencing of voices that such laws are meant to protect. In the United States, there is no law against “hate speech,” but Facebook is still trying to limit the amount of abusive content on its site as advertisers flee and politicians continue to apply pressure.

Facebook moderators use a set of internal guidelines to determine what is or isn’t hate speech. Unfortunately for many users, the guidelines — which they never saw before ProPublica published them — result in some unexpected moderation decisions.

Users wondered why hate speech targeting Black children was allowed while similar speech targeting, for instance, white men wasn’t. The internal guidelines explained the factors considered by moderators, which led exactly to these seemingly-inexplicable content removals.

According to Facebook’s internal guidelines, these categories are “protected,” which means moderators will remove “hateful” content targeting anything on this list.

  • Sex
  • Race
  • Religious affiliation
  • Ethnicity
  • National origin
  • Sexual orientation
  • Gender identity
  • Serious disability/disease

And this is the list of categories not considered “protected” by Facebook:

  • Social class
  • Occupation
  • Continental origin
  • Political ideology
  • Appearance
  • Religions
  • Age
  • Countries

Critics pointed out the internal standards would seem to lead directly to harassment of groups supposedly protected (Black children), while shielding groups historically-viewed — at least in the United States — as not in any need of additional protections (white men).

This seemingly-incongruous outcome is due to the application of the rules by moderators. If a “protected” class is modified by an “unprotected” category (“Black” [race/protected] + “children” [age/unprotected]), the resulting combination is determined to be “unprotected.” In the case of white men, both categories are protected: race + sex. What seems to be a shielding of a pretty protected group (white men) is actually just the proper application of Facebook’s internal moderation guidelines

In response to criticism about outcomes like these, Facebook pointed out it operated globally. What might be considered a ridiculous (or even harmful) moderation decision here in the United States makes more sense in other areas of the world where white men might not make up a large percentage of the population or have historically held a great number of positions of power.

Decisions to be made by Facebook:

  • Should content be removed if it conveys hateful rhetoric against certain groups or individuals even if it doesn’t specifically violate the internal guidelines?
  • Should context be considered when moderating posts that violate the internal guidelines to ensure users who are spreading awareness/criticizing other users’ hateful speech aren’t subjected to the same moderation efforts or account limitations?
  • Which first principles should Facebook be operating on when creating anti-hate policies, and are these policies holding up those principles in practice?

Questions and policy implications to consider:

  • When moderating hate speech, should more discretion be used by moderators to ensure better protection of marginalized groups?
  • Would altering or expanding the scope of the internal guidelines result in users switching to other social media services?
  • Do seemingly inconsistent internal rules (i.e., moderation that protects white men while leaving Black children open to abuse) confuse users and/or result in loss of advertising revenue?

Resolution: Facebook moderators continue to use lists like these to make decisions about perceived “hate speech.” The company continues to consider all stakeholders, including foreign governments who have passed “hate speech” laws that surpass what the site’s internal guidelines already target for removal.

Filed Under: , ,
Companies: facebook

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Content Moderation Case Study: Facebook's Internal 'Hate Speech' Guidelines Appear To Leave Protected Groups Unprotected (June 2017)”

Subscribe: RSS Leave a comment
Anonymous Coward says:

Re: Re:

I find it silly to focus on who is being criticized and not what they’re being criticized about. E.G. is a white man being criticized for being a white man or being criticized for being an adult a.k.a. "okay, boomer".

Similar to the article’s mention of criticizing a black child. Are they being criticized for being black or for saying something childish?

This comment has been flagged by the community. Click here to show it.

Anonymous Coward says:

Facebook’s policies are insane. I was just banned for a month for "hate speech" for quoting a term in a reply. "I think hillbilly white trash complained about being compared to him…" Someone had previously used the term "hillbilly white trash" and got a warning for it. Simply using the adjective "white" should not get you a ban…

Glenn says:

When you start to censor anything you don’t like, you tend to find more and more things to not like. Mission creep happens. Compound that with censorship by committee, and your lowest common denominator starts to approach zero. Meaning: since everyone hates something, almost nothing isn’t hated… and you’re not allowed to talk about it.

2120: "First Amendment? …what’s that?"

This comment has been deemed insightful by the community.
Stephen T. Stone (profile) says:




Moderation is a platform/service owner or operator saying “we don’t do that here”. Personal discretion is an individual telling themselves “I won’t do that here”. Editorial discretion is an editor saying “we won’t print that here”, either to themselves or to a writer. Censorship is someone saying “you won’t do that anywhere” alongside threats or actions meant to suppress speech.

Now, with that in mind, please explain how Facebook’s moderation is actually censorship.

Stephen T. Stone (profile) says:


We should all be protected from hate. But some groups of people — e.g., gay people — have always been marginalized in society by hatred of the dominant group (in this case, straight people). Society enacts laws to protect such groups from “hate” (read: discrimination) because we’ve seen what happens when we don’t offer such protections to those groups. It isn’t pretty, and it isn’t a period of time worth revisiting.

Also: The same laws that protect gay people from discrimination on the basis of sexual orientation also apply to straight people. Marginalized groups aren’t asking for “special rights” — they’re asking for equitable treatment under the law.

Add Your Comment

Your email address will not be published.

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Older Stuff
15:43 Content Moderation Case Study: Facebook Struggles To Correctly Moderate The Word 'Hoe' (2021) (21)
15:32 Content Moderation Case Study: Linkedin Blocks Access To Journalist Profiles In China (2021) (1)
16:12 Content Moderation Case Studies: Snapchat Disables GIPHY Integration After Racist 'Sticker' Is Discovered (2018) (11)
15:30 Content Moderation Case Study: Tumblr's Approach To Adult Content (2013) (5)
15:41 Content Moderation Case Study: Twitter's Self-Deleting Tweets Feature Creates New Moderation Problems (2)
15:47 Content Moderation Case Studies: Coca Cola Realizes Custom Bottle Labels Involve Moderation Issues (2021) (14)
15:28 Content Moderation Case Study: Bing Search Results Erases Images Of 'Tank Man' On Anniversary Of Tiananmen Square Crackdown (2021) (33)
15:32 Content Moderation Case Study: Twitter Removes 'Verified' Badge In Response To Policy Violations (2017) (8)
15:36 Content Moderation Case Study: Spam "Hacks" in Among Us (2020) (4)
15:37 Content Moderation Case Study: YouTube Deals With Disturbing Content Disguised As Videos For Kids (2017) (11)
15:48 Content Moderation Case Study: Twitter Temporarily Locks Account Of Indian Technology Minister For Copyright Violations (2021) (8)
15:45 Content Moderation Case Study: Spotify Comes Under Fire For Hosting Joe Rogan's Podcast (2020) (64)
15:48 Content Moderation Case Study: Twitter Experiences Problems Moderating Audio Tweets (2020) (6)
15:48 Content Moderation Case Study: Dealing With 'Cheap Fake' Modified Political Videos (2020) (9)
15:35 Content Moderation Case Study: Facebook Removes Image Of Two Men Kissing (2011) (13)
15:23 Content Moderation Case Study: Instagram Takes Down Instagram Account Of Book About Instagram (2020) (90)
15:49 Content Moderation Case Study: YouTube Relocates Video Accused Of Inflated Views (2014) (2)
15:34 Content Moderation Case Study: Pretty Much Every Platform Overreacts To Content Removal Stimuli (2015) (23)
16:03 Content Moderation Case Study: Roblox Tries To Deal With Adult Content On A Platform Used By Many Kids (2020) (0)
15:43 Content Moderation Case Study: Twitter Suspends Users Who Tweet The Word 'Memphis' (2021) (10)
15:35 Content Moderation Case Study: Time Warner Cable Doesn't Want Anyone To See Critical Parody (2013) (14)
15:38 Content Moderation Case Studies: Twitter Clarifies Hacked Material Policy After Hunter Biden Controversy (2020) (9)
15:42 Content Moderation Case Study: Kik Tries To Get Abuse Under Control (2017) (1)
15:31 Content Moderation Case Study: Newsletter Platform Substack Lets Users Make Most Of The Moderation Calls (2020) (8)
15:40 Content Moderation Case Study: Knitting Community Ravelry Bans All Talk Supporting President Trump (2019) (29)
15:50 Content Moderation Case Study: YouTube's New Policy On Nazi Content Results In Removal Of Historical And Education Videos (2019) (5)
15:36 Content Moderation Case Study: Google Removes Popular App That Removed Chinese Apps From Users' Phones (2020) (28)
15:42 Content Moderation Case Studies: How To Moderate World Leaders Justifying Violence (2020) (5)
15:47 Content Moderation Case Study: Apple Blocks WordPress Updates In Dispute Over Non-Existent In-app Purchase (2020) (18)
15:47 Content Moderation Case Study: Google Refuses To Honor Questionable Requests For Removal Of 'Defamatory' Content (2019) (25)
More arrow