Techdirt's think tank, the Copia Institute, is working with the Trust & Safety Professional Association and its sister organization, the Trust & Safety Foundation, to produce an ongoing series of case studies about content moderation decisions. These case studies are presented in a neutral fashion, not aiming to criticize or applaud any particular decision, but to highlight the many different challenges that content moderators face and the tradeoffs they result in. Find more case studies here on Techdirt and on the TSF website.

Content Moderation Case Study: Facebook Struggles To Correctly Moderate The Word 'Hoe' (2021)

from the language-is-a-funny-thing dept

Summary: One of the many challenges with content moderation is the flexibility of language. When applying blocklists — a list of prohibited terms considered not appropriate for the platform — moderators need to consider innocuous uses of words that, when removed from their context, appear to be violations of the platform’s terms of use.

Multiple platforms have run into the phenomenon known as the “Scunthorpe problem.” In this famous case, a town whose name no one would ever mistake for offensive was deemed offensive by moderation blocklists simply because within the name of the town is the word “cunt” which many blocklists forbids.

Deploying automated blocklists can be even more challenging when dealing with specialized or niche content, which may use certain terms that are offensive outside of this specific context, but are essential to discussing and understanding the relevant subject matter. A paleontologists’ conference was derailed when the moderation blocklist made it impossible for participants to use words like “bone,” “pubic,” “stream,” and “beaver.”

Facebook has worked continuously to refine its moderation processes, but it still occasionally makes the wrong call when it comes to their blocklists. In January 2021, residents of (and visitors to) a Devon, England landmark were surprised to find their posts and comments vanishing from the site. After a little investigation, it became clear Facebook was deleting posts containing references to the landmark known as Plymouth Hoe.

In addition to being the name of a common garden tool (more on that in a moment), “hoe” also refers to a “sloping ridge shaped like an inverted foot or heel,” such as Plymouth Hoe, which is known locally as the Hoe. Users were temporarily forced to self-censor the harmless term to avoid moderation, either by adding unnecessary punctuation or dropping the “h.” It appeared Facebook’s automated processes believed these comments and posts were using a derogatory term for a romantic partner who is only in a relationship to better their own financial position.

Facebook soon apologized for the moderation error and stated it was “taking steps to rectify the error” and figure out what caused the mistaken moderation in the first place. Problem solved?

Not really.

The same problem popped up again, this time affecting a New York gardening group. WNY Gardeners, a group with more than 8,000 members, is the latest to be affected by Facebook’s “hoe” pruning. A member responded to the prompt “most loved & indispensable weeding tool” with “Push pull hoe!” Not long after that, the member was informed by Facebook that the comment violated the site’s policy on bullying and harassment.

Company Considerations:

  • How could blocklists and keyword searches be better utilized to detect and remove violations of site policies? 
  • How much collateral damage from automated moderation should be considered acceptable? Is this an acceptable trade-off for lower moderation costs, which often relies on more automated moderation and fewer human moderators?
  • Can AI-based moderation more reliably detect actual violations (rather than innocuous uses of blocklisted terms) as the technology advances? What are the trade-offs with AI-based moderation tools as compared to simple blocklists? 
  • What mitigation measures might be put in place to deal with a blocklist that catches words with different meanings depending on context?
  • Who should be in charge of reviewing a blocklist and how frequently should it be updated? 

Issue Considerations:

  • Does prohibiting words like “hoe” make a significant dent in online harassment and abuse? Does the tech have the capability to “catch up” (or surpass) the ability of humans to route around moderation efforts?
  • Should more resources go to staffing human moderators in order to prevent errors and/or allow for a more robust challenge process that allows content to remain “live” until the challenge process has concluded?
  • What ways might automation and human reviewers be used in combination to avoid the more egregious automated blocklist mistakes?

Resolution: Once again, Facebook has apologized for not recognizing the word “hoe” in contexts where it’s appropriate to use. But after two highly-publicized incidents in less than a year — both involving the same word — Facebook has added human moderators to backstop automated calls on flagged terms like these in order to prevent unjustified removals of posts, accounts, or groups.

Originally posted to the Trust & Safety Foundation website.

Filed Under: , ,
Companies: facebook

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Content Moderation Case Study: Facebook Struggles To Correctly Moderate The Word 'Hoe' (2021)”

Subscribe: RSS Leave a comment
21 Comments
Ben (profile) says:

English only?

And you’ve not even considered the difficulties FaceBook and many other sites have with languages other than English… and even more difficult: languages with loan words from other languages where the loaned word has derogatory implications in one language, but not the other.
That’s before we even come close to considering differences in culture in a multi-lingual cultures, or languages shared amongst different cultures with varying mores around offense in a written form.

Paul B says:

Re: English only?

Chinese Filler words…

There is very common Chinese Filler word that happens to sound like a term for former slaves from africa in the US. It got at least one professor fired who was teaching a class on Chinese language.

Just wait till some voice AI hears a Chinese speaker and emergency bans an event over a filler word.

PaulT (profile) says:

Re: English only?

"languages with loan words from other languages where the loaned word has derogatory implications in one language, but not the other"

You don’t even have to go as far as loan words. For example, the Swedish word for "end" is "slut". Good luck with a multilingual filter that has no false positives.

"That’s before we even come close to considering differences in culture in a multi-lingual cultures, or languages shared amongst different cultures with varying mores around offense in a written form."

Then, of course, language evolves over time. A sentence written a decade ago could be perfectly innocent but contain some offensive connotation today. Then, dealing with acronyms. Whether your WAP comment is lascivious in nature or a boring discussion of wireless internet options could simply be a matter of timing.

PaulT (profile) says:

Re: Re:

"For a while they were blocking posts in a group I know of that contained the word “white.”"

I’ve never heard that before so I’m guessing there must be some other context. Certainly I’ve never come across such things in groups I’m involved with, be that talking about a popular chef in my area with the surname White or talking about the movie White Christmas in movie related groups when the appropriate season arrives.

"It boggles the mind that they have been left in charge of our cultural commons, by our communal laziness and inaction."

It boggles the mind that someone who genuinely believes this is so lazy not to use their competitors instead. If they have any control over the commons at all, it’s only because people choose to use them, so it’s weird that someone who objects to them so strongly should choose to do so.

Uriel-238 (profile) says:

Re: Farming implents and sex workers

At this point I’ve only seen ho used as colloquial slang for whore. It has its own problems, between Santa’s laugh, the humorous slang for the Hostess product. Whoopi Goldberg’s company is One Ho Productions

Whenever I’ve read hoe it was referring to the gardening tool…unless it was in a rant or screed rife with misspellings and apostrophes used in plurals rather than possessives. In that case, I still read it as a farm tool because it amuses me.

The solution in my opinion is to descandalize sex work. Especially considering ours is an economy fit to drive workers to desperation.

In the meantime, our moderating systems are not only going to have to monitor for a continuous run of slurs, but will also have to watch for misspellings, intentional or otherwise. I’ve only heard of simps around August.

Anonymous Coward says:

Word filters never work. I once played a mobile game where the chat had such aggressive substring filters it made a normal conversation look like drunken sailors with all the bleep-outs. It would not only block the usual over-inclusive lists of bad words some intern googled, but also parts of bad words (shortest only 2 letters long), misspellings, and anagrams, all of it including across punctuation and in the middle of longer words. It was difficult to carry conversations, constantly having to evade the filters or explain the bleeped bits, especially with new players who weren’t familiar with the bleeps.

ButdDespite turning the scunthorpe problem up to eleven, people still managed to cuss eachother just fine when they were actually trying to avoid the system.

Leave a Reply to Pixelation Cancel reply

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...
Older Stuff
15:43 Content Moderation Case Study: Facebook Struggles To Correctly Moderate The Word 'Hoe' (2021) (21)
15:32 Content Moderation Case Study: Linkedin Blocks Access To Journalist Profiles In China (2021) (1)
16:12 Content Moderation Case Studies: Snapchat Disables GIPHY Integration After Racist 'Sticker' Is Discovered (2018) (11)
15:30 Content Moderation Case Study: Tumblr's Approach To Adult Content (2013) (5)
15:41 Content Moderation Case Study: Twitter's Self-Deleting Tweets Feature Creates New Moderation Problems (2)
15:47 Content Moderation Case Studies: Coca Cola Realizes Custom Bottle Labels Involve Moderation Issues (2021) (14)
15:28 Content Moderation Case Study: Bing Search Results Erases Images Of 'Tank Man' On Anniversary Of Tiananmen Square Crackdown (2021) (33)
15:32 Content Moderation Case Study: Twitter Removes 'Verified' Badge In Response To Policy Violations (2017) (8)
15:36 Content Moderation Case Study: Spam "Hacks" in Among Us (2020) (4)
15:37 Content Moderation Case Study: YouTube Deals With Disturbing Content Disguised As Videos For Kids (2017) (11)
15:48 Content Moderation Case Study: Twitter Temporarily Locks Account Of Indian Technology Minister For Copyright Violations (2021) (8)
15:45 Content Moderation Case Study: Spotify Comes Under Fire For Hosting Joe Rogan's Podcast (2020) (64)
15:48 Content Moderation Case Study: Twitter Experiences Problems Moderating Audio Tweets (2020) (6)
15:48 Content Moderation Case Study: Dealing With 'Cheap Fake' Modified Political Videos (2020) (9)
15:35 Content Moderation Case Study: Facebook Removes Image Of Two Men Kissing (2011) (13)
15:23 Content Moderation Case Study: Instagram Takes Down Instagram Account Of Book About Instagram (2020) (90)
15:49 Content Moderation Case Study: YouTube Relocates Video Accused Of Inflated Views (2014) (2)
15:34 Content Moderation Case Study: Pretty Much Every Platform Overreacts To Content Removal Stimuli (2015) (23)
16:03 Content Moderation Case Study: Roblox Tries To Deal With Adult Content On A Platform Used By Many Kids (2020) (0)
15:43 Content Moderation Case Study: Twitter Suspends Users Who Tweet The Word 'Memphis' (2021) (10)
15:35 Content Moderation Case Study: Time Warner Cable Doesn't Want Anyone To See Critical Parody (2013) (14)
15:38 Content Moderation Case Studies: Twitter Clarifies Hacked Material Policy After Hunter Biden Controversy (2020) (9)
15:42 Content Moderation Case Study: Kik Tries To Get Abuse Under Control (2017) (1)
15:31 Content Moderation Case Study: Newsletter Platform Substack Lets Users Make Most Of The Moderation Calls (2020) (8)
15:40 Content Moderation Case Study: Knitting Community Ravelry Bans All Talk Supporting President Trump (2019) (29)
15:50 Content Moderation Case Study: YouTube's New Policy On Nazi Content Results In Removal Of Historical And Education Videos (2019) (5)
15:36 Content Moderation Case Study: Google Removes Popular App That Removed Chinese Apps From Users' Phones (2020) (28)
15:42 Content Moderation Case Studies: How To Moderate World Leaders Justifying Violence (2020) (5)
15:47 Content Moderation Case Study: Apple Blocks WordPress Updates In Dispute Over Non-Existent In-app Purchase (2020) (18)
15:47 Content Moderation Case Study: Google Refuses To Honor Questionable Requests For Removal Of 'Defamatory' Content (2019) (25)
More arrow