's Techdirt Profile


Posted on Techdirt - 16 February 2022 @ 03:32pm

Content Moderation Case Study: YouTube Doubles Down On Questionable 'graphic Content' Enforcement Before Reversing Course (2020)


YouTube creators have frequently complained about the opaque and frustrating nature of the platform’s appeals process for videos that are restricted or removed for violating its Community Guidelines. Beyond simply removing content, these takedowns can be severely damaging to creators, as they can result in “strikes” against a channel. Strikes incur temporary restrictions on the user’s ability to upload content and use other site features, and enough strikes can ultimately lead to permanent channel suspension.

Creators can appeal these strikes, but many complain that the response to appeals is inconsistent, and that rejections are deemed “final” without providing insight into the decision-making process or any further recourse. One such incident in 2020 involving high-profile creators drew widespread attention online and resulted in a rare apology and reversal of course by YouTube.

On August 24, 2020, YouTube creator MoistCr1TiKaL (aka Charlie White, who also uses the handle penguinz0), who at the time had nearly six-million subscribers, posted a video in which he reacted to a viral 2014 clip of a supposed “road rage” incident involving people dressed as popular animated characters. The authenticity of the original video is unverified and many viewers suspect it was staged for comedic purposes, as the supposed “violence” it portrays appears to be fake, and the target of the “attack” appears uninjured. Soon after posting his reaction video, White received a strike for “graphic content with intent to shock” and the video was removed. On September 1, White revealed on Twitter that he had appealed the strike, but the appeal was rejected.

White then posted a video expressing his anger at the situation, and pointed out that another high-profile YouTube creator, Markiplier (aka Mark Fischbach), had posted his own reaction to the same viral video nearly four years earlier but had not received a strike. Fischbach agreed with White and asked YouTube to address the inconsistency. To the surprise of both creators, YouTube responded by issuing a strike to Fischbach’s video as well.

The incident resulted in widespread backlash online, and the proliferation of the #AnswerUsYouTube hashtag on Twitter, with fans of both creators demanding a reversal of the strikes and/or more clarity on how the platform makes these enforcement decisions.

Company considerations:

  • If erroneous strikes are inevitable given the volume of content being moderated, what are the necessary elements of an appeals process to ensure creators have adequate recourse and receive satisfactory explanations for final decisions?
  • What are the conditions under which off-platform attention to a content moderation decision should result in further manual review and potential reversals outside the normal appeals process?
  • How can similar consideration be afforded to creators who face erroneous strikes and rejected appeals, but do not have large audiences who will put off-platform pressure on the company?

Issue considerations:

  • How can companies balance the desire to directly respond to controversies involving highly popular creators with the desire to employ consistent, equitable processes for all creators?
  • How should platforms harmonize their enforcement decisions when they are alerted to clear contradictions between the decisions on similar pieces of content?


On September 2, a few hours after Fischbach announced his strike and White expressed his shock at that decision, the TeamYouTube Twitter account replied to White and to Fischbach with an apology, stating that it had restored both videos and reversed both strikes and calling the initial decision “an over-enforcement of our policies.” Both creators expressed their appreciation for the reversal, while also noting that they hope the company makes changes to prevent similar incidents from occurring in the future. Since such reversals by YouTube are quite rare, and apologies even rarer, the story sparked widespread coverage in a variety of outlets.

Originally posted to the Trust and Safety Foundation website.

Posted on Techdirt - 9 February 2022 @ 03:46pm

Content Moderation Case Study: Russia Slows Down Access To Twitter As New Form Of Censorship (2021)


On March 10 2021, the Russian Government deliberately slowed down access to Twitter after it accused the platform of repeatedly failing to remove posts about illegal drug use, child pornography, and pushing minors towards suicide. 

State communications watchdog Roskomnadzor (RKN) claimed that “throttling” the speed of uploading and downloading images and videos on Twitter was to protect its citizens by making its content less accessible. Using Deep Packet Inspection (DPI) technology, RKN essentially filtered internet traffic for Twitter-related domains. As part of Russia’s controversial 2019 Sovereign Internet Law, all Russian Internet Service Providers (ISPs) were required to install this technology, which allows internet traffic to be filtered, rerouted, and blocked with granular rules through a centralized system. In this example, it blocked or slowed down access to specific content (images and videos) rather than the entire service. DPI technology also gives Russian authorities unilateral and automatic access to ISPs’ information systems and access to keys to decrypt user communications. 

Twitter throttling in Russia meme. Translation: “Runet users; Twitter”

The University of Michigan’s researchers reported connection speeds to Twitter users were reduced on average by 87 percent and some Russian internet service providers reported a wider slowdown in access. Inadvertently, this throttling affected all website domains that included the substring (Twitter’s shortened domain name), including,, Russian state operated news site and several other Russian Government websites, including RKN’s own.

Although reports suggest that Twitter has a limited user base in Russia, perhaps as low as 3% of the population (from an overall population of 144 million), it is popular with politicians, journalists and opposition figures. The ‘throttling’ of access was likely intended as a warning shot to other platforms and a test of Russia’s technical capabilities. Russian parliamentarian, Aleksandr Khinshtein, an advocate of the 2019 Sovereign Internet Law, was quoted as saying that: 

Putting the brakes on Twitter traffic “will force all other social networks and large foreign internet companies to understand Russia won’t silently watch and swallow the flagrant ignoring of our laws.” The companies would have to obey Russian rules on content or “lose the possibility to make money in Russia.” — Aleksandr Khinshtein

The Russian Government has a history of trying to limit and control citizen’s access and use of social media. In 2018, it tried and ultimately failed to shut down Telegram, a popular messaging app. Telegram, founded by the Russian émigré, Pavel Durov, refused to hand over its encryption keys to RKN, despite a court order. Telegram was able to thwart the shutdown attempts by shifting the hosting of its website to Google Cloud and Amazon Web Services through ‘domain fronting’ – which the Russian Government later banned. The Government eventually backed down in the face of technical difficulties and strong public opposition.
Many news outlets suggest that these incidents demonstrate that Russia, where the internet has long been a last bastion of free speech as the government has shuttered independent news organizations and obstructed political opposition, is now tipping towards the more tightly controlled Chinese model and replicating aspects of its famed Great Fire Wall – including creating home-grown alternatives to Western platforms. They also warn that as Russian tactics become bolder and its censorship technology more technically sophisticated – they will be easily co-opted and scaled up by other autocratic governments.

Company considerations:

  • To what extent should companies comply with such types of government demands? 
  • Where do companies draw the line between acquiescing to government demands/local law that are contrary to its values or could result in human rights violations vs expanding into a market or ensuring that its users have access?
  • To what extent should companies align their response and/or mitigation strategies with that of other (competitor) US companies affected in a similar way by local regulation?
  • Should companies try to circumvent the ‘throttling’ or access restrictions through technical means such as reconfiguring content delivery networks?
  • Should companies alert its users that their government is restricting/throttling access?

Issue considerations:

  • When are government takedown requests too broad and overreaching? Who – companies, governments, civil society, a platform’s users – should decide when that is the case?
  • How transparent should companies be with its users about why certain content is taken down because of government requests and regulation? Would there be times when companies should not be too transparent?
  • What can users and advocacy groups do to challenge government restrictions on access to a platform?
  • Should – as the United Nations suggest – access to the internet be seen as a part of a suite of digital human rights?


The ‘throttling’ of access to Twitter content initially lasted two months. According to RKN, Twitter removed 91 percent of its takedown requests after RKN threatened to block Twitter if it didn’t comply. Normal speeds for desktop users resumed in May after Twitter complied with RKN’s takedown requests but reports indicate that throttling is continuing for Twitter’s mobile app users until it complies fully with RKN’s takedown requests.

Originally posted to the Trust and Safety Foundation website.

Posted on Techdirt - 12 January 2022 @ 03:33pm

Content Moderation Case Study: Facebook Knew About Deceptive Advertising Practices By A Group That Was Later Banned For Operating A Troll Farm (2018-2020)


In the lead-up to the 2018 midterm elections in the United States, progressive voters in seven competitive races in the Midwest were targeted with a series of Facebook ads urging them to vote for Green Party candidates. The ads, which came from a group called America Progress Now, included images of and quotes from prominent progressive Democrats including Bernie Sanders and Alexandria Ocasio-Cortez with the implication that these politicians supported voting for third parties. 

The campaign raised eyebrows for a variety of reasons: two of the featured candidates stated that they did not approve the ads, nor did they say or write the supposed quotes that were run alongside their photos, and six of the candidates stated that they had no connection with the group. The office of Senator Sanders asked Facebook to remove the campaign, calling it “clearly a malicious attempt to deceive voters.” Most notably, an investigation by ProPublica and VICE News revealed that America Progress Now was not registered with the Federal Election Commission nor was any such organization present at the address listed on its Facebook page.

In response to Senator Sanders’ office, and in a further statement to ProPublica and VICE, Facebook stated that it had investigated the group and found no violation of its advertising policies or community standards.

Two years later, during the lead-up to the 2020 presidential election, an investigation by the Washington Post revealed a “troll farm”-type operation directed by Rally Forge, a digital marketing firm with connections to Turning Point Action (an affiliate of the conservative youth group Turning Point USA), in which multiple teenagers were recruited and directed to post pro-Trump comments using false identities on both Facebook and Twitter. This revelation resulted in multiple accounts being removed by both companies, and Rally Forge was permanently banned from Facebook.

As it turned out, these two apparently separate incidents were in fact closely connected: an investigation by The Guardian in June of 2021, aided in part by Facebook whistleblower Sophie Zhang, discovered that Rally Forge had been behind the America Progress Now ads in 2018. Moreover, Facebook had been aware of the source of the ads and their deceptive nature, and of Rally Forge’s connection to Turning Point, when it determined that the ads did not violate its policies. The company did not disclose these findings at the time. Internal Facebook documents, seen by The Guardian, recorded concerns raised by a member of Facebook’s civic integrity team, noting that the ads were “very inauthentic” and “very sketchy.” In the Guardian article, Zhang asserted that “the fact that Rally Forge later went on to conduct coordinated inauthentic behavior with troll farms reminiscent of Russia should be taken as an indication that Facebook’s leniency led to more risk-taking behavior.”

Company considerations:

  • What is the best way to address political ads that are known to be intentionally deceptive but do not violate specific advertising policies?
  • What disclosure policies should be in place for internal investigations that reveal the questionable provenance of apparently deceptive political ad campaigns?
  • When a group is known to have engaged in deceptive practices that do not violate policy, what additional measures should be taken to monitor the group in case future actions involve escalations of deceptive and manipulative tactics?

Issue considerations:

  • How important should the source and intent of political ads be when determining whether or not they should be allowed to remain on a platform, as compared to the content of the ads themselves?
  • At what point should apparent connections between a group that violates platform policies and a group that did not directly engage in the prohibited activity result in enforcement actions against the latter group?


A Facebook spokesperson told The Guardian that the company had “strengthened our policies related to election interference and political ad transparency” in the time since the 2018 investigation, which revealed no violations by America Progress Now. The company also introduced a new policy aiming to increase transparency regarding the operators of networks of Facebook Pages.

Rally Forge and one of its page administrators remain permanently banned from Facebook following the 2020 troll farm investigation, while Turning Point USA and Turning Point Action deny any involvement in the specifics of either campaign, and Facebook has taken no direct enforcement action against those groups.

Originally posted to the Trust and Safety Foundation website.

Posted on Techdirt - 5 January 2022 @ 03:50pm

Content Moderation Case Study: Roblox Moderators Combat In-Game Reenactments Of Mass Shootings (2021)

Online game platform Roblox has gone from a niche offering to a cultural phenomenon over its 15 years of existence. Rivalling Minecraft in its ability to attract young users, Roblox is played by over half of American children and has a user base of 164 million active users.

Roblox also gives players access to a robust set of creation tools, allowing users to create and craft their own experiences, as well as enjoy those created by others. 

A surge in users during the COVID-19 pandemic created problems Roblox’s automated moderation systems — as well as its human moderators — are still attempting to solve. Roblox employs 1,600 human moderators who not only handle content flowing through in-game chat features but content created and shared with other users utilizing Roblox’s creation tools. 

Users embraced the creation tools, some in healthier ways than others. If it happened in the real world, someone will try to approximate it online. Users have used a kid-focused game to create virtual red light districts where players can gather to engage in simulated sex with other players — an activity that tends to avoid moderation by utilizing out-of-game chat platforms like Discord to provide direct links to this content. 

Perhaps more disturbingly, players are recreating mass shootings — many of them containing a racial element — inside the game, and inviting players to step into the shoes of mass murderers. Anti-Defamation League researcher Daniel Kelley was easily able to find recreations of the Christchurch Mosque shooting that occurred in New Zealand in 2019

While Roblox proactively polices the platform for “terrorist content,” the continual resurfacing of content like this remains a problem without an immediate solution. As Russell Brandom of The Verge points out, 40 million daily users generate more content than can be manually-reviewed by human moderators. And the use of a keyword blocklist would result in users being unable to discuss (or recreate) the New Zealand town. 

Company considerations:

  • How does catering to a younger user base affect moderation efforts?
  • What steps can be taken to limit access to or creation of content when users utilize communication channels the company cannot directly monitor? 
  • What measures can be put in place to limit unintentional interaction with potentially harmful content by younger users? What tools can be used to curate content to provide “safer” areas for younger users to explore and interact with?

Issue considerations:

  • How should companies respond to users who wish to discuss or otherwise interact with each other with content that involves newsworthy, but violent, events? 
  • How much can a more robust reporting process ease the load on human and AI moderation?
  • Can direct monitoring of users and their interactions create additional legal risks when most users are minors? How can companies whose user bases are mostly children address potential legal risks while still giving users freedom to create and communicate on the platform?


Roblox updated its Community Standards to let users know this sort of content was prohibited. It also said it would engage in “proactive detection” that would put human eyes on content related to terms like this, allowing geographic references but not depictions of the mosque shooting. 

Originally posted to the Trust and Safety Foundation website.

Posted on Techdirt - 15 December 2021 @ 03:59pm

Content Moderation Case Study: Nintendo Blocks Players From Discussing COVID, Other Subjects (2020)

Summary: Nintendo has long striven to be the most family-friendly of game consoles. Its user base tends to skew younger and its attempts to ensure its offerings are welcoming and non-offensive have produced a long string of moderation decisions that have mostly, to this point, only affected game content. Many of these changes were made to make games less offensive to users outside of Nintendo’s native Japan. 

Nintendo’s most infamous content moderation involved a port of the fighting game Mortal Kombat. While other Sega (Nintendo’s main rival at that point) console owners were treated to the original red blood found in the arcades, Nintendo users had to make do with a gray colored “sweat” — a moderation move that greatly cemented Nintendo’s reputation as a console for kids. 

Nintendo still has final say on content that can be included in its self-produced products, leading to contributors finding their additions have been stripped out of games if Nintendo’s moderators feel they are possibly offensive. While Nintendo has backed off from demanding too many alterations from third-party game developers, it still wields a heavy hand when it comes to keeping its own titles clean and family-friendly.

With the shift to online gaming, came new moderation challenges for Nintendo to address. Multiple players interacting in shared spaces controlled by the company produced some friction between what players wanted to do and what the company would allow. The first challenges arrived nearly a decade ago with the Wii, which featured online spaces where players could interact with each other using text or voice messages. This was all handled by moderators who apparently reviewed content three times before allowing it to arrive at its destination, something that could result in an “acceptable” thirty minute delay between the message’s sending and its arrival.

Thirty minutes is no longer an acceptable delay, considering the instantaneous communications allowed by other consoles. And there are more players online than ever, thanks to popular titles like Animal Crossing, a game with social aspects that are a large part of its appeal

While it’s expected Nintendo would shut down offensive and sexual language, given its perception of the desire of its target market, the company’s desire to steer users clear of controversial subjects extended to a worldwide pandemic and the Black Lives Matter movement in the United States.

Here’s what gaming site Polygon discovered after Nintendo issued a patch for Animal Crossing in September 2020:

According to Nintendo modder and tinkerer OatmealDomeVer. 10.2.0 expands the number of banned words on the platform, including terms such as KKK, slave, nazi, and ACAB. The ban list also includes terms such as coronavirus and COVID. Polygon tested these words out while making a new user on a Nintendo Switch lite and found that while they resulted in a warning message, the acronym BLM was allowed by the system. Most of these words seem to be a response to the current political moment in America.

Patricia Hernandez, Polygon

As this report from the Electronic Frontier Foundation notes, Nintendo often steers clear of political issues, even going so far as to ban the use of any of its online games for “political advocacy,” which resulted in the Prime Minister of Japan having to cancel a planned Animal Crossing in-game campaign event.

Company considerations:

  • How does limiting discussion of current/controversial events improve user experience? How does it adversely affect players seeking to interact?
  • How should companies respond to users who find creative ways to circumvent keyword blocking? 
  • How does a company decide which issues/terms should be blocked/muted when it comes to current events?

Issue considerations:

  • How should companies approach controversial issues that are of interest to some players, but may make other players uncomfortable? 
  • How can suppressing speech involving controversial topics adversely affect companies and their user bases?
  • How can Nintendo avoid being used by governments to control speech related to local controversies, given its willingness to preemptively moderate speech related to issues of great interest to its user base?

Resolution: Nintendo continues its blocking of these terms, apparently hoping to steer clear of controversial issues. While this may be at odds with what players expect to be able to discuss with their online friends, it remains Nintendo’s playground where it gets to set the rules.

But, as the EFF discovered, moderation could be easily avoided by using variations that had yet to end up on Nintendo’s keyword blocklist.

Originally posted to the Trust and Safety Foundation website.

Posted on Techdirt - 8 December 2021 @ 03:45pm

Content Moderation Case Study: Twitter Briefly Restricts Account Of Writer Reporting From The West Bank (2021)

Summary: In early May 2021, writer and researcher Mariam Barghouti was reporting from the West Bank on escalating conflicts between Israeli forces and Palestinian protestors, and making frequent social media posts about her experiences and the events she witnessed. Amidst a series of tweets from the scene of a protest, shortly after one in which she stated “I feel like I’m in a war zone,” Barghouti’s account was temporarily restricted by Twitter. She was unable to post new tweets, and her bio and several of her recent tweets were replaced with a notice stating that the account was “temporarily unavailable because it violates the Twitter Media Policy”.

The incident was highlighted by other writers, some of whom noted that the nature of the restriction seemed unusual, and the incident quickly gained widespread attention. Fellow writer and researcher Joey Ayoub tweeted that Barghouti had told him the restriction would last for 12 hours according to Twitter, and expressed concern for her safety without access to a primary communication channel in a dangerous situation.

The restriction was lifted roughly an hour later. Twitter told Barghouti (and later re-stated to VICE’s Motherboard) that the enforcement action was a “mistake” and that there was “no violation” of the social media platform’s policies. Motherboard also asked Twitter to clarify which specific policies were initially believed to have been violated, but says the company “repeatedly refused”.

Company Considerations:

  • In cases where enforcement actions are taken involving sensitive news reporting content, how can the reasons for enforcement be better communicated to both the public and the reporters themselves?
  • How can the platform identify cases like these and apply additional scrutiny to prevent erroneous enforcement actions?
  • What alternatives to account suspensions and the removal of content could be employed to reduce the impact of errors?
  • How can enforcement actions be applied with consideration for journalists’ safety in situations involving the live reporting of dangerous events?

Issue Considerations:

  • With so much important news content, especially live reporting, flowing through social media platforms, what can be done to prevent policy enforcement (erroneous or otherwise) from unduly impacting the flow of vital information?
  • Since high-profile enforcement and reversal decisions by platforms are often influenced by widespread public attention and pressure, how can less prominent reporters and other content creators protect themselves?

Resolution: Though the account restriction was quickly reversed by Twitter, many observers did not accept the company’s explanation that it was an error, instead saying the incident was part of a broader pattern of social media platforms censoring Palestinians. Barghouti said:

“I think if I was not someone with visibility on social media, that this would not have garnered the attention it did. The issue isn’t the suspension of my account, rather the consideration that Palestinian accounts have been censored generally but especially these past few weeks as we try to document Israeli aggressions on the ground.”

Posted on Techdirt - 1 December 2021 @ 03:40pm

Content Moderation Case Study: Discord Adds AI Moderation To Help Fight Abusive Content (2021)

Summary: In the six years since Discord debuted its chat platform, it has seen explosive growth. And, over the past half-decade, Discord’s chat options have expanded to include GIFs, video, audio, and streaming. With this growth and these expanded offerings, there have come a number of new moderation challenges and required adapting to changing scenarios.

Discord remains largely text-based, but even when limited to its original offering — targeted text-oriented forums/chat channels — users were still subjected to various forms of abuse. And, because the platform hosted multiple users on single channels, users sometimes found themselves targeted en masse by trolls and other malcontents. While Discord often relies on the admins of servers to handle moderation on those servers directly, the company has found that it needs to take a more hands on approach to handling content moderation.

Discord’s addition of multiple forms of content create a host of new content moderation challenges. While it remained text-based, Discord was able to handle moderation using a blend of AI and human moderators.

Some of the moderation load was handed over to users, who could perform their own administration to keep their channels free of content they didn’t like. For everything else (meaning content that violates Discord’s guidelines), the platform offered a mixture of human and AI moderation. The platform’s Trust & Safety team handled content created by hundreds of millions of users, but its continued growth and expanded offerings forced the company to find a solution that could scale to meet future demands.

To continue to scale, Discord ended up purchasing Sentropy, an AI company that only launched last year with the goal of building AI tools to help companies moderate disruptive behavior on their platforms. Just a few months prior to the purchase, Sentropy had launched its first consumer-facing product, an AI-based tool for Twitter users to help them weed out and block potentially abusive tweets. However, after being purchased, Sentropy shut down the tool, and is now focused on building out its AI content moderation tools for Discord.

Discord definitely has moderation issues it needs to solve — which range from seemingly-omnipresent spammers to interloping Redditors with a taste for tasteless memes — but it remains to be seen whether the addition of another layer of AI will make moderation manageable.

Company Considerations:

  • What advantages can outside services offer above what platforms can develop on their own? 
  • What are the disadvantages of partnering with a company whose product was not designed to handle a platform’s specific moderation concerns?
  • How do outside acquisitions undermine ongoing moderation efforts? Conversely, how do they increase the effectiveness of ongoing efforts? 
  • How should platforms handle outside integration of AI moderation as it applies to user-based moderation efforts by admins running their own Discord servers?
  • How much input should admins have in future moderation efforts? How should admins deal with moderation calls made by AI acquisitions that may impede efforts already being made by mods on their own servers?

Issue Considerations:

  • What are the foreseeable negative effects of acquiring content moderation AI designed to handle problems observed on different social media platforms?
  • What problems can outside acquisitions introduce into the moderation platform? What can be done to mitigate these problems during integration?
  • What negative effect can additional AI moderation efforts have on “self-governance” by admins entrusted with content moderation by Discord prior to acquisition of outside AI?

Resolution: So far, the acquisition has yet to produce much controversy. Indeed, Discord as a whole has managed to avoid many of the moderation pitfalls that have plagued other platforms of its size. Its most notorious action to date was its takeover of the WallStreetBets server as it went supernova during a week or two of attention-getting stock market activity. An initial ban was rescinded once the server’s own moderators began removing content that violated Discord guidelines, accompanied by Discord’s own moderators who stepped in to handle an unprecedented influx of users while WallStreetBets continued to make headlines around the nation.

Other than that, the most notable moderation efforts were made by server admins, rather than Discord itself, utilizing their own rules which (at least in one case) exceeded the restrictions on content delineated in Discord’s terms of use.

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 24 November 2021 @ 03:31pm

Content Moderation Case Studies: Facebook Suspends Account For Showing Topless Aboriginal Women (2016)

Summary: Facebook’s challenges of dealing with content moderation around “nudity” have been covered many times, but part of the reason the discussion comes up so often is that there are so many scenarios to consider that it is difficult to create policies that cover them all.

In March of 2016, activist Celeste Liddle gave the keynote address at the Queen Victoria Women’s Centre’s annual International Women’s Day address. The speech covered many aspects of the challenges facing aboriginal women in Australia, and mentions in passing at one point that Liddle’s Facebook account had been repeatedly suspended for posting images of topless aboriginal women that were shown in a trailer for a TV show.

“I don’t know if people remember, but last year the Indigenous comedy show 8MMM was released on ABC. I was very much looking forward to this show, particularly since it was based in Alice and therefore I knew quite a few people involved.

“Yet there was controversy because when 8MMM released a promotional trailer for the show prior to it going to air. This trailer was banned by Facebook because it featured topless desert women painted up for ceremony engaging in traditional dance.

“Facebook saw these topless women as “indecent” and in violation of their no nudity clause. On hearing this, I was outraged that Arrernte woman undertaking ceremony could ever be seen in this way so I posted the trailer up on my own page stating as such.

“What I didn’t count on was a group of narrow-minded little white men deciding to troll my page so each time I posted it, I not only got reported by them but I also got locked out and the video got removed.” — Celeste Liddle

The publication New Matilda published a transcript of the entire speech, which Liddle then linked to herself, leading her account to be suspended for 24 hours and New Matilda’s post being removed — highlighting the point that Liddle was making. As she told New Matilda in a follow up article about the removal and the suspension:

“My ban is because I’ve previously published images of nudity… I’m apparently a ‘repeat nudity poster offender’…

“I feel decidedly smug this morning, because everything I spoke about in my speech on this particular topic just seems to have been proven completely true…

“It’s actually a highly amusing outcome.” — Celeste Liddle

Facebook’s notice to New Matilda claimed that it was restricted for posting “nudity” and said that the policy has an exception if the content is posted for “educational, humorous or satirical purposes,” but did not give New Matilda a way to argue that the usage in the article was “educational.”

Many publications, starting with New Matilda, highlighted the contrast that the same day Liddle gave her speech (International Women’s Day), Esquire released a cover story about Kim Kardashian which featured an image of her naked but partially painted. Both images, then, involved topless women, with their skin partially painted. However, those posting the aboriginal women faced bans from Facebook, while the Kardashian image not only remained up, but went viral.

Company Considerations:

  • How can policies regarding nudity be written to take into account cultural and regional differences?
  • Is there a way to adequately determine if nudity falls into one of the qualified exemptions, such as “educational” use?
  • What would be an effective and scalable way to enable an appeals process that would allow users like Liddle to inform Facebook the nature of the content that resulted in her temporary suspension?

Issue Considerations:

  • Questions about moderating “nudity” have been challenging for many websites. Are there reasonable and scalable policies that can be put in place that adequately take context into account?
  • Many websites start out with a “no nudity” policy to avoid having to deal with adult material on their websites. What other factors should any website consider regarding why a more nuanced policy may make more sense?

Resolution: After this story got some attention, Liddle launched a petition asking Facebook to recognize that aboriginal women “practicing culture are not offensive.”

Facebook’s standards are a joke. They are blatantly racist, sexist and offensive. They show a complete lack of respect for the oldest continuing culture in the world. They also show that Facebook continually fails to address their own shortfalls in knowledge. Finally, they show that Facebook is more than willing to allow scurrilous bullying to continue rather than educate themselves. — Celeste Liddle

New Matilda requested comment from Facebook regarding the removal of the link to its story and were told that even if the sharing was for an “awareness campaign” Facebook still believed it should be removed because some audiences in Facebook’s “global community” would be “sensitive” to such content. The company also notes that in order to allow its content moderators to apply rules “uniformly” they sometimes need to be “more blunt than we would like.”

“We are aware that people sometimes share content containing nudity for reasons like awareness campaigns, artistic projects or cultural investigations. The reason we restrict the display of nudity is because some audiences within our global community may be sensitive to this type of content – particularly because of cultural background or age. In order to treat people fairly and respond to reports quickly, it is essential that we have policies in place that our global teams can apply uniformly and easily when reviewing content. As a result, our policies can sometimes be more blunt than we would like, and restrict content shared for legitimate purposes. We encourage people to share Celeste Liddle’s speech on Facebook by simply removing the image before posting it.”

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 17 November 2021 @ 03:30pm

Content Moderation Case Study: Game Developer Deals With Sexual Content Generated By Users And Its Own AI (2021)

Summary: Dealing with content moderation involving user generated content from humans is already quite tricky — but those challenges can reach a different level when artificial intelligence is generating content as well. While the cautionary tale of Microsoft’s AI chatbot Tay may be well known, other developers are still grappling with the challenges of moderating AI-generated content.

AI Dungeon wasn’t the first online text game to leverage the power of artificial intelligence. For nearly as long as gaming has been around, attempts have been made to pair players with algorithmically-generated content to create unique experiences.

AI Dungeon has proven incredibly popular with players, thanks to its use of powerful machine learning algorithms created by Open AI, the latest version of which substantially expands the input data and is capable of generating text that, in many cases, is indistinguishable from content created by humans.

For its first few months of existence, AI Dungeon used an older version of Open AI’s machine learning algorithm. It wasn’t until Open AI granted access to the most powerful version of this software (Generative Pre-Trained Transformer 3 [GPT-3]) that content problems began to develop.

As Tom Simonite reported for Wired, Open AI’s moderation of AI Dungeon input and interaction uncovered some disturbing content being crafted by players as well as its own AI.

A new monitoring system revealed that some players were typing words that caused the game to generate stories depicting sexual encounters involving children. OpenAI asked Latitude to take immediate action. “Content moderation decisions are difficult in some cases, but not this one,” OpenAI CEO Sam Altman said in a statement. “This is not the future for AI that any of us want.”

While Latitude (AI Dungeons’ developer) had limited moderation methods during its first few iterations, its new partnership with Open AI and the subsequent inappropriate content, made it impossible for Latitude to continue its limited moderation and allow this content to remain unmoderated. It was clear that the inappropriate content wasn’t always a case of users feeding input to the AI to lead it towards generating sexually abusive content. Some users reported seeing the AI generate sexual content on its own without any prompts from players. What may have been originally limited to a few users specifically seeking to push the AI towards creating questionable content had expanded due to the AI’s own behavior, which assumed all input sources were valid and usable when generating its own text.

Company Considerations:

  • How can content created by a tool specifically designed to iteratively generate content be effectively moderated to limit the generation of impermissible or unwanted content?
  • What should companies do to stave off the inevitability that their powerful algorithms will be used (and abused) in unexpected (or expected) ways? 
  • How should companies apply moderation standards to published content? How should these standards be applied to content that remains private and solely in the possession of the user?
  • How effective are blocklists when dealing with a program capable of generating an infinite amount of content in response to user interaction?

Issue Considerations:

  • What steps can be taken to ensure a powerful AI algorithm doesn’t become weaponized by users seeking to generate abusive content?

Resolution: AI Dungeon’s first response to Open AI’s concerns was to implement a blocklist that would prevent users from nudging the AI towards generating questionable content, as well as prevent the AI from creating this content in response to user interactions.

Unfortunately, this initial response generated a number of false positives and many users became angry once it was apparent that their private content was being subjected to keyword searches and read by moderators.

AI Dungeon’s creator made tweaks to filters in hopes of mitigating collateral damage. Finally, Latitude arrived at a solution that addressed over-blocking but still allowed it access to Open AI’s algorithm. This is from the developer’s latest update on AI Dungeon’s moderation efforts, published in mid-August 2021:

We’ve agreed upon a new approach with OpenAI that will allow us to shift AI Dungeon’s filtering to have fewer incorrect flags and allow users more freedom in their experience. The biggest change is that instead of being blocked from playing when input triggers OpenAI’s filter, those requests will be handled by our own AI models. This will allow users to continue playing without broader filters that go beyond Latitude’s content policies.

While the fix addressed the overblocking problem, it did create other issues for players, as AI Dungeon’s developer acknowledged in the same post. Users who were shunted to AI Dungeon’s AI would suffer lower performance due to slower processing. On the other hand, routing around Open AI’s filtering system would allow AI Dungeon users more flexibility when crafting stories and limit false flags and account suspensions.

Originally posted to the Trust & Safety Foundation website.

Posted on Techdirt - 10 November 2021 @ 03:41pm

Content Moderation Case Study: Electric Truck Company Uses Copyright Claims To Hide Criticism (2020)

Summary: There are many content moderation challenges that companies face, but complications arise when users or companies try to make use of copyright law as a tool to block criticism. In the US, the laws around content that allegedly infringes on a copyright holder’s rights are different than most other types of content, and that creates some interesting challenges in the content moderation space.

Specifically, under Section 512 of the Digital Millennium Copyright Act (DMCA), online service providers who do not wish to be held liable for user-posted material that infringes copyright need to take a few steps to be free of liability. Key among those steps is having a “notice-and-takedown” process, in which a copyright holder can notify the website of allegedly infringing material; and if the website removes access to the work, it cannot be held liable for the infringement.

This process creates a strong incentive for websites to remove content upon receiving a takedown notice, as doing so automatically protects the site. However, this strong incentive for the removal of content has also created a different kind of incentive: those who wish to have content removed from the internet can submit takedown notices claiming copyright infringement, even if the work does not infringe on copyright. This creates an interesting challenge for companies hosting content: determining when a copyright takedown notice has been submitted for illegitimate purposes.

In September of 2020, news was released that Nikola, an alternative energy truck company’s promotional video showing its new hydrogen fuel cell truck driving along a highway was false. A report by a research firm criticized the company, saying that the truck did not move under its own propulsion. As it turned out, the truck did not actually have a hydrogen fuel cell and was instead filmed rolling downhill; Nikola admitted that it had faked its promotional video. In Nikola’s response, it admits that the truck did not move on its own, but it still claimed that the original report was “false and defamatory.” While the response from Nikola does highlight areas where it disagrees with the way in which the research firm wrote about the company’s efforts, it does not identify any actual “false” statements of fact.

Soon after this, many YouTube creators who made videos about the situation discovered that their videos about the incident were being removed due to copyright claims from Nikola. While video creators did use some of the footage of the faked promotional video in their YouTube videos, they also noted that it was clearly fair use, as they were reporting on the controversy and just using a short snippet of Nikola’s faked promotional video, often presenting it in much longer videos with commentary.

When asked about the situation, Nikola and YouTube spokespeople seemed to give very different responses. Ars Technica’s Jon Brodkin posted the comments from each side by side:

“YouTube regularly identifies copyright violations of Nikola content and shares the lists of videos with us,” a Nikola spokesperson told Ars. “Based on YouTube’s information, our initial action was to submit takedown requests to remove the content that was used without our permission. We will continue to evaluate flagged videos on a case-by-case basis.”

YouTube offered a different description, saying that Nikola simply took advantage of the Copyright Match Tool that’s available to people in the YouTube Partner Program.

“Nikola has access to our copyright match tool, which does not automatically remove any videos,” YouTube told the [Financial Times]. “Users must fill out a copyright removal request form, and when doing so we remind them to consider exceptions to copyright law. Anyone who believes their reuse of a video or segment is protected by fair use can file a counter-notice.”

Company Considerations:

  • Given the potential liability from not taking down an infringing video, how much should YouTube investigate whether or not a copyright claim is legitimate?
  • Is there a scalable process that will allow the company to review copyright takedowns to determine whether or not they are seeking to take down content for unrelated reasons?
  • What kind of review process should be put in place to handle situations, like what happened with Nikola, where a set of videos are reported as copyright violations and are taken down because those videos featured the copyrighted material as news or commentary, and the copyright infringement takedown requests were improper?
  • Improper takedowns can reflect poorly on the internet platform that removes the content, but often make sense to avoid potential liability. Are there better ways to balance these two competing pressures?

Issue Considerations:

  • Copyright is one of the few laws in the US that can be used to pressure a website to take down content. Given that the incentives support both overblocking and false reporting, are there better approaches that might protect speech, while giving companies more ability to investigate the legitimacy of infringement claims?
  • Under the current DMCA 512 structure, users can file a counternotice with the website, but the copyright holder is also informed of this and given 10 days to file a lawsuit. The threat of a lawsuit often disincentivizes counternotices. Are there better systems enabling those who feel wrongfully targeted to express their concerns about a copyright claim?

Resolution: After the press picked up on the story of these questionable takedown notices, many of the YouTube creators found that the takedown demands had been dropped by Nikola.

In July of 2021, nine months after the news broke of the faked videos, Nikola’s founder Trevor Milton was charged with securities fraud by the SEC for the faked videos.

Originally posted to the Trust & Safety Foundation website.

More posts from >>