Content Moderation At Scale Is Impossible; Naughty Kids In Wuhan Edition

from the masnick's-impossibility-theorem dept

I keep trying to point out that content moderation at scale is impossible to do well for a whole variety of reasons, including the fact that sooner or later some people -- or some large groups of people -- may try to game the system in totally unexpected ways. Witness this amusing example from the London Review of Books, reporting on the situation in Wuhan, China, which was ground zero for the Covid-19 coronavirus outbreak. With everything shut down in and around Wuhan, schools have moved to online learning -- and some naughty kids seem to have worked out a way to try to get out of having to do schoolwork: getting the app the schools rely on pulled from the app store via fake negative ratings.

Schools are suspended until further notice. With many workplaces also shut, notoriously absent Chinese fathers have been forced to stay home and entertain their children. Video clips of life under quarantine are trending on TikTok. Children were presumably glad to be off school – until, that is, an app called DingTalk was introduced. Students are meant to sign in and join their class for online lessons; teachers use the app to set homework. Somehow the little brats worked out that if enough users gave the app a one-star review it would get booted off the App Store. Tens of thousands of reviews flooded in, and DingTalk’s rating plummeted overnight from 4.9 to 1.4. The app has had to beg for mercy on social media: ‘I’m only five years old myself, please don’t kill me.’

Must tip my cap to the cleverness here, but on the content moderation side it shows, yet again, just how difficult it is to handle content moderation. No one running an app store or other platform prepares for a situation like this. In this case, at least, it seems likely that with so many negative reviews -- and now press attention -- the platform might take notice and discount the most recent thousands of reviews, but imagine having to keep track of every case where this is happening, often on a much smaller, less obvious, scale?

What seems easy about content moderation almost never is. Everyone seems to think it's easy until they're actually running a platform.

Filed Under: content moderation, content moderation at scale, coronavirus, covid-19, dingtalk, ios, remote learning, students, wuhan
Companies: apple


Reader Comments

Subscribe: RSS

View by: Time | Thread


  • identicon
    Anonymous Coward, 10 Mar 2020 @ 10:57am

    This is why I still think that a up or down choice works better than a rating. Providing people with a way to amplify their own like (or dislike) of something leads to a skewing of results.

    reply to this | link to this | view in chronology ]

    • identicon
      Anonymous Coward, 10 Mar 2020 @ 11:54am

      Re:

      It's especially ridiculous that you have to give someone a star to indicate they're terrible. How does that make any sense? Besides that, it's well documented that these ratings skew high. "Got what was promised" should be the average experience with a 50% rating, whereas people are encouraged to give maximum ratings for that.

      reply to this | link to this | view in chronology ]

      • identicon
        Anonymous Coward, 10 Mar 2020 @ 12:05pm

        Re: Re:

        I actually prefer a four-tier choice. Good or bad, great or terrible. No middle tier "3 star" choice that provides no help. No pointless 10 or 100 point scale that only allows trolls to bring down scores because, as you say, the larger the scale, the higher it tends to skew because people don't normalize their scores. 100 point scales are the worse because people think in terms of school grades where anything under a 60% is a usually considered a failure instead of just average.

        reply to this | link to this | view in chronology ]

        • identicon
          Anonymous Coward, 10 Mar 2020 @ 2:30pm

          Re: Re: Re:

          No middle tier "3 star" choice that provides no help.

          As long as it's unambiguous what people should pick for something that goes well, but is nothing special. That's the usual interaction ("taxi got me where I asked to go, at the listed price"—that's not "5 stars"). I don't even know that "bad" vs. "terrible" is a useful choice; in practice that seems more based on how vindictive someone feels, rather than a real judgment of the product/service.

          Really, I'd say anyone reporting a bad experience should have to identify the type of problem they had. At least whether the problem is in the product, the advertising/listing, or the seller. (What does "1 star" mean in the context of this Wuhan thing? "I didn't find it useful", "wasn't as advertised", "product is malware"—Apple could have very different responses to those.)

          reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Mar 2020 @ 11:17am

    Those ratings are super useful when they're honest. But it's far too easy to weaponize them to either artificially boost an app or product's rating or to tear it down. A safer metric might be number of downloads versus number of uninstalls.

    reply to this | link to this | view in chronology ]

    • icon
      Samuel Abram (profile), 10 Mar 2020 @ 1:34pm

      Re:

      But even that's not a good metric: I've uninstalled lots of great games on Steam, GOG, and itch.io for the very simple reason that I finished them.

      reply to this | link to this | view in chronology ]

      • icon
        urza9814 (profile), 10 Mar 2020 @ 1:53pm

        Re: Re:

        Meh, that's still providing useful information about replayability, so it would be a good metric to have in addition to a star rating. But even better I think would be to weight star ratings based on the play/use time. A one star review from someone who has played a hundred hours counts a hundred times more than a one star review from someone who played for less than an hour. And you'd need some expiry method too, so that a five star review from someone who hasn't logged in for a year isn't overruling newer ratings. So maybe [star rating] * [play time - time since last login]...and then normalize that by dividing by the value as if every user had given five stars.

        reply to this | link to this | view in chronology ]

        • identicon
          Tally, 28 Mar 2020 @ 9:02am

          Re: Re: Re:

          I feel like that would then be biased towards the hardcore fans of the game that put hundreds or thousands of hours into a game. Instead of just the gamers that played the game enough to be well informed.

          Clearly there'd need to be a cap on how far playtime weighs into it in that hypothetical rating system.

          But really, I'm not sure if such a system would actually work well.

          reply to this | link to this | view in chronology ]

  • icon
    urza9814 (profile), 10 Mar 2020 @ 11:26am

    The problem isn't the moderation

    I would argue that moderation isn't really the problem here, the problem is schools relying on third-party services that they have no control over and which were designed for a very different use case. It's an easy enough problem to solve, just post an APK on your own website and nobody can take that down but you.

    reply to this | link to this | view in chronology ]

    • icon
      Graham J (profile), 10 Mar 2020 @ 11:46am

      Re: The problem isn't the moderation

      And then tell people to side load apps made by whoever that haven't been checked with Google's scanner (assuming they use Android)?

      That solution may be easy, but it isn't good.

      reply to this | link to this | view in chronology ]

      • icon
        urza9814 (profile), 10 Mar 2020 @ 12:25pm

        Re: Re: The problem isn't the moderation

        If you don't trust the manufacturer of the app, why are you installing it? In most cases I trust them more than I trust Google.

        Google catches the obvious malware, but they're also the delivery system for the less obvious malware. Moderating for viruses is no easier than moderating for content (it's probably harder, as bad content often isn't trying to hide that fact.) and malware has gotten through in the past and will in the future. Better to download from a reputable source in the first place rather than downloading any random garbage that pops up in a search result and assuming it's safe.

        reply to this | link to this | view in chronology ]

      • icon
        Thad (profile), 10 Mar 2020 @ 12:32pm

        Re: Re: The problem isn't the moderation

        Given that it's China, they're very probably using Android but not Google Play.

        reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Mar 2020 @ 11:43am

    Steam

    During the beginning of the whole Epic Store exclusive fuss, Steam saw several games severely tank on review (most notably the newest Metro game)... they managed to put a solution into place as an attempt to mitigate review bombing... it's changed how the review bombing impacts the overall reviews, but still doesn't really solve the problem.
    So there will be "something" done, but overall it's just going to muddy reviews all together (there are legitimate cases for a sudden spike in poor reviews, like the removal of a feature for no reason... or to put it in a 'higher tier')
    Maybe someone should make a game about reviews :0

    reply to this | link to this | view in chronology ]

  • icon
    Graham J (profile), 10 Mar 2020 @ 11:43am

    Actually

    Guess you haven't heard of Wikipedia.

    reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Mar 2020 @ 1:38pm

    Click-through "reviews" are rendered pretty much useless by self-selection bias. Basically, the only people who bother to leave them are sycophants and haters giving maximum or minimum rating, respectively, without thinking. The majority who didn't have any strong opinion, usually won't bother to review at all.

    IMO, everyone who uses an app without reviewing should be given a default neutral rating, to represent that silent majority.

    reply to this | link to this | view in chronology ]

    • icon
      urza9814 (profile), 10 Mar 2020 @ 2:06pm

      Re:

      But how do you define "users"? Is it everyone who ever downloaded the app? Everyone who ever logged in? Everyone who logged in in the past day? Everyone who installed it once upon a time and forgot about it and left it running as a background service? I have a lot of installed apps that could break and I wouldn't notice. I even have some apps that I use daily that could have major bugs in major features and I would never notice because I'm only using one minor piece of the app. I don't want to be giving others a false impression that these apps work well when I really don't know or care.

      As I posted elsewhere on this article, I think a better method would be to weight reviews based on the play/usage time of the user writing the review. So you still only get reviews from people who are actually invested in the app in some way, but one review from a loyal, long-term user will overrule hundreds from people who are just review bombing.

      reply to this | link to this | view in chronology ]

  • icon
    Rico R. (profile), 10 Mar 2020 @ 1:46pm

    At least dogs can breathe a sigh of relief with this news; they’re no longer being framed for eating homework the kids didn’t do!

    reply to this | link to this | view in chronology ]

  • icon
    ECA (profile), 10 Mar 2020 @ 1:55pm

    moderation.

    Isnt moderated.

    There are allot of things that you can Cut, delete, augment but moderation tends to be Hard.
    Trying to get things What??
    you can get them to moderate
    Speech,
    how things are said and expressed
    How things are done
    But you cant balance things when the group DONT LIKE YOU..

    reply to this | link to this | view in chronology ]

  • identicon
    Anonymous Coward, 10 Mar 2020 @ 3:38pm

    Communication works well in a community--one where people genuinely care about each other and about the subject. It doesn't work--at all--where people are driven by malice. If neighbors hate each other, if their idea of a good time is to get together to burn cars or churches or Asian grocery stores, then telling them to TALK NICE is going to be rather a waste.

    People who don't have enough local fellow-misanthropes used to go to the big city. Now they just go online. And you can't tell them not to do something because people don't like it. They are doing that thing precisely because people don't like it.

    reply to this | link to this | view in chronology ]

  • identicon
    Lawrence D’Oliveiro, 10 Mar 2020 @ 5:13pm

    “Impossible” Or Not, It Has To Be Done

    When these companies have the power to make or break livelihoods on a whim, it becomes clear that their behaviour cannot continue unchecked.

    This is why we have Governments and legal systems: to step in and impose rules once the “Wild West” no longer becomes a tenable way to live.

    reply to this | link to this | view in chronology ]

    • icon
      That One Guy (profile), 11 Mar 2020 @ 1:23am

      Still nope

      When these companies have the power to make or break livelihoods on a whim, it becomes clear that their behaviour cannot continue unchecked.

      That example of yours does not show what you think it does. While it's certainly aggravating to have trolls and/or puritanical pinheads flagging photos Facebook is not the one deciding to 'break' anyone's livelihood there, as that is simply another example of the difficulty of moderation at large scale, why you don't put all your eggs in one basket and yet another instance of a story as old as human civilization, one or more losers deciding to screw with someone using the tools available.

      This is why we have Governments and legal systems: to step in and impose rules once the “Wild West” no longer becomes a tenable way to live.

      Still waiting for you to list exactly what you think the platforms should be forced into doing that will magically make those problems go away. I'll even narrow it down, what 'rules' do you think should have been in place that would have prevent the example you linked above that wouldn't cause even more damage?

      reply to this | link to this | view in chronology ]

    • icon
      Scary Devil Monastery (profile), 11 Mar 2020 @ 4:34am

      Re: “Impossible” Or Not, It Has To Be Done

      "When these companies have the power to make or break livelihoods on a whim, it becomes clear that their behaviour cannot continue unchecked."

      Oh, hey, Baghdad Bob, welcome back. I thought I recognized your usual brand of inflammatory anti-google rhetoric.

      And no, the fact that humans are being human STILL isn't a reasonable excuse to abolish actual freedom of speech. Rumors being harmful is something we've lived with for some time now and we still haven't put a cop in every pub to closely monitor what the patrons are saying.

      reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Close

Add A Reply

Have a Techdirt Account? Sign in now. Want one? Register here



Subscribe to the Techdirt Daily newsletter




Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Special Affiliate Offer

Advertisement
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Advertisement
Report this ad  |  Hide Techdirt ads
Recent Stories
Advertisement
Report this ad  |  Hide Techdirt ads

Close

Email This

This feature is only available to registered users. Register or sign in to use it.