from the not-how-it-works dept
Content moderation at scale is impossible. This time, it’s email content moderation. This week a new publication called The Markup launched. It’s a super smart group of folks who are doing deep data-driven investigative reporting of companies in and around the tech space — and I’m very excited to see what they do. I was going to write about the project overall and its goals, but instead I’m going to write about one of its first stories, done in partnership with the Guardian, entitled Swinging the Vote?, and which looks at Gmail’s filtering system, specifically as it regards political emails from Presidential candidates.
A few years back, Google added the “Promotions” tab to Gmail, as a way of hopefully, automagically sorting not-quite-spam emails, but general promotional emails that you probably don’t want cluttering up your inbox. Personally, I don’t use it, as I use a different filtering setup entirely that overrides Gmail’s defaults. However, for many people it’s proven to be quite useful. The reporters at The Markup conducted a worthwhile experiment:
The Markup set up a new Gmail account to find out how the company filters political email from candidates, think tanks, advocacy groups, and nonprofits.
We found that few of the emails we?d signed up to receive ?11 percent?made it to the primary inbox, the first one a user sees when opening Gmail and the one the company says is ?for the mail you really, really want.?
Half of all emails landed in a tab called ?promotions,? which Gmail says is for ?deals, offers, and other marketing emails.? Gmail sent another 40 percent to spam.
Very interesting! What was perhaps even more interesting was the chart — which quickly rocketed around social media — showing that some candidates had their emails go into the Primary Inbox at a much, much higher rate than others:
You’ll notice a few standouts. 63% of Pete Buttigieg’s emails made it to the Inbox, as did 47% of Andrew Yang’s. Everyone else was much closer to 0% with quite a few — including both Elizabeth Warren and Joe Biden — at 0%.
The reporters at The Markup also published a companion piece that gives the details of how they went about doing this research and (yes!) they even provide the data and the code on Github. This is a fantastic and transparent way of doing such journalism — and I applaud them for that.
However, the very framing of the original story itself… is problematic. It’s one thing to be open about how you conducted the research. But starting with a title like “Swinging the vote” and highlighting the chart above almost immediately resulted in lots of people on Twitter assuming (or suggesting) that Google was doing this deliberately, and that they were purposely making the decision to tilt the playing field towards Buttigieg. This includes vocal big tech critic Roger McNamee, who declared this was evidence that “Gmail has its thumb in the scale.” Another Google critic, who is fond of misleading conspiracy theories about the company, called it “election meddling” and claims that Google was giving certain candidates “special treatment.”
Except… that’s almost certainly not the case. No one at Google on the Gmail spam team is thinking about promoting one Presidential candidate over another. Instead, this is just yet another example of Masnick’s Impossibility Theorem, but applied to email moderation, rather than social media. Content moderation at scale is impossible to do well and will always piss off some people.
Indeed, looking over the data, the most obvious and most likely solution is simply this: Buttigieg and Yang hired competent email marketers who know how to craft emails that are (1) less likely to trigger the algorithm, and (2) less likely to be clicked on as spam by users (an important signal that feeds back into the algorithm). The rest of the candidates… did not. And thus, their emails went to the promotions and spam folder because they had characteristics that are more closely associated with promotions and spam. And, yet, The Markup story doesn’t bother to get into any of that — and thus leaves the speculation wide open, allowing plenty of folks to leap in.
Again, I’m super excited about The Markup as a project and believe it will put out plenty of important and impactful journalism in the days, weeks, months and years to come. I recommend people read over The President’s Letter from the site’s President Nabiha Syed (a past podcast guest) and Editor’s Letter from Julia Angwin — both of which present a compelling vision of what The Markup will be.
But this story shows how important context is in presenting a story. This is a data driven story — which is great. But if the necessary context is not provided, especially on a topic so fraught with speculation, people are going to rush in and jump to conclusions. The Markup itself did not directly say that Google was doing this deliberately, but its total failure to suggest why this might be happening, along with a cringe-worthy headline, opened the door for others to jump in and assume as much — and that’s a shame.