Facebook Has Many Sins To Atone For, But 'Selling Data' To Cambridge Analytica Is Not One Of Them

from the let's-at-least-be-accurate dept

Obviously, over the past few days there's been plenty of talk about the big mess concerning Cambridge Analytica using data on 50 million Facebook users. And, with that talk has come all sorts of hot takes and ideas and demands -- not all of which make sense. Indeed, it appears that there's such a rush to condemn bad behavior that many are not taking the time to figure out exactly what bad behavior is worth condemning. And that's a problem. Because if you don't understand the actual bad behavior, then your "solutions" will be misplaced. Indeed, they could make problems worse. And... because I know that some are going to read this post as a defense of Facebook, let me be clear (as the title of this post notes): Facebook has many problems, and has done a lot of bad things (some of which we'll discuss below). But if you mischaracterize those "bad" things, then your "solutions" will not actually solve them.

One theme that I've seen over and over again in discussions about what happened with Facebook and Cambridge Analytica is the idea that Facebook "sold" the data it had on users to Cambridge Analytica (alternatively that Cambridge Analytica "stole" that data). Neither is accurate, and I'm somewhat surprised to see people who are normally careful about these things -- such as Edward Snowden -- harping on the "selling data" concept. What Facebook actually does is sell access to individuals based on their data and, as part of that, open up the possibility for users to give some data to companies, but often unwittingly. There's a lot of nuance in that sentence, and many will argue that for all reasonable purposes "selling data" and my much longer version are the same thing. But they are not.

So before we dig into why they're so different, let's point out one thing that Facebook deserves to be yelled at over: it does not make this clear to users in any reasonable way. Now, perhaps that's because it's not easy to make this point, but, really, Facebook could at least do a better job of explaining how all of this works. Now, let's dig in a bit on why this is not selling data. And for that, we need to talk about three separate entities on Facebook. First are advertisers. Second are app developers. Third are users.

The users (all of us) supply a bunch of data to Facebook. Facebook, over the years, has done a piss poor job of explaining to users what data it actually keeps and what it does with that data. Despite some pretty horrendous practices on this front early on, the company has tried to improve greatly over the years. And, in some sense, it has succeeded -- in that users have a lot more granular control and ability to dig into what Facebook is doing with their data. But, it does take a fair bit of digging and it's not that easy to understand -- or to understand the consequences of blocking some aspects of it.

The advertisers don't (as is all too commonly believed) "buy" data from Facebook. Instead, the buy the ability to put ads into the feeds of users who match certain profiles. Again, some will argue this is the same thing. It is not. From merely buying ads, the advertiser gets no data in return about the users. It just knows what sort of profile info it asked for the ads to appear against, and it knows some very, very basic info about how many people saw or interacted with the ads. Now, if the ad includes some sort of call to action, the advertiser might then gain some information directly from the user, but that's still at the user's choice.

The app developer ecosystem is a bit more complicated. Back in April of 2010, Facebook introduced the Open Graph API, which allowed app developers to hook into the data that users were giving to Facebook. Here's where "things look different in retrospect" comes into play. The original Graph API allowed developers to access a ton of information. In retrospect, many will argue that this created a privacy nightmare (which, it kinda did!), but at the same time, it also allowed lots of others to build interesting apps and services, leveraging that data that users themselves were sharing (though, not always realizing they were sharing it). It was actually a move towards openness in a manner that many considered benefited the open web by allowing other services to build on top of the Facebook social graph.

There is one aspect of the original API that does still seem problematic -- and really should have been obviously problematic right from the beginning. And this is another thing that it's entirely appropriate to slam Facebook for not comprehending at the time. As part of the API, developers could not only get access to all this information about you... but also about your friends. Like... everything. From the original Facebook page, you can see all the "friend permissions" that were available. These are better summarized in the following chart from a recent paper analyzing the "collateral damage of Facebook apps."

If you can't read that... it's basically a ton of info from friends, including their likes, birthdays, activities, religion, status updates, interests, etc. You can kind of understand how Facebook ended up thinking this was a good idea. If an app developer was designing an app to provide you a better Facebook experience, it might be nice for that app to have access to all that information so it could display it to you as if you were using Facebook. But (1) that's not how this ever worked (and, indeed, Facebook went legal against services that tried to provide a better Facebook experience) and (2) none of this was made clear to end-users -- especially the idea that in sharing your data with your friends, they might cough up literally all of it to some shady dude pushing a silly "personality test" game.

But, of course, as I noted in my original post, in some cases, this set up was celebrated. When the Obama campaign used the app API this way to reach more and more people and collect all the same basic data, it was celebrated as being a clever "voter outreach" strategy. Of course, the transparency levels were different there. Users of the Obama app knew what they were supporting -- though didn't perhaps realize they were revealing a lot of friend data at the same time. Users of Cambridge Analytica's app... just thought they were taking a personality quiz.

And that brings us to the final point here: Cambridge Analytica, like many others, used this setup to suck up a ton of data, much of it from friends of people who agreed to install a personality test app (and, a bunch of those users were actually paid via Mechanical Turk to basically cough up all their friends' data). There are reasonable questions about why Facebook set up its API this way (though, as noted above, there were defensible, if short-sighted, reasons). There are reasonable questions about why Facebook wasn't more careful about watching what apps were doing with the data they had access to. And, most importantly, there are reasonable questions about how transparent Facebook was to its end users through all of this (hint: it was not at all transparent).

So there are plenty of things that Facebook clearly did wrong, but it wasn't about selling data to Cambridge Analytica and it wasn't Cambridge Analytica "stealing" data. The real problem was in how all of this was hidden. It comes back to transparency. Facebook could argue that this information was all "public" -- which, uh, okay, it was, but it was not public in a way that the average Facebook user (or even most "expert" Facebook users) truly understood. So if we're going to bash Facebook here, it should be for the fact that none of this was clear to users.

Indeed, even though Facebook shut down this API in April of 2015 (after deprecating it in April of 2014), most users still had no idea just how much information Facebook apps had about them and their friends. Today, the new API still coughs up a lot more info than people realize about themselves (and, again, that's bad and Facebook should improve that), but no longer your friends' data as well.

So slam Facebook all your want for failing to make this clear. Slam Facebook for not warning users about the data they were sharing -- or that their friends could share. Slam Facebook for not recognizing how apps were sucking up this data and the privacy implications related to that. But don't slam Facebook for "selling your data" to advertisers, because that's not what happened.

I was going to use this post to also discuss why this misconception is leading to bad policy prescriptions, but this one is long enough already, so stay tuned for that one next. Update: And here's that post.

Filed Under: apps, breach, data, selling data, social media, transparency, users
Companies: cambridge analytica, facebook

Reader Comments

Subscribe: RSS

View by: Time | Thread

  1. identicon
    3D Face Analysis, 21 Mar 2018 @ 10:49am

    Facebook shouldn't own the "copyright" to the database...

    The real problem is that Facebook claims de facto ownership of the user data. In reality the owners of the data are the authors themselves, not Facebook. In copyright law, the owners of the data are the authors themselves, unless the copyright is transferred between parties. However, the only way that copyright could be transferred between parties is by a signed document, or in the case of work-for-hire law, the employer owns the employee's copyrighted work. Virtually none of its users transferred copyright to Facebook that way. (Except for the extremely tiny 0.00000001% of data posted by Zuckerburg himself or his employees.) The users have not transferred their copyrights to Facebook. Therefore, Facebook does not own the data.

    Only the owners of the copyright could sue people. However, because Facebook does not actually own the data, Facebook could not sue. People should be legally allowed to scrap Facebook public data without worrying about copyright infringement.

    However, this is not the case. Facebook sued Power Ventures for scraping data from Facebook. Facebook, like LinkedIn, claims that only they could only sell data, and sues anyone else for selling "their" data without their authorization.

    The web should be open. People should be free to scrap public data without getting sued. In this way more innovation and competition will take place.

    Facebook might argue that they own the "collective work" for the selection or the arrangement of the data. But still they do not own it. Facebook does not "select" or "arrange" the user's data. It's not like there are moderators on Facebook who pick and choose which posts will be allowed to be published on the site. Facebook is NOT a "publisher". Facebook is NOT a "content publisher" by only means. Facebook is instead a service provider, like Gmail. It's absurd to believe that Gmail owns the copyright of all of your email messages. It should be the same for Facebook. Facebook does not own the messages. Facebook does not own the database, because, like Gmail, they are not the author of the data. The authors of the database are the users themselves, not Facebook. Facebook does not "manage" or "compile" the data in the database. Please stop calling Facebook a "publisher" when because it isn't. Facebook a service provider (like a post office), not a publisher (like a magazine).

    Facebook might argue that they "owns" the database because its software "manipulates" the data, but that is still wrong. Yes software underlying Facebook might sorts data chronologically or alphabetically, but the mere act of sorting data is too trivial for copyright protection. There is no creativity in the mere act of sorting your friend's posts reverse-chronologically in a feed or sorting your friends list alphabetically by name. Sorting or aggregating posts alphabetically or chronologically is a mechanical process that is to trivial for copyright protection.

    See "503.03(a) Works-not originated by a human author."

    In order to be entitled to copyright registration, a work must be the product of human authorship. Works produced by mechanical processes or random selection without any contribution by a human author are not registrable.

    Facebook might claim that it "moderates" the database, so it could claim "ownership" of the the database. But this is still flawed. By "moderation" they mean deleting posts that are reported by users to violates its terms of use. For example deleting posts that contain advocates violence or deleting posts that are copyright infringement. But this is too trivial for copyright protection. Facebook deletes only a tiny minority of posts, and the act of "deleting" certain things from the database is too trivial and does not meet the threshold of originality for copyright protection.

    Facebook might again argue that if they don't claim copyright ownership of the user's data, they wouldn't have as much incentive to develop or maintain their software. But this argument is, again, absurd. It's like claiming that if the post office does not claim copyright ownership of all of the mail that they deliver, the post office wouldn't have any economic incentive to continue to exist.

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here

Subscribe to the Techdirt Daily newsletter

Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Techdirt Gear
Show Now: Takedown
Report this ad  |  Hide Techdirt ads
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Chat
Report this ad  |  Hide Techdirt ads
Recent Stories
Report this ad  |  Hide Techdirt ads


Email This

This feature is only available to registered users. Register or sign in to use it.