Don't Let This Get Lost In The Shuffle: The Data Transfer Project Is Expanding, And Could Help Create Real Competition Online

from the this-is-important dept

While lots of people are angling to break up the big internet companies in the belief that will lead to more competition, we've long argued that such a plan is unlikely to work. Instead, if you truly want more competition you need to end the ability of these companies to lock up your data. Instead, we need to allow third parties access so that the data is not stuck in silos, but where users themselves both have control and alternative options that they can easily move to.

That's why we were quite interested a year ago when Google, Facebook, Microsoft and Twitter officially announced the Data Transfer Project (which initially began as a Google project, but expanded to those other providers a year ago). The idea was that the companies would make it ridiculously easy to let users automatically transfer their own data (via their own control) to a different platform. While some of the platforms had previously allowed users to "download" all their data, this project was designed to be much more: to make switching from one platform to another much, much easier -- effectively ending the siloing of data and (worse) the lock-in effects that help create barriers to competition. As we noted last year:

But the really important thing that this may lead to is not so much about transferring your data between one of the giant platforms, but hopefully in opening up new businesses which would allow you to retain much greater control over your data, while limiting how much the platforms themselves keep. This is something we've talked about in the past concerning the true power of data portability. Rather than having it tied up in silos connected to the services you use, wouldn't it be much better if I could keep a "data bank" of my data in a place that is secure -- and where if and when I want to I can allow various services to access that data in order to provide the services I want?

In other words, for many years I've complained about how we've lost the promise of cloud computing in just building up giant silos of data connected to the various online services. If we can separate out the data layer from the service layer, then we can get tremendous benefits, including (1) more end-user control over their own data (2) more competitive services and (3) less power to dominate everything by the biggest platforms. Indeed, we could even start to move towards a world of protocols instead of platforms.

So it's good news to see the latest announcement about the project is that it's expanding once again. While the headlines are that Apple has joined the program (to round out the biggest internet companies) it's also notable that two other very interesting, but much smaller, players are joining as well: the federated Mastodon project and Tim Berners-Lee's Solid, which is an attempt to build the kind of "protocols, not platforms" approach that we keep advocating for.

There are still many open questions about how well all of this will work -- but if you believe in true competition among internet services this is the project to pay attention to it, as it has the highest likelihood of actually creating such competition. Plans to "break up" big tech just creates a few more data silos and effectively locks in some pre-selected (slightly smaller) giants, thanks to network effects. What the Data Transfer Project does is flip the equation. It makes it so that more competition can thrive without taking away the network effects that make the internet so powerful. It's the most interesting, and most compelling approach to generating actual competition among internet services.

I still hope that the project goes even further in knocking down silos and opening up for competition, but it's already quite encouraging. Of course, it got almost no attention at all because anti-trust is sexy, whereas companies opening themselves up to competition through technological means is apparently boring.

Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Filed Under: competition, data portability, data transfer project, lock-in, mastodon, solid
Companies: apple, facebook, google, microsoft, solid

Reader Comments

The First Word

That's why we were quite interested a year ago when Google, Facebook, Microsoft and Twitter officially announced the Data Transfer Project (which initially began as a Google project, but expanded to those other providers a year ago). The idea was that the companies would make it ridiculously easy to let users automatically transfer their own data (via their own control) to a different platform.

Have you actually looked at Facebook's "ridiculously easy" data? I downloaded mine a few months ago, and looking at it from the perspective of a programmer, it's garbage. It's exactly what I would do if I wanted to set up a system specifically designed to look like openness to an unskilled outside observer (such as a politician or regulator) while being worthless for the purpose of actually enabling data transfer to a competitor.

The devil, it has been said, is in the details, and when you look at the details of the data Facebook gives you, (and what they don't give you), you definitely see a diabolical entity emerge. The most important subtle little problem is that there are no unique identifiers.

For example, in your Friends data, it gives you the name of each Friend, and a few bits of data they've shared, but no username or other token that identifies them specifically. Then in your Comments data, it says which post you commented on, and the name of the person who posted it... but without a unique identifier you have no way of knowing if this Bob Smith is the same Bob Smith in your Friends list or someone else who happens to have that name.

You may say "well sure, but how likely are you to have two friends by the same name, or go commenting on someone's post with the same name as one of your friends?" And you'd probably be right... but that's exactly what makes this such a subtly evil problem. Because it looks just fine to any individual user, but if you try to use the data for its primary intended purpose--to facilitate competition by enabling people to move to a competing system--the lack of unique identifiers makes it impossible to reconstruct the social graph. If I'm running the MasonBook network and I import data from Dave, Fred, and Janet, and all of them have a friend named Bob Smith, I have no way to determine if they're all friends with the same person or not.

Facebook's "participation" in the Data Transfer Project is nothing but transparency theater, to borrow a concept from the world of security. It's just more of the same from a company that's never bothered to even pretend they're not being evil.

—Mason Wheeler

Subscribe: RSS

View by: Time | Thread

  1. identicon
    Anonymous Coward, 2 Aug 2019 @ 7:51am

    Re: Re:

    Part of what led us to "the cloud" was the drive (starting somewhere in the 80s to 90s I reckon, and ramping up as the Great Information Security Flame War intensified) to firewall networks and take the control of which protocols could reach an endpoint out of that endpoint's hands. This "denied by default" posture, while a good decision from a tactical, information-security standpoint, led to a situation where only a very small set of protocols are allowed to cross the "border posts" between networks. The winners in this were the widely used client-server protocols of the time: HTTP(S), POP3 ( largely replaced by IMAP now), DNS (out of sheer necessity), and FTP (in PASV mode). SSH was frequently permitted as well, but not always (due to its tunneling capability or non-inspectability compared to HTTPS); if you were some other protocol, though, you had to be prepared for being unreachable from a significant subset of endpoints (take IRC or BitTorrent for instance) or being restricted to proxied flows (SMTP, which you can only use to reach a border-guard-controlled mail transfer agent). Furthermore, the addition of secondary screening (application-layer security proxies) to these border posts made trying to "masquerade" protocols by hiding them within the normal flows of say HTTPS a dicey effort, at best.

    In addition, the notion of unsolicited inbound traffic was treated as an insecure horror by these newfound border police forces (through the use of stateful firewalling and address translation), further cementing the dominance of centralized, client-server models over decentralized, "host anywhere" protocols. Furthermore, even if you had enough control over the border police to get them to send the unsolicited visitors to you (port forwarding), the relative unwillingness of ISPs to assign static addresses to consumer Internet endpoints posed an additional barrier, requiring the development and use of dynamic DNS updating over HTTP(S) as the mechanisms for this that are native to the DNS protocol will not work in a restrictive environment where DNS is proxied, or DNS servers are otherwise locked down.

    With all these challenges, and the desire of users to access the services they wished coming into conflict with what the border guards were yammering about, everything got multiplexed into valid HTTP running over port 80 (later to be replaced with HTTPS over port 443), stunting the growth of new protocols to environments where a) the application could receive official blessing and paperwork from all the border posts involved (site-to-site VPN protocols, SIP), b) the protocol was never intended to cross a border to begin with (SMB, NFS, and so on, internal chat systems), or c) the protocol was either client-server OR equipped with ways to pierce address translation, AND was intended for use by end consumers only, outside of managed environments (online gaming, "olden days" chat/instant messaging).

Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here

Subscribe to the Techdirt Daily newsletter

Comment Options:

  • Use markdown. Use plain text.
  • Remember name/email/url (set a cookie)

Follow Techdirt
Insider Shop - Show Your Support!

Essential Reading
Techdirt Insider Chat
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it

Email This

This feature is only available to registered users. Register or sign in to use it.