The Mirage of Data Portability - Marginal REVOLUTION

In The Facebook Trials: It’s Not “Our” Data I wrote:

Facebook hasn’t taken our data—they have created it.

…Moreover, it’s the prospect of profits that has led Facebook and Google to invest in the technology and tools that have created “our data.” The more difficult it is to profit from data, the less data there will be. Proposals to require data to be “portable” miss this important point. Try making your Facebook graph portable before joining Facebook.

In an important post, Will Rinehart, adds detail:

Contrary to the claims of portability proponents, however, it isn’t data that gives Facebook power.

…Facebook’s technology stack, the suite of technologies that it uses behind the scenes, clearly shows the importance of scaling, as much of the architecture was developed in-house to address the unique problems facing Facebook’s vast troves of data. Facebook created BigPipe to dynamically serve pages faster, Haystack to efficiently store billions of photos, Unicorn for searching the social graph, TAO for storing graph information, Peregrine for querying, and MysteryMachine to help with end-to-end performance analysis. Nearly all of this design is open for others to use, and has been a significant boon to programmers in the ecosystem. The company also invested billions in content delivery networks to quickly deliver video, and it split the cost of an undersea cable with Microsoft to speed up information travel.

The vast investment that Facebook has put into programs for understanding and processing its users’ data points to the fundamental flaw in the argument for data portability.

…Requiring data portability does little to deal with the very real challenges that face the competitors of Facebook, Amazon, and Google. Entrants cannot merely compete by collecting the same kind of data. They need to build better sets of tools to understand information and make it useful for consumers.