Wednesday, April 18, 2012

The State of Data Portability in Social Media, Part I – A Closer Look at Facebook

View the original post on

[The following is not a commentary on data portability policies at Facebook --- that will be a follow-up to this series. Instead, this article attempts to document the current state of data portability within social media, and in this case, Facebook in particular]

Every day, Facebook consumes billions of snippets of people’s lives in the form of freely-provided pictures, comments, messages, and more and stores them away in server facilities scattered throughout the world. This information is added to the tons of other information they already have and then used to render details of our lives upon request. But beyond the Facebook website, how can a user interact directly with their information?

Facebook’s Personal Archive

Accessing your Facebook information is as simple as visiting your Facebook page, or that of your friends. This structured format is constantly being tweaked to provide what Facebook believes is the best way for you to interact with all of this information. But they also provide a mechanism for you to take your data “offline” through the downloading of a “personal archive”, as in:

Getting to this result starts easily enough, simply access your account settings from your Facebook page and select “Download a copy of your Facebook data” at the bottom of the GENERAL ACCOUNT SETTINGS tab.

After a bit of security validation, the process begins. Facebook starts gathering your information into your personal archive and emails you when complete. Not all of your information is provided, however, particularly things that involve activity with others. Specific information includes:

Now, I believe that I am a moderate to light-weight Facebook user. I do not use it every day, although I do have several hundred “friends” and my twitter feed auto-posts to Facebook. Still, it took almost a full hour to gather my information and package it up with neatly organized directories into a 44MB zip file ready for me to download.

And here’s the content of the zip file representing my personal archive from Facebook:

Now that I have my data on my own computer, I can browse through it without having to be connected to the internet. I have successfully downloaded a copy of my data Facebook allows me to access offline. There were some issues browsing the data, the biggest being no “pagination” for the data, hence when I tried to view all my messages the browser locked up trying to render so much information.

From a pure data portability perspective, this process is more of a “backup” of data than true data portability. The information provided is pre-formatted into html documents that make it easy to interact with IN THE FORMAT CHOSEN BY FACEBOOK, however much of the underlying data is unavailable to non-programmers.

I had hoped to see additional formats to the archive, or even just one, other than formatted HTML. JSON would have been my first choice, and, in fact, there was a time when Facebook did provide this option, but alas no more.

Facebook’s Graph API

Programmers have considerably more options through Facebook’s extensive SOCIAL GRAPH API and related tools and resources, but this is not for the average user. Most applications today that integrate with Facebook are doing so in one form or another through the API interface (or one of its related components or plugins).

Interaction with graph data is extensive. From the Facebook’s Developer pages:

“The Graph API presents a simple, consistent view of the Facebook social graph, uniformly representing objects in the graph (e.g., people, photos, events, and pages) and the connections between them (e.g., friend relationships, shared content, and photo tags).”

Facebook has also provided public access to the GRAPH API through the use of its RESTful interface. This makes it extremely easy to gather specific information from the social graph simply by referencing a web address, as in:

3rd Party Alternatives to Facebook’s Personal Archive

Facebook itself is not the only option for users interested in extracting their information. One of the most interesting alternatives is a site called (with an accompanying Facebook App), built by assistant professor of art Owen Mundy of Florida State University. It essentially provides a user interface to many of the programmatic aspects of the Facebook Graph API.


A variety of other options exist that can help users access and interact with their information, but ultimately Facebook has the biggest opportunity, as well as responsibility, to see these initiatives through.

Coming up next: The State of Data Portability in Social Media, Part II – A Closer Look at Google.

-- Steve Repetti