Social Network Data: Making Sense of What's Online

A bit of confusion about where we were in the program. But we’re all good now, so let’s getting into the session notes on social network data. Allons-y!

Open Standards for Social Data Exchange and Archiving
Evan Prodromou (StatusNet)

Talking about social network data and standards.

Classes of social data include: profile data (who user is, contact information, what user likes, etc.), social media (text, images, audio, video, polls, checkins, events, Q&A), social graph (record of relationships and connections), social curation (commenting, tagging, sharing).

Challenges to archiving social network data:
Most social networks have limits on what users can do to archive their own data. Have API access rules, winner-take-all business models, etc..

Motivations for preservation of digital social network data: digital civil liberties, open source implementers, enterprise social networks, and social network federation. More pressure to create open data formats in order to preserve social network data.

Standards used in Social Network Media
FOAF “Friend of a Friend”: RDF-based
RSS and/or Atom
SIOC: RDF-based (pronounced “shock”), works with RSS and FOAF
Portable Contacts aka PoCo, VCard-like, XML
Activity Streams social media linked, upward compatible with Atom and RSS, JSON version available, exciting and keep your eye on it, increasing use in libraries

Interesting to hear about standards being used, but presentation was too fast to get down all the important information. Check out the links above for more information.

Recommendations: Produce Activity Streams and consume ActivityStreams, RSS, and Atom.

Charting Collections of Connections in Social Media: Mapping and Measuring SOcial Media Networks to Find Key Positions and Structures
Marc Smith (Connected Action Consulting Group)

Talking about nodexl and that most people do not capture information about their networks. People are social and crowds are important. Crowds now gather online (interaction with physical crowds is very interesting too). Online social media for coming together online now serialize comments.

NodeXL builds a graph that looks like a graph based on social media data. Example, creating graphs from Tweets that mention a certain word. You can find some examples on Flickr of these graphs.

In social networks, the most important thing is “position, position, position.” Archiving connections is possible, but few of the resellers or archives of social media do so. Archiving connections is as important as archiving digital object (great for contextualization).

NodeXL makes really interesting, sometimes confusing, but cool looking graphics. My colleague who researches social networks is all over this type of data representations and analyses. Very interesting.

“We envision hundreds of NodeXL data collectors around the world collectively generating a free and open archive of social media network snapshots on a wide range of topics.”

The Social Networks and Archival Context (SNAC) Project
Ray Larson (UC Berkeley)

Dealing with metadata surrounding collections held in archives. Project funded by NEH.

Data from: EAD finding aids from LoC, OAC, Northwest DIgital Archive, and Virginia Heritage; Authority records from LoC, Getty Vocabulary Program, Virtual International Authority File; other biographical sources (eg DBPedia).

EAC is now complemented by EAC or Encoded Archival Context: XML-based standards for descriptions of record creators= authority control. Want to have controlled vocabularies because we have the problem of many different names for same person, same name for different person. (Are they also adding these authority files to LoC? We need standards, but we don’t need a ton of standards that overlap so we have issues about deciding which one to use.)

Very nice looking interface for the authority files. Nice touch: noting from which archives they are deriving the names for the authority file. And then using data to create pretty infographic of connections–still under development. SNAC website for latest version to download and try out.

Take away: Connections are super-important and we need sophisticated ways to capture this information. I’m definitely going to download NodeXL and play around with it. If you use it, let me know how you like it.