Forensics, Privacy, Security

Last session before closing keynote. On to the summaries of forensics, privacy, and security!

Questions posed: What is the proper boundary between public and private data? How far should archivists go in collecting what might be private data?

Session moderated by Elizabeth Churchill (Yahoo! Research)

Archival Applications of Digital Forensics Tools and Techniques or Why I started reading Forensic Cop Journal
Kam Woods (University of North Carolina)

Parallels between forensics and archiving: case files and archival packaging; exploit private data to support criminal prosecution and identify private and legally-encumbered data to redact/protect, etc..

Archives increasingly have to deal with ingesting heterogeneous fixed and removable media. Need to ensure reliable data extraction and reducing hidden risk. Need to know what you’re given. Want to establish “ground truth” about what is on the media. Looking at residual and system data.

Handle issues of privacy via forensics formats such as cryptopgraphic hashing and unique identifiers. Working with bulk_extractor tool to process data with proactive detection and decompression, stream processing during disk imaging, and parallelized processing. Also using fiwalk: creates DFXML from disk images using SleuthKit and creates Dublin Core metadata for files, and file level hashing. Tools, APIs, etc. available at

Developing other projects mentioned in earlier session: BitCurator and Realistic corpora for archival education and training.

The Personal in the Organizational: Value and Ethics
Sam Meister (University of Maryland)

Discussion of the issues and implications of personal data embedded in the records of failed businesses. Framing the talk as privacy as an ethical matter. Part of larger research endeavor.

Sherwood is a restructuring firm and offers private bankruptcy option. Sherwood becomes new owners of the company and winds the company down (managed liquidation). Therefore it also has a lot of private data and records.

Records of start-ups are messy. No records management. “These small organizations are like big people”: both are a bit messy and disorganized. Lots of personal, private data goes to Sherwood when Sherwood is given company.

No comprehensive legislature when it comes to preserving personal data. Therefore it is an ethical issue. Can look at Codes of Ethics: Privacy statements from Society of American Archivists and International Council on Archives.

Looking at selection and appraisal: How can we collect these records? Difficult to know how much personal information is located in the records and where it is in the files. Options: redaction, not collecting employee records: each method has downsides.
Access: How do we give access to the records? How do we keep the private data private?

Transition of private records to public. There is complexity of rights and ethical questions about access and privacy (or the tension between the two). All about maintaining trust.

Take away: Many issues to think about in regards to forensics, privacy, and access. More questions than answers at this point, but definitely learned more about valuable projects furthering our understanding and practical ability to deal with the records coming to our archives.