- Keynote: "Code for Liberation", Kate Krauss, Tor project
- 'Can't wait for Perfect: Implementing "Good Enough" Digital Preservation': Shira Peltzman & Alice Prael
- "Enabling Access to Old Wu-Tang Clan Fan Sites: Facilitating Interdisciplinary Web Archive Collaboration": Nick Ruest & Ian Milligan
- "Digital Preservation 101, or, How to Keep Bits for Centuries": Julie Swierczek, Harvard Art Museums
- "Guerrilla Usability Testing & Communicating Value", Eka Grguric
- "Get Your Recon", Christina Harlow
- interviewing activists to find out what they need from libraries
- BLM
- transgender
- personal experience
- 19-year-old college student / sandwich delivery guy shot in his car by Philadelphia police near her house
- endemic problem in Philadelphia & many other cities
- BLM first to surface this in an organized way
- BLM now under social media surveillance by FBI
- cf. Occupy: Homeland Security, Joint Terrorism Task Force, NCIS, IMSI-catchers ("cell site simulators" used by police to intercept mobile phone traffic)
- cf. COINTELPRO & civil rights
- our anti-terrorism agencies have massive budgets to find terrorists, but there just aren't many terrorists to find, so they're finding terrorists where there aren't any
- discrimination -> control of personal information critical to job security, family relationships; can be life or death
- there are political, social prices to pay when we collect data that we
don't need
- e.g. Aeon special collection system asks for considerable amounts of personal data, which is then accessible to library staff for generations
- "when you keep that data, you are siding with your institution; you are not necessarily siding with the researcher ... that information can be subpoenaed"
- "what to collect, what to retain, what to distribute, who can see what... these are political decisions, moral decisions"
- ALA position: "the most minimal amount of data for the shortest possible time" (see ALA privacy policy guidelines)
- sysadmins & technicians may think they're not surveillance targets, but Snowden docs indicate GCHQ & others have targeted techs at telecoms as way to get to their customers
- teach workshops, e.g. using Signal for iPhone / Android
- examine your access rules
- "if you don't have it, you won't have to worry about it"
- "don't collect what you can't protect"
- "these are moral choices"
- "if you have privacy at home, but you don't have privacy on line, you have no privacy"
- DC public library had a privacy month:
dclibrary.org/1984
- livestreamed a 1984 readathon
- privacy workshop for teams
- film series
- etc.
- Tor exit nodes in libraries:
- first implemented in NH
- DHS got the local police department to get the library to shut it down
- Host crypto parties
- Host unconferences to learn activist or advocacy community needs
- how to address a community that you don't know
- "there's a whole bunch of software developers I know that want to make software that solves a problem that no one needs"
-
Q: Google analytics?
-
A: "It's free, it works so beautifully, why can't we use it?" -> Tor doesn't collect any analytics, so they have to find ways to solve problems without analytics (e.g. whole-day A/B testing)
-
Q: Opt-in seems like a solution. [Ed. not a question]
-
A: People are so used to opting in, is it really informed consent? Opting in should be baseline.
'Can't wait for Perfect: Implementing "Good Enough" Digital Preservation': Shira Peltzman & Alice Prael
-
Shira Peltzman: Digital Archivist UCLA
-
Alice Prael: National Digital Stewardship Resident at JFK Presidential Library
- Bit preservation
- Content management
- ensuring files can be found, delivered, opened, read / played back
- Ongoing management
- preservation is an active, continuous process
- requires ongoing funding & engagement
- standards: OAIS, TRAC (aka ISO 16363 / TDR)
- "the most you can do with your current resources"
- "probably a little more than you are"
- institution-dependent
- moving target
- Don't go it alone
- look at existing policies etc.
- Inventory
- for advocacy
- for prioritization & budgeting
- high priority items need more copies, geographically distributed, more fixity checks & monitoring
- break down into sub-tasks, look for low-hanging fruit (naming conventions, basic metadata...), incremental daily progress
- be an effective advocate for the material you try to preserve
- you're going to have to educate people about / sell them on the whole idea of digital preservation
- you'll need to communicate differently to different audiences
- get digital preservation into your institution's mission statement
- NDSA preservation levels are a good benchmark... but don't have any access-related guidelines.
"Enabling Access to Old Wu-Tang Clan Fan Sites: Facilitating Interdisciplinary Web Archive Collaboration": Nick Ruest & Ian Milligan
Every day of my life, I wished these archives were bigger ... my biggest problem is now abundance. I spend almost every day ... wishing we had less information.
- Wayback machine requires you to know the URL
- everybody knows that's not how people want to work
- everyone's working on discovery
- but discovery can't be a black box of ranking algorithms ("that black box is writing my book")
- webarchives.ca: Warcbase + Shine
"Digital Preservation 101, or, How to Keep Bits for Centuries": Julie Swierczek, Harvard Art Museums
-
"when I say digital preservation, people think I mean digital storage"
-
"when I say digital archive, people think I'm talking about a backup drive"
-
OAIS is rocket science; it came out of NASA
-
Principles:
- Provenance
- Original order
-
Ingest
- Special forensic floppy controllers, e.g. KryoFlux
-
Formats
If anybody tells you you should get a one-time grant for digital preservation, you have my permission to get mad.
- schedule regular usability testing sessions as part of your development schedule
- guerrilla vs. standard
- guerrilla is cheaper
- minimal equipment, no labs, can be done by amateurs
- 3-6 participants, broad strokes, most bugs
- answers over statistical validity
- figure out what you're testing
- specific features: color scheme, can people logout
- figure out stakeholders
- subsets of user group: undergrads vs. grads vs. faculty
- personas
- you don't need robust, detailed personas
- just get at needs & goals
- goals
- focus on the user, not on what you need to get out of testing the design
- refine goals into concrete tasks
- e.g. "look up grades" -> "look up your grades on the midterm exam"
- bad tasks lead users ("log in, go to x, tell me what you think you would click on")
- 2 facilitators: talker + notetaker (or record everything & then take
notes)
- bad prompt: (to frustrated user) "try logging out"
- good prompt: "can you describe to me how you're feeling / what you're trying to do"
-
data is really messy
-
we don't want entities incorrectly not linked
-
we don't want entities incorrectly linked
-
solution: "get a student" to clean up the data, manually search authorities
- limits of linked data services (OpenRefine, LODRefine)
- mismatches, incomplete data, lots of data munging
- different LOC services use incompatible APIs, aren't set up for bulk queries
- some MARC fields just become opaque blank nodes when converted to RDF
- Wikidata services don't provide fuzzy matching
- theoretically standard identifiers turn up in different formats on different serves
- "To fully realize the benefits of LD, a huge amount of entity matching / data remediation work needs to occur"