dmolesUC/c4l16-day1-notes.md

## c4l16-day1-notes.md

      
    Raw
  

              c4l16-day1-notes.md
            
          
    8 March 2016


Keynote: "Code for Liberation", Kate Krauss, Tor project

BLM
transgender community
who's a target?
what can libraries do?
Q&A


'Can't wait for Perfect: Implementing "Good Enough" Digital Preservation': Shira Peltzman & Alice Prael

Bedrock principles
"Good-enough digital preservation"


"Enabling Access to Old Wu-Tang Clan Fan Sites: Facilitating Interdisciplinary Web Archive Collaboration": Nick Ruest & Ian Milligan
"Digital Preservation 101, or, How to Keep Bits for Centuries": Julie Swierczek, Harvard Art Museums
"Guerrilla Usability Testing & Communicating Value", Eka Grguric
"Get Your Recon", Christina Harlow

Limits of linked data


Keynote: "Code for Liberation", Kate Krauss, Tor project


interviewing activists to find out what they need from libraries

BLM
transgender


BLM


personal experience

19-year-old college student / sandwich delivery guy shot in his car by
Philadelphia police near her house
endemic problem in Philadelphia & many other cities
BLM first to surface this in an organized way


BLM now under social media surveillance by FBI

cf. Occupy: Homeland Security, Joint Terrorism Task Force, NCIS,
IMSI-catchers ("cell site
simulators" used by police to intercept mobile phone traffic)
cf. COINTELPRO & civil rights
our anti-terrorism agencies have massive budgets to find terrorists,
but there just aren't many terrorists to find, so they're finding
terrorists where there aren't any


transgender community


discrimination -> control of personal information critical to job
security, family relationships; can be life or death
there are political, social prices to pay when we collect data that we
don't need

e.g. Aeon special collection system
asks for considerable amounts of personal data, which is then
accessible to library staff for generations
"when you keep that data, you are siding with your institution; you are
not necessarily siding with the researcher ... that information can be
subpoenaed"
"what to collect, what to retain, what to distribute, who can see what...
these are political decisions, moral decisions"
ALA position: "the most minimal amount of data for the shortest
possible time" (see ALA privacy policy guidelines)


who's a target?


sysadmins & technicians may think they're not surveillance targets, but
Snowden docs indicate GCHQ & others have
targeted techs at telecoms as way to get to their customers

what can libraries do?


teach workshops, e.g. using Signal for
iPhone / Android
examine your access rules

"if you don't have it, you won't have to worry about it"
"don't collect what you can't protect"


"these are moral choices"
"if you have privacy at home, but you don't have privacy on line, you
have no privacy"
DC public library had a privacy month:
dclibrary.org/1984

livestreamed a 1984 readathon
privacy workshop for teams
film series
etc.


Tor exit nodes in libraries:

first implemented in NH
DHS got the local police department to get the library to shut it down

national outcry -> reopened


Host crypto parties
Host unconferences to learn activist or advocacy community needs

how to address a community that you don't know
"there's a whole bunch of software developers I know that want to make
software that solves a problem that no one needs"


Q&A


Q: Google analytics?


A: "It's free, it works so beautifully, why can't we use it?" -> Tor
doesn't collect any analytics, so they have to find ways to solve
problems without analytics (e.g. whole-day A/B testing)


Q: Opt-in seems like a solution. [Ed. not a question]


A: People are so used to opting in, is it really informed consent?
Opting in should be baseline.


'Can't wait for Perfect: Implementing "Good Enough" Digital Preservation': Shira Peltzman & Alice Prael


Slides


Shira Peltzman: Digital Archivist UCLA


Alice Prael:
National Digital Stewardship Resident
at JFK Presidential Library


Bedrock principles


Bit preservation
Content management

ensuring files can be found, delivered, opened, read / played back


Ongoing management

preservation is an active, continuous process
requires ongoing funding & engagement
standards: OAIS, TRAC (aka ISO 16363 / TDR)


"Good-enough digital preservation"


"the most you can do with your current resources"
"probably a little more than you are"
institution-dependent
moving target
Don't go it alone

look at existing policies etc.


Inventory

for advocacy
for prioritization & budgeting
high priority items need more copies, geographically distributed, more
fixity checks & monitoring
break down into sub-tasks, look for low-hanging fruit (naming
conventions, basic metadata...), incremental daily progress


be an effective advocate for the material you try to preserve

you're going to have to educate people about / sell them on the whole
idea of digital preservation
you'll need to communicate differently to different audiences


get digital preservation into your institution's mission statement
NDSA preservation levels are a good benchmark... but don't have any
access-related guidelines.

NDSA levels of access proposal


"Enabling Access to Old Wu-Tang Clan Fan Sites: Facilitating Interdisciplinary Web Archive Collaboration": Nick Ruest & Ian Milligan


Every day of my life, I wished these archives were bigger ... my biggest
problem is now abundance. I spend almost every day ... wishing we had
less information.


Wayback machine requires you to know the URL

everybody knows that's not how people want to work
everyone's working on discovery
but discovery can't be a black box of ranking algorithms ("that black
box is writing my book")


webarchives.ca:
Warcbase +
Shine


"Digital Preservation 101, or, How to Keep Bits for Centuries": Julie Swierczek, Harvard Art Museums


Slides


Handout


"when I say digital preservation, people think I mean digital storage"


"when I say digital archive, people think I'm talking about a backup drive"


OAIS is rocket science; it came out of NASA


Principles:

Provenance
Original order


Ingest

Special forensic floppy controllers, e.g.
KryoFlux


Formats

NDIIPP


If anybody tells you you should get a one-time grant for digital
preservation, you have my permission to get mad.


"Guerrilla Usability Testing & Communicating Value", Eka Grguric


schedule regular usability testing sessions as part of your development
schedule
guerrilla vs. standard

guerrilla is cheaper
minimal equipment, no labs, can be done by amateurs
3-6 participants, broad strokes, most bugs
answers over statistical validity


figure out what you're testing

specific features: color scheme, can people logout


figure out stakeholders

subsets of user group: undergrads vs. grads vs. faculty


personas

you don't need robust, detailed personas
just get at needs & goals


goals

focus on the user, not on what you need to get out of testing the
design
refine goals into concrete tasks

e.g. "look up grades" -> "look up your grades on the midterm exam"
bad tasks lead users ("log in, go to x, tell me what you think you
would click on")


2 facilitators: talker + notetaker (or record everything & then take
notes)

bad prompt: (to frustrated user) "try logging out"
good prompt: "can you describe to me how you're feeling / what you're
trying to do"


"Get Your Recon", Christina Harlow


Slides & notes


data is really messy


we don't want entities incorrectly not linked


we don't want entities incorrectly linked


solution: "get a student" to clean up the data, manually search authorities


Limits of linked data


limits of linked data services (OpenRefine, LODRefine)

mismatches, incomplete data, lots of data munging
different LOC services use incompatible APIs, aren't set up for bulk
queries
some MARC fields just become opaque blank nodes when converted to RDF
Wikidata services don't provide fuzzy matching
theoretically standard identifiers turn up in different formats on
different serves


"To fully realize the benefits of LD, a huge amount of entity matching /
data remediation work needs to occur"