- "Linked Open Dime Novels; or, 19th Century Fiction and 21st Century Data": Matthew Short, Demian Katz
- "Beyond the Keyword: Creative Search and Query Expansions based on DBpedia": Marya Sawaf
- "So you think you want to migrate to RDF": Eben English, Steven Carl Anderson
- "How not to waste catalogers' time: Making the most of subject headings": John Mark Ockerbloom
- "The Modern Day Sisyphus: #libtech Burnout and You": Becky Yoose
- "Janus - Node.js Handler for all Library Searches": David Naughton
- "Getty Research Portal Reboot: Angular and Elasticsearch for Metadata Search Aggregation": Susan Ley, Adam Cahan (Getty)
- "Architecture is politics: The power and the perils of systems design": Andreas Orphanides (NCSU Libraries)
- Transcending Traditional Systems and Labels: An API-First Archives Approach at NPR
- "Building Desktop Applications using Web Technologies with Electron": Jason Ronallo
- "Beyond the Bento Box: Using linked data and smart algorithms to integrate repository data in context": Jordan Fields & Mark Noble (Marmot)
- "What does it take to get a job these days? Analyzing jobs.code4lib.org data to understand current technology skillsets": Monica Maceli
- "Building a user-friendly authorities browse in Blacklight": Jennifer Colt & Frances Webb (Cornell)
"Linked Open Dime Novels; or, 19th Century Fiction and 21st Century Data": Matthew Short, Demian Katz
- bibliography of 19th century popular fiction as LD
- four classes: CreativeWork, Edition, Copy, Series aligning with current data model
- hard to figure out line between "work" and "expression" etc. in more abstract models
- properties: RDA Unconstrained (not FRBR)
- defined English equivalents for opaque URIs to simplify coding
- model captures relationships between multiple editions, info previously only existing in editorial notes
- nominal authors (house pen names) and actual authors tracked separately
- "With limited time and resources, you can actually do real things"
- MODS and MARC allow you to wedge URIs into records -- using that for identifiers is a "tiny bit of linked data" that allows interop with related data in full LD
- semantic search -- within a single knowledge space
- DBPedia as source of variations and synonyms
- serendipitous or "creative" search -- get outside the original knowledge
space, share ideas across disciplines
- synonyms of synonyms of synonyms -> exponential dataset
- use word frequencies to filter for common expressions
- fault tolerance to get around DBPedia quirks
- "reuse is how vocabularies gain value"
- "always prefer using an existing [predicate] IRI over inventing a new one"
- linked open vocabularies, sameAs
- "with RDF you're not limited to a single vocabulary, you can mix-and-match"
- predicates have domains (valid subjects) and ranges (valid objects); not
all URIs are predicates
- ...of course, some people (DPLA, Europeana) aren't actually following the definitions... "there is no Semantic Web police"
- try to conform to accepted usages, or
- use less popular predicate that does have the right range, or mint your own
- "domains actually mean very little"
- you don't have to explicitly declare classes
- but try not to do invalid things, e.g. use a predicate with a book domain for music
- extinction: URIs that don't resolve
- "if there's data that you care about at that URI, you still need to store that text locally"
- don't be afraid to create a new predicate
- "we've all seen" enough jamming data where it doesn't belong in MARC etc
- services like id.loc.gov can be rate-limited
- "you're going to need to cache everything"
- Rails Linked Data Fragments: front end to blazegraph, marmotta, in-memory
- down side: batch downloads may not be made available often enough
- public users can't tell the difference
- RDF doesn't magically mean aggregatable or harvestable
- need tightly-defined data structures, need to follow standards
- "this is where things are going, you're going to have to deal with it"
- OPACs don't do a good job with subject browse
- Solr and faceting aren't everything
- "You can't just throw your subject headings into a weighted search and call it done"
- most relevant book is not necessarily the one with the best term score
- faceting is good for slice-and-dice, not for explore: narrow or broaden, not lateral
- if you look how catalogers work, they assign subjects in a certain order
by relevance
- plea for those converting to RDF: "please go out of your way to preserve subject ordering"
- dates in subject headings can be mined to raise scores for works contemporary to events
- see notes here
- you have one problem ->
- "I'll just use node.js" ->
- you have uncountably infinite problems
- node.js is not a robust HTTP server
- you need nginx etc. as a proxy
- you need something else to keep node running if it goes down
- forever, Supervisor
- apache + passenger + node.js works OK
- asynchrony in node is harder than it looks
"Getty Research Portal Reboot: Angular and Elasticsearch for Metadata Search Aggregation": Susan Ley, Adam Cahan (Getty)
- angular.js + ElasticSearch
- angular.js
- Google MVC framework for JS-based web apps
- benefits: dependency injection, 2-way data binding, testability, DOM filtering
- large community
- good styleguide (which Getty followed)
- ElasticSearch
- "you don't have to use Java, you can write a bunch of funky JSON instead"
- "ElasticSearch scales"
"Architecture is politics: The power and the perils of systems design": Andreas Orphanides (NCSU Libraries)
- system design controls what users can or can't do
- "design ethics: a thing"
- 3 key lessons in the ethics of system design
- "persuasive design"
- "dark patterns": exploiting cognitive biases
- pre-checked opt-in boxes
- mixing required and optional checkboxes
- highlighting and mis-identifying non-lowest airfares as lowest
- clickable things should look clickable
- calls to action should be prominently placed
- Ethical principles
- implement constraints/affordances to the user's benefit
- design affordances the user will recognize
- don't disguise constraints
- "architecture is politics" -- Mitch Kapor
- e.g. Robert Moses' transit-proof overpasses
- design sends a message about how designers value customers
- "your metadata schema is a social justice issue"
- your design choices reflect your values even if you don't intend it
- do you value collecting metadata more than you value user privacy
- Ethical principles
- seek out & recognize your biases
- diversify your design practices (and your team)
- understand your culture and its mores
- 80/20 rule
- if you spend 80% of your developer time supporting your 20% power users, you're devaluing the vast majority of your users
- content:advertising ratio
- popular websites might have 1:5 content:advertising ratio
- suggests advertisers are 5x more important than users
- "your data validation schema is a social justice issue"
- e.g. "your name must match your ID" vs. allowing only roman characters, modeling names as first/middle/last, etc.
- 15% of internet users depend on mobile devices
- "your mobile website is a social justice issue"
- Ethical principles for compassionate design
- recognize & acknowledge compromises
- know your users
- design with empathy
- iterating on front end independent of back end development
- simple, frequent front-end deployments
- coming from backbone & jquery
- "angular is way easier than backbone"
- all application state is stored in the URL
- no state in the browser session
- everything is bookmarkable, shareable, embeddable in bug reports
- proxy layer: an API in front of the API
- microservice between UI and API
- authentication, caching, connecting to multiple internal APIs
- moving from MySQL to NoSQL (Elastic + DynamoDB):
- lots of HTTP calls
- a million records -> N million API requests
- SQL dump: 1 minute / year
- inserting data into API: 1 hour / year
- 40 years of data -> 1 week to load
- slides
- Why desktop applications
- stand out from sea of browser tabs
- focus w/o distraction by sea of browser tabs
- Don't want to learn desktop GUI toolkits? Use HTML/CSS/JS.
- Electron: one of several available platforms for that
- used by e.g. Slack
- Chromium + Node.js
- Issues
- cross-platform, but:
- need to build a native installer
- still some OS differences
- still need to recompile native modules
- cross-platform, but:
"Beyond the Bento Box: Using linked data and smart algorithms to integrate repository data in context": Jordan Fields & Mark Noble (Marmot)
- public library users probably want books first
- but we also have archives, articles....
- Marmot has 16 public, 6 academic, 5 school libraries
- one discovery system for all these different user groups
- federated discovery across different ILSs
- Pika: Marmot's new (alpha) discovery layer
- Linked data sources:
- Who's on first
- Geonames
- Find a Grave
- Wikipedia
- Internal catalog, geneology, archive
- Different subject catalogs between article database, archive, EBSCO catalog
- primarily using LD for well-known relationships
- Linked data sources:
"What does it take to get a job these days? Analyzing jobs.code4lib.org data to understand current technology skillsets": Monica Maceli
- curriculum, jobs, practitioners
- curriculum study
- "somebody does that every couple of years" -> automate it via web-scraping to identify trends over time
- jobs
- code4lib jobs tagged and curated by volunteers
- shortimer: "a django web app that collects job announcements from the code4lib discussion list and puts them on the Web."
- text mining, correlations, groupings with R
Cornell's blacklight implementation
- existing (Voyager) lists subjects and authors according to vocabulary
- new (Blacklight) lists according to field, provides easy access to narrowing
- links to main Blacklight catalog search results
- headings only appear if heading, "see", or "see also" will provide search results in main catalog
- cross-references come from the authority record
- main catalog now indexes alternate forms as well as preferred forms (e.g.
records catalogued as "myocardial infarction" now show up under "heart
attack")
- "users will find the records they want, but they won't necessarily realize we've done anything interesting to help them find the records they want"
- can be done w/o setting up a separate authority browse
- separate Solr index to facilitate searching records at the same level