Instantly share code, notes, and snippets.

View threaded_reader_bench.rb
require 'marc'
require 'benchmark'
require 'concurrent'
require 'concurrent-edge'
module MARC
class ThreadedReader < Reader
def each
records = 20)
View marc_vs_marc4j_bench.rb
# A very, *very* imperfect bench, but gives us a rough idea
# tl;dr -- marc-binary is a wash, marc-xml is about 3.5 times faster using marc4j
# > bundle exec ruby --server bench.rb
# jruby (2.5.0) 2018-05-24 81156a8 Java HotSpot(TM) 64-Bit Server VM 25.112-b16 on 1.8.0_112-b16 +jit [darwin-x86_64]
# Warmup: 45. Runtime: 15
# Comparison:
# marc4j-xml: 8197.4 i/s
View multi_file.rb
class MultiFile
include Enumerable
def initialize(filenames_or_handles, open_mode: 'r:utf-8')
@names_and_handles = Array(filenames_or_handles).map do |fn|
if fn.kind_of?(IO)
name = if fn.respond_to? :to_path
View indexer.rb
require 'traject'
require_relative 'recusive_json_reader'
require 'traject/debug_writer'
settings do
store "reader_class_name", "MyJsonHierarchyReader"
store "writer_class_name", "Traject::DebugWriter"
store "output_file", "recursive.out"
View recursive_yield_example.rb
require 'json'
class MyJsonHierarchyReader
# @param [#each] input_stream Probably a file, but any iterator will do
# so long as it returns a valid JSON object from #each
def initialize(input_stream, settings)
# ... whatever you need to do. Let's pretend it's
# a newline-delimited JSON file, since you didn't
# specify anything
@json_lines = input_stream
View marc21_changed_code.rb
def extract_marc(spec, options = {})
# ... stuff deleted for clarity
ppchain = Marc21.create_post_processing_chain(options, translation_map)
lambda do |record, accumulator, context|
accumulator.concat extractor.extract(record)
View Talk About Fedora.adoc

Samvera#General talking about Fedora

Tuesday, August 29, 2017

Mike Giarlo (5:56 PM)

Have folks here been hearing all manner of rumors today about Samvera, or certain Samvera institutions, walking away from Fedora and other community components? Some of us are hearing these rumors as of a few hours ago, and we’re trying to figure out where the misinformation is coming from.

It seems to center on Valkyrie. We did discuss Valkyrie and Fedora futures on today’s Fedora Leadership group, but not in the context the rumors are in.

View safer_reindex_everything.rb
require 'active-fedora'
require 'json'
def descendant_uris(uri)
resource =, uri)
STDERR.puts "Failed to create resource for uri #{uri}"
return []
View solread_patch_benchmark.rb
require 'benchmark'
require 'uri'
require 'solr_ead'
require 'concurrent'
# Make a subclass with all the speed patches
class IndexerWithPatches < SolrEad::Indexer
def additional_component_fields(node, addl_fields =
# Clear or create the cache
View arclight_monkeypatch.rb
require "URI"
require 'solr_ead'
class SolrEad::Indexer
def additional_component_fields(node, addl_fields =
p_ids = parent_id_list(node)
p_unittitles = parent_unittitle_list(node)