Skip to content

Instantly share code, notes, and snippets.

View billdueber's full-sized avatar

Bill Dueber billdueber

View GitHub Profile
@billdueber
billdueber / multi_file.rb
Last active June 1, 2018 19:14
Enumerate over multiple files
class MultiFile
include Enumerable
def initialize(filenames_or_handles, open_mode: 'r:utf-8')
@names_and_handles = Array(filenames_or_handles).map do |fn|
if fn.kind_of?(IO)
name = if fn.respond_to? :to_path
fn.to_path
else
require 'traject'
require_relative 'recusive_json_reader'
require 'traject/debug_writer'
settings do
store "reader_class_name", "MyJsonHierarchyReader"
store "writer_class_name", "Traject::DebugWriter"
store "output_file", "recursive.out"
end
require 'json'
class MyJsonHierarchyReader
# @param [#each] input_stream Probably a file, but any iterator will do
# so long as it returns a valid JSON object from #each
def initialize(input_stream, settings)
# ... whatever you need to do. Let's pretend it's
# a newline-delimited JSON file, since you didn't
# specify anything
@json_lines = input_stream
@billdueber
billdueber / marc21_changed_code.rb
Last active October 31, 2017 17:20
changed code and the simplistic config used for the benchmark
def extract_marc(spec, options = {})
# ... stuff deleted for clarity
## CREATE THE CHAIN
ppchain = Marc21.create_post_processing_chain(options, translation_map)
lambda do |record, accumulator, context|
accumulator.concat extractor.extract(record)

Samvera#General talking about Fedora

Tuesday, August 29, 2017

Mike Giarlo (5:56 PM)

Have folks here been hearing all manner of rumors today about Samvera, or certain Samvera institutions, walking away from Fedora and other community components? Some of us are hearing these rumors as of a few hours ago, and we’re trying to figure out where the misinformation is coming from.

It seems to center on Valkyrie. We did discuss Valkyrie and Fedora futures on today’s Fedora Leadership group, but not in the context the rumors are in.

require 'active-fedora'
require 'json'
def descendant_uris(uri)
begin
resource = Ldp::Resource::RdfSource.new(ActiveFedora.fedora.connection, uri)
rescue
STDERR.puts "Failed to create resource for uri #{uri}"
return []
end
@billdueber
billdueber / solread_patch_benchmark.rb
Last active July 27, 2017 19:24
A self-contained (read: monkeypatch)benchmarking program for SolrEad based on https://github.com/awead/solr_ead/pull/20
require 'benchmark'
require 'uri'
require 'solr_ead'
require 'concurrent'
# Make a subclass with all the speed patches
class IndexerWithPatches < SolrEad::Indexer
def additional_component_fields(node, addl_fields = Hash.new)
# Clear or create the cache
@billdueber
billdueber / arclight_monkeypatch.rb
Last active July 26, 2017 18:33
Monkeypatch of SolrEad to try to make indexing faster
require "URI"
require 'solr_ead'
class SolrEad::Indexer
def additional_component_fields(node, addl_fields = Hash.new)
p_ids = parent_id_list(node)
p_unittitles = parent_unittitle_list(node)

Install using a template:

This is slighly different in that it installs the gems inside the app directory in yourappname/.bundle and uses native libxml2 to compile nokogiri.

$ ruby -v # need a 2.3 or later
   => 2.4.0
$ rails -v # 
 =&gt; 5.0.2

Install a skeleton Rails app with Hyrax

(from Seth Johnson)

$ ruby -v
    => ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linu

$ rails -v
    => Rails 5.0.2