fuzzy_hash_cop.rb
Last active August 29, 2015 14:20
Using fuzzy hashing to identify source code duplication
#!/usr/bin/env ruby
# Notes:
# - Depends on gem russdeep-1.2.1
# - cutoff of 65 is arbitrary
# - results may be unreliable for files < 4k
# (i.e. most source files)
# - config files for different environments
# are often crazy-similar
Result of running fuzzy_hash_cop.rb @1b230eb on sipity @790338d
293 ./app/models/sipity/models/submission_window_work_type.rb
352 ./app/models/sipity/models/work_submission.rb
1270 ./app/runners/sipity/runners/work_enrichment_runners.rb
1462 ./app/runners/sipity/runners/work_event_trigger_runners.rb
312 ./db/migrate/20150201002853_create_sipity_models_processing_strategy.rb
412 ./db/migrate/20150201002900_create_sipity_models_processing_strategy_state.rb
Sample data generator for stash-wrapper
#!/usr/bin/env ruby
require 'mime/types'
require 'nokogiri'
require 'set'
require 'stash/wrapper'
ST = Stash::Wrapper
# ------------------------------------------------------------
Notes from BIDS Docker Workshop day 1, 7 Jan 2016
Stash / Docker diary

Creating a Dockerfile for Stash development

Emulating the AWS development environment

It is possible to find Docker images for Amazon Linux, but because the RPM repositories for Amazon Linux are only available within AWS, they only work if the Docker host is also on AWS.

Instead, we're going to try use Centos, which

msg = "required #{__FILE__}, #{caller[0]}"'/tmp/require.log', 'a') { |f| f.puts(msg) }
puts msg

Fedora breakout session


Islandora CLAW: Islandora + Fedora 4

  • in process of moving from F3 to F4, possibly Drupal 7 to Drupal 8
  • timeline?
    • Nick Ruest (York U):
  • four less-than-part-time developers