Skip to content

Instantly share code, notes, and snippets.

@dchandekstark
Last active August 29, 2015 14:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dchandekstark/803fe2a8807fc0b1f611 to your computer and use it in GitHub Desktop.
Save dchandekstark/803fe2a8807fc0b1f611 to your computer and use it in GitHub Desktop.
Indexing API

ActiveFedora Indexing API

TL;DR: This Gist sketches out an API for providing a means of managing inheritable indexing configurations in ActiveFedora. Primary motivations are (1) to avoid (for most purposes) putting indexing logic in an overriden #to_solr method, and (2) to make it possible to introspect a model's or instance’s indexing.

An earlier version of this idea was kicked around on IRC, and I've refined it significantly since then, mainly to accommodate the current property indexing API (from ActiveTriples). Originally, I just aimed to deal with indexing calculated values (basically model methods, as opposed to attributes or properties which already support indexing), for which one currently has to override #to_solr.

Essentially, I wanted to be able to do this type of thing on a model:

# Stupid example
class Image < ActiveFedora::Base
  # index key, *args [, &block]
  # key is interpreted as method name unless block is given
  index :dimensions, :stored_sortable do
    crazy_dimensions
  end

  def crazy_dimensions
    "#{height}x#{width}"
  end
end

Currently, I basically have to do this:

def to_solr(solr_doc = {})
  solr_doc[ActiveFedora::SolrService.solr_name(:dimensions, :stored_sortable)] = crazy_dimensions
end

which (besides being ugly) means that if somewhere else I want to search on that index, have to recalculate the index field name, because it's not actually stored anywhere.

In the proposed code I would be able to get the field name with

Image.index_name(:dimensions) # => "dimensions_ssi"

In response to a desire not to have to put index statements on models, and to extending this capability to properties, I modified it to support multiple "indexers", which can be added like:

class Image < ActiveFedora::Base
  index_with ImageIndexer
end

where ImageIndexer is a subclass of ActiveFedora::Indexer (a PORO) and contains index statements.

For model-level and property index statements, there is a "default indexer" to which those statements are added.

Indexers are inherited by subclasses and instances via Rails's class_attribute mechanism.

With all this, when an object has to generate its solr document, it can simply iterate through its indexers, passing itself and collecting all the index name, value pairs.

module ActiveFedora
class Base
include Indexing
index_with ObjectIndexer
end
end
class Image < ActiveFedora::Base
index_with ImageIndexer
# Property builder would need to be modified so that the block
# is executed in the context of the model class, i.e.
# something like:
#
# def build(&block)
# NodeConfig.new(name, options[:predicate], options.except(:predicate))
# model.instance_exec(&block) if block_given?
# end
#
property :foo, predicate: RDF::DC.title do |property_name|
index property_name, :stored_sortable
end
end
class ImageIndexer < ActiveFedora::Indexer
index :height, :stored_sortable # gets value from #height instance method
index :width, :stored_sortable do
horizontal # gets value from #horizontal instance method
end
end
# Represents a single index "field"
module ActiveFedora
class Index
attr_reader :name, :method
def initialize(name, method)
@name = name
@method = method
end
def value(obj)
if method.respond_to?(:call)
obj.instance_exec(&method)
else
obj.send(method)
end
end
end
end
module ActiveFedora
class Indexer
module IndexerMethods
# Adds an index
# @api public
def index(key, *args, &block)
if index_map.key?(key)
raise "Another index has been mapped to the key #{key.inspect}: #{index_map[key].inspect}"
end
method = block || key
name = SolrService.solr_name(key, *args)
add_index(key, name, method)
end
# Adds an index to the class
def add_index(key, name, method)
self.index_map = self.index_map.merge(key => Index.new(name, method))
end
# Returns the list of indexes for this indexer
def indexes
index_map.values
end
# Returns the names of the index fields
def index_names
indexes.map(&:name)
end
# Returns the name of the index field mapped to the key
# @raise [KeyError]
def index_name(key)
index_map.fetch(key).name
end
# Returns a hash of index names and values - i.e., for generating a solr doc
def index_names_and_values(object)
indexes.each_with_object({}) do |idx, fields|
fields[idx.name] = idx.value(object)
end
end
end # IndexerMethods
class << self
attr_accessor :index_map
def inherited(subclass)
subclass.index_map = index_map.dup
super
end
end
self.index_map = {}
attr_accessor :index_map
extend IndexerMethods
include IndexerMethods
def initialize
@index_map = self.class.index_map.dup
end
end
end
module ActiveFedora
module Indexing
extend ActiveSupport::Concern
included do
#
# A list of indexers
#
# An indexer must implement this interface:
#
# .index_map => Hash of key, index
# .index_names_and_values(obj) => Hash of index name, value for obj
#
# @see ActiveFedora::Indexer
#
class_attribute :indexers, :default_indexer
self.indexers = []
end
module ClassMethods
def inherited(subclass)
subclass.default_indexer = Indexer.new
subclass.index_with default_indexer
super
end
# Adds an index to the default indexer
def index(key, *args, &block)
default_indexer.index(key, *args, &block)
end
# Adds one or more indexers (classes)
#
# index_with MyIndexer, YourIndexer
#
def index_with(*idxers)
indexers += idxers.map { |idxer| idxer.is_a?(Class) ? idxer.new : idxer }
end
# Returns a combined list of indexes from all the indexers
def indexes
indexers.map(&:indexes).flatten
end
# Returns combined key, index map
def index_map
indexers.map(&:index_map).reduce({}, :merge)
end
# Returns list of all index field names (i.e. solr names)
def index_names
indexes.map(&:name)
end
# Returns index field name for a key
def index_name(key)
index_map[key].name
end
end # ClassMethods
def to_solr
doc = {}
# rdf stuff
# relationship stuff
# whatever
doc.merge(index_names_and_values)
end
def index_names_and_values
indexers.map { |idxer| idxer.index_names_and_values(self) }.reduce({}, :merge)
end
end
end
module ActiveFedora
class ObjectIndexer < Indexer
index :object_profile, :displayable do
# something goes here
end
index :system_create, :stored_sortable do
ctime
end
index :system_modified, :stored_sortable do
mtime
end
index :active_fedora_model, :stored_sortable do
# something goes here
end
# TODO Solr Document ID
index :has_model, :symbol # verify :symbol
end
end
@escowles
Copy link

👍 as someone who's gone down the path of overriding to_solr, this looks like a much better approach.

@hectorcorrea
Copy link

👍

@awead
Copy link

awead commented Jan 16, 2015

Good stuff. Maybe it could be a topic at the Hydra meeting in Portland?

@dchandekstark
Copy link
Author

Yeah, Portland seems like a good opportunity to explore this further with the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment