I want to derive a mental picture of a blacklight search. I know that I submit a form and that a query is made against a solr db, the result of which is processed and returnd to me. However, I need a more detailed picture of the interim steps.
Specically I want to get at:
- What is the function that first receives the forms input?
- Where and how can I make changes to that process?
- How does blacklight_advanced_search fit in?
- What ultimitely calls the solr db?
- What function processes the solr response?
- (How)/is it possible to add custom preprocessing to the returned search input?
(For those who would rather just get to the point):
-
What is the function that first receives the forms input?
- The entry point is
CatalogController#index
- But the interesting things start at
Blacklight::SearchHelper.search_results
.
- The entry point is
-
Where and how can I make changes to that process?
- There are many opportunities to change how the search behaves. The easiest and
most intuative is to update the
default_processor_chain
with the methods that you want to apply to search params before they are sent to the solr server.
- There are many opportunities to change how the search behaves. The easiest and
most intuative is to update the
-
How does blacklight_advanced_search fit in?
- blacklight_advanced_search uses mixins and updates the
default_processor_chain
in order to change the behavior of the search.
- blacklight_advanced_search uses mixins and updates the
-
What ultimitely calls the solr db?
RSolr.connect(connection_config.merge(adapter: connection_config[:http_adapter]))
-
What function processes the solr response?
solr_response = blacklight_config.response_model.new(res, solr_params, document_model: blacklight_config.document_model, blacklight_config: blacklight_config)
-
(How)/is it possible to add custom preprocessing to the returned search input?
- Yes, and both the response_model and document_model are configurable, so one idea could be to wrap one of the default objects and enhance them with whatever functionality it is we are seeking to add.
The first step in our journey starts with the search form:
<form class="search-query-form clearfix navbar-form" role="search" action="http://localhost:32823/" accept-charset="UTF-8" method="get" _lpchecked="1"><input name="utf8" type="hidden" value="✓">
<!-- .. -->
</form>
As we can see the action path is set to the root of the application ("/") and the method is set to get.
We look inside of config/routes.rb
to see how a get
action to "/" is
handled by the application.
Inside that file we see the following line
root to: "catalog#index"
What this means is that calls to the root of our application get routed to the
CatalogController.index
action. So we need to determine how the
catalog_controller is defined and where the index action comes from.
We see from ./app/controllers/catalog_controller.rb
the definition for
CatalogController
:
class CatalogController < ApplicationController
include BlacklightAdvancedSearch::Controller
include BlacklightRangeLimit::ControllerOverride
include Blacklight::Catalog
# ...
And with a little slouthing, we determine that module Blacklight::Catalog.index
gets defined in
blacklight-6.12.0/app/controllers/concerns/blacklight/catalog.rb
. The
definition reveals our entry point:
def index
(@response, @document_list) = search_results(params)
respond_to do |format|
format.html { store_preferred_view }
format.rss { render :layout => false }
format.atom { render :layout => false }
format.json do
@presenter = Blacklight::JsonPresenter.new(@response,
@document_list,
facets_from_request,
blacklight_config)
end
additional_response_formats(format)
document_export_formats(format)
end
end
Fortunately this is a relatively straight forward function at this level. We
can see that the @response
, and @document_list
instance variables are
assigned from the return value of search_results(params)
. Note that params
is a hash of the url parameters as provided to the controller by rails.
It's important to note also that because @response
and @document_list
are
being defined as instance variables that they are thus made available to any
view
or helper
method that the CatalogController uses.
In the next step the controller (via the index actions) dynamically sets the response based on the mime type of the request via the respond_to dsl: https://apidock.com/rails/ActionController/MimeResponds/respond_to
Following search_results
to its definition we find it at
blacklight-6.12.0/app/controllers/concerns/blacklight/search_helper.rb
:
module Blacklight::SearchHelper
extend ActiveSupport::Concern
include Blacklight::RequestBuilders
# a solr query method
# @param [Hash] user_params ({}) the user provided parameters (e.g. query, facets, sort, etc)
# @yield [search_builder] optional block yields configured SearchBuilder, caller can modify or create new SearchBuilder to be used. Block should return SearchBuilder to be used.
# @return [Blacklight::Solr::Response] the solr response object
def search_results(user_params)
builder = search_builder.with(user_params)
builder.page = user_params[:page] if user_params[:page]
builder.rows = (user_params[:per_page] || user_params[:rows]) if user_params[:per_page] || user_params[:rows]
builder = yield(builder) if block_given?
response = repository.search(builder)
if response.grouped? && grouped_key_for_results
[response.group(grouped_key_for_results), []]
elsif response.grouped? && response.grouped.length == 1
[response.grouped.first, []]
else
[response, response.documents]
end
end
This is where things start to get interesting. We can see for instance that we
will be using a builder
to generate a response. This builder seems both
configurable and overridable as it is either generated using the user_params
builder = search_builder.with(user_params)
or overridden using the builder itself if a block is given
builder = yield(builder) if block_given?
.
[Quick asside] : As an example of how the builder override is invoked we can take a look at the
Blacklight::Catalog.index
override defined in blacklight_advanced_search gem:
blacklight_advanced_search-6.3.1/app/controllers/blacklight_advanced_search/advanced_controller.rb
Where you can see search_results
is called using a block.
class BlacklightAdvancedSearch::AdvancedController < CatalogController
def index
@response = get_advanced_search_facets unless request.method == :post
end
protected
# Override to use the engine routes
def search_action_url(options = {})
blacklight_advanced_search_engine.url_for(options.merge(action: 'index'))
end
def get_advanced_search_facets
# We want to find the facets available for the current search, but:
# * IGNORING current query (add in facets_for_advanced_search_form filter)
# * IGNORING current advanced search facets (remove add_advanced_search_to_solr filter)
response, _ = search_results(params) do |search_builder|
search_builder.except(:add_advanced_search_to_solr).append(:facets_for_advanced_search_form)
end
response
end
end
Going back to the search_results
function, we note that the main pattern is
the creation of a builder
object (also referred to as a query
) and passing
the builder/query
object to the repository.search
method:
def search_results(user_params)
# ...
builder = search_builder.with(user_params)
response = repository.search(builder)
# ...
end
# ...
def get_facet_field_response(facet_field, user_params = params || {}, extra_controller_params = {})
# ...
query = search_builder.with(user_params).facet(facet_field)
repository.search(query.merge(extra_controller_params))
#...
end
# ...
def get_previous_and_next_documents_for_search(index, request_params, extra_controller_params={})
#...
query = search_builder.with(request_params).start(p.delete(:start)).rows(p.delete(:rows)).merge(extra_controller_params).merge(p)
response = repository.search(query)
#...
end
# ...
def get_opensearch_response(field = nil, request_params = params || {}, extra_controller_params = {})
#...
query = search_builder.with(request_params).merge(solr_opensearch_params(field)).merge(extra_controller_params)
response = repository.search(query)
#...
end
Given this general pattern, the next obvious thing to take a look at are the
search_builder
and repository
definitions.:
In the current file we find a definition for the repository:
delegate :repository_class, to: :blacklight_config
def repository
repository_class.new(blacklight_config)
end
And in
blacklight-6.12.0/app/controllers/concerns/blacklight/request_builders.rb
we
find a definition for the request_builder:
module Blacklight
module RequestBuilders
extend ActiveSupport::Concern
#...
# Override this method to use a search builder other than the one in the config
delegate :search_builder_class, to: :blacklight_config
def search_builder
search_builder_class.new(self)
end
#...
end
end
In both of these definitions we note that blacklight_config
is invoked
(meaning these objects are configurable). So our next question is to take
a look at where and how blacklight_config
is defined. To answer that
question, let's go back and take a look at the definition for
Blacklight::Catalog
found at
blacklight-6.12.0/app/controllers/concerns/blacklight/catalog.rb
module Blacklight::Catalog
extend ActiveSupport::Concern
include Blacklight::Base
include Blacklight::DefaultComponentConfiguration
include Blacklight::Facet
#...
One note of interest here is that Blacklight::Catalog
is not
a controller but a concern, which as you may recall is included into the
CatalogController
(see CatalogController
section above.)
You may think that the line
include Blacklight::DefaultComponentConfiguration
would suggest that
blacklight
config is defined there, but in fact that file uses
blacklight_config
. The next place to look in is at Blacklight:Base
. That is
defined at: blacklight/app/controllers/concerns/blacklight/base.rb
module Blacklight::Base
extend ActiveSupport::Concern
include Blacklight::Configurable
include Blacklight::SearchHelper
include Blacklight::SearchContext
OK, we are making some progress, what is inside of Blacklight::Configurable
?
Well, that ends up also being a concern, but this time it's defined under the models directory:
blacklight-6.12.0/app/models/concerns/blacklight/configurable.rb
module Blacklight::Configurable
extend ActiveSupport::Concern
included do
helper_method :blacklight_config if respond_to? :helper_method
end
#instance methods for blacklight_config, so get a deep copy of the class-level config
def blacklight_config
@blacklight_config ||= self.class.blacklight_config.deep_copy
end
attr_writer :blacklight_config
module ClassMethods
def copy_blacklight_config_from(other_class)
self.blacklight_config = other_class.blacklight_config.inheritable_copy
end
# lazy load a deep_copy of superclass if present, else
# a default_configuration, which will be legacy load or new empty config.
# note the @blacklight_config variable is a ruby 'instance method on class
# object' that won't be automatically available to subclasses, that's why
# we lazy load to 'inherit' how we want.
def blacklight_config
@blacklight_config ||= if superclass.respond_to?(:blacklight_config)
superclass.blacklight_config.deep_copy
else
default_configuration
end
end
Paydirt?
Unfortunately there is nothing obvious in the blacklight_config
definition that suggests that we can delegate either of search_builder_class
or repository_class
, which is what we've been after in this part of
the journey. So what gives? Well, we need to follow to where the default_configuration
is set.
# ...
attr_writer :blacklight_config
#simply a convenience method for blacklight_config.configure
def configure_blacklight(*args, &block)
blacklight_config.configure(*args, &block)
end
##
# The default configuration object
def default_configuration
Blacklight::Configurable.default_configuration.inheritable_copy
end
end
def self.default_configuration
@default_configuration ||= Blacklight::Configuration.new
end
def self.default_configuration= config
@default_configuration = config
end
The default configuration is set to be Blacklight::Configuration.new
. So, that is where we need to go next:
blacklight/lib/blacklight.rb
require 'kaminari'
require 'deprecation'
require 'blacklight/utils'
require 'active_support/hash_with_indifferent_access'
module Blacklight
autoload :AbstractRepository, 'blacklight/abstract_repository'
autoload :Configuration, 'blacklight/configuration'
autoload :Exceptions, 'blacklight/exceptions'
autoload :Parameters, 'blacklight/parameters'
autoload :Routes, 'blacklight/routes'
autoload :RuntimeRegistry, 'blacklight/runtime_registry'
autoload :SearchBuilder, 'blacklight/search_builder'
autoload :SearchState, 'blacklight/search_state'
autoload :Solr, 'blacklight/solr'
extend Deprecation
require 'blacklight/version'
require 'blacklight/engine' if defined?(Rails)
OK, we are getting closer according to this line:
autoload :Configuration, 'blacklight/configuration'
Next we go look in that file at blacklight/lib/blacklight/configuration.rb
And, we found our answer!!!
module Blacklight
##
# Blacklight::Configuration holds the configuration for a Blacklight::Controller, including
# fields to display, facets to show, sort options, and search fields.
class Configuration < OpenStructWithHashAccess
# ...
# Set up Blacklight::Configuration.default_values to contain
# the basic, required Blacklight fields
class << self
# ...
def repository_class
super || Blacklight::Solr::Repository
end
# ...
def search_builder_class
super || locate_search_builder_class
end
def locate_search_builder_class
::SearchBuilder
end
So now we know that the default repository_class
is
Blacklight::Solr::Repository
and that the default search_builder class is
::SearchBuilder
.
We'll want to take a look at what the basic query.with(user_params)
and
repository.search(query)
do.
First let's take a look at query.with
:
blacklight/lib/blacklight/search_builder.rb
##
# Set the parameters to pass through the processor chain
def with(blacklight_params = {})
params_will_change!
@blacklight_params = blacklight_params.dup
self
end
So the most basic version of the query object simply sets the
@blacklight_params
instance variable to whatever is passed into the with
method. That's pretty straight forward. Anything fancier would more likely
happen manually or via a passed in block on creation.
Now taking a quick look at the default repository search
method:
module Blacklight::Solr
class Repository < Blacklight::AbstractRepository
##
# Execute a search query against solr
# @param [Hash] params solr query parameters
def search params = {}
send_and_receive blacklight_config.solr_path, params.reverse_merge(qt: blacklight_config.qt)
end
##
# Execute a solr query
# @see [RSolr::Client#send_and_receive]
# @overload find(solr_path, params)
# Execute a solr query at the given path with the parameters
# @param [String] solr path (defaults to blacklight_config.solr_path)
# @param [Hash] parameters for RSolr::Client#send_and_receive
# @overload find(params)
# @param [Hash] parameters for RSolr::Client#send_and_receive
# @return [Blacklight::Solr::Response] the solr response object
def send_and_receive(path, solr_params = {})
benchmark("Solr fetch", level: :debug) do
key = blacklight_config.http_method == :post ? :data : :params
res = connection.send_and_receive(path, {key=>solr_params.to_hash, method: blacklight_config.http_method})
solr_response = blacklight_config.response_model.new(res, solr_params, document_model: blacklight_config.document_model, blacklight_config: blacklight_config)
The are seveal interesting lines here, first let's talk about the abovious one:
solr_response = blacklight_config.response_model.new(res, solr_params, document_model: blacklight_config.document_model, blacklight_config: blacklight_config)
This line is interesting because we can see that the response model and the document model are both configurable (i.e. we can substitute our own wrapper classes via configuration)
def response_model
super || Blacklight::Solr::Response
end
def document_model
super || ::SolrDocument
end
But the line right before it is just as interesting although indirectly:
res = connection.send_and_receive(path, {key=>solr_params.to_hash, method: blacklight_config.http_method})
Specifically the bit that says solr_params.to_hash
is super important as we
find out if we look up the definition of the to_hash
method in this case:
def initialize(*options)
# ...
@processor_chain ||= default_processor_chain.dup
# ...
end
def to_hash
return @params unless params_need_update?
@params = processed_parameters.
reverse_merge(@reverse_merged_params).
merge(@merged_params).
tap { self.clear_changes }
end
alias_method :query, :to_hash
alias_method :to_h, :to_hash
# ...
def processed_parameters
request.tap do |request_parameters|
processor_chain.each do |method_name|
send(method_name, request_parameters)
end
end
end
The above code extraction (edited for clarification) reveals to us the basic pattern that is at the heart of a configurable way to manipulate search attributes prior to making the solr request. We can seee that processed_parameters
uses the configurable @processor_chain
to mutate the request_parameters, and that to_hash/query/to_h uses the processed_parameters
method.
At this point we have enough detail to derive a relatively accurate mental picture of the basic components of a generic blacklight search.
However, we still do not yet have a clear understanding of how
blacklight_advanced_search
fits into this picture.
Let's go back to our routes configuration file (cofig/routes.rb
) to see if we can find some clues.
mount BlacklightAdvancedSearch::Engine => "/"
This tells us that blacklight_advanced_search is an engine which according to rails docs is a type of plugin that is essentially a mini application that we can mount on our (host) application: http://guides.rubyonrails.org/engines.html
So how does this engine change our applications. Well, to get an idea of what it will do we should take a look at the generator code that is used to add it to our code:
blacklight_advanced_search-6.3.1/lib/generators/blacklight_advanced_search/install_generator.rb
Of those changes the changes that we are most interested in are as follows:
def inject_search_builder
inject_into_file 'app/models/search_builder.rb', after: /include Blacklight::Solr::SearchBuilderBehavior.*$/ do
"\n include BlacklightAdvancedSearch::AdvancedSearchBuilder" \
"\n self.default_processor_chain += [:add_advanced_parse_q_to_solr, :add_advanced_search_to_solr]"
end
end
def install_catalog_controller_mixin
inject_into_class "app/controllers/catalog_controller.rb", "CatalogController" do
" include BlacklightAdvancedSearch::Controller\n"
end
end
def configuration
inject_into_file 'app/controllers/catalog_controller.rb', after: "configure_blacklight do |config|" do
"\n # default advanced config values" \
"\n config.advanced_search ||= Blacklight::OpenStructWithHashAccess.new" \
"\n # config.advanced_search[:qt] ||= 'advanced'" \
"\n config.advanced_search[:url_key] ||= 'advanced'" \
"\n config.advanced_search[:query_parser] ||= 'dismax'" \
"\n config.advanced_search[:form_solr_parameters] ||= {}\n"
end
Above we can see that we will be adding two new methods to the processor_chain,
remember our discussion earlier that configuring the default_processor_chain
is one of the many ways to override the blacklight search behavior and we can
see that the blacklight_advanced_search gem is taking advantage of that
possibility.
The other changes include mixins and default configuration settings, and those behave as we would expect them to so I wont go into detail about them.