Skip to content

Instantly share code, notes, and snippets.

@harikt
Forked from mankyKitty/SolrQuerySyntaxPrimer.md
Created July 13, 2020 15:26
Show Gist options
  • Save harikt/5497ccd3745db9b84d57de50a91f0735 to your computer and use it in GitHub Desktop.
Save harikt/5497ccd3745db9b84d57de50a91f0735 to your computer and use it in GitHub Desktop.
The documentation around the basics of the Solr query syntax is terrible, this is an attempt to alleviate the doc-shock associated with trying to learn some fundamentals.

Solr Query Syntax Basics

This is a super basic beginners guide to Solr Lucene query syntax. We're going to cover running a straightforward query, as well as some of the more useful functionality such as filtering and creating facets. We'll point out some things you can't do and generally give you enough instruction so that you can get yourself into trouble.

For testing you need a REST client capable of sending requests to your Solr instance. Either RESTClient for Firefox or Postman for Chrome are good choices.

Misc

Request Specific Fields

To specify a list of fields to return instead of the default Solr response use fl and provide a comma delimited list of fields:

&fl=ss_destination,fs_total_price,ds_departure_date

Limit Number of results

To specify the number of results to return include the rows parameter with a numeric value:

&rows=10

This can be combined with the start value to creating pagination queries:

&rows=10&start=0
...
&rows=10&start=10

Query String

To provide Apache Solr with a query string to search on you must use the q parameter like so:

q='Cruises'

You can also use AND as well as OR in this string to create boolean queries.

q='Athens OR United Kingdom'
...
q='Athens AND Istanbul'

Filter Queries

Filter queries are used to narrow the results of a search and are normally used to restrict the search based on the value of specific fields that have been indexed. Multiple filter queries can be added to the same base query, each is run in turn against the results of the previous query. This allows you to have some pre-made filter queries that you can programmatically apply to the query as needed and they will all work in tandem. The basic structure of a filter query is as follows:

&fq=<field_id>:<query>
...
&fq=ss_destination:(Bali)

Filter queries also support the use of wildcards * to create STARTS WITH, CONTAINS, or ENDS WITH style queries:

&fq=ss_destination(Bali*)
...
&fq=ss_destination(*Bali*)
...
&fq=ss_destination(*Bali)

Filter queries can be combined as well to create more complex boolean filter conditions. Be sure to include () around the items:

&fq=(ss_destination(Bali) OR ss_destination:(Sydney))
...
&fq=(ss_destination(Bali) AND ss_one_way:(Yes))

Ranges BETWEEN

Filter queries also support the use of ranges to filter for any results that occur between set boundaries.

&fq=fs_total_price:[<min> TO <max>]
...
&fq=fs_total_price:[100 TO 999]
...
&fq=fs_total_price:[1 TO *]
...
&fq=fs_total_price:[* TO 999]

Negate Operator

Any filter query can have a negate operator applied to them to set this filter query to a NOT matching style filter. You simply prepend the - to the start of your field name to declare this as a NOT filter.

&fq=-fs_total_price:[1000 TO *]
...
&fq=-ss_destination:(Bali)

These can be included in combined filter queries as well.

&fq=(ss_destination:(Bali) AND -fs_total_price:[1000 TO *])

Facets

To request facets from Solr you first need to tell Solr to activate them for this particular query, without this there will be no facet results:

&facet=true

After which you must specify fields to provide faceted results for using the facet.field parameter:

&facet=true&facet.field=<field_id>
...
*facet=true&facet.field=ss_destination

This will add a facet section to the query response and the facet results will include the number of results for each entry in the faceted field. This basic usage will return results even if there are zero results for any of the facets. So if you have no indexed entries for a specific field that you request a facet for, that entry will still appear however it will show a count of 0.

To eliminate these results you can add the facet.mincount to the query to specify the minimum number of results that each facet must have before it will be included.

&facet=true&facet.field=ss_destination&facet.mincount=1

Setting it to 1 will eliminate any zero result facets from the query result.

To specify facet results for multiple fields, add multiple facet.field parameters to the query string:

&facet=true&facet.field=ss_destination&facet.field=fs_total_price&facet.mincount=1

Facet Ranges

In the final example above we included a facet field for the fs_total_price, this would create a facet where there was an entry for every unique value that is indexed in the fs_total_price field. This could create an unwieldy facet result which doesn't add much value as it is too granular. In this case we need to create a facet that creates a set of result ranges based on your specifications.

The main components required to create a ranged facet are:

  • facet.range which takes a field id as its value
  • f.<field_id>.facet.range.start to indicate what the starting value of the range is. A complete entry would look like: f.fs_total_price.facet.range.start.
  • f.<field_id>.facet.range.end to indicate where the range ends. This is the final value in the range. It works the same as the facet.range.start value.
  • f.<field_id>.facet.range.gap specifies the step value for the range. It is a numeric value that indicates the distance between each facet result.

A complete facet range for our fs_total_price field that went from 200 to 3000 in increments of 500 would look like this:

&facet=true&facet.field=fs_total_price&facet.range=fs_total_price&f.fs_total_price.facet.range.start=200&f.fs_total_price.facet.range.end=3000&f.fs_total_price.facet.range.gap=500

Note that a facet range query such as the one above will exclude and result that is below 200 or above 3000 in the facet result. This may be desired behaviour but if you would like to include this information in the facet then you need to include the facet.range.include parameter. All of its options are here.

You may include multiple ranges for different fields on the one query, but they must be for different fields. You cannot have multiple ranges on one field in a single query. Or at least I haven't worked out how to do it.

You cannot create facet ranges for fields that are multivalue.

Sorting

To sort a Solr query you must supply a field ID and a sort direction. You can specify multiple sort parameters and each one will be applied in turn:

&sort=fs_total_price asc
...
&sort=fs_total_price asc&sort=fs_number_of_travellers desc

You cannot sort on fields that are multivalue fields.

Grouping

You are able to create groups of results in a Solr query, depending on your query this is a way of aggregating results into sections. Adding the group parameters to your query will group all results. You need to first tell Solr that you are going to specify group information by setting group to true, then you need to specify a field to group on using group.field with a value of the field ID.

&group=true&group.field=ss_destination
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment