Skip to content

Instantly share code, notes, and snippets.

@tlasica
Created January 19, 2017 00:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tlasica/9b6295d0b6e66f32ccbf99c8f33cd9c6 to your computer and use it in GitHub Desktop.
Save tlasica/9b6295d0b6e66f32ccbf99c8f33cd9c6 to your computer and use it in GitHub Desktop.
<?xml version="1.0" encoding="UTF-8" ?>
<!--
=======
Copyright DataStax, Inc.
Please see the included license file for details.
-->
<!--
For more details about configurations options that may appear in
this file, see http://wiki.apache.org/solr/SolrConfigXml.
-->
<config>
<!-- In all configuration below, a prefix of "solr." for class names
is an alias that causes solr to search appropriate packages,
including org.apache.solr.(search|update|request|core|analysis)
You may also specify a fully qualified Java classname if you
have your own custom plugins.
-->
<!-- Set this to 'false' if you want solr to continue working after
it has encountered an severe configuration error. In a
production environment, you may want solr to keep working even
if one handler is mis-configured.
You may also set this to false using by setting the system
property:
-Dsolr.abortOnConfigurationError=false
-->
<abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError>
<!-- Controls what version of Lucene various components of Solr
adhere to. Generally, you want to use the latest version to
get all bug fixes and improvements. It is highly recommended
that you fully re-index after changing this setting as it can
affect both how text is indexed and queried.
-->
<luceneMatchVersion>LUCENE_4_10_3</luceneMatchVersion>
<!-- Enable DSE Search new type mappings -->
<dseTypeMappingVersion>2</dseTypeMappingVersion>
<!-- lib directives can be used to instruct Solr to load an Jars
identified and use them to resolve any "plugins" specified in
your solrconfig.xml or schema.xml (ie: Analyzers, Request
Handlers, etc...).
All directories and paths are resolved relative to the
instanceDir.
If a "./lib" directory exists in your instanceDir, all files
found in it are included as if you had used the following
syntax...
<lib dir="./lib" />
-->
<!-- A dir option by itself adds any files found in the directory to
the classpath, this is useful for including all jars in a
directory.
-->
<!-- an exact path can be used to specify a specific file. This
will cause a serious error to be logged if it can't be loaded.
-->
<!--
<lib path="../a-jar-that-does-not-exist.jar" />
-->
<!-- Data Directory
Used to specify an alternate directory to hold all index data
other than the default ./data under the Solr home. If
replication is in use, this should match the replication
configuration.
<dataDir>${solr.data.dir}</dataDir>
-->
<!-- The DirectoryFactory to use for indexes.
solr.StandardDirectoryFactory, the default, is filesystem
based and tries to pick the best implementation for the current
JVM and platform. One can force a particular implementation
via solr.MMapDirectoryFactory, solr.NIOFSDirectoryFactory, or
solr.SimpleFSDirectoryFactory.
DSE Search does not support solr.RAMDirectoryFactory or any other
non-persistent DirectoryFactory implementation.
-->
<directoryFactory name="DirectoryFactory" class="solr.StandardDirectoryFactory"/>
<indexConfig>
<rt>false</rt>
<useCompoundFile>false</useCompoundFile>
<ramBufferSizeMB>512</ramBufferSizeMB>
<mergeFactor>10</mergeFactor>
<!-- The number of concurrent merges (maxThreadCount) to perform and
the size of the merge backlog (maxMergecount). Be aware the backlog of merges
are stored as threads.
For compatibility the default of maxThreadCount=1 and maxMergeCount=2 is preserved
but these settings do not provide enough parallelism for a typical server.
Setting maxThreadCount to # of cores / 2 and maxMergecount to maxThreadCount * 2
is a good starting point.
-->
<!--<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
   <int name="maxThreadCount">1</int>
<int name="maxMergeCount">2</int>
</mergeScheduler>-->
<!-- Unlock On Startup
If true, unlock any held write or commit locks on startup.
This defeats the locking mechanism that allows multiple
processes to safely access a lucene index, and should be used
with care.
This is not needed if lock type is 'none' or 'single'
-->
<unlockOnStartup>true</unlockOnStartup>
<!-- If true, IndexReaders will be reopened (often more efficient)
instead of closed and then opened.
-->
<reopenReaders>true</reopenReaders>
<!-- Number of parallel deletes tasks to submit at once
when processing deletes in parallel.
Defaults to number of available processors.
-->
<!--<parallelDeleteTasks>4</parallelDeleteTasks>-->
<!-- The strategy used to match deleted terms with documents.
seekExact does an m * n check (m = # of terms, n = # of segments)
and uses the bloom filters to avoid having to check the block tree
for most segments. It will still have to check several due to false postives
unless you have increased the bloom filter precision.
seekCeiling also does a worst case m * n check, but it doesn't use the bloom filters.
It may be able to stop checking terms against segments that can't contain the
remaining terms, but this doesn't work if the terms in segments are randomly distributed.
If your unique key field is some kind of ordered sequence like a time UUID and they were
also inserted in order then many segments will not be in range for most deleted terms assuming
there is locality among deleted terms. Can be slightly faster.
Defaults to seekExact because it performs reliably in all scenarios. seekCeiling may be faster
but depends on the data and insertion order. When testing the two be aware that performance
differs drastically when the block trees stop fitting in memory. Any test that doesn't reflect
whether the block trees fit in memory will produce inaccurate results.
-->
<!--<deleteApplicationStrategy>seekExact</deleteApplicationStrategy>-->
<!-- Commit Deletion Policy
Custom deletion policies can specified here. The class must
implement org.apache.lucene.index.IndexDeletionPolicy.
http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexDeletionPolicy.html
The standard Solr IndexDeletionPolicy implementation supports
deleting index commit points on number of commits, age of
commit point and optimized status.
The latest commit point should always be preserved regardless
of the criteria.
-->
<deletionPolicy class="solr.SolrDeletionPolicy">
<!-- The number of commit points to be kept -->
<str name="maxCommitsToKeep">1</str>
<!-- The number of optimized commit points to be kept -->
<str name="maxOptimizedCommitsToKeep">0</str>
<!--
Delete all commit points once they have reached the given age.
Supports DateMathParser syntax e.g.
-->
<!--
<str name="maxCommitAge">30MINUTES</str>
<str name="maxCommitAge">1DAY</str>
-->
</deletionPolicy>
<infoStream file="INFOSTREAM.txt">false</infoStream>
</indexConfig>
<!-- JMX
This example enables JMX if and only if an existing MBeanServer
is found, use this if you want to configure JMX through JVM
parameters. Remove this to disable exposing Solr configuration
and statistics to JMX.
For more details see http://wiki.apache.org/solr/SolrJmx
-->
<jmx />
<!-- The default high-performance update handler -->
<!-- IN DSE THIS CANNOT BE CHANGED -->
<updateHandler class="solr.DirectUpdateHandler2">
<autoSoftCommit>
<maxTime>10000</maxTime>
</autoSoftCommit>
</updateHandler>
<query>
<!-- Max Boolean Clauses
Maximum number of clauses in each BooleanQuery, an exception
is thrown if exceeded.
** WARNING **
This option actually modifies a global Lucene property that
will affect all SolrCores. If multiple solrconfig.xml files
disagree on this property, the value at any given moment will
be based on the last SolrCore to be initialized.
-->
<maxBooleanClauses>1024</maxBooleanClauses>
<!-- Filter Cache
Cache used by SolrIndexSearcher for filters (DocSets),
unordered sets of *all* documents that match a query.
There are two types of filter caches.
Per segment size-based caches, with the following parameters:
class - LFUCache, LRUCache or FastLRUCache.
size - the maximum number of entries in the cache.
initialSize - the initial capacity (number of entries) of the cache.
WARNING: These size-based caches are now deprecated.
Per segment memory-based caches with LRU eviction, with the following parameters:
class - SolrFilterCache.
highWaterMarkMB - the maximum number of used MBs before triggering asynchronous LRU eviction.
lowWaterMarkMB - the maximum number of used MBs after eviction is completed.
-->
<filterCache class="solr.FastLRUCache" size="4096" initialSize="1024" autowarmCount="1024" />
<enableLazyFieldLoading>true</enableLazyFieldLoading>
<useColdSearcher>true</useColdSearcher>
<maxWarmingSearchers>16</maxWarmingSearchers>
</query>
<!-- Request Dispatcher
This section contains instructions for how the SolrDispatchFilter
should behave when processing requests for this SolrCore.
handleSelect affects the behavior of requests such as /select?qt=XXX
handleSelect="true" will cause the SolrDispatchFilter to process
the request and will result in consistent error handling and
formatting for all types of requests.
handleSelect="false" will cause the SolrDispatchFilter to
ignore "/select" requests and fallback to using the legacy
SolrServlet and it's Solr 1.1 style error formatting
-->
<requestDispatcher handleSelect="true" >
<!-- Request Parsing
These settings indicate how Solr Requests may be parsed, and
what restrictions may be placed on the ContentStreams from
those requests
enableRemoteStreaming - enables use of the stream.file
and stream.url parameters for specifying remote streams.
multipartUploadLimitInKB - specifies the max size of
Multipart File Uploads that Solr will allow in a Request.
*** WARNING ***
The settings below authorize Solr to fetch remote files, You
should make sure your system has some authentication before
using enableRemoteStreaming="true"
-->
<requestParsers enableRemoteStreaming="true"
multipartUploadLimitInKB="2048000" />
<!-- HTTP Caching
Set HTTP caching related parameters (for proxy caches and clients).
The options below instruct Solr not to output any HTTP Caching
related headers
-->
<httpCaching never304="true" />
</requestDispatcher>
<requestHandler name="search" class="solr.SearchHandler" default="true">
<!-- default values for query parameters can be specified, these
will be overridden by parameters in the request
-->
<lst name="defaults">
<int name="rows">10</int>
</lst>
</requestHandler>
<!-- SearchHandler for CQL Solr queries:
this handler doesn't support any additional components, only default parameters
-->
<requestHandler name="solr_query" class="com.datastax.bdp.search.solr.handler.component.CqlSearchHandler">
<lst name="defaults">
<int name="rows">10</int>
</lst>
</requestHandler>
<requestHandler name="/update"
class="solr.UpdateRequestHandler">
</requestHandler>
<!-- CSV Update Request Handler
http://wiki.apache.org/solr/UpdateCSV
-->
<requestHandler name="/update/csv"
class="solr.CSVRequestHandler"
startup="lazy" />
<!-- JSON Update Request Handler
http://wiki.apache.org/solr/UpdateJSON
-->
<requestHandler name="/update/json"
class="solr.JsonUpdateRequestHandler"
startup="lazy" />
<requestHandler name="/analysis/field"
startup="lazy"
class="solr.FieldAnalysisRequestHandler" />
<requestHandler name="/analysis/document"
class="solr.DocumentAnalysisRequestHandler"
startup="lazy" />
<!-- Admin Handlers
Admin Handlers - This will register all the standard admin
RequestHandlers.
-->
<requestHandler name="/admin/"
class="solr.admin.AdminHandlers" />
<!-- ping/healthcheck -->
<requestHandler name="/admin/ping" class="solr.PingRequestHandler">
<lst name="invariants">
<str name="qt">search</str>
<str name="q">solrpingquery</str>
</lst>
<lst name="defaults">
<str name="echoParams">all</str>
</lst>
</requestHandler>
<!-- Echo the request contents back to the client -->
<requestHandler name="/debug/dump" class="solr.DumpRequestHandler" >
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="echoHandler">true</str>
</lst>
</requestHandler>
<searchComponent name="terms" class="solr.TermsComponent"/>
<!-- A request handler for demonstrating the terms component -->
<requestHandler name="/terms" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<bool name="terms">true</bool>
</lst>
<arr name="components">
<str>terms</str>
</arr>
</requestHandler>
<!-- Legacy config for the admin interface -->
<admin>
<defaultQuery>*:*</defaultQuery>
<!-- configure a healthcheck file for servers behind a
loadbalancer
-->
<!--
<healthcheck type="file">server-enabled</healthcheck>
-->
</admin>
</config>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment