Skip to content

Instantly share code, notes, and snippets.

@wizardbeard
Created December 16, 2013 05:32
Show Gist options
  • Save wizardbeard/7982720 to your computer and use it in GitHub Desktop.
Save wizardbeard/7982720 to your computer and use it in GitHub Desktop.

Solr server setup on CentOS 6.4

Install packages

yum install java-1.7.0-openjdk tomcat6 tomcat6-webapps tomcat6-admin-webapps wget

Download Solr

wget http://mirror.catn.com/pub/apache/lucene/solr/4.3.1/solr-4.3.1.tgz
tar -xzf solr-*.tgz

Copy to a installation path

cd solr-*
cp -R example/solr /opt
cp dist/*.war /opt/solr/solr.war
chown tomcat:tomcat -R /opt/solr

Fixing Solr

4.3.1 ships broken by default, fuck knows why, to fix you need to copy some logging jar classes to tomcats class path.

cp example/lib/ext/* /usr/share/tomcat6/lib
cp example/resources/log4j.properties /usr/share/tomcat6/lib

Deploy to and setup tomcat

cat > /usr/share/tomcat6/conf/Catalina/localhost/solr.xml <<EOF
<?xml version="1.0" encoding="utf-8"?>
<Context docBase="/opt/solr/solr.war" debug="0" crossContext="true">
	<Environment name="solr/home" type="java.lang.String" value="/opt/solr/" override="true"/>
</Context>
EOF

Setup the manager username/password /etc/tomcat6/tomcat-users.xml

<?xml version='1.0' encoding='utf-8'?>
<tomcat-users>
	<user name="tomcat" password="changeme" roles="admin,manager" />
</tomcat-users>

Make sure tomcat starts at boot and restart for our changes to take affect.

chkconfig tomcat6 on
service tomcat6 restart

By this point Solr should be installed and running, navigate to http://localhost:8080/solr to view the admin interface

Create a new Solr core

By default you will notice you will have a core named collection1. Trying to create a new core via the admin interface will most likely return Error CREATEing SolrCore 'new_core': Unable to create core: new_core so we need to do this manually.

mkdir -p /opt/solr/core1/conf

Create a basic solrconfig.xml file

nano /opt/solr/core1/conf/solrconfig.xml

<?xml version="1.0" encoding="UTF-8" ?>
<config>
	<luceneMatchVersion>LUCENE_43</luceneMatchVersion>
	<dataDir>${solr.data.dir:}</dataDir>
	<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
	<codecFactory class="solr.SchemaCodecFactory"/>
	<schemaFactory class="ClassicIndexSchemaFactory"/>

	<indexConfig>
		<lockType>${solr.lock.type:native}</lockType>
	</indexConfig>

	<!-- JMX -->
	<jmx />

	<!-- The default high-performance update handler -->
	<updateHandler class="solr.DirectUpdateHandler2">
		<updateLog>
			<str name="dir">${solr.ulog.dir:}</str>
		</updateLog>
		<autoCommit>
			<maxTime>15000</maxTime>
			<openSearcher>false</openSearcher>
		</autoCommit>
	</updateHandler>

	<query>
		<maxBooleanClauses>1024</maxBooleanClauses>
		<filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0"/>
		<queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/>
		<documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/>
		<enableLazyFieldLoading>true</enableLazyFieldLoading>
		<queryResultWindowSize>20</queryResultWindowSize>
		<queryResultMaxDocsCached>200</queryResultMaxDocsCached>
		<useColdSearcher>false</useColdSearcher>
		<maxWarmingSearchers>2</maxWarmingSearchers>
	</query>

	<!-- Request Dispatcher -->
	<requestDispatcher handleSelect="false" >
		<requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000" formdataUploadLimitInKB="2048"/>
		<httpCaching never304="true" />
	</requestDispatcher>

	<!-- SearchHandler -->
	<requestHandler name="/select" class="solr.SearchHandler">
		<lst name="defaults">
			<str name="echoParams">explicit</str>
			<int name="rows">10</int>
			<str name="df">text</str>
		</lst>
	</requestHandler>

	<!-- A request handler that returns indented JSON by default -->
	<requestHandler name="/query" class="solr.SearchHandler">
		<lst name="defaults">
			<str name="echoParams">explicit</str>
			<str name="wt">json</str>
			<str name="indent">true</str>
			<str name="df">text</str>
		</lst>
	</requestHandler>

	<requestHandler name="/get" class="solr.RealTimeGetHandler">
		<lst name="defaults">
			<str name="omitHeader">true</str>
			<str name="wt">json</str>
			<str name="indent">true</str>
		</lst>
	</requestHandler>

	<requestHandler name="/update" class="solr.UpdateRequestHandler"></requestHandler>

	<requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler">
		<lst name="defaults">
			<str name="stream.contentType">application/json</str>
		</lst>
	</requestHandler>

	<requestHandler name="/update/csv" class="solr.CSVRequestHandler">
		<lst name="defaults">
			<str name="stream.contentType">application/csv</str>
		</lst>
	</requestHandler>

	<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" >
		<lst name="defaults">
			<str name="lowernames">true</str>
			<str name="uprefix">ignored_</str>
			<str name="captureAttr">true</str>
			<str name="fmap.a">links</str>
			<str name="fmap.div">ignored_</str>
		</lst>
	</requestHandler>

	<requestHandler name="/analysis/field" startup="lazy" class="solr.FieldAnalysisRequestHandler" />

	<requestHandler name="/analysis/document" class="solr.DocumentAnalysisRequestHandler" startup="lazy" />

	<requestHandler name="/admin/" class="solr.admin.AdminHandlers" />

	<requestHandler name="/admin/ping" class="solr.PingRequestHandler">
		<lst name="invariants">
			<str name="q">solrpingquery</str>
		</lst>
		<lst name="defaults">
			<str name="echoParams">all</str>
			<str name="df">id</str>
		</lst>
	</requestHandler>

	<requestHandler name="/debug/dump" class="solr.DumpRequestHandler" >
		<lst name="defaults">
		 <str name="echoParams">explicit</str>
		 <str name="echoHandler">true</str>
		</lst>
	</requestHandler>

	<queryResponseWriter name="json" class="solr.JSONResponseWriter">
		<str name="content-type">application/json; charset=UTF-8</str>
	</queryResponseWriter>

	<admin>
		<defaultQuery>*:*</defaultQuery>
	</admin>
</config>

Create a basic schema.xml file

nano /opt/solr/core1/conf/schema.xml

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="core1" version="1.5">
	<fields>
		<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
		<field name="title" type="text_general" indexed="true" stored="true" multiValued="true" />
		<field name="description" type="text_general" indexed="true" stored="true" />

		<field name="_version_" type="long" indexed="true" stored="true" multiValued="false" />
	</fields>

	<uniqueKey>id</uniqueKey>

	<types>
		<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
		<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/>
		<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
		<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
		<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
		<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/>

		<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
		<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/>
		<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/>
		<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/>

		<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
		<fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0"/>
		<fieldtype name="binary" class="solr.BinaryField"/>

		<fieldType name="pint" class="solr.IntField"/>
		<fieldType name="plong" class="solr.LongField"/>
		<fieldType name="pfloat" class="solr.FloatField"/>
		<fieldType name="pdouble" class="solr.DoubleField"/>
		<fieldType name="pdate" class="solr.DateField" sortMissingLast="true"/>

		<fieldType name="random" class="solr.RandomSortField" indexed="true" />

		<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
			<analyzer>
				<tokenizer class="solr.WhitespaceTokenizerFactory"/>
			</analyzer>
		</fieldType>

		<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
			<analyzer type="index">
				<tokenizer class="solr.StandardTokenizerFactory"/>
				<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
				<filter class="solr.LowerCaseFilterFactory"/>
			</analyzer>
			<analyzer type="query">
				<tokenizer class="solr.StandardTokenizerFactory"/>
				<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
				<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
				<filter class="solr.LowerCaseFilterFactory"/>
			</analyzer>
		</fieldType>
	</types>
</schema>

Create synonyms and stopwords text files

touch /opt/solr/core1/conf/synonyms.txt
touch /opt/solr/core1/conf/stopwords.txt

Now we have setup our core its time to tell solr about it.

nano /opt/solr/solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
	<cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:8983}" hostContext="${hostContext:solr}" zkClientTimeout="${zkClientTimeout:15000}">
		<core name="collection1" instanceDir="collection1" />
		<core name="core1" instanceDir="core1" />
	</cores>
</solr>

Make sure all the files have the correct owners and restart tomcat

chown tomcat:tomcat -R /opt/solr
service tomcat6 restart
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment