This is a step by step instruction on how to create a cluster that has three Solr nodes running in cloud mode. These instructions should work on both a local cluster (for testing, with 3 virtual hosts) and a remote cluster where each server runs in its own physical machine.
This was tested on Solr version 6.2.1
and Zookeeper version 3.4.6
We will assume that the names of the hosts with the Zookeeper servers will be called: zserver1 zserver2 zserver3
- Download and extract Solr on each machine:
curl -O http://mirror.metrocast.net/apache/lucene/solr/6.2.1/solr-6.2.1.tgz
mkdir /opt/solr
tar -zxvf solr-6.2.1.tgz -C /opt/solr --strip-components=1
- Use Solr installation best practices to keep your zookeeper directory clean:
su - zookeeper
cd /usr/hdp/current/zookeeper-client/bin/
$ ./zkCli.sh -server server1:2181,server2:2181,server3:2181
- In the zookeeper shell, type the following:
create /solr []
- In the zookeeper shell, type the following to confirm the directory now exists:
ls /solr
- In the zookeeper shell, type the following:
quit
cd /opt/solr
- Start the three Solr instances on each host and have them point at our Zookeeper instances:
# Notice the /solr on the LAST zk instance,
# this forces solr to save all data in the zkw /solr directory instead of the root directory.
$ ./bin/solr start -c -p 8983 -z zserver1:2181, zserver2:2181, zserver3:2181/solr
$ ./bin/solr start -c -p 8983 -z zserver1:2181, zserver2:2181, zserver3:2181/solr
$ ./bin/solr start -c -p 8983 -z zserver1:2181, zserver2:2181, zserver3:2181/solr
- Upload our collection configuration to ZooKeeper: You will need to use the zkcli script from the Solr installation and not the zkcli script from the Zookeeper install.
$ ./server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -zkhost zserver1:2181/solr \
-confdir ./server/solr/configsets/data_driven_schema_configs/conf/ \
-confname my-config
-
Create a Solr collection using the uploaded configuration.
curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=my-colection&numShards=2&replicationFactor=1&collection.configName=my-config'
Notes:
- If you want to create multiple collections with different schemas, then repeat the last two steps for each collection that uses a different schema. Otherwise, Zookeeper will sync the schema for all collections and you will end up with a single schema for all collections.
- In Solr, the default maxShardsPerNode is one shard per node. In this setup, we had 3 nodes, so we should not attempt to add more replicas to a collection (e.g., numShards=2 & replicationFactor=2 will result in four shards in total spreaded across three nodes). This would cause a series of errors and crashes since two replicas of the same shard will never be allowed to exist on the same node as per the maxShardsPerNode config setting.
Awesome - this is very helpful. (Much more helpful than the official page as it turns out https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble.) . Thanks very much