Skip to content

Instantly share code, notes, and snippets.

@isoboroff
Created May 20, 2014 17:06
Show Gist options
  • Save isoboroff/424fcdf63fa760c1d1a7 to your computer and use it in GitHub Desktop.
Save isoboroff/424fcdf63fa760c1d1a7 to your computer and use it in GitHub Desktop.
Getting out of Solr Zookeeper /solr/overseer/queue hell
I had a large index job crash at about doc 75M. This is with CDH4.6. I could not list the /solr/overseer/queue directory in ZK because it had millions of entries.
Here are the steps I followed to avoid a re-index:
1. Shut down all solr-servers and zookeeper-servers
2. Run zookeeper-server-initialize --force --myid X on each ZK server. This should result in an empty ZK space.
3. solrctl --init
4. hadoop mv /solr/the-collection /hold
5. Restart solr-server instances
6. solrctl instancedir --create ... # re-upload the config info
7. solrctl collection --create # with the same number of nodes, repls, as before
8. shut down solr-server instances again
9. hadoop rm /solr/the-collection
10. hadoop mv /hold /solr/the-collection
11. Restart solr-server instances and give them time to recover
Undoubtedly some documents got dropped during indexing, but most everything was still there.
No where on the net does it really spell out how to do surgery on ZK and Solr metadata. Just wanted to gist this info before I forgot it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment