Skip to content

Instantly share code, notes, and snippets.

@mykoweb
Last active December 15, 2016 06:19
Show Gist options
  • Save mykoweb/a7a7083623f4bda23450538bc08b69b1 to your computer and use it in GitHub Desktop.
Save mykoweb/a7a7083623f4bda23450538bc08b69b1 to your computer and use it in GitHub Desktop.
Scaling Elasticsearch
req_body_for_patient_a = {
routing: 'PatientA',
filter: {
term: { patient_name: 'PatientA' }
}
}
req_body_for_patient_b = {
routing: 'PatientB',
filter: {
term: { patient_name: 'PatientB' }
}
}
raw_put 'patients/_alias/PatientA', req_body_for_patient_a
raw_put 'patients/_alias/PatientB', req_body_for_patient_b
client.indices.exists_alias? name: 'PatientA' # => true
client.indices.exists_alias? name: 'PatientB' # => true
# It's a good idea to disable refresh when indexing documents to improve performance
client.indices.put_settings index: 'PatientA', body: { refresh_interval: -1 }
client.indices.put_settings index: 'PatientB', body: { refresh_interval: -1 }
# Now index (store) the documents
patient_a_doc1 = {
patient_name: 'PatientA',
content: 'The quick brown fox'
}
patient_a_doc2 = {
patient_name: 'PatientA',
content: 'It was a dark and stormy night'
}
patient_b_doc1 = {
patient_name: 'PatientB',
content: 'Two roads diverged in a yellow wood'
}
patient_b_doc2 = {
patient_name: 'PatientB',
content: 'Lorem ipsum dolor sit amet'
}
bulk_index_body = [
{ index: { _index: 'PatientA', _type: 'patient_type', data: patient_a_doc1 } },
{ index: { _index: 'PatientA', _type: 'patient_type', data: patient_a_doc2 } },
{ index: { _index: 'PatientB', _type: 'patient_type', data: patient_b_doc1 } },
{ index: { _index: 'PatientB', _type: 'patient_type', data: patient_b_doc2 } }
]
client.bulk body: bulk_index_body
# Don't forget to enable refresh
client.indices.put_settings index: 'PatientA', body: { refresh_interval: '30s' }
client.indices.put_settings index: 'PatientB', body: { refresh_interval: '30s' }
def raw_put(path, body)
conn = Faraday.new url: 'http://localhost:9200'
conn.put(path) do |req|
req.body = body.to_json
req.headers['Content-Type'] = 'application/json'
end
end
# Create primary index
req_body = {
settings: {
index: {
number_of_shards: 5, # In production, this number would be much larger
number_of_replicas: 1
}
},
mappings: {
patient_type: {
properties: {
patient_name: { index: 'not_analyzed', type: 'string' },
content: { index: 'not_analyzed', type: 'string' }
}
}
}
}
raw_put 'patients', req_body
client.indices.exists? index: 'patients' # => true
pry(main)> client.search index: 'patients', q: 'content:The quick brown'
=> {"took"=>4,
"timed_out"=>false,
"_shards"=>{"total"=>5, "successful"=>5, "failed"=>0},
"hits"=>
{"total"=>1,
"max_score"=>0.26442188,
"hits"=>
[{"_index"=>"patients",
"_type"=>"patient_type",
"_id"=>"AVkBDPk-1ECoVlT3taI2",
"_score"=>0.26442188,
"_routing"=>"PatientA",
"_source"=>{"patient_name"=>"PatientA", "content"=>"The quick brown fox"}}]}}
pry(main)> client.search index: 'PatientA', q: 'content:The quick brown'
=> {"took"=>2,
"timed_out"=>false,
"_shards"=>{"total"=>1, "successful"=>1, "failed"=>0},
"hits"=>
{"total"=>1,
"max_score"=>0.26442188,
"hits"=>
[{"_index"=>"patients",
"_type"=>"patient_type",
"_id"=>"AVkBDPk-1ECoVlT3taI2",
"_score"=>0.26442188,
"_routing"=>"PatientA",
"_source"=>{"patient_name"=>"PatientA", "content"=>"The quick brown fox"}}]}}
require 'json'
require 'elasticsearch'
client = Elasticsearch::Client.new hosts: 'http://localhost:9200'
# Test that Elasticsearch is up
client.info
pry(main)> client.info
=> {"name"=>"Dredmund Druid",
"cluster_name"=>"elasticsearch_mkim",
"cluster_uuid"=>"f19ZvZWLTsaJeUM9ww1nQw",
"version"=>
{"number"=>"2.4.2",
"build_hash"=>"161c65a337d4b422ac0c805f284565cf2014bb84",
"build_timestamp"=>"2016-11-17T11:51:03Z",
"build_snapshot"=>false,
"lucene_version"=>"5.5.2"},
"tagline"=>"You Know, for Search"}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment