Skip to content

Instantly share code, notes, and snippets.

@karmi
Created March 16, 2012 16:09
Show Gist options
  • Star 28 You must be signed in to star a gist
  • Fork 21 You must be signed in to fork a gist
  • Save karmi/2050769 to your computer and use it in GitHub Desktop.
Save karmi/2050769 to your computer and use it in GitHub Desktop.
Bootstrap, install and configure ElasticSearch with Chef Solo
.DS_Store
Gemfile.lock
*.pem
node.json
tmp/*
!tmp/.gitignore

Bootstrap, install and configure ElasticSearch with Chef Solo

The code in this repository bootstraps and configures a fully managed Elasticsearch installation on a EC2 instance with EBS-based local persistence.

Download or clone the files in this gist:

curl -# -L -k https://gist.github.com/2050769/download | tar xz --strip 1 -C .

First, in the downloaded node.json file, replace the access_key and secret_key values with proper AWS credentials.

Second, create a dedicated security group in the AWS console for ElasticSearch nodes. We will be using group named elasticsearch-test.

Make sure the security groups allows connections on following ports:

  • Port 22 for SSH is open for external access (the default 0.0.0.0/0)
  • Port 8080 for the Nginx proxy is open for external access (the default 0.0.0.0/0)
  • Port 9300 for in-cluster communication is open to the same security group (use the Group ID for this group, available on the "Details" tab, such as sg-1a23bcd)

Third, launch a new instance in the AWS console:

  • Use a meaningful name for the instance. We will use elasticsearch-test-chef-1.
  • Create a new "Key Pair" for the instance, and download it. We will be using a key named elasticsearch-test.
  • Use the Amazon Linux AMI (ami-1b814f72). Amazon Linux comes with Ruby and Java pre-installed.
  • Use the m1.large instance type. You may use the small or even micro instance type, but the process will take very long, due to AWS constraints (could be hours instead of minutes).
  • Use the security group created in the first step (elasticsearch-test).

Copy the SSH key downloaded from AWS console to the tmp/ directory of this project and change its permissions:

cp ~/Downloads/elasticsearch-test.pem ./tmp
chmod 600 ./tmp/elasticsearch-test.pem

Once the instance is ready, copy its "Public DNS" in the AWS console (eg. ec2-123-40-123-50.compute-1.amazonaws.com).

We can begin the "bootstrap and install" process now.

Let's setup the connection details, first:

HOST=<REPLACE WITH YOUR PUBLIC DNS>
SSH_OPTIONS="-o User=ec2-user -o IdentityFile=./tmp/elasticsearch-test.pem -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null"

Let's copy the files to the machine:

scp $SSH_OPTIONS bootstrap.sh patches.sh node.json solo.rb $HOST:/tmp

Let's bootstrap the machine (ie. install neccessary packages, download cookbooks, etc):

time ssh -t $SSH_OPTIONS $HOST "sudo bash /tmp/bootstrap.sh"
time ssh -t $SSH_OPTIONS $HOST "sudo bash /tmp/patches.sh"

Let's launch the Chef run with the chef-solo command to provision the system:

time ssh -t $SSH_OPTIONS $HOST "sudo chef-solo -N elasticsearch-test-1 -j /tmp/node.json"

Once the Chef run successfully finishes, you can check whether ElasticSearch is running on the machine (leave couple of seconds for ElasticSearch to have a chance to start...):

ssh -t $SSH_OPTIONS $HOST "curl localhost:9200/_cluster/health?pretty"

You can also connect to the Nginx-based proxy:

curl http://USERNAME:PASSWORD@$HOST:8080

And use it for indexing some data:

curl -X POST "http://USERNAME:PASSWORD@$HOST:8080/test_chef_cookbook/document/1" -d '{"title" : "Test 1"}'
curl -X POST "http://USERNAME:PASSWORD@$HOST:8080/test_chef_cookbook/document/2" -d '{"title" : "Test 2"}'
curl -X POST "http://USERNAME:PASSWORD@$HOST:8080/test_chef_cookbook/document/3" -d '{"title" : "Test 3"}'
curl -X POST "http://USERNAME:PASSWORD@$HOST:8080/test_chef_cookbook/_refresh"

Or performing searches:

curl "http://USERNAME:PASSWORD@$HOST:8080/_search?pretty"

You can also use the provided service to check ElasticSearch status:

ssh -t $SSH_OPTIONS $HOST "sudo service elasticsearch status -v"

Of course, you can check the ElasticSearch status with Monit:

ssh -t $SSH_OPTIONS $HOST "sudo monit reload && sudo monit status -v"

(If the Monit daemon is not running, start it with sudo service monit start first. Notice the daemon has a startup delay of 2 minutes by default.)

The provisioning scripts will configure the following on the target instance:

  • Install Nginx and Monit
  • Install and configure Elasticsearch via the cookbook
  • Create, attach, format and mount a new EBS disk
  • Configure Nginx as a reverse proxy for Elasticsearch with HTTP authentication
  • Configure Monit to check Elasticsearch process status and cluster health

This repository comes with a collection of Rake tasks which automatically create the server in Amazon EC2, and perform all the provisioning steps. Install the required Rubygems with bundle install and run:

time bundle exec rake create NAME=elasticsearch-test-from-cli

http://www.elasticsearch.org/tutorials/2012/03/21/deploying-elasticsearch-with-chef-solo.html

echo -e "\nInstalling development dependencies, Ruby and essential tools..." \
"\n===============================================================================\n"
yum install gcc gcc-c++ make automake install ruby-devel libcurl-devel libxml2-devel libxslt-devel vim curl git -y
echo -e "\nInstalling Rubygems..." \
"\n===============================================================================\n"
yum install rubygems -y
gem install json --no-ri --no-rdoc
echo -e "\nInstalling and bootstrapping Chef..." \
"\n===============================================================================\n"
test -d "/opt/chef" || curl -# -L http://www.opscode.com/chef/install.sh | sudo bash -s -- -v 11.6.0
mkdir -p /etc/chef/
mkdir -p /var/chef-solo/site-cookbooks
mkdir -p /var/chef-solo/cookbooks
if test -f /tmp/solo.rb; then mv /tmp/solo.rb /etc/chef/solo.rb; fi
echo -e "\nDownloading cookbooks..." \
"\n===============================================================================\n"
test -d /var/chef-solo/site-cookbooks/monit || curl -# -L -k http://s3.amazonaws.com/community-files.opscode.com/cookbook_versions/tarballs/915/original/monit.tgz | tar xz -C /var/chef-solo/site-cookbooks/
test -d /var/chef-solo/site-cookbooks/ark || git clone git://github.com/bryanwb/chef-ark.git /var/chef-solo/site-cookbooks/ark
if [ ! -d /var/chef-solo/cookbooks/elasticsearch ]; then
git clone git://github.com/elasticsearch/cookbook-elasticsearch.git /var/chef-solo/cookbooks/elasticsearch
else
cd /var/chef-solo/cookbooks/elasticsearch
git fetch
git reset origin/master --hard
fi
echo -e "\n*******************************************************************************\n" \
"Bootstrap finished" \
"\n*******************************************************************************\n"
source :rubygems
gem 'rake'
gem 'json'
gem 'fog'
gem 'ansi'
{
"run_list": [ "recipe[monit]",
"recipe[elasticsearch]",
"recipe[elasticsearch::plugins]",
"recipe[elasticsearch::ebs]",
"recipe[elasticsearch::data]",
"recipe[elasticsearch::aws]",
"recipe[elasticsearch::nginx]",
"recipe[elasticsearch::proxy]",
"recipe[elasticsearch::monit]" ],
"elasticsearch" : {
"cluster_name" : "elasticsearch_test_with_chef",
"bootstrap" : { "mlockall" : false },
"discovery" : { "type": "ec2" },
"data_path" : "/usr/local/var/data/elasticsearch/disk1",
"data" : {
"devices" : {
"/dev/sda2" : {
"file_system" : "ext3",
"mount_options" : "rw,user",
"mount_path" : "/usr/local/var/data/elasticsearch/disk1",
"format_command" : "mkfs.ext3",
"fs_check_command" : "dumpe2fs",
"ebs" : {
"size" : 25,
"delete_on_termination" : true,
"type" : "io1",
"iops" : 100
}
}
}
},
"cloud" : {
"aws" : {
"access_key" : "<REPLACE>",
"secret_key" : "<REPLACE>",
"region" : "us-east-1"
},
"ec2" : {
"security_group": "elasticsearch-test"
}
},
"plugins" : {
"karmi/elasticsearch-paramedic" : {}
},
"nginx" : {
"users" : [ { "username" : "USERNAME", "password" : "PASSWORD" } ],
"allow_cluster_api" : true
}
},
"monit" : {
"notify_email" : "<REPLACE WITH YOUR E-MAIL>",
"mail_format" : { "from" : "monit@amazonaws.com", "subject" : "[monit] $SERVICE $EVENT on $HOST", "message" : "$SERVICE $ACTION: $DESCRIPTION" }
}
}
# Patch Monit cookbook problems
mkdir -p /etc/monit/conf.d/
rm -f /etc/monit.conf
touch /etc/monit/monitrc
chmod 700 /etc/monit/monitrc
ln -nfs /etc/monit/monitrc /etc/monit.conf
# Patch Nginx cookbook problems
mkdir -p /etc/nginx/sites-available/
useradd -s /bin/sh -u 33 -U -d /var/www -c Webserver www-data
echo -e "\n*******************************************************************************\n" \
"Patching finished" \
"\n*******************************************************************************\n"
require 'rubygems'
require 'json'
require 'fog'
require 'ansi'
module Provision
class Server
attr_reader :name, :options, :node, :ui
def initialize(options = {})
@options = options
@name = @options.delete(:name)
end
def create!
create_node
tag_node
msg "Waiting for SSH...", :yellow
wait_for_sshd and puts
end
def destroy!
servers = connection.servers.select { |s| s.tags["Name"] == name && s.state != 'terminated' }
if servers.empty?
msg "[!] No instance named '#{name}' found!", :red
exit(1)
end
puts "Will terminate #{servers.size} server(s): ",
servers.map { |s| "* #{s.tags["Name"]} (#{s.id})" }.join("\n"),
"Continue? (y/n)"
exit(0) unless STDIN.gets.strip.downcase == 'y'
servers.each do |s|
@node = s
msg "Terminating #{node.tags["Name"]} (#{node.id})...", :yellow
connection.terminate_instances(node.id)
msg "Done.", :green
end
end
def connection
@connection ||= Fog::Compute.new(
:provider => 'AWS',
:aws_access_key_id => options[:aws_access_key_id],
:aws_secret_access_key => options[:aws_secret_access_key],
:region => options[:aws_region]
)
end
def node
@node ||= connection.servers.select do |s|
s.tags["Name"] =~ Regexp.new(name) && s.state != 'terminated'
end.first
end
def create_node
msg "Creating EC2 instance #{name} in #{options[:aws_region]}...", :bold
msg "-"*ANSI::Terminal.terminal_width
@node = connection.servers.create(:image_id => options[:aws_image],
:groups => options[:aws_groups].split(",").map {|x| x.strip},
:flavor_id => options[:aws_flavor],
:key_name => options[:aws_ssh_key_id],
:block_device_mapping => options[:block_device_mapping] || [ { "DeviceName" => "/dev/sde1", "VirtualName" => "ephemeral0" }]
)
msg_pair "Instance ID", node.id
msg_pair "Flavor", node.flavor_id
msg_pair "Image", node.image_id
msg_pair "Region", options[:aws_region]
msg_pair "Availability Zone", node.availability_zone
msg_pair "Security Groups", node.groups.join(", ")
msg_pair "SSH Key", node.key_name
msg "Waiting for instance...", :yellow
@node.wait_for { print "."; ready? }
puts
msg_pair "Public DNS Name", node.dns_name
msg_pair "Public IP Address", node.public_ip_address
msg_pair "Private DNS Name", node.private_dns_name
msg_pair "Private IP Address", node.private_ip_address
end
def tag_node
msg "Tagging instance in EC2...", :yellow
custom_tags = options[:tags].split(",").map {|x| x.strip} rescue []
tags = Hash[*custom_tags]
tags["Name"] = @name
tags.each_pair do |key, value|
connection.tags.create :key => key, :value => value, :resource_id => @node.id
msg_pair key, value
end
end
def wait_for_sshd
hostname = node.dns_name
loop do
begin
print(".")
tcp_socket = TCPSocket.new(hostname, 22)
readable = IO.select([tcp_socket], nil, nil, 5)
if readable
msg "\nSSHd accepting connections on #{hostname}, banner is: #{tcp_socket.gets}", :green
return true
end
rescue SocketError
sleep 2
retry
rescue Errno::ETIMEDOUT
sleep 2
retry
rescue Errno::EPERM
return false
rescue Errno::ECONNREFUSED
sleep 2
retry
rescue Errno::EHOSTUNREACH
sleep 2
retry
ensure
tcp_socket && tcp_socket.close
end
end
end
def ssh(command)
host = node.dns_name
user = options[:ssh_user]
key = options[:ssh_key]
opts = "-o User=#{user} -o IdentityFile=#{key} -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null"
system "ssh -t #{opts} #{host} #{command}"
end
def scp(files, params={})
host = node.dns_name
user = options[:ssh_user]
key = options[:ssh_key]
opts = "-o User=#{user} -o IdentityFile=#{key} -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -r"
path = params[:path] || '/tmp'
__command = "scp #{opts} #{files} #{host}:#{path}"
puts __command
system __command
end
def msg message, color=:white
puts message.ansi(color)
end
def msg_pair label, value, color=:cyan
puts (label.to_s.ljust(25) + value.to_s).ansi(color)
end
end
end
desc "Create, bootstrap and configure an instance in EC2"
task :create => :setup do
@server = Provision::Server.new @args
@server.create!
Rake::Task[:upload].execute
Rake::Task[:provision].execute
end
desc "Terminate an EC2 instance"
task :destroy => :setup do
@server = Provision::Server.new @args
@server.destroy!
end
desc "(Re-)provision an instance"
task :provision => :setup do
@server ||= Provision::Server.new @args
@server.ssh "sudo bash /tmp/bootstrap.sh"
@server.ssh "sudo bash /tmp/patches.sh"
@server.ssh "sudo chef-solo -N #{@server.name} -j /tmp/#{@args[:node_json]}"
exit(1) unless $?.success?
puts "_"*ANSI::Terminal.terminal_width
puts "\nOpen " + "http://#{@args[:http_username]}:#{@args[:http_password]}@#{@server.node.dns_name}:8080".ansi(:bold) + " in your browser"
end
task :upload => :setup do
@server ||= Provision::Server.new @args
@server.scp "bootstrap.sh patches.sh #{@args[:node_json]} solo.rb", :path => '/tmp'
exit(1) unless $?.success?
end
task :setup do
ENV['AWS_ACCESS_KEY'] || ( puts "\n[!] Missing AWS_ACCESS_KEY environment variable...".ansi(:red) and exit(1) )
ENV['AWS_SECRET_ACCESS_KEY'] || ( puts "\n[!] Missing AWS_SECRET_ACCESS_KEY environment variable...".ansi(:red) and exit(1) )
node_json = ENV['NODE'] || 'node.json'
json = JSON.parse(File.read( File.expand_path("../#{node_json}", __FILE__) ))
name = json['elasticsearch']['node_name'] rescue nil
aws_access_key = json['elasticsearch']['cloud']['aws']['access_key'] rescue nil
aws_secret_access_key = json['elasticsearch']['cloud']['aws']['secret_key'] rescue nil
aws_region = json['elasticsearch']['cloud']['aws']['region'] rescue nil
aws_group = json['elasticsearch']['cloud']['ec2']['security_group'] rescue nil
http_username = json['elasticsearch']['nginx']['users'][0]['username'] rescue nil
http_password = json['elasticsearch']['nginx']['users'][0]['password'] rescue nil
@args = {}
@args[:name] = ENV['NAME'] || name || 'elasticsearch-test'
@args[:node_json] = node_json
@args[:aws_ssh_key_id] = ENV['AWS_SSH_KEY_ID'] || 'elasticsearch-test'
@args[:aws_access_key_id] = ENV['AWS_ACCESS_KEY_ID'] || aws_access_key
@args[:aws_secret_access_key] = ENV['AWS_SECRET_ACCESS_KEY'] || aws_secret_access_key
@args[:aws_region] = ENV['AWS_REGION'] || aws_region || 'us-east-1'
@args[:aws_groups] = ENV['GROUP'] || aws_group || 'elasticsearch-test'
@args[:aws_flavor] = ENV['FLAVOR'] || 't1.micro'
@args[:aws_image] = ENV['IMAGE'] || 'ami-1624987f'
@args[:ssh_user] = ENV['SSH_USER'] || 'ec2-user'
@args[:ssh_key] = ENV['SSH_KEY'] || File.expand_path('../tmp/elasticsearch-test.pem', __FILE__)
@args[:http_username] = http_username
@args[:http_password] = http_password
end
file_cache_path "/var/chef-solo"
cookbook_path ["/var/chef-solo/site-cookbooks", "/var/chef-solo/cookbooks"]
@AbleCoder
Copy link

When I try to use this I get net-ssh dependency problem.

If you update the chef install version from 10.18.2 to 10.20.0 in bootstrap.sh it works.
(https://gist.github.com/karmi/2050769#file-bootstrap-sh-L12)

@jtandalai
Copy link

https://github.com/elasticsearch/cookbook-elasticsearch/blob/master/recipes/ebs.rb seems to be depending on default to the following :

value_for_platform(
'default' => %w|libxslt1-dev libxml2-dev|
)

whereas the bootstrap shell script is installing libxml2-devel libxslt-devel.

@brownjohnf
Copy link

I experienced the same issue as both @AbleCoder and @jtandalai. Fixing these two issues worked for me, through the single-server section. I'll start playing with multi-node tomorrow...

Copy link

ghost commented Mar 24, 2013

@brownjohnf , I get the same issue running through the tutorial code verbatim even though it looks like @jtandalai 's fix has been patched. I get the same issue again on changing the chef version stamp to 10.20.0 and re-running. Is there a quick fix? I just want to play around with ES on AWS, but don't want to spend hours debugging ops code! :p

Error executing action create on resource 'ruby_block[Create EBS volume on /dev/sda2 (size: 25GB)]'

Gem::LoadError

Unable to activate net-scp-1.1.0, because net-ssh-2.2.2 conflicts with net-ssh (>= 2.6.5)

Cookbook Trace:

/var/chef-solo/cookbooks/elasticsearch/libraries/create_ebs.rb:19:in `block (2 levels) in create_ebs'

@webmat
Copy link

webmat commented Mar 26, 2013

In the example node.json, the path is specified incorrectly. I'm guessing the cookbook was modified since this gist was posted.

Instead of

"data_path" : "/usr/local/var/data/elasticsearch/disk1",

you should use

"path": {
  "data": ["/usr/local/var/data/elasticsearch/disk1"]
},

@cimi
Copy link

cimi commented Jun 26, 2013

I can confirm that after using @webmat and @AbleCoder's fixes, the Rakefile ran for me and I have a nice two node cluster running.

It would be best if the gist was updated.

@AliHichem
Copy link

I got no luck with version 10.20.0 but it works well with @webmat fix and version 10.22.0, also there is an important note which is to set the appropriate region.

Besides, It's pretty impressive !

Btw, could anyone tell me where to set elasticseach specific version ?

@lou
Copy link

lou commented Aug 28, 2013

We managed to deploy on an Ubuntu AMI so we had to do some little tricks but it works like a charm.

If it can helps anyone we have made a repo https://github.com/octoly/deploy-elasticsearch

@tamlyn
Copy link

tamlyn commented Jun 3, 2014

With the latest Amazon Linux AMI (ami-2918e35e) I had to run sudo yum groupinstall "Development Tools" on the server first otherwise Chef would fail on chef_gem[fog].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment