Skip to content

Instantly share code, notes, and snippets.

@vincentramirez
Last active December 18, 2019 09:43
Show Gist options
  • Save vincentramirez/37994beea576169d7c29d16ded1b15ff to your computer and use it in GitHub Desktop.
Save vincentramirez/37994beea576169d7c29d16ded1b15ff to your computer and use it in GitHub Desktop.
HashiCorp Consul Geo-Failover demo

This is a setp-by-step guide to help demonstrate the use of the HashiCorp Consul to provide automated geo-failover capability for a basic microservices based application.

Pre-reqs:

Terraform Open Source, v.0.12.17+
Git
A registered trial account with Packet.com
Basic understanding of linux commands and ssh and the use of public and private RSA keys
I am a Mac user and levarage my RSA keys located in ~/.ssh
The Packet free trial also provides information on the use of RSA keys
as a secure method for gaining ssh access to a remote linux server

Deploy the demo environment

git clone https://github.com/vincentramirez/consul_geofailover_demo
cd consul_geofailover_demo
cp terraform.tfvars.example ./terraform.tfvars

Edit the terraform.tvars with the text editor of your choice and provide a valid Packet api key
with read/write accessat the Organization level.
This terraform will create a new project. *If you are using a Packet trial account
you can only have one live project at any time during your trial.

vi terraform.tfvars
i
auth_token = <yourPacketAPIkey>
esc key
:wq!
enter
terraform init
terraform apply
yes

This deployment takes ~2-3 minutes to complete as written and targetting the SJC1 and EWR1 Packet datacenters
If your deployment is successful you will recieve a very basic output like this:

Outputs:
consul_servera_public-IP = 123.45.6.7
consul_serverb_public-IP = 123.45.6.8

Connect to the demo environment

Assuming you are a Mac or other Linux user, you have already provided Packet with a copy of your Public RSA key
that has been injected into the servers you deploy for this demo.
To connect to the servers that were created by Terraform:
Open a terminal window

ssh -i ~/.ssh/id_rsa root@<puclicIPofServera>
ssh -i ~/.ssh/id_rsa root@<puclicIPofServerb>

Verify that Consul is up and running

systemctl status consul

*if you deploy this demo multiple times you may encounter an issue where a Public IP address gets released
and re-used resulting in this error:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:WYQz3YXzxOAzwyylj13EFiggjmCzmo9TJe5wfbIxs4k.
Please contact your system administrator.......

To resolve this run the following command:

ssh-keygen -R "<publicIPaddressOfserverYouAreTryingToConnectTo>"

Then try to ssh back into that IP

Prep the demo environment

This demo is very stripped down to reduce cost and complexity, all components run on a single server in each datacetner.
We will run a couple of Microservices, Envoy proxies, and a Mesh Gateway per datacenter; all on the same host.
To accomplish this in a sane manner without multiple ssh terminal session into the same server we will use a linux tool called Tmux. This should already be installed on your server if you are using the Centos7 OS that this demo was originally created with. If Tmux is not present on your server, install it based on your OS method. Run the following commands one each server:

bash /tmp/wanjoin.sh 
tmux 
PORT=9003 counting-service 

We now need to open another virtual terminal window in Tmux. To do this press the control key and b key at the same time quickly and release.
If you see the letter b entered into your terminal window try this again after releasing
control key and b key press the letter c key and release
this should open a fresh terminal screen for you. from now on I will use the term ctrl + b, c to indicate this

consul connect envoy -sidecar-for=counting

ctrl + b, c

PORT=9002 COUNTING_SERVICE_URL="http://localhost:9001" dashboard-service

ctrl + b, c

consul connect envoy -sidecar-for=dashboard -admin-bind localhost:19001

ctrl + b, c

bash /tmp/mesh-gateway.sh

ctrl + b, w This will now leave you with a view of the 5 terminal windows running in Tmux
Go the second server and repeat these exact steps

Bring up the GUI interfaces for the demo

In a web browser go to http://<serveraPublicIP:8500> This will open up the Consul UI
In another browser tab go to http://<serveraPublicIP:9002>
and in one more browser tab go to http://<serverbPublicIP:9002>

What am I looking at?

You have now deployed HashiCorp Consul into 2 separate datacenters These Consul environments are federated
You have registered 2 microservices in each datacenter the counting service and the dashboard service
Each service has an Envy proxy running with it
You also deployed a Mesh Gateway into each location that will allow us to leverage Layer 7 controls to complete our demo of a geo-failover scenario.
In your browser tab that is connected to the http://<serveraPublicIP:8500> You should have landed on the Services tab of the UI, if not click on Services
You should see 6 services registered into the Consul Service Catalog, all with Green Health checks In the upper left corner of the UI, you should see sjc1 or ewr1 which indicates which datacenter you are currently focused on.
Click on this name and then click on the second datacenter to change views into the six services running in the neighboring datacenter. They should be green and happy as well.

Let's kill a service!

If you look at the tabs you have open to http://<serveraPublicIP:9002> & http://<serverbPublicIP:9002> you sould see a very basic Dashboard with numbers counting up.
This is our Dashboard application that requires access to a counting microservice that litterally counts up from the number 1.
If you look below the counting numbers on the dashboard you should see the name of the datacenter where the counting service is running. So if you are in sjc1 that should be displayed below the numbers and vice versa if you are in ewr1 this should be displayed below the counting numbers.

This indicates that we have a healthy application the dashboard app which is able to reach a microservice called counting in its local geographical location.
We have sechuled the same counting service in both locations with the same unique name. Now let's see some Consul magic.
Go back to your Tmux terminal session for servera use the arrow keys to go to the top of the Tmux list of windows, or window 0 and press enter
this should bring you back to local instance of the counting service you manually executed. KILL IT!!!:

control + c

Now go back to your web browser http://<serveraPublicIP:9002>
wait for 3 seconds or refresh the browser a few times. The dashboard should still be funtional but you should notice that underneath the counting numbers you are now connected to ewr1
The dashboard application lost connectivity to a healthy instance of the counting service in its local datacenter and Consul was able to automatically detect this and failover to a neighboring Consul datacenter that has a healthy instance of the counting service.

Bring the local service back online

While still in your Tmux terminal for servera, where you just killed the counting service, bring it back online:

PORT=9003 counting-service

Verify the service is back online in your Consul UI, focused on the sjc1 datacenter
Now go back to your web browser http://<serveraPublicIP:9002> and see what is listed under the counting numbers
You should be back to the local instance of the counting service as indicated by sjc1 underneath the counting numbers

This demo is showcasing the power of HashiCorp Consul with Service Resolver failover capability

For more information on these concepts: https://www.consul.io/docs/agent/config-entries/service-resolver.html
Keep in mind that this demo only showcases two datacenters or geo-locations. Consul can be federated with more than two locations and this technology can be used for inter and intra failover scenarios.
Consul is a very portable solution that can run pretty much anywhere, bare metal servers, virtual machines, kubernetes, private datacenters, and public clouds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment