Armon Dadgar armon

## gist:00294f5143aa7aa46d842187ea9cc00c

func readPath(name string) {
    p := GetPolicy(name)
    DoSomething(p)
}

func writePath(name string) {
    p := GetPolicy(name)
    LockManager.Lock(name, func() {
        DoSomethign(p)

## keybase.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                armon
                / keybase.md
            
            
              Created
              December 5, 2015 23:00
            
              
                Keybase Authorization
              
          
    Keybase proof

I hereby claim:

I am armon on github.
I am armon (https://keybase.io/armon) on keybase.
I have a public key ASByRbYLSFAAGnJf0iMdYR0t5U9u5uVjfP8vY6p2s0vFego

To claim this, I am signing this object:

  
## infoq-vault2.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                armon
                / infoq-vault2.md
            
            
              Created
              September 9, 2015 01:27
            
          
    InfoQ: Vault is an online system that clients must request secrets from,
what risk is there that a Vault outage causes down time?
Armon: HashiCorp has been in the datacenter automation space for several years,
and we understand the highly-available nature of modern infrastructure.
When we designed Vault, high availability was a critical part of the design,
not something we tried to bolt on later. Vault makes use of coordination services
like Consul or Zookeeper to perform leader election. This means you can
deploy multiple Vault instances, such that if one fails there is an automatic
failover to a healthy instance. We typically recommend deploying at least

  
## consul-tunes.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              3 stars
            
          
                armon
                / consul-tunes.md
            
            
              Last active
              September 27, 2020 17:54
            
              
                Running Consul as a global K/V store
              
          
    Simplest way to do this with Consul is to run a single "global" datacenter.
This means the timing for the LAN gossip need to be tuned to be WAN appropriate.
In consul/config.go (https://github.com/hashicorp/consul/blob/master/consul/config.go#L267)
Do something like:
// Make the 'LAN' more forgiving for latency spikes
conf.SerfLANConfig.MemberlistConfig = memberlist.DefaultWANConfig()

Then we need to tune the Raft layer to be extremely forgiving.

  
## reap.sh
#!/bin/bash

# Store the live members
consul members | grep alive | awk '{ print $1 }' > /tmp/alive.txt

# Clean-up the collectd metrics
cd /data/graphite/whisper/collectd
ls | awk '{print substr($1, 0, index($1, "_node_")) }' > /tmp/monitored.txt
for NODE in `cat /tmp/monitored.txt`; do if grep -q $NODE /tmp/alive.txt; then echo $NODE alive; else echo $NODE dead; sudo rm -Rf ${NODE}_node_*; fi; done

## gist:9fa0542d7123f4c10e12
The initial observed cluster behavior:
1) Constant churn of nodes between Failed and Alive
2) Message bus saturated (~150 updates/sec)
3) Subset of cluster affected
4) Some nodes that are flapping don't exist! (Node dead, or agent down)

One immediate question is how the cluster remained in an unstable
state. We expect that the cluster should converge and return to
a quiet state after some time. However, there was a bug in the
low level SWIM implementation (memberlist library).

## gist:a8f90ab7f50159ac3cc2
Sent 5/1/2014

Hey Igor,

Glad you did a write up! I’m one of the authors of Consul. You mention we get some
things wrong about SmartStack, but we would love to get that corrected. The website
is generated from this file:

https://github.com/hashicorp/consul/blob/master/website/source/intro/vs/smartstack.html.markdown

## gist:ee101313a3cee8ff16d2
armon:~/projects/consul-demo-tf/tf (master) $ TF_LOG=1 terraform plan
2014/10/15 19:51:31 Detected home directory from env var: /Users/armon
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: aws = /Users/armon/projects/go/bin/terraform-provider-aws
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: cloudflare = /Users/armon/projects/go/bin/terraform-provider-cloudflare
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: consul = /Users/armon/projects/go/bin/terraform-provider-consul
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: digitalocean = /Users/armon/projects/go/bin/terraform-provider-digitalocean
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: dnsimple = /Users/armon/projects/go/bin/terraform-provider-dnsimple
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: google = /Users/armon/projects/go/bin/terraform-provider-google
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: heroku = /Users/armon/projects/go/bin/terraform-provider-heroku
2014/10/15 19:51:31 [DEBUG] Discoverd plugin: mailgun = /Users/armon/projects/go/bin/terraform-

## watch.conf
    2014/09/12 14:21:07 [DEBUG] http: Request /v1/agent/self (385.106us)
    2014/09/12 14:21:07 [DEBUG] http: Request /v1/event/fire/mysql-available (80.68us)
    2014/09/12 14:21:07 [DEBUG] consul: user event: mysql-available
    2014/09/12 14:21:07 [DEBUG] agent: new event: mysql-available (be9e89d7-e66b-8dbf-6a3e-ac1f64cfbc27)
    2014/09/12 14:21:07 [DEBUG] http: Request /v1/event/list?index=1&name=mysql-available (5.50690515s)
    2014/09/12 14:21:07 [DEBUG] http: Request /v1/event/list?index=1&name=mysql-available (42.151us)
    2014/09/12 14:21:07 [DEBUG] agent: watch handler 'cat >> events.out' output:

## runtime.json
# jdyer at MacBook-Pro.local in ~/Projects/consul [15:48:45]
$ dig @localhost -p 8600 _sip._udp.service.consul srv

; <<>> DiG 9.10.0-P2 <<>> @localhost -p 8600 _sip._udp.service.consul srv
; (3 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 5926
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

	func readPath(name string) {
	p := GetPolicy(name)
	DoSomething(p)
	}

	func writePath(name string) {
	p := GetPolicy(name)
	LockManager.Lock(name, func() {
	DoSomethign(p)
	#!/bin/bash

	# Store the live members
	consul members \| grep alive \| awk '{ print $1 }' > /tmp/alive.txt

	# Clean-up the collectd metrics
	cd /data/graphite/whisper/collectd
	ls \| awk '{print substr($1, 0, index($1, "_node_")) }' > /tmp/monitored.txt
	for NODE in `cat /tmp/monitored.txt`; do if grep -q $NODE /tmp/alive.txt; then echo $NODE alive; else echo $NODE dead; sudo rm -Rf ${NODE}_node_*; fi; done
	The initial observed cluster behavior:
	1) Constant churn of nodes between Failed and Alive
	2) Message bus saturated (~150 updates/sec)
	3) Subset of cluster affected
	4) Some nodes that are flapping don't exist! (Node dead, or agent down)

	One immediate question is how the cluster remained in an unstable
	state. We expect that the cluster should converge and return to
	a quiet state after some time. However, there was a bug in the
	low level SWIM implementation (memberlist library).
	Sent 5/1/2014

	Hey Igor,

	Glad you did a write up! I’m one of the authors of Consul. You mention we get some
	things wrong about SmartStack, but we would love to get that corrected. The website
	is generated from this file:

	https://github.com/hashicorp/consul/blob/master/website/source/intro/vs/smartstack.html.markdown
	armon:~/projects/consul-demo-tf/tf (master) $ TF_LOG=1 terraform plan
	2014/10/15 19:51:31 Detected home directory from env var: /Users/armon
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: aws = /Users/armon/projects/go/bin/terraform-provider-aws
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: cloudflare = /Users/armon/projects/go/bin/terraform-provider-cloudflare
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: consul = /Users/armon/projects/go/bin/terraform-provider-consul
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: digitalocean = /Users/armon/projects/go/bin/terraform-provider-digitalocean
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: dnsimple = /Users/armon/projects/go/bin/terraform-provider-dnsimple
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: google = /Users/armon/projects/go/bin/terraform-provider-google
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: heroku = /Users/armon/projects/go/bin/terraform-provider-heroku
	2014/10/15 19:51:31 [DEBUG] Discoverd plugin: mailgun = /Users/armon/projects/go/bin/terraform-
	2014/09/12 14:21:07 [DEBUG] http: Request /v1/agent/self (385.106us)
	2014/09/12 14:21:07 [DEBUG] http: Request /v1/event/fire/mysql-available (80.68us)
	2014/09/12 14:21:07 [DEBUG] consul: user event: mysql-available
	2014/09/12 14:21:07 [DEBUG] agent: new event: mysql-available (be9e89d7-e66b-8dbf-6a3e-ac1f64cfbc27)
	2014/09/12 14:21:07 [DEBUG] http: Request /v1/event/list?index=1&name=mysql-available (5.50690515s)
	2014/09/12 14:21:07 [DEBUG] http: Request /v1/event/list?index=1&name=mysql-available (42.151us)
	2014/09/12 14:21:07 [DEBUG] agent: watch handler 'cat >> events.out' output:
	# jdyer at MacBook-Pro.local in ~/Projects/consul [15:48:45]
	$ dig @localhost -p 8600 _sip._udp.service.consul srv

	; <<>> DiG 9.10.0-P2 <<>> @localhost -p 8600 _sip._udp.service.consul srv
	; (3 servers found)
	;; global options: +cmd
	;; Got answer:
	;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 5926
	;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
	;; WARNING: recursion requested but not available