Skip to content

Instantly share code, notes, and snippets.

View kensipe's full-sized avatar

Ken Sipe kensipe

View GitHub Profile
@kensipe
kensipe / gist:17eef1778de973c4003f
Last active September 8, 2015 22:03
Mesosphere Cluster with mesos-dns, hdfs and spark on GCE

Intro

this screen cast will demo how to setup an mesosphere cluster for the purposes of analytics. We will show how to provision mesosphere on Google Compute Platform along with installing Mesos-DNS, HDFS and Spark.

We will start with setting up mesosphere on GCE by directing our browser to google.mesosphere.com

GCE setup

  1. setup through wizard
  2. download and run openvpn
  3. see mesos ui
@kensipe
kensipe / mesos-dns-setup-notes.md
Created May 29, 2015 17:03
details for setting up mesos-dns with docker

Mesos-DNS

Scripts for setting up

sudo mkdir /etc/mesos-dns
sudo vi /etc/mesos-dns/config.json

config.json

@kensipe
kensipe / dcos-ssh-slaves.md
Last active August 29, 2015 14:22
dcos ssh into slaves

SSH steps

  1. go to aws console, filter to the name of your cluster
  2. find your master (it will be 1 with a public IP and a Security group which includes the words MasterSecurityGroup)
    1. get its public DNS
  3. add the following to ~/.ssh/config
	Host ec2-52-25-163-225.us-west-2.compute.amazonaws.com (this is your DNS)
	        Compression yes
 ForwardAgent yes
@kensipe
kensipe / myriad-demo.txt
Last active September 30, 2015 18:42
myriad demo
setup: You have a multi-purpose cluster environment used for end user web traffic and in-house analytics. In this example we have 4 running docker instances of nginx hosting our web application fronted by haproxy and 2 small and 2 medium instances of YARN running on Mapr Hadoop with MapRFS.
note: most organizations underutilize their datacenter resources by separating these two concerns. In this demonstration we are co-locating these separate needs.
<setup scripts>
1. look at master port 80 (web app)
- technical dive: look at /etc/haproxy/haproxy.cfg on master
2. lets run a terrasort job
<launch terrasort job>
@kensipe
kensipe / dcos-topology.png
Last active July 20, 2020 17:00
DCOS Topology
dcos-topology.png
@kensipe
kensipe / mapr-myriad-setup-notes.md
Last active July 20, 2020 17:00
MapR Myriad Setup Notes
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
CODENAME=$(lsb_release -cs)

Add the repository

echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | sudo tee /etc/apt/sources.list.d/mesosphere.list
sudo apt-get -y update
@kensipe
kensipe / working-with-mapr-notes.txt
Last active July 20, 2020 16:59
Working With Mapr Notes
sudo /etc/init.d/mapr-warden restart
sudo maprcli node list -columns ip
maprcli volume list -columns n,p
hadoop fs -ls /var/mapr/local
hadoop fs -stat /var/mapr/local
hadoop fs -stat /var/mapr/local/demo-mapr-slave1.c.inbound-bee-664.internal
hadoop fs -lsr /var/
@kensipe
kensipe / marathon-metrics-70k.json
Last active July 20, 2020 16:58
Marathon Metrics
{
"version":"3.0.0",
"gauges":{
"api.mesosphere.marathon.core.event.impl.stream.HttpEventStreamActorMetrics.number-of-streams":{
"value":0
},
"jvm.buffers.direct.capacity":{
"value":856750
},
"jvm.buffers.direct.count":{
#!/bin/bash
# expect git and aws with prod creds
# expects to be in the marathon dir or have the MARATHON_PROJECT_DIR set
if [ -z "$MARATHON_PROJECT_DIR" ]; then
echo "MARATHON_PROJECT_DIR NOT set... using current directory"
else
pushd $MARATHON_PROJECT_DIR
fi
@kensipe
kensipe / DCOS Unreachable Strategy.md
Last active November 6, 2020 05:34
DCOS Unreachable Strategy

Unreachable Strategy

In order for Marathon to provide partition aware unreachable strategy support there are 2 high level events that must occur; 1) Mesos needs to communicate a task is unreachable and 2) Marathon must respond to that event if unresolved within a specified amount of time. Each of these events have configuration options and DCOS system defaults which are worth review in order to fully understand how and when an unreachable task will be managed by Marathon.

Apache Mesos Unreachable Strategies

Apache Meso's ability to communicate a task / node is unreachable is controlled by 2 concepts; 1) mesos-agent health check and 2) node rate limiter. Regarding agent health checks, the mesos-master flags of control are: -max_agent_ping_timeouts and -agent_ping_timeout. While the Mesos defaults are 5 and 15s respectively providing a 75 second notification event by default (assuming the loss of 1 agent). The default for DC/OS for [max_slave_ping_timeouts is 20](https://github.com/dcos/dcos/blob/9