Skip to content

Instantly share code, notes, and snippets.

@krisis
Last active May 19, 2017 11:21
Show Gist options
  • Save krisis/c06d4f843d27c9c8af60a00622acf952 to your computer and use it in GitHub Desktop.
Save krisis/c06d4f843d27c9c8af60a00622acf952 to your computer and use it in GitHub Desktop.
Minio Marathon Host IP environment variable

Problem Statement

Minio distributed setup can be configured to use upto 16 disks. These disks may be spread across upto 16 nodes (agents in Mesos). To setup distributed Minio, we need to know the (internal) IP/hostname of all the container instances before scheduling them. E.g, in a Minio distributed setup with 4 disks spread across 4 nodes with IPs IP1, IP2, IP3 and IP4 respectively, the command to run in each container is,

minio server http://IP1:9000/disk http://IP2:9000/disk http://IP3:9000/disk http://IP4:9000/disk

Proposal for Marathon

Marathon sets up some environment variables for each task it launches in addition to those set by Mesos. Currently, it sets up host ports, one for each assigned resource port. These are named PORT0 through PORT{N-1}, where N is the number of assigned ports.

We propose that Marathon should similarly set up task environment variables HOST0 through HOST{N-1} with the IP addresses of the agents where the containers will be scheduled, where N is the number of nodes. E.g, in a distribued Minio setup when we launch 4 containers, Marathon should set up HOST0 through HOST3 with IPs of corresponding agents where minio server containers are scheduled.

The command section of Minio's marathon.json would look like,

"args": [
  "server",
  "$HOST0:9000/disk",
  "$HOST1:9000/disk",
  "$HOST2:9000/disk",
  "$HOST3:9000/disk"
]

Comparison with Kubernetes and Docker-Swarm

Kubernetes Statefulsets

Kubernetes provides constructs called StatefulSets that offer predetermined unique identity to each of the requested pods. While all the pods are still free to be scheduled wherever Kubernetes scheduler deems it suitable, application knows deterministically about the hosts that are running its pods.

For example in Minio's case, we know the hostnames as defined by the StatefulSet documentation.

Here is how we pass the Minio server arguements in Kubernetes .yaml files.

args:
  - server
  - http://minio-0.minio.default.svc.cluster.local/data
  - http://minio-1.minio.default.svc.cluster.local/data
  - http://minio-2.minio.default.svc.cluster.local/data
  - http://minio-3.minio.default.svc.cluster.local/data

Docker Swarm

Docker Swarm lets you use service or container names directly as hostnames. So, as soon as we create a service we already know the name. This makes it easy to deploy distributed Minio on Docker Swarm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment