Skip to content

Instantly share code, notes, and snippets.

@jaceklaskowski
Last active August 1, 2018 11:28
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save jaceklaskowski/296ebbbe0067a767b3017f2459677c22 to your computer and use it in GitHub Desktop.
Save jaceklaskowski/296ebbbe0067a767b3017f2459677c22 to your computer and use it in GitHub Desktop.
How to run spark-jobserver on Docker and Mac OS (using docker-machine)

From https://github.com/spark-jobserver/spark-jobserver#getting-started-with-spark-job-server:

The easiest way to get started is to try the Docker container which prepackages a Spark distribution with the job server and lets you start and deploy it.

➜  spark-jobserver git:(master) docker-machine version
docker-machine version 0.7.0, build a650a40

// https://gist.github.com/radekg/ec5a1575c450a48e5cba

// default is the name of the default machine
// https://docs.docker.com/machine/get-started/#operate-on-machines-without-specifying-the-name

➜  spark-jobserver git:(master) ✗ docker-machine stop
Stopping "default"...
Machine "default" was stopped.

➜  spark-jobserver git:(master) ✗ docker-machine rm default
About to remove default
Are you sure? (y/n): y
Successfully removed default
➜  spark-jobserver git:(master) ✗ docker-machine create --driver virtualbox \
>                       --virtualbox-cpu-count "4" \
>                       --virtualbox-memory "4096" \
>                       --virtualbox-disk-size "50000" default
Running pre-create checks...
Creating machine...
(default) Copying /Users/jacek/.docker/machine/cache/boot2docker.iso to /Users/jacek/.docker/machine/machines/default/boot2docker.iso...
(default) Creating VirtualBox VM...
(default) Creating SSH key...
(default) Starting the VM...
(default) Check network to re-create if needed...
(default) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env default

➜  spark-jobserver git:(master) ✗ docker-machine ls
NAME      ACTIVE   DRIVER       STATE   URL   SWARM   DOCKER    ERRORS
default   -        virtualbox   Saved                 Unknown

➜  spark-jobserver git:(master) docker-machine start
Starting "default"...
(default) Check network to re-create if needed...
(default) Waiting for an IP...
Machine "default" was started.
Waiting for SSH to be available...
Detecting the provisioner...
Started machines may have new IP addresses. You may need to re-run the `docker-machine env` command.

➜  spark-jobserver git:(master) docker-machine ls
NAME      ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER    ERRORS
default   -        virtualbox   Running   tcp://192.168.99.100:2376           v1.11.0

➜  spark-jobserver git:(master) ✗ eval "$(docker-machine env)"

➜  spark-jobserver git:(master) docker images
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
cloudera/quickstart   latest              4239cd2958c6        2 weeks ago         6.336 GB
java                  7-jre               e833f836bc85        2 weeks ago         344.1 MB

// https://github.com/marcuslonnberg/sbt-docker#building-an-image

➜  spark-jobserver git:(master) ✗ sbt docker
[info] Loading global plugins from /Users/jacek/.sbt/0.13/plugins
[info] Loading project definition from /Users/jacek/dev/oss/spark-jobserver/project
...
[info] spark-1.6.1-bin-hadoop2.4/README.md
[info]  ---> 3c9c03d81c16
[info] Removing intermediate container 4872b1ed002a
[info] Step 18 : VOLUME /database
[info]  ---> Running in c78470bcd47a
[info]  ---> 26a2df3f3134
[info] Removing intermediate container c78470bcd47a
[info] Step 19 : ENTRYPOINT app/server_start.sh
[info]  ---> Running in d4efe2b8275f
[info]  ---> 86cb55b4accd
[info] Removing intermediate container d4efe2b8275f
[info] Successfully built 86cb55b4accd
[info] Tagging image 86cb55b4accd with name: velvia/spark-jobserver:0.6.2-SNAPSHOT.mesos-0.25.0.spark-1.6.1
[info] Warning: '-f' is deprecated, it will be removed soon. See usage.
[success] Total time: 121 s, completed Apr 23, 2016 8:26:27 PM

➜  spark-jobserver git:(master) docker images
REPOSITORY               TAG                                       IMAGE ID            CREATED             SIZE
velvia/spark-jobserver   0.6.2-SNAPSHOT.mesos-0.25.0.spark-1.6.1   86cb55b4accd        3 minutes ago       805.6 MB
cloudera/quickstart      latest                                    4239cd2958c6        2 weeks ago         6.336 GB
java                     7-jre                                     e833f836bc85        2 weeks ago         344.1 MB

➜  spark-jobserver git:(master) docker run -d -p 8090:8090 velvia/spark-jobserver:0.6.2-SNAPSHOT.mesos-0.25.0.spark-1.6.1
aeccf6dea167a21d3fb8852084bfb47a7f66f60043ff2845a4e66c6e01f75b11

➜  spark-jobserver git:(master) docker ps
CONTAINER ID        IMAGE                                                            COMMAND                 CREATED             STATUS              PORTS                              NAMES
aeccf6dea167        velvia/spark-jobserver:0.6.2-SNAPSHOT.mesos-0.25.0.spark-1.6.1   "app/server_start.sh"   13 seconds ago      Up 12 seconds       0.0.0.0:8090->8090/tcp, 9999/tcp   serene_varahamihira

➜  spark-jobserver git:(master) CONTAINER_ID=$(docker ps --format "{{.ID}}" --filter="ancestor=velvia/spark-jobserver:0.6.2-SNAPSHOT.mesos-0.25.0.spark-1.6.1")

➜  spark-jobserver git:(master) ✗ docker logs -f $CONTAINER_ID
[2016-04-23 18:32:13,819] INFO  internal.command.DbMigrate [] [] - Successfully applied 1 migration to schema "PUBLIC" (execution time 00:00.034s).
[2016-04-23 18:32:13,855] INFO  k.jobserver.io.JobDAOActor [] [akka://JobServer/user/dao-manager] - Starting actor spark.jobserver.io.JobDAOActor
[2016-04-23 18:32:13,856] INFO  jobserver.DataManagerActor [] [akka://JobServer/user/data-manager] - Starting actor spark.jobserver.DataManagerActor
[2016-04-23 18:32:13,857] INFO  spark.jobserver.JarManager [] [akka://JobServer/user/jar-manager] - Starting actor spark.jobserver.JarManager
[2016-04-23 18:32:13,859] INFO  AkkaClusterSupervisorActor [] [akka://JobServer/user/context-supervisor] - Starting actor spark.jobserver.AkkaClusterSupervisorActor
[2016-04-23 18:32:13,861] INFO  ark.jobserver.JobInfoActor [] [akka://JobServer/user/job-info] - Starting actor spark.jobserver.JobInfoActor
[2016-04-23 18:32:13,920] INFO  AkkaClusterSupervisorActor [] [] - AkkaClusterSupervisor initialized on akka.tcp://JobServer@127.0.0.1:48324
[2016-04-23 18:32:13,921] INFO  k.jobserver.JobResultActor [] [akka://JobServer/user/context-supervisor/global-result-actor] - Starting actor spark.jobserver.JobResultActor
[2016-04-23 18:32:13,945] INFO  Cluster(akka://JobServer) [] [Cluster(akka://JobServer)] - Cluster Node [akka.tcp://JobServer@127.0.0.1:48324] - Node [akka.tcp://JobServer@127.0.0.1:48324] is JOINING, roles [supervisor]
[2016-04-23 18:32:14,079] INFO  spark.jobserver.WebApi [] [] - No authentication.
[2016-04-23 18:32:14,332] INFO  spark.jobserver.WebApi [] [] - Starting browser web service...
[2016-04-23 18:32:14,814] INFO  Cluster(akka://JobServer) [] [Cluster(akka://JobServer)] - Cluster Node [akka.tcp://JobServer@127.0.0.1:48324] - Leader is moving node [akka.tcp://JobServer@127.0.0.1:48324] to [Up]
[2016-04-23 18:32:14,845] INFO  ay.can.server.HttpListener [] [akka.tcp://JobServer@127.0.0.1:48324/user/IO-HTTP/listener-0] - Bound to /0.0.0.0:8090

// Open Spark JobServer UI
➜  spark-jobserver git:(master) open http://$(docker-machine ip):8090

Stop the container and Docker Machine when you're done.

➜  spark-jobserver git:(master) ✗ docker stop $CONTAINER_ID
aeccf6dea167

➜  spark-jobserver git:(master) ✗ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

// You may optionally remove non-active old containers
➜  spark-jobserver git:(master) ✗ docker rm $(docker ps -aq)
d80930ad5a79
50b8edc7ae01
bd3df8303bf0
af64c43f4938
aeccf6dea167

➜  spark-jobserver git:(master) ✗ docker-machine stop
Stopping "default"...
Machine "default" was stopped.

Caveats

When you see the following error:

[info] Sending build context to Docker daemon 557.1 kB
[info] Sending build context to Docker daemon 1.114 MB
[info] Error response from daemon: client is newer than server (client API version: 1.23, server API version: 1.22)

docker-machine upgrade your machine!

➜  spark-jobserver git:(master) ✗ docker-machine upgrade
Waiting for SSH to be available...
Detecting the provisioner...
Upgrading docker...
Stopping machine to do the upgrade...
Upgrading machine "default"...
Default Boot2Docker ISO is out-of-date, downloading the latest release...
Latest release for github.com/boot2docker/boot2docker is v1.11.0
Downloading /Users/jacek/.docker/machine/cache/boot2docker.iso from https://github.com/boot2docker/boot2docker/releases/download/v1.11.0/boot2docker.iso...
0%....10%....20%....30%....40%....50%....60%....70%....80%....90%....100%
Copying /Users/jacek/.docker/machine/cache/boot2docker.iso to /Users/jacek/.docker/machine/machines/default/boot2docker.iso...
Starting machine back up...
(default) Check network to re-create if needed...
(default) Waiting for an IP...
Restarting docker...

Further reading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment