So, it comes to point where you can deploy the cool #nlproc / #neuralempty tech you've built and there's this Docker thing that everyone is telling you to do so that installing the libraries/tools you need for you tech is less painful...
What is Docker?
And now, "dockerize"... (Wave wand at the terminal)
Before continuing, take a look at these instructions:
If any of the above works for you, you can skip to here ;P
Install Docker on Ubuntu 14.04 (the TL;DR way)
Remove older versions of dockers
sudo apt-get remove docker docker-engine
Update all your distro
sudo apt-get update sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual
Wget and call the script
wget -qO- https://get.docker.com/ | sudo sh
Add your user to the
sudo usermod -aG docker $(whoami)
Simulate a logout + login to "activate" the group membership
su - $USER
(Optional): Install Docker Compose
sudo apt-get -y install python-pip sudo pip install docker-compose
Like all new programming language, deep learning framework or anything that you have to write code for, here's the Docker's version of
docker run hello-world
BTW, I think we need some sort of
Hello World for NLP.
You should see something like this:
$ docker run hello-world Unable to find image 'hello-world:latest' locally latest: Pulling from library/hello-world 78445dd45222: Pull complete Digest: sha256:c5515758d4c5e1e838e9cd307f6c6a0d620b5e07e6f927b07d05f6d12a1ac8d7 Status: Downloaded newer image for hello-world:latest Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://cloud.docker.com/ For more examples and ideas, visit: https://docs.docker.com/engine/userguide/
The True Poison
Now, to sell you the real poison....
Moses is great for Machine Translation (MT) and surely more deployable than any #NeuralEmpty tools. Although installation has gone a long way from offensively obtuse to simply
bjam, there's still no reason to re-type the installation commands every time you deploy Moses on a new system/machine.
Ulrich Germann showed how it can be done easily on http://lectures.ms.mff.cuni.cz/view.php?rec=291.
Slides on MT Marathon 2015 Wiki, user and password is both
TL;DR, just give me the docker file already: https://gist.github.com/alvations/f2727a6331e4a48c5a1905e47ef5c5f3
And the image on Docker hub: https://hub.docker.com/r/alvations/momo/
$ docker pull alvations/momo $ docker run -it alvations/momo bash
Below we will walkthrough how the Docker container was created.
Let's Dockerize Moses
I'll choose the latest Ubuntu distro, but feel free to choose a distro of your choice.
To know the name(s) of the available docker images, you cause the
docker search - command, e.g.
$ docker search ubuntu # Look for "ubuntu" $ docker search centos # Look for "centos" $ docker search windows # Look for "windows"
If you see a warning like:
Warning: failed to get default registry endpoint from daemon (Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?). Using system default: https://index.docker.io/v1/ Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
According to http://stackoverflow.com/a/33596140/610569, do this:
sudo usermod -aG docker $(whoami)
then logout and login. You can use the same
su - $USER trick.
On Mac OSX
(Just in case you're on a mac, though this is a guide for Linux)
$ docker-machine start # Start virtual machine for docker $ docker-machine env # It's helps to get environment variables $ eval "$(docker-machine env default)" # Set environment variables
Start Ubuntu in Docker
Start an empty Ubuntu image:
docker run -it ubuntu bash
You will be "teleported" to a bash within an Ubuntu image, first we update + upgrade the distro.
apt-get update apt-get install -y apt-utils debconf-utils echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections apt-get update apt-get -y upgrade
Possibly, you might see something like this at the end of the upgrade:
Setting up makedev (2.3.1-93ubuntu2~ubuntu16.04.1) ... mv: cannot move 'console-' to 'console': Device or resource busy makedev console c 5 1 root tty 0600: failed
You can just ignore it... More details, see http://stackoverflow.com/questions/43269412/device-or-resource-busy-docker
For the sake of future sanity, let's install
apt-get install -y sudo nano apt-get install -y perl apt-get install -y python-dev python3-dev python-pip python3-pip
We'll also install some common unix tools too
apt-get install -y curl wget tar dtrx
Now, let's install Moses' dependencies:
apt-get install -y libboost-all-dev apt-get install -y build-essential git-core pkg-config automake libtool wget zlib1g-dev python-dev libbz2-dev apt-get install -y cmake
Setup a user account
Still in the docker container after the
apt-get -y upgrade, we create a sudo user that'll we'll be using for the future. Let's use the username
ubiwan with password
useradd -m -p mosesdocker -s /bin/bash ubiwan usermod -aG sudo ubiwan # add user to sudo list su - ubiwan # login to the ubiwan user
From hence forth, we'll use the
ubiwan username to install Moses.
Still in the docker container, after logging in to
ubiwan, we continue with the Moses installation:
cd $HOME git clone https://github.com/moses-smt/mosesdecoder.git cd mosesdecoder make -f contrib/Makefiles/install-dependencies.gmake ./compile.sh --max-kenlm-order=20 --max-factors=1000 cd $HOME
Now let's install MGIZA++ (the word aligner):
cd $HOME git clone https://github.com/moses-smt/mgiza.git cd mgiza/mgizapp cmake . make make install cp scripts/merge_alignment.py bin/ cd $HOME
We know that
mkcls is EXTREMELY slow, so let's replace it with @jonsafari's
cd $HOME git clone https://github.com/jonsafari/clustercat.git cd clustercat make -j 4 cd $HOME
Finally, we create a directory to keep all the external binaries that Moses will need:
cd $HOME mkdir moses-training-tools cp mgiza/mgizapp/bin/* moses-training-tools/ cp clustercat/bin/clustercat moses-training-tools/ cp clustercat/bin/mkcls moses-training-tools/mkcls-clustercat mv moses-training-tools/mkcls moses-training-tools/mkcls-original cp moses-training-tools/mkcls-clustercat moses-training-tools/mkcls cd $HOME
moses-training-tools should contain these files:
$ ls moses-training-tools/ clustercat d4norm hmmnorm mgiza mkcls mkcls-clustercat mkcls-original plain2snt snt2cooc snt2coocrmp snt2plain symal
(Optional): Delete the source and minimize the Docker container size:
rm -rf mgiza/ rm -rf clustercat/ strip mosesdecoder/bin/* mosesdecoder/lib/* moses-training-tools/*
Let's exit from everywhere and out of the docker image:
exit # out of the ubiwan user exit # out of the docker image
Create a Docker Hub account and repo
Now, we're ready to "save" our image.
First let's create a Docker Hub account. Go to Docker Hub and sign up:
Follow the instructions on https://docs.docker.com/engine/getstarted/step_five/ and create a repo named
You should now find the
my-momo repository on the url with your username, something like:
<username> is your username (without the angular brackets).
Find the image ID and push to Docker Hub
To push the image we've created into the
my-momo repo on your Docker Hub account, we have to find the Image ID, we must first commit and push the image into a the container. And we must give it a name, let's call it
# Commit image to the container docker commit $(docker ps -q -l) momo
Then, we tag the
momo to the
# Tag *momo* image to *my-momo* repo in the Docker Hub docker tag momo <username>/my-momo
Finally, we push the image into the
# Push to *my-momo* docker push <username>/my-momo
Voila, now you have the Moses Docker image on your Docker Hub repo at
Run the Docker container on another machine
With Docker intalled on the new machine and your
my-momo repo, you can simply do this to get Moses running:
docker pull <username>/my-momo docker run -it <username>/my-momo bash
You train Moses models, decode and have fun!!!
Do read up on using Dockefile to install the tools/software you need in the Docker image too. It'll help a lot to automate the above steps. See Dockerfile tutorial
If you've made it up to this point, you deserve this. (Or perhaps, you've just pressed the TL;DR link on the content page that leads to here -_-||| )
Without following any of the steps you can simply use the pre-prepared Moses Docker image I've created:
docker pull alvations/momo docker run -it alvations/momo bash
Or if you would like to use/modify the Dockerfile: https://gist.github.com/alvations/f2727a6331e4a48c5a1905e47ef5c5f3
And to build the image from the Dockerfile:
wget https://gist.githubusercontent.com/alvations/f2727a6331e4a48c5a1905e47ef5c5f3/raw/7d566e3ae03443ae9b20bbf2da1dadf9c649e958/momo.dock -O momo.dockerfile docker build -t momo - < momo.dockerfile
Cannot connect to Docker daemon
If you encounter an issue of
cannot connect to Docker daemon:
username@server:~/momodocker$ docker pull alvations/momo Using default tag: latest Warning: failed to get default registry endpoint from daemon (Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?). Using system default: https://index.docker.io/v1/ Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Error checking TLS connection: Host is not running
If you see:
username@server:~/momodocker$ eval` "$(docker-machine env default)" this will sometimes fix. Other times, next error is: "Error checking TLS connection: Host is not running"
TL;DR, Try this:
$ docker-machine rm default About to remove default WARNING: This action will delete both local reference and remote instance. Are you sure? (y/n): y Successfully removed default username@server:~/momodocker$ docker-machine create default --driver virtualbox Running pre-create checks... (default) Default Boot2Docker ISO is out-of-date, downloading the latest release... (default) Latest release for github.com/boot2docker/boot2docker is v17.04.0-ce (default) Downloading /Users/liling.tan/.docker/machine/cache/boot2docker.iso from https://github.com/boot2docker/boot2docker/releases/download/v17.04.0-ce/boot2docker.iso... (default) Creating VirtualBox VM... (default) Creating SSH key... (default) Starting the VM... (default) Check network to re-create if needed... (default) Waiting for an IP... Waiting for machine to be running, this may take a few minutes... Detecting operating system of created instance... Waiting for SSH to be available... Detecting the provisioner... Provisioning with boot2docker... Copying certs to the local machine directory... Copying certs to the remote machine... Setting Docker configuration on the remote daemon... Checking connection to Docker... Docker is up and running! To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env default
How to continue a docker which is exited?
When we created an image with
docker run -it ubuntu bash, it assigned a container ID to the image automatically.
To locate the container ID, we find the container for which we last exited:
docker ps -q -l
-q: list only container IDs
-l: list only last created container
The command above will print something like
e19f19d62ffd to the terminal.