We now provide a script to help you setup Bösen and Strads systems on a single machine with just 1 command. After setting it up, you can run two demo applications to verify that they are working. If you seek further deployment or prefer a more detailed hands-on experience, please refer to this full installation guide. Also check out Poseiden, the multi-GPU distributed deep learning framework of Petuum.
Before start, run the following commands to prepare necessary environment. If you do not have sudo privilege, please contact your administor for help. After getting these ready, you are good to run Petuum with or without sudo.
sudo apt-get -y update && sudo apt-get -y install g++ make autoconf git \
libtool uuid-dev openssh-server cmake libopenmpi-dev openmpi-bin libssl-dev \
libnuma-dev python-dev python-numpy python-scipy python-yaml protobuf-compiler \
subversion libxml2-dev libxslt-dev zlibc zlib1g zlib1g-dev libbz2-1.0 \
libbz2-dev libgoogle-glog-dev libzmq3-dev libyaml-cpp-dev \
subversion libxml2-dev libxslt-dev zlibc zlib1g zlib1g-dev libbz2-1.0 libbz2-dev
If you have sudo privilege, run the following command to install Petuum's dependencies.
sudo apt-get -y install libgoogle-glog-dev libzmq3-dev libyaml-cpp-dev \
libgoogle-perftools-dev libsnappy-dev libsparsehash-dev
Then run the setup command, which takes approximately 10 minutes to setup Petuum on a 2-core machine.
python petuum.py setup
The script will enable passwordless ssh connection to localhost using default id_rsa.pub key or generate one if without. Then it will download and compile Petuum's source code and its customized dependencies.
After compilation, to run the Multi-class Logistic Regression demo (in Bosen system), run
python petuum.py run_mlr
The app launches locally and trains multi-class logistic regression model using a subset of the Covertype dataset. You should see something like below. The numbers will be slightly different as it's executed indeterministically with multi-threads.
40 400 0.253846 0.61287 520 0.180000 50 7.43618
I0701 00:35:00.550900 9086 mlr_engine.cpp:298] Final eval: 40 400 train-0-1: 0.253846 train-entropy: 0.61287 num-train-used: 520 test-0-1: 0.180000 num-test-used: 50 time: 7.43618
I0701 00:35:00.551867 9086 mlr_engine.cpp:425] Loss up to 40 (exclusive) is saved to /home/ubuntu/petuum/app/mlr/out.loss in 0.000955387
I0701 00:35:00.552652 9086 mlr_sgd_solver.cpp:160] Saved weight to /home/ubuntu/petuum/app/mlr/out.weight
I0701 00:35:00.553907 9031 mlr_main.cpp:150] MLR finished and shut down!
To run the MedLDA supervised topic model (in STRADS system), run
python petuum.py run_lda
The app launches 3 workers locally and trains with 20 newsgroup dataset. You will see outputs like below. Once all workers have reported "Ready to exit program", you may Ctrl-C to terminate the program.
......
Rank (2) Ready to exit program from main function in ldall.cpp
I1222 20:38:31.271615 2687 trainer.cpp:464] (rank:0) Dict written into /tmp/dump_dict
I1222 20:38:31.271632 2687 trainer.cpp:465] (rank:0) Total num of words: 53485
I1222 20:38:46.930896 2687 trainer.cpp:487] (rank:0) Model written into /tmp/dump_model
Rank (0) Ready to exit program from main function in ldall.cpp
Use the following command to display top 10 words in each of the topics that's just generated.
python petuum.py display_topics
If you don't have sudo, run the setup command with --no-sudo
argument.
In addition to the sudo setup command, the script will compile and install Petuum's dependencies in its local folder.
This setup process takes about 20 minutes.
python petuum.py setup --no-sudo
Then you can run Petuum's demo applications as stated above.