jmuhlich/gist:cacd807cde0446fb5fad

## gistfile1.rst

      
    Raw
  

              gistfile1.rst
            
          
    Instructions for running StarCluster on EC2 instances


Install StarCluster locally


See documentation at http://star.mit.edu/cluster/docs/latest/installation.html
pip install StarCluster
Follow directions at
http://star.mit.edu/cluster/docs/latest/quickstart.html to create config
template, SSH keypair, etc.
Get 12-digit user ID for your AWS account (not your IAM username) from AWS dashboard.
Specify a name for the SSH key that is unique to the user (e.g.,
"janedoe_ec2" instead of "mykey")
In the config file, set NODE_IMAGE_ID to the HVM AMI key
Set the NODE_INSTANCE_TYPE (e.g., c3.xlarge, etc.)


Configure software on StarCluster


Start cluster and SSH into root node:
starcluster start mycluster
starcluster sshmaster mycluster


The StarCluster AMI includes numpy, scipy, matplotlib, ipython, pip, unzip,
git, libmpich2, mpich2.

Change directory to /home/sgeadmin so that any downloaded software goes
here. Only files in /home can be seen by all nodes:
cd /home/sgeadmin


The mpi4py package included with the StarCluster AMI was compiled against the
wrong MPI library (mpich instead of openmpi). To fix this, it needs to be
uninstalled and explicitly compiled against openmpi:
pip uninstall -y mpi4py
update-alternatives --set mpi /usr/lib/openmpi/include
pip install mpi4py


Automating cluster configuration with a script

The above steps (and any other custom configuration or dependency setup for your specific project) can be put into a script which is copied over to the cluster and then, run, as in the following:
starcluster put mycluster provision_cluster.sh /home/sgeadmin
starcluster sshmaster mycluster '/home/sgeadmin/provision_cluster.sh'


Configuring SGE with MPI

It may be preferable for MPI jobs to allocate slots in a fill-up (assign all
slots from a given machine before going to the next machine) rather than a
round-robin fashion (assign one slot from each machine in turn). To check the
allocation method, run and check the allocation_rule field:
qconf -sp orte

Parallel jobs can be submitted using SGE by:
qsub -b y -cwd -pe orte 24 mpirun ./mpi-executable arg1 arg2 [...]


-b y specifies that the executable is a binary.
-cwd executes the job from the current working directory.
-pe orte specifies the name of the parallel environment and the number of
nodes (24).


Performance notes

EC2 vCPUs are equivalent to hyperthreads, so if your workload needs full cores (e.g. FPU-heavy code) you should schedule half as many jobs per instance as there are vCPUs.
If you are allocating one CPU per worker,and your code involves OpenBLAS, you should disable OpenBLAS's automatic threading by exporting OMP_NUM_THREADS=1 in your environment.