Skip to content

Instantly share code, notes, and snippets.

@ChristopherHogan
Last active December 5, 2018 21:38
Show Gist options
  • Save ChristopherHogan/bb8c2db0f49781fb22c1f190ab2bd1b0 to your computer and use it in GitHub Desktop.
Save ChristopherHogan/bb8c2db0f49781fb22c1f190ab2bd1b0 to your computer and use it in GitHub Desktop.
Setting up cfncluster to run Meep
#!/bin/bash
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
chmod +x miniconda.sh
./miniconda.sh -b -p /opt/miniconda
export PATH=/opt/miniconda/bin:$PATH
conda create -n mp -c chogan/label/dev -c chogan -c conda-forge pymeep-parallel boto3
chown -R ec2-user:ec2-user /opt/miniconda
function fix_boot_disable_ht() {
echo "${0} Updating kernel line"
if [[ -x /sbin/grubby ]] ; then
/sbin/grubby --update-kernel=ALL --args=maxcpus=$total_cores
fi
if [ -e /etc/default/grub ]; then
if grep -q maxcpus /etc/default/grub; then
sed -i "s/maxcpus=[0-9]*/maxcpus=$total_cores/g" /etc/default/grub
else
sed -i "/^GRUB_CMDLINE_LINUX_DEFAULT=/ s/\"$/ maxcpus=$total_cores\"/" /etc/default/grub
sed -i "/^GRUB_CMDLINE_LINUX=/ s/\"$/ maxcpus=$total_cores\"/" /etc/default/grub
fi
if [ -e /etc/grub2.cfg ]; then
grub2-mkconfig > /etc/grub2.cfg
fi
if which update-grub; then
update-grub
fi
fi
}
function disable_ht {
echo "${0}: disabling HT"
parent_cores=$(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -d, -f1 | cut -d- -f1 | tr '-' '\n' | tr ',' '\n'| sort -un)
# If there are no parents, HT is probably already disabled.
if [ "$parent_cores" == "" ]; then
parent_cores=$(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list)
fi
total_cores=$(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -d, -f1 | cut -d- -f1 | tr '-' '\n' | tr ',' '\n'| sort -un | wc -l)
sibling_cores=$(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -d, -f2- | cut -d- -f2- | tr '-' '\n' | tr ',' '\n'| sort -un)
# Ensure enabled cores are enabled - starting at 1, cpu0 can't be changed
for p in $parent_cores; do
if [ $p -ne 0 ]; then
echo 1 > /sys/devices/system/cpu/cpu${p}/online
fi
done
if [ "$parent_cores" == "$sibling_cores" ]; then
echo "Hyperthreading already disabled"
else
# Ensure disabled threads are actually disabled
for s in $sibling_cores; do
echo 0 > /sys/devices/system/cpu/cpu${s}/online
done
fi
fix_boot_disable_ht
}
disable_ht

Meep cfncluster setup

  1. Install cfncluster locally. This process has only been tested with version 1.5.4. I prefer to use a conda environment for this.
$ conda create -n cfn python=3 pip
$ source activate cfn
$ pip install cfncluster==1.5.4
  1. Use the following configuration file in ~/.cfncluster/config. I use the default base_os.
[aws]
aws_access_key_id = <ID>
aws_secret_access_key = <KEY>
aws_region_name = us-east-1

[cluster default]
key_name = <KEY NAME>
vpc_settings = test
master_instance_type = t2.small
compute_instance_type = c4.large
initial_queue_size = 0
maintain_initial_size = false
max_queue_size = <DESIRED MAX>
cluster_type = spot
spot_price = 0.10
post_install = s3://hogan-fragment-stats/install_conda_meep.sh
ec2_iam_role = CfnClusterEC2Role
extra_json = {"cfncluster": {"cfn_scheduler_slots": "1"}}

[vpc test]
vpc_id = vpc-<XXXXXXXX>
master_subnet_id = subnet-<XXXXXXXX>

[global]
cluster_template = default
update_check = true
sanity_check = true

[aliases]
ssh = ssh -i <PATH TO KEY> {CFN_USER}@{MASTER_IP} {ARGS}

The extra_json tells the cluster to use only one core per EC2 compute instance. You can also change the "1" to "cores" to associate a slot with each physical core. By default it will associate a slot with each virtual cpu (hyperthread).

The install_conda_meep.sh script is included in this gist. It installs the latest pymeep-parallel package into a conda environment on all nodes at /opt/miniconda/envs/mp. It then disables hyperthreading on all nodes.

  1. Create the cluster and ssh into the master node.
$ cfncluster create default
$ cfncluster ssh default
  1. Clone the Meep repo to the shared filesystem to run some tests
$ cd /shared
$ git clone https://github.com/stevengj/meep.git
  1. Submit a test job to make sure the cluster runs a process on each instance. See test.sh below.
$ qsub test.sh
# Make sure 8 nodes are created (it can take a while)
$ qhost

Look for 8 different host names in the output file test.sh.o1.

  1. Run some Meep scripts (see run_meep_test.sh below)
$ qsub run_meep_test.sh
# Use qhost to see instances and load.
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe mpi 8
#$ -S /bin/bash
/opt/miniconda/envs/mp/bin/mpirun -n 8 /opt/miniconda/envs/mp/bin/python /shared/meep/python/tests/simulation.py
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe mpi 8
#$ -S /bin/bash
/opt/miniconda/envs/mp/bin/mpirun -n 8 hostname
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment