Skip to content

Instantly share code, notes, and snippets.

@sean-smith
Last active April 5, 2022 19:33
Show Gist options
  • Save sean-smith/56361352e8ceb70b93ba7d53b4ed605b to your computer and use it in GitHub Desktop.
Save sean-smith/56361352e8ceb70b93ba7d53b4ed605b to your computer and use it in GitHub Desktop.
Create a desktop visualization queue with AWS ParallelCluster and NICE DCV

DCV Visualization Queue

When DCV is enabled, the default behaviour of AWS ParallelCluster is to run a single DCV session on the head node, this is a quick and easy way to visualize the results of your simulations or run a desktop application such as StarCCM+.

A common ask is to run DCV sessions on a compute queue instead of the head node. This has several advantages, namely:

  1. Run multiple sessions on the same instance (possibly with different users per-session)
  2. Run a smaller head node and only spin up more-expensive DCV instances when needed. We set a 12 hr timer below that automatically kills sessions after we leave.

Setup

  1. Create a security group that allows you to connect to the compute nodes on port 8443. We'll use this below in the AdditionalSecurityGroups section for that queue.

image

  1. Create a cluster with a queue dcv with instance type g4dn.xlarge.

The g4dn.xlarge is ideal for our remote desktop use case, it has 1 Nvidia T4, 4 vcpus, and 16 GB memory. Given the number of vcpus, we can start up to 4 sessions on it.

Region: us-east-1
Image:
  Os: alinux2
HeadNode:
  InstanceType: t2.micro
  Networking:
    SubnetId: subnet-123456789
  Ssh:
    KeyName: blah
  Iam:
    AdditionalIamPolicies:
      - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
Scheduling:
  Scheduler: slurm
  SlurmQueues:
    - Name: dcv
      ComputeResources:
        - Name: dcv-g4dnxlarge
          InstanceType: g4dn.xlarge
          MinCount: 0
          MaxCount: 4
      Networking:
        SubnetIds:
          - subnet-123456789
        AdditionalSecurityGroups:
          - sg-031b9cd973e8f62b0 # security group you created above
      Iam:
        AdditionalIamPolicies:
          - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
        S3Access:
          - BucketName: dcv-license.us-east-2 # needed for license access
  1. Create a file called desktop.sbatch with the following contents:
#!/bin/bash
#SBATCH -p dcv
#SBATCH -t 12:00:00
#SBATCH -J desktop
#SBATCH -o "%x-%j.out"

# magic command to disable lock screen
dbus-launch gsettings set org.gnome.desktop.session idle-delay 0 > /dev/null
# Set a password
password=$(openssl rand -base64 32)
echo $password | sudo passwd ec2-user --stdin > /dev/null
# start DCV server and create session
sudo systemctl start dcvserver
dcv create-session $SLURM_JOBID


ip=$(curl -s http://169.254.169.254/latest/meta-data/public-ipv4)
printf "\e[32mClick the following URL to connect:\e[0m"
printf "\n=> \e[34mhttps://$ip:8443?username=ec2-user&password=$password\e[0m\n"

while true; do
   sleep 1
done;
  1. Submit a job:
sbatch desktop.sbatch # note the job id
  1. Once the job starts running, check the file cat desktop-[job-id].out for connection details:

image

No-Ingress DCV

An alternative to the above approach where we opened up the Security Group to allow traffic from port 8443 is to create a Port Forwarding Session with AWS SSM. This allows us to lock down the Security Group and have no ingress.

  1. Install Session Manager Plugin
  2. Submit a job using the following submit script:
#!/bin/bash
#SBATCH -p desktop
#SBATCH -t 12:00:00
#SBATCH -J desktop
#SBATCH -o "%x-%j.out"

# magic command to disable lock screen
dbus-launch gsettings set org.gnome.desktop.session idle-delay 0 > /dev/null
# Set a password
password=$(openssl rand -base64 32)
echo $password | sudo passwd ec2-user --stdin > /dev/null
# start DCV server and create session
sudo systemctl start dcvserver
dcv create-session $SLURM_JOBID

instance_id=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
printf "\e[32mFor a no-ingress cluster, you'll need to run the following command (on your local machine):\e[0m"
printf "\n=> \e[37m\taws ssm start-session --target $instance_id --document-name AWS-StartPortForwardingSession --parameters '{\"portNumber\":[\"8443\"],\"localPortNumber\":[\"8443\"]}'\e[0m\n"

printf "\n\n\e[32mThen click the following URL to connect:\e[0m"
printf "\n=> \e[34mhttps://localhost:8443?username=ec2-user&password=$password\e[0m\n"

while true; do
   sleep 1
done;
  1. Run the output port forwarding command locally:

image

  1. Connect to the URL!

Multiple Sessions Per-Instance

By default Slurm will schedule each session on 1 vcpu. This means that our g4dn.xlarge instance type can fit 4 sessions:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment