John Blischak 2014-05-14
Multiple users have observed that submitting jobs via Snakemake requires much more memory than is necessary to run the command (e.g. mailing list post, Bitbucket issue).
To document and explore this issue, I have created a few scripts to recreate the problem. The task to perform is simply sleeping for a short, random period of time and then creating a file. For comparison, I run this 100 times using three methods:
- shell - sumbit the job directly via the Bash shell
- subprocess - submit the job to qsub via the Python subprocess module
- snakemake - submit the job via Snakemake
The analysis can be run using the commands below. The argument passed to the scripts is an ID number for that particular run. The first script should be run on the head node from within this directory.
bash submit_all.sh 01
bash check_and_clean.sh 01
bash analyze.sh 01
The virtual memory usage is similar (and minimal) when using the shell or subprocess. However, the virutal memory is increased by an order of magnitude when using Snakemake. Furthermore, the virtual memory requirements for Snakemake also fluctuate by over 1G, which is more than the total virtual memory requirement for the other two methods.
The commands run through shell had the following distribution of virtual memory usage:
- 14 119.848M
- 83 119.910M
- 3 119.926M
The commands run through subprocess had the following distribution of virtual memory usage:
- 9 119.848M
- 88 119.910M
- 3 119.926M
The commands run through snakemake had the following distribution of virtual memory usage:
- 82 1.440G
- 1 1.441G
- 2 2.558G
- 15 2.559G
I used the following software versions:
- Red Hat Enterprise Linux Server release 5.4
- Sun Grid Engine 6.2u3
- Python 3.3.4
- Snakemake 2.5.1
Files:
- analyze.sh - Retrieves the virtual memory usage via qacct
- check_and_clean.sh - Reports if the jobs finished and removes all files
- submit_all.sh - runs all three methods (run from the head node)
- Snakefile - creates file via Snakemake
- submit.py - creates file via subprocess
- submit.sh - creates file via shell