Skip to content

Instantly share code, notes, and snippets.

@sbamin
Last active September 22, 2021 16:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sbamin/2c5cda06626a6a4d6677bc0ab453a315 to your computer and use it in GitHub Desktop.
Save sbamin/2c5cda06626a6a4d6677bc0ab453a315 to your computer and use it in GitHub Desktop.
snakemake notes gathered during Boston Snakemake Days 2021, https://koesterlab.github.io/bsd2021.html

Tips from Johannes Köster - Author of Snakemake on advanced use of snakemake

  • When definining conda environments, prefer using channel priority in order of
bioconda > conda-forge > anything else
log <- file(snakemake@log[[1]], open="wt")
sink(log)
sink(log, type="message")
import sys
sys.stderr = open(snakemake.log[0], "w")
  • snakemake can use input functions that can be a simple python function using snakemake wildcards (as defined from output). You can also use similar functions in params section, including use of python lamda funciton. See details here. If input function returns more than one file, you can also use snakemake unpack function to return dictionary object with key-value pairs details here.
  • In current version, 6.8.0, snakemake will not rerun entire workflow if say you add some of input files in configfile, e.g., add more fastqs in the first rule, but the final calls/all.vcf is present as in here. You can override this behavior using --list-input-changes. In upcoming release, snakemake will introduce snakemake --rerun-changes to rerun entire workflow on changes in ? one or more of input files.
  • snakemake has an experimental support for logging via slack. Ref.: snakemake logging and engenegr/log_handler_slack.py
  • snakemake can use scatter-gather similar to HPC array-like logic. Details here. Also, check dna-seq-varlociraptor workflow.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment