chrisamiller/cromwell_workflows.md

## cromwell_workflows.md

      
    Raw
  

              cromwell_workflows.md
            
          
    There are three inputs you need to run a workflow:

A .cwl file that contains the steps to be run
A .yaml file that gives the inputs to that CWL
A config file that tells cromwell about it's environment, how to submit jobs to the cluster, and where to stick the results

Let's start with #3 - the config file.  I've made this easy for you.  Create a directory where you want to run things, then inside of it, run the following command:
/storage1/fs1/timley/Active/aml_ppg/src/utilities/create_cromwell_config -o cromwell.config -l logs -d output -q timley -G compute-timley```

If you can't access that script, there is a copy here set up with  values appropriate for our cluster: create_cromwell_config (as of Feb 2021)
Number 1 is also straightforward - check out the analysis-workflows repo from github and find the path to the workflow in question.  It might be something like ~/git/analysis-workflow/definitions/subworkflows/merge_readcounts_somatic.cwl
Number 2 is the part you need to create. it needs to have values for all of the required inputs in that CWL file.  Often times, you can find one of these to use as a template, either in the example_data directory, or by asking in slack. Swap in paths to your files as needed.
Then, once you've got it all set up, you hop into a gsub container (which has cromwell installed):
gsub -m 6

We use -m 6 to give it a little extra memory.
Then run your command like this:
/usr/bin/java -Dconfig.file=$CONFIG -jar /opt/cromwell.jar run -t cwl -i $YAMLFILE $CWLFILE >cromwell.log

Give it a try with one case and see if you can get it running the logs are somewhat cryptic, so if you hit problems, let me know.