Skip to content

Instantly share code, notes, and snippets.

@akhanf
Created November 19, 2021 19:55
Show Gist options
  • Save akhanf/e6f8eac4329cafbabb4fbaa20fa1e203 to your computer and use it in GitHub Desktop.
Save akhanf/e6f8eac4329cafbabb4fbaa20fa1e203 to your computer and use it in GitHub Desktop.
snakebids-batch
global_app_opts:
tmp: '$SLURM_TMPDIR'
apps:
hippunfold:
command:
- 'singularity run -e /project/6050199/akhanf/singularity/bids-apps/khanlab_hippunfold_latest.sif'
opts:
- '--keep_work'
rsync_maps:
'results/sub-{subject}/': 'sub-{subject}' #watch out for trailing slash on the origin folder, it is required here..
'work/sub-{subject}/': 'work/sub-{subject}'
outputs:
dirs:
- 'sub-{subject}'
resources:
mem_mb: 128000
time: 180
app_type: snakemake
threads: 32
fmriprep_20.2.6:
command:
- 'singularity run -e /project/6050199/akhanf/singularity/bids-apps/nipreps_fmriprep_20.2.6.sif'
opts:
- '--fs-no-reconall --notrack'
rsync_maps:
'sub-{subject}': 'sub-{subject}'
outputs:
files:
- 'sub-{subject}.html'
dirs:
- 'sub-{subject}'
resources:
mem_mb: 32000
time: 180
app_type: nipype
threads: 8
bids_dir: '/path/to/bids_dir'
output_dir: '/path/to/output_dir'
#enable printing debug statements during parsing -- disable if generating dag visualization
debug: False
derivatives: False #will search in bids/derivatives if True; can also be path(s) to derivatives datasets
#list of analysis levels in the bids app
analysis_levels: &analysis_levels
- participant
#mapping from analysis_level to set of target rules or files
targets_by_analysis_level:
participant:
- '' # if '', then the first rule is run
#this configures the pybids grabber - create an entry for each type of input you want to grab
# indexed by name of input
# dictionary for each input is passed directly to pybids get()
# https://bids-standard.github.io/pybids/generated/bids.layout.BIDSLayout.html#bids.layout.BIDSLayout.get
#these inputs aren't actually used here, but are currently required or snakebids will choke
pybids_inputs:
bold:
filters:
suffix: 'bold'
extension: '.nii.gz'
datatype: 'func'
wildcards:
- subject
- session
- acquisition
- task
- run
#configuration for the command-line parameters to make available
# passed on the argparse add_argument()
parse_args:
#--- core BIDS-app options --- (do not modify below)
bids_dir:
help: The directory with the input dataset formatted according
to the BIDS standard.
output_dir:
help: The directory where the output files
should be stored. If you are running group level analysis
this folder should be prepopulated with the results of the
participant level analysis.
analysis_level:
help: Level of the analysis that will be performed.
choices: *analysis_levels
--participant_label:
help: The label(s) of the participant(s) that should be analyzed. The label
corresponds to sub-<participant_label> from the BIDS spec
(so it does not include "sub-"). If this parameter is not
provided all subjects should be analyzed. Multiple
participants can be specified with a space separated list.
nargs: '+'
--exclude_participant_label:
help: The label(s) of the participant(s) that should be excluded. The label
corresponds to sub-<participant_label> from the BIDS spec
(so it does not include "sub-"). If this parameter is not
provided all subjects should be analyzed. Multiple
participants can be specified with a space separated list.
nargs: '+'
--derivatives:
help: 'Path(s) to a derivatives dataset, for folder(s) that contains multiple derivatives datasets (default: %(default)s) '
default: False
nargs: '+'
#--- workflow specific configuration -- below is just an example:
#singularity containers
singularity:
fsl: 'docker://brainlife/fsl/6.0.0'
general_app_opts:
# - '--use-singularity'
# - '--singularity-prefix /project/6050199/akhanf/singularity/snakemake_containers/'
global_app_opts:
tmp: '/tmp'
#tmp: '$SLURM_TMPDIR'
apps:
# this app is hippunfold, either installed in the same venv, or as a pipx install in a separate venv (latter is probably preferred)
# or could run with singularity run -e
hippunfold:
command:
# - 'hippunfold' #could replace this with: singularity run -e (or have a container opt)
- 'singularity run -e /project/6050199/akhanf/singularity/bids-apps/khanlab_hippunfold_latest.sif'
opts:
- '--keep_work'
# - '--use-singularity'
# - '--singularity-prefix /project/6050199/akhanf/singularity/snakemake_containers/'
rsync_maps:
'results/sub-{subject}/': 'sub-{subject}' #watch out for trailing slash on the origin folder, it is required here..
'work/sub-{subject}/': 'work/sub-{subject}'
outputs:
dirs:
- 'sub-{subject}'
resources:
mem_mb: 128000
input:
expand(os.path.join(f'{root}','{app}','sub-{subject}'),subject=config['subjects'],app=apps)
for app in apps:
rule:
input:
bids_dir = config['bids_dir']
name: app
params:
out_tmp = os.path.join(f'{app_tmp}',app,f'sub-{{subject}}'),
command = app_config[app]['command'],
app_opts = app_config[app]['opts'],
resources_opts = get_resources,
rsync_cmds = get_cmd_rsync_from_tmp(app)
output: get_outputs(app)
threads: app_config[app]['threads']
resources: **app_config[app]['resources']
shell:
"{params.command} {input.bids_dir} {params.out_tmp} participant --participant_label {wildcards.subject} {params.app_opts} {params.resources_opts} && "
" {params.rsync_cmds} "
"../workflow/Snakefile" 99L, 3024C 92,9 Bot
time: 180
app_type: snakemake
threads: 32
fmriprep_20.2.6:
command:
- 'singularity run -e /project/6050199/akhanf/singularity/bids-apps/nipreps_fmriprep_20.2.6.sif'
opts:
- '--fs-no-reconall --notrack'
rsync_maps:
'sub-{subject}': 'sub-{subject}'
outputs:
files:
- 'sub-{subject}.html'
dirs:
- 'sub-{subject}'
resources:
mem_mb: 32000
time: 180
app_type: nipype
threads: 8
[akhanf@gra-login3 config]$ vi ../workflow/Snakefile
connect /tmp/.X11-unix/X0: No such file or directory
[akhanf@gra-login3 config]$ cat ../workflow/Snakefile
#---- begin snakebids boilerplate ----------------------------------------------
import snakebids
from snakebids import bids
configfile: 'config/snakebids.yml'
#writes inputs_config.yml and updates config dict
config.update(
snakebids.generate_inputs(
bids_dir=config["bids_dir"],
pybids_inputs=config["pybids_inputs"],
derivatives=config["derivatives"],
participant_label=config["participant_label"],
exclude_participant_label=config["exclude_participant_label"]
)
)
#this adds constraints to the bids naming
wildcard_constraints: **snakebids.get_wildcard_constraints(\
config["pybids_inputs"]\
)
#---- end snakebids boilerplate ------------------------------------------------
app_config = config['apps']
apps = app_config.keys()
app_tmp = config['global_app_opts']['tmp']
root = config['root']
def get_resources(wildcards, resources, threads):
app_type = resources.app_type
if app_type == 'snakemake':
return f"--resources mem_mb={resources.mem_mb} --cores {threads}"
elif app_type == 'nipype':
return f"--mem {resources.mem_mb} --nprocs {threads} --omp-nthreads {threads}"
else:
return ""
def get_cmd_rsync_from_tmp(app):
rsync_cmds = list()
rsync_maps = app_config[app]['rsync_maps']
for map_from in rsync_maps.keys():
map_to = rsync_maps[map_from]
map_from_folder = os.path.join(f'{app_tmp}',f'{app}','sub-{subject}',f'{map_from}')
map_to_folder = os.path.join(f'{root}',f'{app}',f'{map_to}')
rsync_cmds.append(f"mkdir -p {map_to_folder}")
rsync_cmds.append(f"rsync -av {map_from_folder} {map_to_folder}")
return ' && '.join(rsync_cmds)
def get_outputs(app):
outputs = list()
if 'dirs' in app_config[app]['outputs'].keys():
for dir in app_config[app]['outputs']['dirs']:
outputs.append(directory(os.path.join(f'{root}',f'{app}',f'{dir}')))
if 'files' in app_config[app]['outputs'].keys():
for file in app_config[app]['outputs']['files']:
outputs.append(os.path.join(f'{root}',f'{app}',f'{file}'))
return outputs
rule all:
input:
expand(os.path.join(f'{root}','{app}','sub-{subject}'),subject=config['subjects'],app=apps)
for app in apps:
rule:
input:
bids_dir = config['bids_dir']
name: app
params:
out_tmp = os.path.join(f'{app_tmp}',app,f'sub-{{subject}}'),
command = app_config[app]['command'],
app_opts = app_config[app]['opts'],
resources_opts = get_resources,
rsync_cmds = get_cmd_rsync_from_tmp(app)
output: get_outputs(app)
threads: app_config[app]['threads']
resources: **app_config[app]['resources']
shell:
"{params.command} {input.bids_dir} {params.out_tmp} participant --participant_label {wildcards.subject} {params.app_opts} {params.resources_opts} && "
" {params.rsync_cmds} "
@akhanf
Copy link
Author

akhanf commented Nov 19, 2021

happy to get some feedback on it! I am thinking one good use of this could be in a snakebids replacement for bidsBatch..

@akhanf
Copy link
Author

akhanf commented Nov 19, 2021

also just realized could add "analysis_level" to the template as well

@pvandyken
Copy link

I remember when we discussed it, we had contrasted this sort of approach with wrappers, although I can't recall what our conclusion was. As far as I can tell from here, this functionality can be replicated with wrappers pretty effectively (and you won't have workflow design in you config file). Do you remember if there was any reason to avoid the wrapper though?

@pvandyken
Copy link

Actually, I can come up with one reason myself: I don't think wrappers would play with and of the snakeboost plugs, so if you wanted auto-tarred directories, pipenv, etc, that would have to be native to the wrapper.

I wondering if we could roll out a Snakeboost version of this. Something like:

rule hippunfold:
  input:
    expand(os.path.join(f'{root}','{app}','sub-{subject}'),subject=config['subjects'],app=apps)
  outputs:
      'sub-{subject}'
  params:
    opts=[
       '--keep_work',
    ],
    rsync_maps={
      'results/sub-{subject}/': 'sub-{subject}', #watch out for trailing slash on the origin folder, it is required here..
      'work/sub-{subject}/': 'work/sub-{subject}'
    },
  resources:
    mem_mb: 128000
  shell:
    bidsBatch('khanlab_hippunfold_latest.sif') # Just the end of the container to be resolved automatically

Although I don't know if you get immediate access to params from within the bidsBatch function here, so they may have to be direct arguments of the bidsBatch function:

bidsBatch(
  opts=[
    '--keep_work',
  ],
  rsync_maps={
    'results/sub-{subject}/': 'sub-{subject}', #watch out for trailing slash on the origin folder, it is required here..
    'work/sub-{subject}/': 'work/sub-{subject}'
  },
  container='khanlab_hippunfold_latest.sif'
) 

You could even pre-write these rules and import them as modules. Just some thoughts

@akhanf
Copy link
Author

akhanf commented Dec 20, 2021

yes good point, hoping to try out snakeboost sometime soon myself too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment