Skip to content

Instantly share code, notes, and snippets.

View GenevieveBuckley's full-sized avatar

Genevieve Buckley GenevieveBuckley

  • Monash University
  • Melbourne
View GitHub Profile
@GenevieveBuckley
GenevieveBuckley / hlg-steps.md
Created December 10, 2021 09:28
Steps to implement a new high level graph in Dask

So, you want to make a new high level graph layer class?

Step 1:

It's common to want to convert something Dask already does, and convert it to use high level graph under the hood.

First you need to find the place in the code where the dask task dictionary is created. Typically this looks like a variable called dsk or dsk_out that is a dictionary mapping the key names to individual tasks.

Found it? Great, this is the spot we're going to insert an instance of your (new, not yet created) high level graph class, eg: dsk_out = MyNewLayer(input_args, ...)

@GenevieveBuckley
GenevieveBuckley / hacky-dask-map-overlap.ipynb
Created November 11, 2021 05:22
Experimenting with a hacky dask map_overlap
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@GenevieveBuckley
GenevieveBuckley / slices_from_chunks_overlap.py
Last active November 10, 2021 00:29
slices_from_chunks_overlap
# Proposed slices_from_chunks_overlap function
# Mofified from slices_from_chunks from dask.array.core
from itertools import product
from dask.array.slicing import cached_cumsum
def slices_from_chunks_overlap(chunks, array_shape, depth=1):
"""Translate chunks tuple to a set of slices in product order
Parameters
----------
@GenevieveBuckley
GenevieveBuckley / itkImagePython.py
Created October 29, 2021 06:07
itkImagePython hacks
# This file was automatically generated by SWIG (http://www.swig.org).
# Version 4.0.2
#
# Do not make changes to this file unless you know what you are doing--modify
# the SWIG interface file instead.
import os
import six
import collections
@GenevieveBuckley
GenevieveBuckley / nuke-and-reinstall-conda.sh
Created October 12, 2021 23:18
Nuke and reinstall conda script
#!/bin/bash
# Fernando's nuke and reinstall conda script
# https://twitter.com/fperez_org/status/1447996737063317512
# Download and install miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O ~/Downloads/miniconda.sh
bash ~/Downloads/miniconda.sh -b -p $HOME/local/conda
conda config --add channels conda-forge
@GenevieveBuckley
GenevieveBuckley / match_dask_keys.py
Created September 9, 2021 05:40
Mater Dask keys, ignoring token strings
def match_key(key, keylist):
# key to match
key = list(key)
key[0] = key[0].rsplit('-', 1)[0] # remove token
# search for matches
for k in keylist:
temp = list(k)
temp[0] = temp[0].rsplit('-', 1)[0] # remove token
if temp == key:
return tuple(k)
@GenevieveBuckley
GenevieveBuckley / 2021.ipynb
Created September 6, 2021 07:21
Dask survey 2021 analysis
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@GenevieveBuckley
GenevieveBuckley / correct_tensordot_auto_rechunkiing.py
Last active July 27, 2021 09:47
Tensordot auto-rechunking experiments
import dask
import dask.array as da
import numpy as np
def _inner_axes(a_ndim, b_ndim, axes):
"""Given tensordot axes argument, return list of axes to sum over."""
if isinstance(axes, (int, float)):
if axes == 0:
inner_axes_a = []
@GenevieveBuckley
GenevieveBuckley / alternative_to_scipy_ndi_findobjects.py
Last active July 23, 2021 06:51
Slightly simpler way to get the array location for any dask block
import numpy as np
import pandas as pd
def _find_slices(x):
"""An alternative to scipy.ndi.find_objects"""
unique_vals = np.unique(x)
unique_vals = unique_vals[unique_vals != 0]
result = {}
for val in unique_vals:
def _tensordot_shape_output(a, b, axes):
if isinstance(axes, (int, float)):
if axes == 0:
shape_out = a.shape + b.shape
chunks_out = a.chunks + b.chunks
elif axes > 0:
shape_out = a.shape[:axes-1] + b.shape[axes-1:]
chunks_out = a.chunks[:axes-1] + b.chunks[axes-1:]
else:
axes_a, axes_b = axes