Skip to content

Instantly share code, notes, and snippets.

View GenevieveBuckley's full-sized avatar

Genevieve Buckley GenevieveBuckley

  • Monash University
  • Melbourne
View GitHub Profile
@GenevieveBuckley
GenevieveBuckley / map-overlap-length-without-materializing.ipynb
Created June 28, 2021 13:27
map-overlap-length-without-materializing-the-task-graph
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@GenevieveBuckley
GenevieveBuckley / combine_slicing.py
Last active June 28, 2021 12:58
combine_slicing.py
import math
import numpy as np
import pytest
def combine_slices(slices):
starts = [s.start for s in slices if s.start is not None]
stops = [s.stop for s in slices if s.stop is not None]
steps = [s.step for s in slices if s.step is not None]
@GenevieveBuckley
GenevieveBuckley / plot.py
Created June 23, 2021 08:15
Simple matplotlib line plot with error bars
import matplotlib.pyplot as plt
%matplotlib inline
plt.errorbar(
[1,2,3],
[1,2,3],
yerr=[.1, .2, .3],
color='blue', marker='o', linestyle='dashed',
)
plt.title("Example")
@GenevieveBuckley
GenevieveBuckley / highlevelgraph-html.py
Last active June 4, 2021 09:19
HighLevelGraph HTML
from html import escape
class HighLevelGraphHTML():
def __init__(self, highlevelgraph):
self.highlevelgraph = highlevelgraph
pass
def _repr_html_(self):
highlevelgraph = self.highlevelgraph
@GenevieveBuckley
GenevieveBuckley / Dask-task-graph-handling-costs-on-the-client.ipynb
Last active June 1, 2021 06:49
Dask task graph handling costs on the client
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@GenevieveBuckley
GenevieveBuckley / talk_proposal_scipy_2021.md
Last active May 11, 2021 04:02
Talk proposal SciPy 2021

TITLE:

Scaling Science: leveraging Dask for life sciences

SHORT ABSTRACT:

Managing the challenges associated with big data in life sciences can be difficult. Scalable scientific computing is required to cope with the increasing demands of modern biology and neuroscience. Dask is a python library for distributed computation. In this talk, we'll look at several case studies where Dask is used to scale up data processing for life sciences. It will include examples from statistical genetics, single cell analysis, and imaging visualization & analysis. This will give you a better understanding of how you can extend code with Dask to scale your analysis.

DESCRIPTION:

@GenevieveBuckley
GenevieveBuckley / distributed-skeleton-analysis.ipynb
Last active May 6, 2021 10:04
distributed skeleton analysis
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@GenevieveBuckley
GenevieveBuckley / install_cupy_9.0.0b3.txt
Created March 15, 2021 07:47
How to install the beta version of Cupy 9.0.0
conda install -c conda-forge/label/cupy_rc cudatoolkit=11.2 cupy=9.0.0b3
@GenevieveBuckley
GenevieveBuckley / PyConline 2020 proposal.md
Last active February 19, 2021 07:12
scipy Japan 2020 talk proposal

PyConlineAU 2020 https://pretalx.com/pycon-au-2020/talk/review/RP9LMHZUMYZZWUB73ESTKG9SGT9QCBMJ

dask-image: distributed image processing for large data

Abstract

This talk introduces dask-image, a python library for distributed image processing. Targeted towards applications involving large array data too big to fit in memory, dask-image is built on top of numpy, scipy, and dask allowing easy scalability and portability from your laptop to the supercomputing cluster. It is of broad interest for a diverse range of data analysis applications such as video/streaming data, computer vision, and scientific fields including astronomy, microscopy and geosciences. We will provide a general overview of the dask-image library, then discuss mixing and matching with your own custom functions, and present a practical case study of a python image processing pipeline.

Detailed abstract

Image datasets are large, and becoming larger. The widely used benchmark dataset COCO (Common Objects in Context) contains 330,000

@GenevieveBuckley
GenevieveBuckley / cellprofiler-environment-dev-ubuntu-20-04.yml
Created February 11, 2021 02:35
CellProfiler development conda environment export
name: cellprofiler-dev
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=1_gnu
- alsa-lib=1.2.3=h516909a_0
- atk-1.0=2.36.0=h3371d22_4
- brotlipy=0.7.0=py38h497a2fe_1001