Skip to content

Instantly share code, notes, and snippets.

View pentschev's full-sized avatar

Peter Andreas Entschev pentschev

View GitHub Profile
from dask.distributed import Client
from dask_cuda import LocalCUDACluster
from dask.array.utils import assert_eq
import dask.array as da
import cupy as cp
add_broadcast_kernel = cp.RawKernel(
r'''
extern "C" __global__
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pentschev
pentschev / cython_malloc_benchmark.ipynb
Last active July 31, 2019 18:43
Cython Malloc Benchmark
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pentschev
pentschev / Dockerfile
Created April 15, 2019 10:03
Dask dev build Dockerfile
FROM python:3
ENV PYTHON 3.7
ENV NUMPY 1.16.2
ENV UPSTREAM_DEV 1
ENV TEST true
ENV LINT true
ENV COVERAGE false
ENV PARALLEL true
ENV XTRATESTARGS ''
@pentschev
pentschev / osx_stacktrace_lldb.txt
Created April 11, 2019 15:28
OSX stacktrace lldb output
(lldb)
There is a running process, detach from it and attach?: [Y/n]
Process 6843 detached
_bt.cpython-37m-darwin.so was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 6843 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
frame #0: 0x0000000114b54ab0 _bt.cpython-37m-darwin.so`backtrace_thread [inlined] _wait_and_reset_signal at bt.c:182:35 [opt]
179 static
180 void _wait_and_reset_signal(struct sigaction *old_sa) {
181 // spin and wait.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pentschev
pentschev / fitting.png
Last active March 16, 2019 17:27
Dask-GLM Benchmarks
fitting.png
@pentschev
pentschev / fft_benchmarks.md
Created April 27, 2015 18:47
FFT Benchmarks Comparing In-place and Out-of-place performance on FFTW, cuFFT and clFFT

Description

All benchmarks are composed of 10 batches of 2-dimensional matrices, with sizes varying from 128x128 to 4096x4096 with single-precision.

CUDA Results

NVIDIA Tesla K20

Matrix dimensions: 128x128 In-place C2C FFT time for 10 runs: 0.538662 ms