Skip to content

Instantly share code, notes, and snippets.

View nouiz's full-sized avatar

Frédéric Bastien nouiz

  • NVIDIA
  • Montréal
View GitHub Profile
@nouiz
nouiz / numba_bug.py
Last active June 8, 2020 16:46
Test how to make a Theano op that call Numba.
import numba
import numpy
#The filter2d with the same signature as Theano
#but not a class method.
def filter2d_theano(node, inputs, outputs):
image, filt = inputs
M, N = image.shape
Mf, Nf = filt.shape
@nouiz
nouiz / time_cudaMemcpyAsync.cu
Created October 1, 2012 20:26 — forked from mrocklin/time_cudaMemcpyAsync.cu
A quick CUDA program to time the effectiveness of using asynchronous CPU-GPU memory transfers.
#include <stdio.h>
#include <sys/time.h>
const int n = 160000000;
// Print number of milliseconds between timevals
void printDuration(timeval a, timeval b, char* message)
{
double elapsedTime = (b.tv_sec - a.tv_sec) * 1000.0;
elapsedTime += (b.tv_usec - a.tv_usec) / 1000.0;
@nouiz
nouiz / gist:3378936
Created August 17, 2012 14:02
Theano vs minivect
$\time make
gfortran -O3 -fno-underscoring -fno-second-underscore -g -Wall -march=native -fPIC -c fbench.f90
python2.7 ~/repos/cython/bin/cython bench.pyx
CC=gcc LD="ld" LDFLAGS="" CFLAGS="-O3 -lgfortran -g -Wall -march=native -fPIC" python2.7 setup.py build_ext --inplace
running build_ext
skipping 'bench.c' Cython extension (up-to-date)
building 'bench' extension
gcc -DNDEBUG -O2 -O3 -lgfortran -g -Wall -march=native -fPIC -fPIC -I/opt/lisa/os/epd-7.1.2/lib/python2.7/site-packages/numpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c bench.c -o build/temp.linux-x86_64-2.7/bench.o
bench.c: In function ‘get_memview_MemoryView_5array_7memview___get__’:
bench.c:5381:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
@nouiz
nouiz / gist:2038703
Created March 14, 2012 19:04 — forked from benanne/gist:2025317
lngamma in theano
import numpy as np
import theano
import theano.tensor as T
def log_gamma_lanczos(z):
assert z.dtype.startswith("float")
# reflection formula. Normally only used for negative arguments,
# but here it's also used for 0 < z < 0.5 to improve accuracy in
# this region.