Skip to content

Instantly share code, notes, and snippets.

@dbalabka
Last active April 22, 2021 11:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dbalabka/68fb2cdd6353a549273b56f57515637b to your computer and use it in GitHub Desktop.
Save dbalabka/68fb2cdd6353a549273b56f57515637b to your computer and use it in GitHub Desktop.
scikits-bootstrap Jackknife resampling with Numba and performance testing
import scikits.bootstrap as bootstrap
import numpy as np
import time
import numba
@numba.njit(parallel=True, fastmath=True)
def _calculate_jackknife_mean_stat(data: np.ndarray) -> np.ndarray:
n = data.shape[0]
jstat = np.zeros(n)
sum = data.sum()
for i in numba.prange(n):
# Alternative solution, which can be used when we want use custom stat function
# jstat[i] = np.concatenate((data[:i], data[i+1:])).mean()
jstat[i] = (sum - data[i]) / (n - 1)
return jstat
tdata = (np.random.randint(0, 5, 100_000), )
start = time.time()
jackindices = bootstrap.jackknife_indices(tdata[0])
jstat_old = [np.mean(*(x[indices] for x in tdata))
for indices in jackindices]
end = time.time()
print(end - start)
start = time.time()
jstat_new = _calculate_jackknife_mean_stat(tdata[0])
end = time.time()
print(end - start)
np.testing.assert_array_equal(jstat_old, jstat_new)
# Numba debug
# bootstrap._calculate_jackknife_mean_stat.parallel_diagnostics(level=4)
# bootstrap._calculate_jackknife_mean_stat.inspect_types()
@dbalabka
Copy link
Author

dbalabka commented Apr 22, 2021

JIT enabled:

63.456329107284546
1.3244829177856445
Traceback (most recent call last):
  File "/Users/torinaki/www/htdocs/bayesian-testing/test7.py", line 30, in <module>
    np.testing.assert_array_equal(jstat_old, jstat_new)
  File "/Users/torinaki/www/htdocs/bayesian-testing/.venv/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 933, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/Users/torinaki/www/htdocs/bayesian-testing/.venv/lib/python3.7/site-packages/numpy/testing/_private/utils.py", line 842, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not equal

Mismatched elements: 19999 / 100000 (20%)
Max absolute difference: 2.22044605e-16
Max relative difference: 1.11433496e-16
 x: array([1.99261, 1.99262, 1.99263, ..., 1.99261, 1.99263, 1.99261])
 y: array([1.99261, 1.99262, 1.99263, ..., 1.99261, 1.99263, 1.99261])

JIT disabled

51.36462903022766
0.07923197746276855

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment