I personally just implement the features when I need them, as I go (as far as I’m able anyways). I think with a few people pushing we could get most of the important functionality of numpy and scipy built on top of arraymancer in a reasonable time.
In principle using Python where needed (but still accelerating it with
Nim!) is pretty straight forward nowadays thanks to the advances in
nimpy. I’ll show an example below. One major thing for this is still
missing in nimpy
, which makes it slightly more complicated than needed.
Assuming we want to integrate a 2D gaussian within some radius (I took this from a calculation of an optical fiber, which is slighly offset from a gaussian spot to calculate the amount of flux lost by the offset) using scipy, we might write the following Python code:
import numpy as np
from scipy.integrate import dblquad
def integrand(r, theta, delta_y, sigma):
# integrand is the integrand of the function which will be 2D integrated
r_prime_2 = r**2 + delta_y**2 - 2 * r * delta_y * np.cos(3.0/2.0 * np.pi - theta)
value = 1.0/(2.0 * np.pi * sigma**2) * np.exp(-r_prime_2/(2.0 * sigma** 2)) * r
return value
a = 1.0 # some constant
delta_y = 0.5 # some shift
sigma = 2.0 # and our gaussians sigma
res = dblquad(integrand, # definition of our gaussian integrand
0.0, # outer integral lower
2.0*np.pi, # outer integral upper
lambda theta: 0.0, # inner integral lower
lambda theta: a, # inner integral upper
args=(delta_y, sigma) # additional arguments
)
Which is nice and all, but thanks to being pure Python, not very fast (in this case I needed to integrate a huge number of those).
So first we might just want to port it all over to Nim, but since we
don’t have a numerical integration function, we want to call
scipy.dblquad
for it. The code equivalent to the above might look
like the following:
let
a = 1.0
delta_y = 0.5
sigma = 2.0
proc integrand(r, theta, delta_y, sigma: float): float {.exportpy.} =
let r_prime_2 = pow(r, 2.0) + pow(delta_y, 2.0) - 2.0 * r * delta_y * cos(3.0 / 2.0 * Pi - theta)
result = 1.0/(2.0 * Pi * pow(sigma, 2) ) * exp(-r_prime_2/(2.0 * pow(sigma, 2.0))) * r
let scipy = pyImport("scipy.integrate")
let res = scipy.dblquad(integrand,
0.0,
2.0 * Pi,
proc(theta: float): float = 0.0,
proc(theta: float): float = a,
args = (delta_y, sigma)
)
The problem is, that this is currently not possible with nimpy
(unless I’m missing something), since we can’t hand our Nim proc
directly to Python, despite using the exportpy
pragma.
Our next best solution is what we might do in Python to accelerate our
calculation, e.g. use ctypes
, numba
etc. to calculate the
integrand. Let’s use Nim for that instead! So we define our integrand
in Nim:
# integrate.nim
import nimpy, math
proc integrand(r, theta, delta_y, sigma: float): float {.exportpy.} =
let r_prime_2 = pow(r, 2.0) + pow(delta_y, 2.0) - 2.0 * r * delta_y * cos(3.0 / 2.0 * Pi - theta)
result = 1.0/(2.0 * Pi * pow(sigma, 2) ) * exp(-r_prime_2/(2.0 * pow(sigma, 2.0))) * r
Then just compile this file with:
nim c -d:release --app:lib -o integrate.so integrate.nim
Note, it’s important that the shared object file is called exactly like the Nim source file!
And then we just rewrite our Python code as follows:
import numpy as np
import time
from scipy.integrate import dblquad
import integrate
a = 1.0 # some constant
delta_y = 0.5 # some shift
sigma = 2.0 # and our gaussians sigma
res = dblquad(integrate.integrand, # our Nim compiled function
0.0, # outer integral lower
2.0*np.pi, # outer integral upper
lambda theta: 0.0, # inner integral lower
lambda theta: a, # inner integral upper
args=(delta_y, sigma) # additional arguments
)
print(res)
And that is all! Our Python code accelerated by Nim in a trivial way! :)
Just as a proof that it’s actually a lot faster:
import numpy as np
import time
from scipy.integrate import dblquad
import integrate
def integrand(r, theta, delta_y, sigma):
# integrand is the integrand of the function which will be 2D integrated
r_prime_2 = r**2 + delta_y**2 - 2*r*delta_y*np.cos(3.0/2.0*np.pi - theta)
value = 1/(2*np.pi*sigma**2)*np.exp(-r_prime_2/(2*sigma**2))*r
return value
a = 1.0 # some constant
delta_y = 0.5 # some shift
sigma = 2.0 # and our gaussians sigma
t0 = time.clock()
for i in xrange(1000):
res = dblquad(integrand, # some python function
0.0, # outer integral lower
2.0*np.pi, # outer integral upper
lambda theta: 0.0, # inner integral lower
lambda theta: a, # inner integral upper
args=(delta_y, sigma) # additional arguments
)
t1 = time.clock()
print(res)
# instead call the nim proc!
t2 = time.clock()
for i in xrange(1000):
res = dblquad(integrate.integrand, # some python function
0.0, # outer integral lower
2.0*np.pi, # outer integral upper
lambda theta: 0.0, # inner integral lower
lambda theta: a, # inner integral upper
args=(delta_y, sigma) # additional arguments
)
t3 = time.clock()
print(res)
print("Duration for pure python: ", (t1 - t0))
print("Duration for nimified python: ", (t3 - t2))
# result on my machine:
# (0.11410585193904128, 9.342312648695034e-12)
# (0.11410585193904128, 9.342312648695034e-12)
# ('Duration for pure python: ', 1.1764960000000002)
# ('Duration for nimified python: ', 0.13569299999999984)
So that’s a ~8.7x speedup!
just add
numba.jit
on the top line ofdef integrand
,I get
but
@numba.jit(numba.float32(numba.float32[:], numba.float32[:], numba.float32, numba.float32))
takes too longI failed to understand https://stackoverflow.com/questions/49683653/how-to-pass-additional-parameters-to-numba-cfunc-passed-as-lowlevelcallable-to-s to write the numba.cfunc version of the
integrand
as I mentioned in https://stackoverflow.com/questions/52529526/how-to-use-numba-cfunc-handle-scipy-integrate-dblquad