I’m trying to remap all values in an array according to some 1-1 correspondence. This can be accomplished by the skimage.util.map_array
function, inner-loop implementation here. (There is a pure Python wrapper that takes care of the array shapes and array allocation here.)
Here’s how it looks in practice:
In [10]: values = np.random.randint(0, 5, size=10)
In [11]: inval = np.arange(5)
In [12]: outval = np.random.random(5)
In [13]: values
Out[13]: array([0, 0, 4, 0, 3, 2, 0, 2, 0, 2])
In [14]: inval
Out[14]: array([0, 1, 2, 3, 4])
In [15]: outval
Out[15]: array([0.595442 , 0.22325946, 0.16452037, 0.70457358, 0.37474462])
In [16]: map_array(values, inval, outval)
Out[16]:
array([0.595442 , 0.595442 , 0.37474462, 0.595442 , 0.70457358,
0.16452037, 0.595442 , 0.16452037, 0.595442 , 0.16452037])
This works well but it’s about 4x slower than using array indexing, as in outval[values]
:
In [39]: image = np.random.randint(0, 5, size=(2048, 2048))
In [40]: %timeit map_array(image, inval, outval)
35.6 ms ± 249 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [41]: %timeit outval[image]
9.48 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
And the problem with the NumPy indexing approach is that you end up with a really huge outval
array if the values in image
are large — even if you don’t actually have many of them. (e.g. to map 2**32
and 2**32+1
to 0.5
and 1
, you need to allocate a 4GB array!)
I thought I’d give Numba a go since dictionaries were implemented “recently”. (Thank you! ) That turns out to be ~2x slower still than the C++ unordered_map
approach.
import numba
@numba.jit
def _map_array(inarr, outarr, inval, outval):
lut = {}
for i in range(len(inval)):
lut[inval[i]] = outval[i]
for i in range(len(inarr)):
outarr[i] = lut[inarr[i]]
Measurement:
In [30]: nd._map_array(image.ravel(), outarr.ravel(), inval, outval)
In [31]: %timeit nd._map_array(image.ravel(), outarr.ravel(), inval, outval)
69.6 ms ± 1.35 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Any ideas on how to speed this up?
Thank you!