Accelerate loops that use ctypes, c_char_p, numpy str_?

AndrewAnnex · June 13, 2020, 6:23pm

I wrote a python library that wraps a c shared-library with over 600 functions using pure ctypes and numpy. The functions are typically simple, say given a string return a float, and for some functions I allow users to pass in lists or numpy arrays that I then loop through in python calling the function repeatedly and shepherding data to and from ctypes. But some users wish to call certain functions millions of times so I want to use numba to jit those wrapper functions to speed things up, without re-writing everything into cython or swig.
Here is a pseudo code example of some of the python code I want to accelerate in numba:

fs = []
f = ctypes.c_double()
for s in strings: #strings in this case is a numpy.str_ array, but it could be a list of strings
      libsomething.str2float(s.encode(encoding="utf-8"), ctypes.byref(f))
      fs.append(f.value)
return numpy.array(fs)

What would be the best way to get numba to work with these strings? c_char_p is not supported in numba (numba/numba#3207) so I can’t just use the jit decorators as is. Would be possible to get around this issue by just enforcing that all lists/tuples/etc become numpy arrays, and use the numpy.str_ datatype?

Any ideas if that could work/proof of concepts would be appreciated

nelson2005 · June 15, 2020, 12:56am

I’d love to see a general purpose solution to this problem… It seems like one of those things that should be simple but isn’t (I’m sure for good reasons)

sklam · June 18, 2020, 8:04pm

Numba will need to implement str.encode() for real by porting https://github.com/python/cpython/blob/8a64ceaf9856e7570cad6f5d628cce789834e019/Objects/stringlib/codecs.h#L262

nelson2005 · June 25, 2020, 6:47pm

If the strings were already encoded correctly in the numpy array, could it work without str.encode()

sklam · June 25, 2020, 8:11pm

There maybe a hack to do that. I guess you can have numpy array of bytes of the correctly encoded string. Then, you can just pass a pointer to the C library by doing pointer arithmetic from the base pointer; i.e. numpy_array.ctypes.data or just something like numpy_array[item_index:].ctypes.data.

Note, I think numpy uses UTF32 internally (and depends on compilation option) if you uses it’s unicode char type.

AndrewAnnex · June 25, 2020, 8:24pm

I’ll give that a try soonish

-Andrew Annex

AndrewAnnex · June 26, 2020, 3:51pm

this may actually be partially working, by casting the numpy array of strings to bytes type I was able to get it to return correct result but I got a lot of numba compilation issues. It seems that numba does not understand ‘ctypes.c_double()’ or ‘ctype.byref’ even though from the docs it seems that numba does support ctypes? currently I am only using ‘@jit(nopython=False)’.

I am also trying a similar trick by creating an empty numpy array in the jit’d function but I am running into surprising issues with trying to do something as simple as ‘res = np.empty(times.shape, dtype=np.float)’

specific warning:

Compilation is falling back to object mode WITH looplifting enabled because Function “nbstr2et” failed type inference due to: Unknown attribute ‘c_double’ of type Module(<module ‘ctypes’ from ‘/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ctypes/init.py’>)

File “”, line 3:
def nbstr2et(times):
et = ctypes.c_double()

Topic		Replies	Views
Numba Integration with Cython Pointers Support: How do I do ...?	1	84	March 20, 2024
How can I call a dynamic library function in nopython mode Support: How do I do ...?	2	480	March 29, 2023
C like struct in numba Community Support	14	2929	October 12, 2020
Numba types inside nopython functions Support: How do I do ...?	2	535	February 9, 2021
Numba with numpy string arrays Support: How do I do ...?	2	854	January 21, 2023

Accelerate loops that use ctypes, c_char_p, numpy str_?

Related Topics