Hey Numba group,
I’ve encountered a situation where I’m dealing with arrays of strings of varying lengths as function arguments. In my jitted function, I noticed that Numba generates multiple different specialized functions for different string lengths which increases compiling times.
Is there a way to avoid this behavior or is there a universal UnicodeCharSeq
type that can be used in the signature to handle strings of varying lengths efficiently?
import numpy as np
import numba as nb
@nb.njit
def foo(arr):
return arr
[foo(np.array(['a'*i], dtype=np.str_)) for i in range(1, 30)]
print(foo.signatures)
# [(Array(UnicodeCharSeq(1), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(2), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(3), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(4), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(5), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(6), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(7), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(8), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(9), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(10), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(11), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(12), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(13), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(14), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(15), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(16), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(17), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(18), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(19), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(20), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(21), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(22), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(23), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(24), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(25), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(26), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(27), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(28), 1, 'C', False, aligned=True),),
# (Array(UnicodeCharSeq(29), 1, 'C', False, aligned=True),)]