Why is Numba slow on a high performance computing cluster?

Following on from my previous post: perhaps rewriting alleleFreq and other some other functions with array expressions in terms of loops and scalar operations could improve the performance of your code with Numba, perhaps in conjunction with the use of parallel=True - this post describes a similar-sounding situation in which this helped, which might help guide your optimization efforts.