Edit: add quotes that I should have referenced in the first place…
Me too, a very stimulating conversation!
both great insights… I got anchored to the ‘sort question’ and googled for a fast sort algorithm. @sschaer’s implementation broke through that to reduce the problem to core principles, demonstrating yet again that algorithms are the key to performance.
Right- numba runs faster with unsigned numpy array access so it doesn’t have to check for wraparound… casting works as well
a, b, c, d = sort4(im[uint64(i)], im[uint64(i + 1)], im[uint64(i + w)], im[uint64(i + w + 1)])
ce[uint64(i)] = (b + c) / 2