Numba crashing IPython kernel/python interpreter

Hey there,

as Numba sadly doesn’t support `np.nanmin` or `np.nanmax` for parallelization I (and someone else) tried to do it ourselves:

Code
``````import numpy as np
import numba as nb
from math import isnan, inf

@nb.njit(fastmath=True)
def _minmax_nan(x):
maximum = -inf
minimum = inf
for i in x:
if not isnan(i):
if i > maximum:
maximum = i
if i < minimum:
minimum = i
return minimum, maximum

@nb.njit(parallel=True)
def _minmax_chunks_nan(x, chunk_ranges):
overall_maxima = []
overall_minima = []
for i in nb.prange(chunk_ranges.shape[0]):
start = chunk_ranges[i, 0]
end = chunk_ranges[i, 1]
chunk_minimum, chunk_maximum = _minmax_nan(x[start : end])
overall_maxima.append(chunk_maximum)
overall_minima.append(chunk_minimum)
return min(overall_minima), max(overall_maxima)

def even_chunk_sizes(dividend, divisor):
quotient, remainder = divmod(dividend, divisor)
cells = [quotient for _ in range(divisor)]
for i in range(remainder):
cells[i] += 1
return cells

def even_chunk_ranges(dividend, divisor):
sizes = even_chunk_sizes(dividend, divisor)
ranges = []
start = 0
for s in sizes:
end = start + s
ranges.append((start, end))
start = end
return ranges

def nanminmax_parallel(x, n_chunks):
chunk_ranges = np.array([
[start, end]
for start, end
in even_chunk_ranges(len(x), n_chunks)
], dtype=np.int64)
return _minmax_chunks_nan(x, chunk_ranges)
``````

Doing things like this in the jupyter notebook is a sure way to kill the kernel:

``````arr = np.random.rand(10)
%timeit nanminmax_parallel(arr, 4)
``````

Just calling the function in quick succession seems to cause a crash.

Can someone help with this issue? Why does it crash, seemingly by chance?
Also, I’m pretty new to numba+JIT so any suggestions to improve this piece of code would be much appreciated.

Best regards

PS: relevant github with ipynb+binder: GitHub - rynkk/misc-ipynbs

Hey rynkk ,

I don’t think it’s related to Jupyter (or timeit). I get similar crashes when running it from a normal .py file. When doing so in a loop, it crashes after between 0 and 5 executions for me (judging by what it prints to the terminal).

This disappears when I disable parallel in your `_minmax_chunks_nan` function. Perhaps appending to a List in parallel is not supported?

Regards,
Rutger

Hi @rynkk

@Rutger’s assessment is correct, the issue is that concurrent write operations on container types are not thread safe. The docs for the upcoming 0.54.0 release have this highlighted: Automatic parallelization with @jit — Numba 0.54.0rc1+0.g9bed2ebb2.dirty-py3.7-linux-x86_64.egg documentation

Hi @stuartarchibald and @Rutger,
thank you very much for the hint, I have altered `_minmax_chunks_nan` to this and now it works flawlessly:

``````@nb.njit(parallel=True)
def _minmax_chunks_nan(x, chunk_ranges):
n_chunks = len(chunk_ranges)
max_results = [-inf]*n_chunks
min_results = [inf]*n_chunks
for i in nb.prange(n_chunks):
start = chunk_ranges[i, 0]
end = chunk_ranges[i, 1]
chunk_minimum, chunk_maximum = _minmax_nan(x[start : end])
min_results[i] = chunk_minimum
max_results[i] = chunk_maximum

return min(min_results), max(max_results)
``````
1 Like