Errors due to UTF-8 docstrings, force docstring codec?

Hello,

I’m running into a string codec error when compiling code on a computing cluster. There are no problems when using the code locally. The issue seems to be caused by Numba wanting to use the ascii codec to process docstrings of jitted functions.

Example traceback:

Traceback (most recent call last):
  File "<string>", line 35, in val
  File "/home/157/lb4583/.local/lib/python3.10/site-packages/pydrex/minerals.py", line 319, in update_orientations
    solver = RK45(
  File "/home/157/lb4583/.local/lib/python3.10/site-packages/scipy/integrate/_ivp/rk.py", line 94, in __init__
    self.f = self.fun(self.t, self.y)
  File "/home/157/lb4583/.local/lib/python3.10/site-packages/scipy/integrate/_ivp/base.py", line 138, in fun
    return self.fun_single(t, y)
  File "/home/157/lb4583/.local/lib/python3.10/site-packages/scipy/integrate/_ivp/base.py", line 20, in fun_wrapped
    return np.asarray(fun(t, y), dtype=dtype)
  File "/home/157/lb4583/.local/lib/python3.10/site-packages/pydrex/minerals.py", line 236, in eval_rhs
    orientations_diff, fractions_diff = _core.derivatives(
  File "/home/157/lb4583/.local/lib/python3.10/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/157/lb4583/.local/lib/python3.10/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.core.typeinfer.CallConstraint object at 0x14ee4345c610>.
'ascii' codec can't decode byte 0xe2 in position 2819: ordinal not in range(128)
During: resolving callee type: type(CPUDispatcher(<function get_rrss at 0x14ee886c3490>))
During: typing of call at /home/157/lb4583/.local/lib/python3.10/site-packages/pydrex/core.py (328)

Enable logging at debug level for details.

File "../../../../../../home/157/lb4583/.local/lib/python3.10/site-packages/pydrex/core.py", line 328:
def _get_rotation_and_strain(
    <source elided>
    """
    rrss = _minerals.get_rrss(phase, fabric)
    ^

During: resolving callee type: type(CPUDispatcher(<function _get_rotation_and_strain at 0x14ee503ec3a0>))
During: typing of call at /home/157/lb4583/.local/lib/python3.10/site-packages/pydrex/core.py (278)

During: resolving callee type: type(CPUDispatcher(<function _get_rotation_and_strain at 0x14ee503ec3a0>))
During: typing of call at /home/157/lb4583/.local/lib/python3.10/site-packages/pydrex/core.py (278)


File "../../../../../../home/157/lb4583/.local/lib/python3.10/site-packages/pydrex/core.py", line 278:
def derivatives(
    <source elided>
    for grain_index in range(n_grains):
        orientation_change, strain_energy = _get_rotation_and_strain(

The relevant error message:

Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.core.typeinfer.CallConstraint object at 0x14ee4345c610>.
'ascii' codec can't decode byte 0xe2 in position 2819: ordinal not in range(128)

So my question is can I somehow force numba to use UTF-8 for docstrings, or to skip docstrings altogether?

A few more bits of circumstantial info:

  • By default, login nodes on the cluster don’t set LANG,I have now manually set it to an UTF-8 language code using ssh SetEnv
  • I redownloaded my code after logging in with the new LANG setting, and file -bi shows charset=utf-8 for my source files
  • This is a PBS cluster with the so-called Environment Modules system for loading dependencies, so I load a Python interpreter with module load python/3.X.X and then do python -m pip install foo to get python packages (incl. numba) that I need, in other words, it is possible to pip install a local package as well, so let me know if a dev version/git HEAD of numba has a workaround/fix as well

Hi @adigitoleo,

This looks like it could be a bug/issue in Numba itself. Any chance you could please open an issue on the Numba issue tracker with a minimal working reproducer?

It also might be possible to get a more specific traceback by setting the environment variable:

NUMBA_CAPTURED_ERRORS="new_style"

(docs: Environment variables — Numba 0.56.4+0.g288a38bbd.dirty-py3.7-linux-x86_64.egg documentation)

which will treat the exception raised (presumably) in the decoder as a signal to stop compilation and provide a full traceback.

Many thanks.

1 Like

Am currently away but will try to come up with a repro in early December. Thanks for the pointers.