AOT Compilation Succeeds - But Import Still Slow

In a library I’m building, Numba AOT compilation succeeds, but the initial import of the library still takes 10-15 seconds. This seems too long. Is Numba re-compiling the library at import? Can I make the initial import quicker?

To be more clear, here is the library structure:

numbastats/
    - __init__.py
    - beta.py
    - config.py
    - gamma.py
setup.py
...(other irrelevant files)...

The setup.py file is:

from setuptools import setup, find_packages
from numbastats.beta import cc_beta
from numbastats.gamma import cc_gamma

setup(
    name="numbastats",
    version="0.2.1",
    description="Numba `@njit` compatible hypothesis test statistics (and unerlying functions).",
    author="Ryan Chien",
    license="Proprietary",
    zip_safe=False,
    classifiers=[
        "Development Status :: 3 - Alpha",
        "Intended Audience :: Science/Research",
        "Topic :: Scientific/Engineering :: Mathematics",
        "Programming Language :: Python :: 3.8",
        "Programming Language :: Python :: 3.7",
        "Programming Language :: Python :: 3 :: Only",
        "License :: Other/Proprietary License",
        "Operating System :: OS Independent",
    ],
    keywords="statistics probability f-distribution beta gamma psi",
    project_urls={},
    packages=find_packages(),
    ext_modules=[
        cc_beta.distutils_extension(),
        cc_gamma.distutils_extension()
    ],
    install_requires=[
        "numpy",
        "scipy",
        "numba"
    ],
    python_requires=">=3",
    test_suite="tests"
)

A very truncated version of beta.py is shown below. Sorry I cannot post more at the moment, I haven’t received permission to release the code in its entirety.

import math
import numpy as np
from numba import njit
from numba.pycc import CC as nbcc
from .gamma import algdiv, gam1, gamln1, gamma, psi, gammainc
from .config import FMTRUE, NGTRUE, CTARG

cc_beta = nbcc('cc_beta')
cc_beta.target_cpu = CTARG

@njit('f8(f8,f8,f8,f8)', fastmath=FMTRUE, nogil=NGTRUE)
@cc_beta.export('_basym', 'f8(f8,f8,f8,f8)')
def _basym(a:int, b:int, lam:float, eps:float):
    """ Perform the asymptotic expansion for the incomplete
         beta function.
    """
    # imagine python code here
    
@njit('f8(f8,f8,f8,f8)', fastmath=FMTRUE, nogil=NGTRUE)
@cc_beta.export('_bpser', 'f8(f8,f8,f8,f8)')
def _bpser(a, b, x, eps):
     """ Evaluae the incomplete beta function using the binomial
          expansion.
     """
    # imagine python code here

And python setup.py install returns the following:

(numba-stats) root@b671ebd7c405:/workspaces/numba-stats# python setup.py install
running install
running bdist_egg
running egg_info
writing numbastats.egg-info/PKG-INFO
writing dependency_links to numbastats.egg-info/dependency_links.txt
writing requirements to numbastats.egg-info/requires.txt
writing top-level names to numbastats.egg-info/top_level.txt
reading manifest file 'numbastats.egg-info/SOURCES.txt'
writing manifest file 'numbastats.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
generating LLVM code for 'cc_f' into build/temp.linux-x86_64-3.7/cc_f.cpython-37m-x86_64-linux-gnu.o
generating LLVM code for 'cc_helpers' into build/temp.linux-x86_64-3.7/cc_helpers.cpython-37m-x86_64-linux-gnu.o
generating LLVM code for 'cc_beta' into build/temp.linux-x86_64-3.7/cc_beta.cpython-37m-x86_64-linux-gnu.o
generating LLVM code for 'cc_gamma' into build/temp.linux-x86_64-3.7/cc_gamma.cpython-37m-x86_64-linux-gnu.o
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/__init__.py -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/beta.py -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/cc_beta.cpython-37m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/cc_f.cpython-37m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/cc_gamma.cpython-37m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/cc_helpers.cpython-37m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/config.py -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/f.py -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/gamma.py -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/helpers.py -> build/bdist.linux-x86_64/egg/numbastats
copying build/lib.linux-x86_64-3.7/numbastats/legacy.py -> build/bdist.linux-x86_64/egg/numbastats
byte-compiling build/bdist.linux-x86_64/egg/numbastats/__init__.py to __init__.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/beta.py to beta.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/config.py to config.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/f.py to f.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/gamma.py to gamma.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/helpers.py to helpers.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/legacy.py to legacy.cpython-37.pyc
creating stub loader for numbastats/cc_f.cpython-37m-x86_64-linux-gnu.so
creating stub loader for numbastats/cc_helpers.cpython-37m-x86_64-linux-gnu.so
creating stub loader for numbastats/cc_beta.cpython-37m-x86_64-linux-gnu.so
creating stub loader for numbastats/cc_gamma.cpython-37m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/numbastats/cc_f.py to cc_f.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/cc_helpers.py to cc_helpers.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/cc_beta.py to cc_beta.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/numbastats/cc_gamma.py to cc_gamma.cpython-37.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying numbastats.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying numbastats.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying numbastats.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying numbastats.egg-info/not-zip-safe -> build/bdist.linux-x86_64/egg/EGG-INFO
copying numbastats.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying numbastats.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
creating 'dist/numbastats-0.2.1-py3.7-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing numbastats-0.2.1-py3.7-linux-x86_64.egg
removing '/opt/conda/envs/numba-stats/lib/python3.7/site-packages/numbastats-0.2.1-py3.7-linux-x86_64.egg' (and everything under it)
creating /opt/conda/envs/numba-stats/lib/python3.7/site-packages/numbastats-0.2.1-py3.7-linux-x86_64.egg
Extracting numbastats-0.2.1-py3.7-linux-x86_64.egg to /opt/conda/envs/numba-stats/lib/python3.7/site-packages
numbastats 0.2.1 is already the active version in easy-install.pth

Installed /opt/conda/envs/numba-stats/lib/python3.7/site-packages/numbastats-0.2.1-py3.7-linux-x86_64.egg
Processing dependencies for numbastats==0.2.1
Searching for numba==0.51.2
Best match: numba 0.51.2
Adding numba 0.51.2 to easy-install.pth file

Using /opt/conda/envs/numba-stats/lib/python3.7/site-packages
Searching for scipy==1.5.2
Best match: scipy 1.5.2
Adding scipy 1.5.2 to easy-install.pth file

Using /opt/conda/envs/numba-stats/lib/python3.7/site-packages
Searching for numpy==1.19.1
Best match: numpy 1.19.1
Adding numpy 1.19.1 to easy-install.pth file
Installing f2py script to /opt/conda/envs/numba-stats/bin
Installing f2py3 script to /opt/conda/envs/numba-stats/bin
Installing f2py3.7 script to /opt/conda/envs/numba-stats/bin

Using /opt/conda/envs/numba-stats/lib/python3.7/site-packages
Searching for llvmlite==0.34.0
Best match: llvmlite 0.34.0
Adding llvmlite 0.34.0 to easy-install.pth file

Using /opt/conda/envs/numba-stats/lib/python3.7/site-packages
Searching for setuptools==49.6.0.post20200917
Best match: setuptools 49.6.0.post20200917
Adding setuptools 49.6.0.post20200917 to easy-install.pth file
Installing easy_install script to /opt/conda/envs/numba-stats/bin

Using /opt/conda/envs/numba-stats/lib/python3.7/site-packages
Finished processing dependencies for numbastats==0.2.1

Thanks!

Hi @ryanchien,

How are you importing this library once you have an AOT compiled C-extension?

Thanks.

Hi!

I’m importing via import numbastats as ns or from numbastats.beta import betainc. An example is below. Is this the info you’re looking for?

(numba-stats) root@2ec0eb62c6d7:/usr# python
Python 3.7.3 | packaged by conda-forge | (default, Dec  6 2019, 08:54:18) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir()
['games', 'share', 'sbin', 'include', 'local', 'src', 'bin', 'lib']
>>> import numbastats as ns
>>> ns.betainc(10.5, 3.5, 0.6)
0.10357928541507373

Thanks, what I am wondering is whether you are importing the python module or the C extension. As the code above isn’t a reproducer, I can’t really tell.

The C extension you are compiling is named based on the name given when instantiating a numba.pycc.CC instance for doing the exports, and obviously the python module is named based on the standard python naming schemes. So if you have cc_beta = nbcc('cc_beta') in your code then you’ll get a C extension library named cc_beta.cpython-<some-stuff>.<library extension> e.g. cc_beta.cpython-38-x86_64-linux-gnu.so, this is the AOT library and can be imported like from mymodule import cc_beta. Is this what you are doing and what is taking 15s to import?

Oh - I am certainly not. Let me give that a try. Thanks, I figured it would be a simple thing I missed.

I’m still missing something:

test_importcext.py

import unittest
import time
import os

class test_import_cext(unittest.TestCase):
    """ Unit tests for imports. """

    def test_import_cc(self):
        ts0 = time.time()
        from numbastats import cc_beta
        te0 = time.time()
        print('Time in seconds import numbastats.cc_beta: %s' % str(te0 - ts0))
(numba-stats) root@2ec0eb62c6d7:/workspaces# python -m unittest numba-stats/tests/test_importcext.py
Time in seconds import numbastats.cc_beta: 24.013480186462402
.
----------------------------------------------------------------------
Ran 1 test in 24.014s

OK

@stuartarchibald, should my __init__.py import cc_beta or beta or both? Thanks.

@yanchien what’s the time like for just doing import cc_beta in the directory in which the extension exists? If that’s 15 seconds, or whatever it was, then I think either the module is huge or there’s potentially a problem somewhere in what was generated. If it’s much less than 15 seconds then I suspect the issue lies somewhere else in the import sequence for your package, quite possibly it having something somewhere which is triggering a load of compilation.

Another way to help test this might be to remove Numba from your environment, as an AOT module it should work without Numba being present. If your package refuses to import to due to Numba compilation errors etc. it hints that there’s compilation occurring on import. TBH I suspect that this is the problem, import mymodule.beta will likely trigger compilation as the declarations with a signature specified like @njit('f8(f8,f8,f8,f8)', fastmath=FMTRUE, nogil=NGTRUE) will force JIT compilation on import.

Running import cc_beta takes only about 0.4 seconds from the extension directory:

(numba-stats) root@2ec0eb62c6d7:/workspaces/numba-stats/build/lib.linux-x86_64-3.7/numbastats# python
Python 3.7.3 | packaged by conda-forge | (default, Dec  6 2019, 08:54:18) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.getcwd()
'/workspaces/numba-stats/build/lib.linux-x86_64-3.7/numbastats'
>>> os.listdir()
['__init__.py', 'beta.py', 'cc_beta.cpython-37m-x86_64-linux-gnu.so', 'cc_error.cpython-37m-x86_64-linux-gnu.so', 'cc_f.cpython-37m-x86_64-linux-gnu.so', 'cc_gamma.cpython-37m-x86_64-linux-gnu.so', 'cc_helpers.cpython-37m-x86_64-linux-gnu.so', 'config.py', 'error.py', 'f.py', 'gamma.py', 'helpers.py', 'legacy.py']
>>> import time
>>> t0 = time.time(); import cc_beta; t1 = time.time();
>>> t1 - t0
0.3951716423034668
>>> cc_beta.betainc(10, 20, 0.2)
0.04926351730421266

This suggests that something in my code is triggering compilation. Trying to import the library after uninstalling numba confims this issue, the import erros with the following ModuleNotFoundError: No module named 'numba'.

What can I do to solve this problem? Many of the functions are interdependent.