Numba with multiprocessing

Hello,
I’ve tried to run some njit functions with python’s multiprocessing as the minimal example code below.
But, somehow, running with multiprocessing is slower (around 0.01x compared to normal run).

I suspect that it’s because compiling overhead, each process has to compile njit function by itself.
Is there any workaround to make multiprocessing’s method faster?

Thank you

Update: it’s not about compiling overhead. I’ve pre-compile the function following the guide (Compiling code ahead of time — Numba 0.52.0.dev0+274.g626b40e-py3.7-linux-x86_64.egg documentation) and called it in multiprocessing, but speed still the same.

import numba
import multiprocessing

@numba.njit()
def a(arg):
    # a CPU intensive task
    None


@numba.njit()
def b(arg):
    # a CPU intensive task
    None


@numba.njit()
def all(arg):
    a(arg)
    b(arg)

if __name__ == "__main__":

    # running normally
    for a in args:
        all(*a)

    # running with multiprocessing
    with multiprocessing.Pool(8) as p:
        p.starmap(all, args)

hi @doliolarzz
are you sure numba is a factor in this? Have you tried to run the exact same code without the njit decorator?
The reason I ask is that multiprocessing adds some overhead, so any simple function (njit or normal python) will be slower when run in parallel. Only when running something complex enough will you see an advantage.

Luk