Skip lower_builtin impl for CPU runs

Hi,
I have a function which will be shared between GPU and CPU (njit) runs.
It contains a math function rcbrt (which is essentially x ** -(1 / 3)).
When running under GPU, we currently implement “impl_rcbrt” and use @lower_builtin to point to __nv_rcbrt.
Now, my question is, how would I skip the whole implementation when I am running the CPU run? I am fine with directly calling a function that return x ** -(1 / 3). Or is there any way I can do a similar implementation for CPU run? (since rcbrt is not in C++ built in math lib, I can’t directly change “__nv_rcbrt” to “rcbrt”. for functions like cbrt, that would work.)
Thanks in advanced.

Hi ZzzCesare,

Apologies for the slow response here, this passed me by when it was first posted. Did you manage to work out a way to use a different lowering implementation on the CPU target?

If not, could you post the code implementing your lowering, and a small example use case, and we can see if there’s a way to implement what you’re looking for?