Passing tuples to device functions

I’m trying to call a device function (union) to which I’m passing 2 tuples (x, y, z are integers) like so:

union(labels_matrix, (z, y, x), (z - 2, y - 2, x - 2))

I get the following error:

No implementation of function Function(<numba.cuda.compiler.DeviceDispatcher object at 0x000002E674B31910>) found for signature:
 
 >>> union <CUDA device function>(array(uint32, 3d, C), UniTuple(int64 x 3), UniTuple(int64 x 3))
 
There are 2 candidate implementations:
      - Of which 2 did not match due to:
      Overload in function 'union <CUDA device function>': File: ..\..\..\Segmentation.py: Line 951.
        With argument(s): '(array(uint32, 3d, C), UniTuple(int64 x 3), UniTuple(int64 x 3))':
       Rejected as the implementation raised a specific error:
         TypingError: Failed in cuda mode pipeline (step: nopython frontend)
       Cannot unify UniTuple(int64 x 3) and Tuple(UniTuple(int64 x 3), int64) for 'p1.3', defined at D:/Work/College Work/Thesis/Segmentation/Segmentation.py (956)
       
       File "Segmentation.py", line 956:
       def union(labels_matrix, p1, p2):
           <source elided>
           while not done:
               p1, p2 = find(labels_matrix, p1), find(labels_matrix, p2)
               ^
       
       During: typing of assignment at D:/.../Segmentation.py (956)
       
       File "Segmentation.py", line 956:
       def union(labels_matrix, p1, p2):
           <source elided>
           while not done:
               p1, p2 = find(labels_matrix, p1), find(labels_matrix, p2)
               ^

I also add that union calls another function (find) with the same tuple.
Is the problem because tuples can’t be passed to device functions? Or is there something else I’m missing?

Thanks for the assistance

This looks like there’s some inconsistent treatment of p1 in your kernel - in some places it’s treated like a tuple of three integers (UniTuple(int64 x 3)), and in others as a tuple containing a tuple of three integers, and one more integer (Tuple(UniTuple(int64 x 3), int64)).

1 Like

Thanks for the reply :slight_smile:
That does help me identify the problem though I’m not sure why it happens.
Even though I’m reusing p1 - it is passed as UniTuple(int64 x 3) and returned as Tuple(UniTuple(int64 x 3), int64) - I only read from this result and, given that this is in a loop, I reassign p1 to a UniTuple(int64 x 3) before the next iteration.
In fact, simply changing the variable names solved the issue. That being said, I assume there is no problem with calling device functions with tuples so that question is answered.

Hi @JRibeiro

if the code for unify is not too unwieldy and you are allowed to share it, then posting the code here might help to see what the problem is.

I am not sure I can fully follow your last explanation, but it sounds as if p1 is changing from being a 3-tuple to a (3-tuple,1)-tuple back to a 3-tuple. I have noticed before that numba does not like this kind of stuff. The type of your variables should not change throughout the flow of your function (keep in mind that it is compiled with static types). If this is by design or just a “shortfall” of numba I do not know. But it may explain why changing the variable name helped. It allows numba to assign a suitable type to that new variable name,

I think this is just a present shortcoming in Numba, though it may improve in future (with @stuartarchibald 's work on SSA in the pipeline).

Thanks for the interest :slight_smile:

I think it is exactly as you both said. The variables p1/p2 are indeed being changed from UniTuple(int64 x 3) to Tuple(UniTuple(int64 x 3), int64) and that is what is causing the problem.
Here’s union:

@cuda.jit(device=True)
def union(labels_matrix, p1, p2):
    done = False

    while not done:
        p1, p2 = find(labels_matrix, p1), find(labels_matrix, p2)
        if p1[1] < p2[1]:
            old = cuda.atomic.min(labels_matrix, p2[0], p1[1])
            done = old == p2_t[1]
            p2 = get_3d_index(labels_matrix.shape, old)
        elif p2[1] < p1[1]:
            old = cuda.atomic.min(labels_matrix, p1[0], p2[1])
            done = old == p1_t[1]
            p1 = get_3d_index(labels_matrix.shape, old)
        else:
            done = True

p1/p2 are passed to union as UniTuple(int64 x 3) but changed to Tuple(UniTuple(int64 x 3), int64) using the find function (concretely 3D coordinates and its linear index). I thought something was off because the code should technically work regardless of the reassignments being good practice or not - p1 and p2 are both changed back to 3D coordinates in case of a next iteration.
Renaming p1/p2 to p1_t/p2_t solved the issue:

@cuda.jit(device=True)
def union(labels_matrix, p1, p2):
    done = False

    while not done:
        p1_t, p2_t = find(labels_matrix, p1), find(labels_matrix, p2)
        if p1_t[1] < p2_t[1]:
            old = cuda.atomic.min(labels_matrix, p2_t[0], p1_t[1])
            done = old == p2_t[1]
            p2 = get_3d_index(labels_matrix.shape, old)
        elif p2_t[1] < p1_t[1]:
            old = cuda.atomic.min(labels_matrix, p1_t[0], p2_t[1])
            done = old == p1_t[1]
            p1 = get_3d_index(labels_matrix.shape, old)
        else:
            done = True