How to manipulate lexsort-like operation using Numba Cuda and swap rows

I have two 3-dimensional arrays, one is called the ‘Feature_Arr’ and the other is ‘Centroids_Arr’.

Each 2d array in the Feature_Arr is like:

[[-0.16578992,  1.302     ,  1.        ,  1.        ],
 [ 0.        ,  0.        ,  0.        ,  1.        ],
 [ 0.00392716,  0.2551    ,  1.        ,  0.        ],
 [ 0.16186276,  1.1104    ,  1.        ,  1.        ]]

and each 2d array in the Centroid_Arr is like:

[[ -4.3150377 ,   2.586869  ,   0.9627161 ],
 [ -1.0964384 ,   1.350665  ,   0.28387254],
 [ -6.1319566 ,  -2.8796217 ,  -1.9866784 ],
 [ -8.299599  ,  -7.908891  ,  -4.6172056 ]]

I’d like to sort each 2D Feature_Arr based on the ascending of the first column, if there are equal numbers, then sorting based on the corresponding numbers from the second column. Then swap the rows of both Feature_Arr and Centroid_Arr based on the sorting result of the Feature_Arr.

This is what I’ve tried:

import numpy as np
from numba import cuda

@cuda.jit(device=True)
def sort_2d_arrays(feature_arr, centroids_arr):
    for i in range(feature_arr.shape[0] - 1):
        for j in range(feature_arr.shape[0] - i - 1):
            if (feature_arr[j, 0] > feature_arr[j + 1, 0]) or \
               (feature_arr[j, 0] == feature_arr[j + 1, 0] and 
                feature_arr[j, 1] > feature_arr[j + 1, 1]):
                for k in range(feature_arr.shape[1]):
                    temp = feature_arr[j, k]
                    feature_arr[j, k] = feature_arr[j + 1, k]
                    feature_arr[j + 1, k] = temp
                for k in range(centroids_arr.shape[1]):
                    temp = centroids_arr[j, k]
                    centroids_arr[j, k] = centroids_arr[j + 1, k]
                    centroids_arr[j + 1, k] = temp
        
@cuda.jit
def sort_features_and_centroids(pharmacophores, centroids):
    x = cuda.grid(1)
    if x < pharmacophores.shape[0]:
        sort_2d_arrays(pharmacophores[x], centroids[x])
d_features = cuda.to_device(test_features)
d_centroid_coords = cuda.to_device(test_coords)

threadsperblock = (16, 16)
blockspergrid_x = int(np.ceil(test_features.shape[0] / threadsperblock[0]))
blockspergrid_y = int(np.ceil(test_features.shape[1] / threadsperblock[1]))
blockspergrid = (blockspergrid_x, blockspergrid_y)

sort_features_and_centroids[blockspergrid, threadsperblock](d_features, d_centroid_coords)

sorted_features = d_features.copy_to_host()
sorted_centroid_coords = d_centroid_coords.copy_to_host()

However, some of 2D arrays were not correctly sorted, for example:

One of the original 2D arrays:

 [[ 0.14940944  1.5281      1.          1.        ]
  [ 0.          0.          0.          0.        ]
  [-0.2631746   0.8732      1.          1.        ]
  [ 0.1137652   0.6502      0.          1.        ]]

Sorted:

[[-0.2631746  0.         0.         0.       ]
 [ 0.         0.8732     1.         1.       ]
 [-0.2631746  0.8732     1.         1.       ]
 [ 0.1137652  0.6502     0.         1.       ]]

As you can see, one row was copied to another row but no swap. And some other arrays were just incorrectly sorted.

1 Like