CPU vs GPU version

@njit
def srch(compare_desc, pe_descs, ): #### Process de çoklanabilir diye düşünüyorum
tupZ = [(1, 18.265, 100, 0.1)]

for x in range(pe_descs.shape[0]):
    dist = np.dot(compare_desc, pe_descs[x].T)
    if float(0,8) < 1:
        # a_string = people_namesTUPLE[x]
        name_number = x
        a_float = dist
        an_int = x
        small_float = sim
        tupZ.append((name_number, a_float, an_int, small_float))


return tupZ

pe_desc = nd.random_normal(0, 1, shape=(5000000, 256))
compare_desc=nd.random_normal(0, 1, shape=(1, 256))
src(compare_desc , pe_desc)

I dont know how to implement in GPU for above code. Is it faster if its run on the GPU?

Is there anyone can help / try ?

best

I’ve edited the code slightly to include imports:

from numba import njit, cuda
from mxnet import nd
import numpy as np

@njit
def srch(compare_desc, pe_descs, ): #### Process de çoklanabilir diye düşünüyorum
    tupZ = [(1, 18.265, 100, 0.1)]

    for x in range(pe_descs.shape[0]):
        sim = np.dot(compare_desc, pe_descs[x].T)
        if float(0,8) < threshold:
            # a_string = people_namesTUPLE[x]
            name_number = x
            a_float = dist
            an_int = x
            small_float = sim
            tupZ.append((name_number, a_float, an_int, small_float))


    return tupZ

pe_desc = nd.random_normal(0, 1, shape=(5000000, 256))
compare_desc=nd.random_normal(0, 1, shape=(1, 256))

res = srch(compare_desc, pe_desc)

However, this doesn’t work, with the error:

Traceback (most recent call last):
  File "repro.py", line 25, in <module>
    res = srch(compare_desc, pe_desc)
  File "/home/gmarkall/numbadev/numba/numba/core/dispatcher.py", line 415, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/gmarkall/numbadev/numba/numba/core/dispatcher.py", line 358, in error_rewrite
    reraise(type(e), e, None)
  File "/home/gmarkall/numbadev/numba/numba/core/utils.py", line 80, in reraise
    raise value.with_traceback(tb)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
NameError: name 'threshold' is not defined

This error may have been caused by the following argument(s):
- argument 0: cannot determine Numba type of <class 'mxnet.ndarray.ndarray.NDArray'>
- argument 1: cannot determine Numba type of <class 'mxnet.ndarray.ndarray.NDArray'>

dist and threshold are not defined - could you provide a modified version of your example that defines these and is executable please? This will make it possible to understand your example better, and to suggest how it might work in CUDA.

Many thanks!

Edited

Also you can change the ndarray to the numpy array if you want