CPU vs GPU version

MyraBaba · July 28, 2020, 2:21pm

@njit
def srch(compare_desc, pe_descs, ): #### Process de çoklanabilir diye düşünüyorum
tupZ = [(1, 18.265, 100, 0.1)]

for x in range(pe_descs.shape[0]):
    dist = np.dot(compare_desc, pe_descs[x].T)
    if float(0,8) < 1:
        # a_string = people_namesTUPLE[x]
        name_number = x
        a_float = dist
        an_int = x
        small_float = sim
        tupZ.append((name_number, a_float, an_int, small_float))


return tupZ

pe_desc = nd.random_normal(0, 1, shape=(5000000, 256))
compare_desc=nd.random_normal(0, 1, shape=(1, 256))
src(compare_desc , pe_desc)

I dont know how to implement in GPU for above code. Is it faster if its run on the GPU?

Is there anyone can help / try ?

best

gmarkall · July 28, 2020, 4:21pm

I’ve edited the code slightly to include imports:

from numba import njit, cuda
from mxnet import nd
import numpy as np

@njit
def srch(compare_desc, pe_descs, ): #### Process de çoklanabilir diye düşünüyorum
    tupZ = [(1, 18.265, 100, 0.1)]

    for x in range(pe_descs.shape[0]):
        sim = np.dot(compare_desc, pe_descs[x].T)
        if float(0,8) < threshold:
            # a_string = people_namesTUPLE[x]
            name_number = x
            a_float = dist
            an_int = x
            small_float = sim
            tupZ.append((name_number, a_float, an_int, small_float))


    return tupZ

pe_desc = nd.random_normal(0, 1, shape=(5000000, 256))
compare_desc=nd.random_normal(0, 1, shape=(1, 256))

res = srch(compare_desc, pe_desc)

However, this doesn’t work, with the error:

Traceback (most recent call last):
  File "repro.py", line 25, in <module>
    res = srch(compare_desc, pe_desc)
  File "/home/gmarkall/numbadev/numba/numba/core/dispatcher.py", line 415, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/gmarkall/numbadev/numba/numba/core/dispatcher.py", line 358, in error_rewrite
    reraise(type(e), e, None)
  File "/home/gmarkall/numbadev/numba/numba/core/utils.py", line 80, in reraise
    raise value.with_traceback(tb)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
NameError: name 'threshold' is not defined

This error may have been caused by the following argument(s):
- argument 0: cannot determine Numba type of <class 'mxnet.ndarray.ndarray.NDArray'>
- argument 1: cannot determine Numba type of <class 'mxnet.ndarray.ndarray.NDArray'>

dist and threshold are not defined - could you provide a modified version of your example that defines these and is executable please? This will make it possible to understand your example better, and to suggest how it might work in CUDA.

Many thanks!

MyraBaba · July 28, 2020, 4:48pm

Edited

Also you can change the ndarray to the numpy array if you want

Topic		Replies	Views
Single thread GPU vs CPU performance as a function of calculation complexity Numba	4	1360	August 30, 2022
Numba cuda: for vs while in kernel performance difference Community Support	1	1267	February 1, 2022
Usage of CUDA Python, Linear Algebra on GPU and Computational Code Community Support	7	2876	December 31, 2021
Random array generation : numba cuda slower than cupy? Support: How do I do ...?	3	1473	July 23, 2021
About understanding simple cuda results Community Support	2	56	April 2, 2024

CPU vs GPU version

Related Topics