Hi...I am very much eager to convert the following code into cuda.jit, what is the correct way to do that?

arundhati87 · November 15, 2021, 9:32am


import os
import cv2
import time
import numpy as np
import numba as nb
from numba import njit
from google.colab import drive

drive.mount('/content/gdrive')

@njit('f8:,:', parallel=True, cache=True)
def normalize_mat(depth_src):
      depth_min = depth_src.min()
      depth_max = depth_src.max()
      depth = (depth_src - depth_min) / (depth_max - depth_min)

     return depth

def generate_stereo(depth_dir, depth_prefix, filename):
      print("=== Start processing:", filename, "===")
      depth_src = cv2.imread(os.path.join(depth_dir, depth_prefix + filename + ".jpg"))

      if len(depth_src.shape) == 3:
          depth_src = cv2.cvtColor(depth_src, cv2.COLOR_BGR2GRAY)
      else:
         depth_src = depth_src

      depth = normalize_mat(depth_src)

      depth = np.round(depth*255).astype(int)

      cv2.imwrite(os.path.join(depth_dir, "normaized_depth_" + filename + ".jpg"), depth)

def file_processing_im(depth_dir, depth_prefix):
      for f in os.listdir(depth_dir):
           filename = f.split(".")[0]
           generate_stereo(depth_dir, depth_prefix, filename)

def main():
      start_time = time.time()

      depth_dir = 'gdrive/MyDrive/depth/'
     depth_prefix = 'Depth_'

file_processing_im(depth_dir, depth_prefix)

      print(time.time() - start_time, "seconds for base generation")

if name == "main":
      main()

gmarkall · November 15, 2021, 9:58am

If you want to convert this code to use CUDA, it’s not clear to me that Numba is the right tool for the job - I’m not familiar with OpenCV, but I understand it has some CUDA functionality already - is that useable for your use case? I wonder if a combination of that and CuPy (to replace the functionality in normalize_mat()) would be more appropriate.

If you really want to replace the functionality with CUDA-jitted kernels with Numba, then I think the main approach you’ll need will be:

Replace the njit decorators with cuda.jit.
In normalize_mat, you need to implement the max and min calculations and normalization using scalar operations indexed by thread ID. You’ll also need to change it so the output array is passed in, because you won’t be able to create the depth array in the function.
In generate_stereo, you’ll need to replace the call to cv2.cvtColor() with a kernel you write yourself that provides the same functionality. You’ll also need to replace the call to np.round() with an implementation that operates on scalars.
You’ll need to move the data loading out of generate_stereo(), and allocate space on the device and transfer data to it before calling generate_stereo().

arundhati87 · November 15, 2021, 11:22am

Thank you so much…will try and update

Topic		Replies	Views
Error: TypeError: 'DeviceFunctionTemplate' object is not callable Support: What is this error message?	0	554	November 22, 2021
Getting error : 'DeviceFunctionTemplate' object is not callable Numba	0	547	November 22, 2021
CUDA - OpenGL interop Support: How do I do ...?	9	2031	March 30, 2023
Cuda vs CPU maintenance Community Support	1	505	June 15, 2020
CUDA.jit - Higher Order Convolution Optimizations (Volterra Operator) Support: How do I do ...?	1	449	October 5, 2020

Hi...I am very much eager to convert the following code into cuda.jit, what is the correct way to do that?

Related topics