Here is my code:
#GPU based Binary Search Kernel Function : To get total count of a search item present in database.
@cuda.jit
def cuda_BinarySearch(srcitem, dsrcDB, dsrange, threadCount):
tid = cuda.grid(1) # thread id
last = first = -1
low = (tid * len(dsrcDB) // threadCount)
high = ((tid+1) * len(dsrcDB) // threadCount) - 1
while low <= high:
# Calculate mid to divide search doamin
mid = low + (high - low) // 2
# if key is found, update the result
if srcitem == dsrcDB[mid][1]:
first = mid
high = mid - 1
# if key is less than the mid element, discard right half
elif srcitem < dsrcDB[mid][1]:
high = mid - 1
# if key is more than the mid element, discard left half
else:
low = mid + 1
# End of first While Loop
if first != -1:
#Reinitialize low & high
low = first
high = ((tid+1) * len(dsrcDB) // threadCount) - 1
while low <= high:
# Calculate mid to divide search doamin
mid = low + (high - low) // 2
# if key is found, update the result
if srcitem == dsrcDB[mid][1]:
last = mid
low = mid + 1
# if key is less than the mid element, discard right half
elif srcitem < dsrcDB[mid][1]:
high = mid - 1
# if key is more than the mid element, discard left half
else:
low = mid + 1
# End of last While Loop
if tid < threadCount:
if first != -1 and last != -1:
dsrange[tid] = (last - first + 1)
#Driver’s Code
cuda_BinarySearch[1,8](srcItem, dsrcDB, dsrange, noOfThreads)
Note:
My database “dsrcDB” is huge almost 255 MB.
I am logging the time taken by GPU and CPU both.
E.g. To search 55 items,
Time taken by GPU (1 Grid, 1 Block, 8 threads) : 0.04008 sec
Time taken by CPU : 0.25186 sec
Now problem is, when I am increasingly number of threads and blocks to 16,32,64,128,256,512,1024,2048 I am not gaining any visible change in time taken by GPU.
GPU Machine details:
Device 1: “GeForce GTX 1080 Ti”
CUDA Driver Version / Runtime Version 11.0 / 11.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 11178 MBytes (11721506816 bytes)
(28) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores
MemoryInfo(free=10626727936, total=11721506816)
numba version: 0.50.1
NumPy version: 1.18.5
llvmlite version: 0.33.0+1.g022ab0f
Any help appreciated. Thank you!