I am calculating the properties of two graphs such that the graph X has x1 nodes and x2 edges while graph Y has x2 nodes and y2 edges. To the jitted function, first time, I am passing the edges of X and, on re-execution of the code i.e. second time, I am passing the edges of Y. But these edges are in the form of ndarray.
Now, my observation is that with cache=True and nopython=True option, if I run the same code along with jitted function multiple times for both X and Y, I am observing the following. For X, the execution time is 82 seconds for first time and then nearly 66 seconds rest of the times of re-execution of the code. For Y, the execution time is 123 second for first time and then nearly 102 seconds rest of the times. I believe that the difference of 82-66=16 (for X) and 123-102=21 (for Y ) is because of the fact that, due to cache=True, the data is not compiled again for X and Y again, and hence I am able to get lesser time in rest of the executions for X and Y.
The reason, I am so much concerned about it, is that I have a third data set Z having 15 times the number of edges in Y. Hence, the execution of Z would take a lot of time if it would re-compile the code again for Z also. Note that the edges in X,Y and Z are ndarrays and, with the help of numba.typeof, I have obtained the data type of edge lists in the following code as array(int16, 2d, C)
The code is as follows.
@njit(cache=True)
def case1(edge_list,signs,i,j,rows,ppp,nnn,ppn_I,ppn_II,ppn_III,ppn_IV,pnn_I,pnn_II,pnn_III,pnn_IV):
for k in prange(j+1,rows):
if (edge_list[i,1]==edge_list[k,0] and edge_list[j,1]==edge_list[k,1]) or (edge_list[i,1]==edge_list[k,1] and edge_list[j,1]==edge_list[k,0]):
if np.sum(signs[i]+signs[j]+signs[k])==3:
ppp=ppp+1
elif np.sum(signs[i]+signs[j]+signs[k])==1:
x=np.zeros((3,1),dtype=np.int8)
if edge_list[i,1]==edge_list[k,0]:
x[0]=signs[i]
elif edge_list[j,1]==edge_list[k,0]:
x[0]=signs[j]
if x[0,0]==1 and x[1,0]==1 and x[2,0]==-1:
ppn_I=ppn_I+1
elif x[0,0]==1 and x[1,0]==-1 and x[2,0]==1:
ppn_II=ppn_II+1
elif x[0,0]==-1 and x[1,0]==1 and x[2,0]==1:
ppn_III=ppn_III+1
elif np.sum(signs[i]+signs[j]+signs[k])==-1:
x=np.zeros((3,1),dtype=np.int8)
if edge_list[i,1]==edge_list[k,0]:
x[0]=signs[i]
elif edge_list[j,1]==edge_list[k,0]:
x[0]=signs[j]
if x[0,0]==-1 and x[1,0]==-1 and x[2,0]==1:
pnn_I=pnn_I+1
elif x[0,0]==-1 and x[1,0]==1 and x[2,0]==-1:
pnn_II=pnn_II+1
elif x[0,0]==1 and x[1,0]==-1 and x[2,0]==-1:
pnn_III=pnn_III+1
elif np.sum(signs[i]+signs[j]+signs[k])==-3:
nnn=nnn+1
return ppp,nnn, ppn_I,ppn_II,ppn_III,ppn_IV,pnn_I,pnn_II,pnn_III,pnn_IV
@njit(cache=True)
def signed_directed_triangles_numba1(edge_list):
edge_list=edge_list[:,0:3]
rows_edge_lists=edge_list.shape
rows=rows_edge_lists[0]
signs=np.zeros((rows_edge_lists[0],1),dtype=np.int8);
for ii in range(0,rows_edge_lists[0]):
if edge_list[ii,2]>=0: signs[ii]=1;
elif edge_list[ii,2]<0: signs[ii]=-1;
edge_list=edge_list[:,0:2]
extracted_triangles=np.zeros((rows_edge_lists[0]-2,10),dtype=np.uint16);
for i in range(0,rows_edge_lists[0]-2):
flag_i1_j1=0
ppp,nnn=0,0
ppn_I,ppn_II,ppn_III,ppn_IV=0,0,0,0
pnn_I,pnn_II,pnn_III,pnn_IV=0,0,0,0
for j in range(i+1,rows-1):#rows_edge_lists[0]-1):
if edge_list[i,0]==edge_list[j,0]:
flag_i1_j1=1
ppp,nnn,ppn_I,ppn_II,ppn_III,ppn_IV,pnn_I,pnn_II,pnn_III,pnn_IV=case1(edge_list,signs,i,j,rows,ppp,nnn,ppn_I,ppn_II,ppn_III,ppn_IV,pnn_I,pnn_II,pnn_III,pnn_IV)
extracted_triangles[i]=np.array([ppp,nnn,ppn_I,ppn_II,ppn_III,ppn_IV,pnn_I,pnn_II,pnn_III,pnn_IV],dtype=np.uint16)
triangles_each_type=np.sum(extracted_triangles,axis=0)
print(triangles_each_type)
return
#edge_list = np.genfromtxt(r'C:\Users\alpha.csv', delimiter=",")
# edge_list=np.genfromtxt(r'C:\Users\bitcoinotc.csv', delimiter=",")
# edge_list = np.genfromtxt(r'C:\Users\epinions.csv', delimiter=",");
#edge_list = edge_list.astype(np.int16)
edge_list=np.array([[7188,1,10,1],[430,1,10,2],[3134,1,10,3],[3026,1,10,4],[3010,1,10,5],[7188,5,10,6],[7188,3,5,7],[430,5,2,8],[3134,6,10,9],[3134,100,10,10],[1,5,-2,11],[1,5,-2,12],[1,3,5,13],[6,100,-4,14]],dtype=np.int16)
#edge_list=np.array([[6,2,4,1],[6,5,2,2],[1,15,1,4],[4,3,7,4],[13,16,8,5],[13,10,8,6],[7,5,1,7],[2,5,-1,8],[2,100,-10,9],[13,100,10,10],[16,10,-6,11],[16,100,-4,12],[5,100,-3,13],[13,200,-4,14],[100,200,3,15],[7,300,1,16],[7,400,1,17],[5,300,-5,18],[5,400,3,19],[111,222,-4,20]],dtype=np.int16)
signed_directed_triangles_numba1(edge_list)
Here X refers to the edge lists obtained by edge_list = np.genfromtxt(r’C:\alpha.csv’, delimiter=",")
Y refers to edge list obtained by edge_list=np.genfromtxt(r’C:\bitcoinotc.csv’, delimiter=",")
Z refers to edge list obtained by edge_list = np.genfromtxt(r’C:\epinions.csv’, delimiter=",");
Please note that I have shortened the code (and also skipped some other functions) such that the number of lines at here would not be much and my problem is being understood by the readers.