Introducing nbstl: Fast and Simple Containers for Python
Hey everyone! ![]()
I’ve been working on a small library called nbstl—a collection of high-performance containers for Python, built with numba and numpy. If you’re working on numerical computing or need fast data structures, this might be useful!
What’s in the box?
- Common Containers: Hash, Heap, Red-Black Tree, AVL Tree, and more.
- Array-Based: Everything is built on
ndarray, so it plays nicely with NumPy. - Fast: Hash and Red-Black Tree are as fast as C++ STL (yes, really!).
- Flexible: Supports key-value pairs and shared memory modes for complex workflows.
Why use it?
- Need a fast heap or hash table for your numerical data? Done.
- Working with large, complex data structures? Shared memory mode has you covered.
- Just want something simple and efficient?
nbstlis designed to be easy to use.
Check it out and let me know what you think! Feedback and contributions are welcome. ![]()
Image-Py/nbstl: stl container for numba
Convex Hull
Algorithm Implementation:
- Sort the points by their X-coordinate.
- Build the upper half: Start from the leftmost point and push it onto the stack. For each new point, check the last two points on the stack. If they form a right turn with the new point, push the new point onto the stack. Otherwise, pop the top element from the stack until a right turn is obtained.
- Build the lower half: Start from the rightmost point and push it onto the stack. For each new point, check the last two points on the stack. If they form a right turn with the new point, push the new point onto the stack. Otherwise, pop the top element from the stack until a right turn is obtained.
- Combine the upper and lower halves to obtain the convex hull.
import numpy as np
import numba as nb
import nbstl
# build Point dtype and PointStack
t_point = np.dtype([('x', np.float32), ('y', np.float32)])
PointStack = nbstl.TypedStack(t_point)
# push to stack one by one, if not turn right, pop
@nb.njit
def convex_line(pts, idx):
hull = PointStack(128)
for i in idx:
p2 = pts[i]
while hull.size>1:
p1 = hull.top(0)
p0 = hull.top(1)
s = p0.x*p1.y - p0.y*p1.x
s += p1.x*p2.y - p1.y*p2.x
s += p2.x*p0.y - p2.y*p0.x
if s<-1e-6: break
hull.pop()
hull.push((p2.x, p2.y))
return hull.body[:hull.size]
# get up line and down line, then concat the hull
@nb.njit
def convexhull(pts):
idx = np.argsort(pts['x'])
up = convex_line(pts, idx)
down = convex_line(pts, idx[::-1])
return np.concatenate((up, down[1:]))
if __name__ == '__main__':
from time import time
# use randn to make 102400 random points
pts = np.random.randn(102400, 2).astype(np.float32)
pts = pts.ravel().view(t_point)
hull = convexhull(pts)
start = time()
hull = convexhull(pts)
print('convex hull of 102400 point cost:', time()-start)
Then, we perform performance comparison using Shapely for convex hull computation on datasets of the same.
from shapely import geometry as geom
pts = np.random.randn(102400, 2).astype(np.float32)
mpoints = geom.MultiPoint(pts)
start = time()
cvxh = mpoints.convex_hull
print('the same points by shapely cost:', time()-start)
we got the result below:
convex hull of 102400 point cost: 0.01891
the same points by shapely cost: 0.04986
convex hull of 1024000 point cost: 0.23035
the same points by shapely cost: 1.08539
