Charex - Enhance Numba with NumPy's string operations

nmehran · February 23, 2024, 10:45pm

I present a project potentially useful for the Numba ecosystem: charex. This package is designed to extend Numba’s capabilities by integrating NumPy’s string operations within Numba-optimized functions.

Overview

charex stands for “Character Extensions” and enables Numba access to the suite of string comparison operations and occurrence and property methods available in NumPy’s char module:

import charex

For those interested in trying out charex, you can find the package and more details on GitHub at the charex repository. The benchmarks and tests are also available within the repository, providing insights into the package’s performance.

Comparison operations:

char.equal
char.not_equal
char.greater_equal
char.less_equal
char.greater
char.less
char.compare_chararrays

Occurrence and Property information:

char.count
char.endswith
char.startswith
char.find
char.rfind
char.index
char.rindex
char.str_len
char.isalpha
char.isalnum
char.isspace
char.isdecimal
char.isdigit
char.isnumeric
char.istitle
char.isupper
char.islower

Further Information

I’ve tested charex to ensure compatibility and performance, with the latest tests conducted using Numba 0.59.0 and NumPy 1.26.3. Please feel free to contribute, validate, provide feedback, and discuss how charex can be further improved. The aim is to gather feedback and further refine the project for integration into Numba (see: #8500).

Oyibo · February 24, 2024, 9:25pm

Kudos to you, @nmehran , for introducing charex, a great extension that enhances Numba’s capabilities with NumPy string operations.

Charex currently aligns with the existing NumPy “char” namespace. As outlined in NEP-55, NumPy will slowly transition from “char” to “strings” namespace. Its introduction may already be imminent in NumPy 2.0. The timing might be a bit unfortunate.

Thank you for your valuable contribution.

nmehran · February 26, 2024, 7:43pm

Hi @Oyibo , I appreciate your insight with regard to NEP-55.

Topic		Replies	Views
Functions with arguments of variable-length strings Support: How do I do ...?	0	156	October 21, 2023
Feature request about supporting Arrow in Numba Community Support	6	1222	November 29, 2022
Numba with numpy string arrays Support: How do I do ...?	2	876	January 21, 2023
Heterogeneous immutable string key dictionaries? Community Support	7	1177	October 30, 2020
Performance issue with typed dicts and lists Support: How do I do ...?	2	1188	March 14, 2021

Charex - Enhance Numba with NumPy's string operations

Overview

Comparison operations:

Occurrence and Property information:

Further Information

Related Topics