Charex - Enhance Numba with NumPy's string operations

I present a project potentially useful for the Numba ecosystem: charex. This package is designed to extend Numba’s capabilities by integrating NumPy’s string operations within Numba-optimized functions.

Overview

charex stands for “Character Extensions” and enables Numba access to the suite of string comparison operations and occurrence and property methods available in NumPy’s char module:

import charex

For those interested in trying out charex, you can find the package and more details on GitHub at the charex repository. The benchmarks and tests are also available within the repository, providing insights into the package’s performance.

Comparison operations:

  • char.equal
  • char.not_equal
  • char.greater_equal
  • char.less_equal
  • char.greater
  • char.less
  • char.compare_chararrays

Occurrence and Property information:

  • char.count
  • char.endswith
  • char.startswith
  • char.find
  • char.rfind
  • char.index
  • char.rindex
  • char.str_len
  • char.isalpha
  • char.isalnum
  • char.isspace
  • char.isdecimal
  • char.isdigit
  • char.isnumeric
  • char.istitle
  • char.isupper
  • char.islower

Further Information

I’ve tested charex to ensure compatibility and performance, with the latest tests conducted using Numba 0.59.0 and NumPy 1.26.3. Please feel free to contribute, validate, provide feedback, and discuss how charex can be further improved. The aim is to gather feedback and further refine the project for integration into Numba (see: #8500).

1 Like

Kudos to you, @nmehran , for introducing charex, a great extension that enhances Numba’s capabilities with NumPy string operations.

Charex currently aligns with the existing NumPy “char” namespace. As outlined in NEP-55, NumPy will slowly transition from “char” to “strings” namespace. Its introduction may already be imminent in NumPy 2.0. The timing might be a bit unfortunate.

Thank you for your valuable contribution.

1 Like

Hi @Oyibo , I appreciate your insight with regard to NEP-55.