We’re experimenting with using llvmlite in xDSL, which uses strict type checking. Instead of using the pyright auto-generated type stubs, which aren’t quite right, or adding the types ourselves, I’m wondering whether you would be interested in adding type annotations in llvmlite itself? I’d also be happy to contribute these if you would find this helpful.
Me too. I would be glad to help.
Hello and thank you for asking about this!
Several attempts had been made over the past decade to retrofit Numba with typing annotations. IIRC it didn’t go very well and so now different parts of Numba are type annotated according to different standards. There have also been attempts at synchronising Numba’s type specification (the one you can use in the @jit decorator to specify the signature) with python type annotations but I don’t think that ever materialized in a productive way.
Generally, I don’t know anyone working on the development of Numba who is truly enthusiastic about adding typing annotations. These have been considered more of a nuisance than anything else and there have been voices in the community that have argued for the complete removal of all type annotation code again since it brings no benefit and just causes issues.
The last attempt for llvmlite was this:
and as you can see it was just abandoned eventually.
The main “problem” I see with adding type annotations is that it would require large scale changes to the code-base. This reminds me very much of the futile attempts that had been made in past to reformat the code-base according with flake8 or black. The Numba/llvmlite code has been around for a while, it wasn’t formatted according to any standard when it was first developed and so it’s quite hard to add this later. Large scale code changes across 10s or 100s of files are likely to break all in-flight PRs so are generally discouraged as they are likely to not be accepted. You could do this incrementally, but that will take a lot of patience and is likely to last for several years. Also, the pure reformatting of source code will break unit-tests. Quite a number of tests are sensitive to line-number changes, for example in the case of testing compilation of functions.
So, to summarize:
- No one on the development team is too enthusiastic about adding this (fixing bugs has priority) so you will need to do a lot of convincing.
- Type-annotation code would need be added incrementally to avoid breaking in-flight PRs
- The work would have to be very high quality in terms of the implementations submitted
- The contributors would need to have a lot of patience (probably a few years) – the project has limited resources and reviews may stall for months at a time
Sorry that I don’t bring better news and I know this sounds rather discouraging. I’m not saying that this is impossible to tackle this task, but it will be a significant amount of work both in terms of arguing and convincing people and in terms of delivering high-quality implementations.
Anyway, to continue the discussion, what would be the true material gain of adding type annotations to llvmlite? What is the proposed benefit and how will this improve the overall situation for llvmlite? How is this motivated and why should the maintainers care? I am asking because I think that crafting a solid motivation would be required first step in this heroic endeavour.
Thank you for giving this context, I had a search on GH and in this forum and hadn’t found anything, it makes sense that there were previous attempts. It also makes sense that a very large change would be difficult to merge and review, when adding type hints to xDSL an incremental approach was much more successful than all at once.
If I understand correctly from your comments and reading the GH comments, while there are no guarantees in terms of review times, PRs adding typing would be welcome. Is that right?
To answer your question, the main idea here is that for other projects to rely on llvmlite it helps to add typing support, as types can signal errors in code paths without test coverage. This matters especially in the case of JITs and production software where I would like to provide some kind of pinky promise of stability and reliability. This only matters insofar as the llvmlite maintainers would like non-numba projects to also rely on llvmlite for LLVM integration, especially as anecdotally newer Python projects seem to be typed-by-default, and often expect dependencies to also have type annotations.
If I understand correctly from your comments and reading the GH comments, while there are no guarantees in terms of review times, PRs adding typing would be welcome. Is that right?
I’ve scheduled this topic for discussion at the next developer meeting on the 20th of Jan. I’ll circle your ask around to get some feedback on this from the other developers. I will report back after the meeting. Getting such a feature added is no small feat so it’s good to recon ahead of time if there will be a buy-in from the developers. You’ll need someone to partner with here, I think as this might go through several rounds of discussion and feedback.
Maybe, in order to have something more tangible to use as a basis for further discussion and evaluation, I would invite you to provide some sort of technical spike. A reasonably short, self-contained PR that demonstrates the feasibility of your proposed approach. Something to proof-of-concept your ideas for some incremental addition to some meaningful part of llvmlite. I would recommend to timebox any effort to generate this spike as the outcome is uncertain. The spike should help facilitate a discussion regarding the merits and drawbacks of adding typing annotations with a desired binary yes/no outcome from the developers.
Sorry for the delay here, the meeting last week was quite busy and we didn’t get around to speaking about this. It was discussed today however and the general consensus from folks that contributed to this discussion is: -0/+0 i.e. it’s not something we strongly desire, but if the implementation is really good and provides value for users, then it may be considered. Suggest to proceed with the technical spike and take it from there. Please only annotate public APIs, annotate inside the source file and really make sure an incremental/iterative approach is followed. Good luck!