-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider alternatives to tree-sitter #41
Comments
ANTLR4 seems to be quite popular, it has typescript support as well and XPath grammar already written. |
Specifically I’m interested in support for generating (or otherwise producing) TypeScript types corresponding to the parsed syntax tree nodes. We use dts-tree-sitter for this currently. It’s not amazing but we rely on it and supplemental types built on that to keep the parser and evaluator in sync. That’s something that Ohm does OOTB, and my impression from the spike does it better than dts-tree-sitter. I’m not sure if that’s something ANTLR4 supports, but I didn’t find anything about that when I looked last. It’s not a deal breaker by any means, as the grammar will likely be stable. But it’s something I’m particularly nervous about trying to augment for an existing parser without really being familiar with its output. |
It's worth mentioning now that #166 is a really good frame of reference for why a typed grammar is valuable for us. Quite a lot of that change would be much more complicated without the types we have to express which syntax nodes can occur where in the syntax tree. It's also worth mentioning that I recently did a spike into ANTLR4's XPath grammar. It seems like it partially addresses these concerns: ANTLR4 has a TypeScript target which produces some grammar-specific types. Unfortunately, what I found was that the generated parse API would require significant rework for us to use... basically across the board, anywhere we use XPath syntax as anything other than an opaque string. The API is heavily influenced by its Java underpinnings, and heavily generalized around OOP-ish concepts that don't really fit our current model of parsed syntax (as a tree of nodes which are basically plain objects). |
Another dependency update PR #209, another round of tree-sitter pain. I chose to work through the issues that came up, despite a strong inclination to put off the update. Mainly because it’s another data point we can consider when we think about the cost/benefit of investigating alternatives and potential rework to adopt one. My instinct today (largely unchanged): tree-sitter was a great choice for prototyping and early iterations, but its stability challenges are an outsized maintenance burden; and it continues to have an outsized impact in terms of bytes on the wire. I’m not in a big rush to look at replacements today, but I do recommend we investigate other options before we release 1.0. |
Build woes. @brontolosone has had them. I think we all have at this point. tree-sitter requires either of these to be available on
The docs/CLI are pretty vague about anything more detailed than that, but it is actually more detailed than that. This makes it hard for us to document our own requirements, and makes for a big barrier to contribution. This is especially problematic because that barrier applies to something that's otherwise very stable (I'm pretty sure we've made one grammar change since this project was very much a prototype!). I think these build woes should be weighed as we consider this quesiton. In the meantime, some shorter term mitigations worth considering:
|
I’m filing quickly on mobile ahead of our check in, but briefly:
I briefly explored Ohm over the weekend and it’s a pretty compelling alternative option (albeit with some reservations around maturity). Beyond its TS support, I was encouraged by how easy it was to build a cromulent XPath grammar, and how much more closely it matches the grammar from the spec.
The text was updated successfully, but these errors were encountered: