Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to exact versions for tree-sitter parsers #22

Open
wenkokke opened this issue Feb 18, 2022 · 4 comments
Open

Switch to exact versions for tree-sitter parsers #22

wenkokke opened this issue Feb 18, 2022 · 4 comments

Comments

@wenkokke
Copy link
Contributor

The web versions of tree-sitter parsers are unfortunately quite brittle:

  1. There is no testing infrastructure that ships with tree-sitter to test the generated web assembly versions, and as a consequence most parsers do not test the generated web assembly.
    Flag for tree-sitter parse and test which use the generated wasm tree-sitter/tree-sitter#1565
  2. The native and web versions of the tree-sitter libraries use different runtime systems, which do not always behave the same.
    Native and WASM parsers behave differently tree-sitter/tree-sitter-haskell#69
  3. Due to limitations in emscripten, the tree-sitter library must determine which functions from the C standard library are exposed to each individual parser, and the set of functions they have chosen to expose is rather limited. Therefore a benign change which might work perfectly using the native versions of the libraries might completely break the web versions of the libraries.
    Compile to wasm with patch to web-tree-sitter (UPDATE) tree-sitter/tree-sitter-haskell#56 (comment)

As a consequence of these two factors, we cannot rely on semantic versioning for web-tree-sitter parsers, and we have to be quite conservative in what versions we allow.

For the short term, I propose that we limit each of the parsers to the exact version which is currently used by the yarn.lock file, be that in exact version number or a commit hash.

For the longer term, I propose that we build a test suite which repeatedly loads up files from several major projects using these supported programming languages and check the generated parse trees to see if (1) they are free from errors, and (2) they correspond to our golden standard files (once we have those). We can then use this test suite to guide in version bumps.

@pokey
Copy link
Member

pokey commented Feb 19, 2022

All sounds good! Well captured. One thought as well is that we may want to add a step in CI that runs cursorless test suite

Alternately, we could fold parse tree into cursorless for purposes of CI testing, but then publish both extensions from CI deploy

@pokey
Copy link
Member

pokey commented Jul 15, 2022

@wenkokke do we need to pin to sha's within package.json? It would be simpler to use main / master in package.json, and then rely on yarn.lock / package-lock.json to capture exact sha's. Would make bumping much easier; eg could even let Renovate bot lockfile maintenance handle it once we migrate this repo into Cursorless monorepo so that we get CI

@wenkokke
Copy link
Contributor Author

If there’s no tags, then it’s probably for the best to have the option for consistency when updating lockfiles?

@pokey
Copy link
Member

pokey commented Jul 16, 2022

@wenkokke sorry I'm not sure I understand. Could you possibly elaborate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants