-
-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python3Packages.tree-sitter-grammars: init at 0.22.5 #320783
base: master
Are you sure you want to change the base?
python3Packages.tree-sitter-grammars: init at 0.22.5 #320783
Conversation
}; | ||
}; | ||
in | ||
# TODO pkgset or flattened? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't see any pkgsets inside python3Packages so we will go ahead and flatten this unless we're told otherwise. But we didn't any flattening in top-level either so we guess this will be the first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We changed our minds. We saw at least two package sets nested in python3Packages
. We saw bootstrap
and qt6
. And this one is literally mapped from an existing package set (tree-sitter.builtGrammars
) so it seems appropriate.
6937418
to
253c53f
Compare
253c53f
to
1ff067a
Compare
1ff067a
to
fe9e96c
Compare
4eeaa9e
to
8c63440
Compare
8c63440
to
0550b06
Compare
0550b06
to
5947cdb
Compare
5947cdb
to
ea460ca
Compare
pkgs/top-level/python-packages.nix
Outdated
# ImportError: /nix/store/7w7piy6hpdj1swdg5r0bz47gk2g3q855-tree-sitter-perl-grammar-0.22.5/parser: undefined symbol: isnumber | ||
"tree-sitter-perl" | ||
# ImportError: /nix/store/6rkhjf2mhwnfqwq2962fvv9bnd4xmpk7-python3.11-python-tree-sitter-ql-dbscheme-0.22.5/lib/python3.11/site-packages/tree_sitter_ql_dbscheme/_binding.abi3.so: undefined symbol: tree_sitter_ql_dbscheme | ||
"tree-sitter-ql-dbscheme" | ||
# ImportError: /nix/store/ybmnykhrgj6sghw4r7ypf897c4wrncr5-python3.11-python-tree-sitter-org-nvim-0.22.5/lib/python3.11/site-packages/tree_sitter_org_nvim/_binding.abi3.so: undefined symbol: tree_sitter_org_nvim | ||
"tree-sitter-org-nvim" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these might be resolved by simply bumping its source.
Some could possibly be resolved with some tinkering, which we decided should not delay the submitting of this PR for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved. All grammars pass!
For the record, this is already ready for review, even though it's marked as draft. |
Co-authored-by: yakampe <[email protected]> Co-authored-by: GetPsyched <[email protected]> Co-authored-by: Shahar "Dawn" Or <[email protected]> Co-authored-by: Robert James Hernandez <[email protected]>
ea460ca
to
d3890e6
Compare
be0a31b
to
93a9562
Compare
pkgs/top-level/python-packages.nix
Outdated
# ImportError: /nix/store/7w7piy6hpdj1swdg5r0bz47gk2g3q855-tree-sitter-perl-grammar-0.22.5/parser: undefined symbol: isnumber | ||
"tree-sitter-perl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be blocked by ganezdragon/tree-sitter-perl#45.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is resolved.
93a9562
to
615c75b
Compare
The final commit message and PR title should be updated. |
615c75b
to
e5dc900
Compare
I updated the final commit message. The PR title should be amended. @sarcasticadmin can do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider taking a look at https://github.com/tree-sitter/py-tree-sitter before doing all the work here?
Taking a look at ngi-nix/ngipkgs#139 I'm honestly a bit confused by the shell.nix
part of the request.
Because you can just do something like:
let
pkgs = import <nixpkgs> { };
tree-sitter-with-grammers = pkgs.tree-sitter.override {
extraGrammars = with pkgs.tree-sitter-grammars; [
# put your grammars here
tree-sitter-go
tree-sitter-yang
];
};
in pkgs.mkShell {
buildInputs = [
tree-sitter-with-grammers
];
}
And get a shell with tree-sitter and the grammars you want, but I might be missing something.
Regarding armijnhemel/proximity_matcher_webservice and the extra context in https://discourse.nixos.org/t/need-help-enabling-grammars-in-treesitter-python/39500 from just googeling a bit, it seems like there is official tree-sitter python bindings https://github.com/tree-sitter/py-tree-sitter/ and we already have them packaged in nixpkgs.
It appears like our package needs a little additional work so you can configure grammars, but that should be an easy <15-min job since you just need to add the grammar python wheel to the pythonPath. (at least that's how I read the language section in the python binding readme
Additonaly please provide reasoning and considerations in your commit message, esp. if you want to add a bunch of hard to maintain code like this. And I'm pretty certain that we should not go with this solution, but instead use the offical python bindings, that are already packaged and a lot easier to maintain.
"date": "2021-12-16T17:14:17+00:00", | ||
"date": "2021-12-16T17:14:17Z", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what happened here, in tree-sitter-scss.json
and in tree-sitter-smithy.json
. Seems like there shouldn't be any change there. Also, the shortening of +00:00
to Z
seems odd, might that be something related to the locals of the person that ran the update script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISO 8601 gives you the option to use either Z
or +00:00
, so both are the same timestamp but it's still odd that it changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't know about that :D
In that case, we should take a look in the script and try to check what changed the behavior to avoid cluttering the git log/blame with unnecessary commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we allow this to happen here, and see what happens with the next update?
assert (lib.assertMsg (!(grammarDrv.meta ? license)) '' | ||
As of this writing, ${grammarDrv.pname} surprisingly doesn't have a license. | ||
This trap is set here to guarantee that if it ever does have a license, this package will inherit the license. | ||
''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What, why? This code, will just be licensed under MIT since that's what nixpkgs is licensed under. Yes, the original grammar might have another licenses, but that really doesn't matter here, and nix allows you to keep track of your (license) dependency-graph. For example, Neovim is licensed under Apache License 2.0 while Tree-Sitter, one of its build inputs is licensed under MIT, but this doesn't mean that Neovim inherits the license. Please remove this nonsensical assert statement.
assert (lib.assertMsg (!(grammarDrv.meta ? license)) '' | |
As of this writing, ${grammarDrv.pname} surprisingly doesn't have a license. | |
This trap is set here to guarantee that if it ever does have a license, this package will inherit the license. | |
''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this nonsensical assert statement.
Please no name-calling my work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement, on a technical level, doesn't make sense, so I think it's nonsensical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've set all the licenses to MIT
snakeCaseName = lib.replaceStrings [ "-" ] [ "_" ] name; | ||
drvPrefix = "python-${name}"; | ||
langIdentOverrides = { | ||
tree_sitter_org_nvim = "tree_sitter_org"; | ||
}; | ||
langIdent = langIdentOverrides.${snakeCaseName} or snakeCaseName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you doing here??
So if the name is tree_sitter_org_nvim
, then you want to replace it with tree_sitter_org
and otherwise just use snakeCaseName
? Why do it in such a complicated manner, instead of just using a simple if statement?
Also, please put a comment stating why you are replacing the name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I believe this is cleaner and easier to maintain than a if statement
What if new a tree-sitter grammar gets added with a weird name like tree_sitter_org_nvim
😂? We can simply put another "override" in there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please put a comment stating why you are replacing the name here.
Could you please tell me why you are renaming here?
Looking at the commit, it seems to have been named this name rather deliberately: 1705882
And before introducing hacky workarounds here, we should change the name in general or just leave it as is.
What if new a tree-sitter grammar gets added with a weird name like tree_sitter_org_nvim 😂?
Then more packages have to worry about it since it would also result in conflicts in our general tree-sitter infrastructure, so doing this workaround here appears to be a bad idea because it will cause more overhead in the long term.
preCheck = '' | ||
rm -r ${snakeCaseName} | ||
''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please leave a comment on why you are removing the directory here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is "common" for python packages, it seems that pytest
gets confused if a directory with the same name as the module is present during tests, (in this case ${snakeCaseName}
also corresponds to the module name provided by the generated grammar package).
It looks like it is fairly common for python packages (here an example: https://github.com/NixOS/nixpkgs/blob/nixos-unstable/pkgs/development/python-modules/tree-sitter/default.nix#L45).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and if you take a look at the file you just linked, you will notice that it has a comment linking to the issue in question on why you have to remove this directory, so please also add a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct comment would be to mention #255262 .
def test_language(): | ||
lang = Language(language()) | ||
assert lang is not None | ||
parser = Parser() | ||
parser.language = lang | ||
tree = parser.parse(bytes("", "utf-8")) | ||
assert tree is not None | ||
'' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced by the soundness of this test, did you at least try to parse some actual code with some of the language-grammars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC in one of our mob sessions, someone pointed out tree-sitter parsing cannot fail even if provided code doesn't follow the grammar, and no exception will be thrown no matter what (I'm not 100% sure tho)
I agree that asserting not None
might not be enough, but I think the best we can do here is to change the assert not None
to the resulting parse tree is an instance of some type that py-tree-sitter provides
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not answering my question. I asked if you tried to parse some actual code (manually) and check if the grammars work, I'm aware that improving this test isn't too easy, you could have a bunch of multiline strings in an attrset with actual language code and insert that instead of an empty string, at least for some languages.
Like for example:
{
tree_sitter_python = {
testCase = ''
foo = 20
'';
expectedResult = $whateverTreeSitterShouldReturn
;
};
}
is how I would probably go about this problem, and if a language isn't in the attr set, then you can still go with an empty string as default value.
["-std=c11"] if system() != 'Windows' else [] | ||
), | ||
define_macros=[ | ||
("Py_LIMITED_API", "0x03080000"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the magic number/address 0x03080000
here mean, could you please add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0x03080000
defines the lowest Python version our extension supports to be 3.8
Official docs: https://docs.python.org/3/c-api/stable.html#c.Py_LIMITED_API
The usage here follows upstream examples:
- https://github.com/tree-sitter/tree-sitter-ocaml/blob/0b12614ded3ec7ed7ab7933a9ba4f695ba4c342e/setup.py#L49
- https://github.com/tree-sitter/tree-sitter-typescript/blob/4f3eb6655a1cd1a1f87ef10201f8e22886dcd76e/setup.py#L49
- https://github.com/tree-sitter/tree-sitter-ruby/blob/0ffe457fb6aabf064f173fd30ea356845cef2513/setup.py#L48
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be documented, and based on that you should also set disabled = pythonOlder "3.8"
src = symlinkJoin { | ||
name = "${drvPrefix}-source"; | ||
paths = [ | ||
(writeTextDir "${snakeCaseName}/__init__.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be entirely frank, this seems like madness. It's hard to maintain code, just inline in some file, and I'm fairly certain you could have accomplished the same result using https://github.com/grantjenks/py-tree-sitter-languages (which is already packaged) and by putting less than 15-mins of work into https://github.com/NixOS/nixpkgs/blob/nixos-unstable/pkgs/development/python-modules/tree-sitter/default.nix so it has a package option for additional grammars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grantjenks/py-tree-sitter-languages is using a deprecated API:
- code: https://github.com/grantjenks/py-tree-sitter-languages/blob/42f4baffec92848be4937b0cc52b2872201fe322/tree_sitter_languages/core.pyx#L14
- deprecation notice: https://github.com/tree-sitter/py-tree-sitter/tree/v0.21.3#build-from-source
- relevant discussion: https://discourse.nixos.org/t/need-help-enabling-grammars-in-treesitter-python/39500/7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I'm sorry I copy-pasted the wrong link, I meant to refer to https://github.com/tree-sitter/py-tree-sitter (the same one I referred to in #320783 (review)). The official one isn't deprecated and still maintained, and is the one we have in nixpkgs.
Yep, that's the first thing we've done and we are using it in the
This is right for exposing
That's what we did in this PR: giving users the ability to get tree-sitter grammar python bindings, which will further allow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What ngi-nix/ngipkgs#139 wants, is exposing Python bindings for multiple grammars to dev shells (original request)
As I said, I wasn't sure if it meant just tree-sitter, or if it meant python bindings.
What this PR adds, is to give py-tree-sitter in nixpkgs the ability to load any/all grammars exposed through tree-sitter.builtGrammars as python bindings
No, it does not, I think? You aren't touching the py-tree-sitter package, nor are you using py-tree-sitter in your code.
py-tree-sitter is indeed in nixpkgs, but you'll need to load a grammar to it before you can parse anything (usage)
Yes, that's the point I made, and you aren't adding support for it here.
All you would have needed to do is a small patch like:
From 12e830ed9a7bace1ef0df64ac088b0a4b309d8cd Mon Sep 17 00:00:00 2001
From: "Janik H." <[email protected]>
Date: Wed, 26 Jun 2024 13:28:58 +0200
Subject: [PATCH] python3Packages.tree-sitter: add support for additional
grammars
---
pkgs/development/python-modules/tree-sitter/default.nix | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/pkgs/development/python-modules/tree-sitter/default.nix b/pkgs/development/python-modules/tree-sitter/default.nix
index fdaa03554433..0127b1518fa1 100644
--- a/pkgs/development/python-modules/tree-sitter/default.nix
+++ b/pkgs/development/python-modules/tree-sitter/default.nix
@@ -5,11 +5,12 @@
pytestCheckHook,
pythonOlder,
setuptools,
- tree-sitter-python,
- tree-sitter-rust,
tree-sitter-html,
tree-sitter-javascript,
tree-sitter-json,
+ tree-sitter-python,
+ tree-sitter-rust,
+ extraGrammars ? [ ]
}:
buildPythonPackage rec {
@@ -29,6 +30,8 @@ buildPythonPackage rec {
build-system = [ setuptools ];
+ dependencies = extraGrammars;
+
nativeCheckInputs = [
pytestCheckHook
tree-sitter-python
--
2.45.1
(I'm not sure if the above example patch is the best way to do this, you might also want to craft your own python environment in passthru
to avoid unnecessary recompiles (similar to what we are doing in octodns)
Example usage would be:
let
pkgs = import ./. { };
in pkgs.writers.writePython3Bin "tree-sitter-test.py" {
libraries = [
(pkgs.python3Packages.tree-sitter.override { extraGrammars = [
pkgs.python3Packages.tree-sitter-javascript
];
})
];
} ''
import tree_sitter_javascript as tsjavascript
from tree_sitter import Language, Parser
JS_LANGUAGE = Language(tsjavascript.language())
parser = Parser(JS_LANGUAGE)
tree = parser.parse(
bytes("console.log('Hello World!');", "utf8")
)
cursor = tree.walk()
cursor.goto_first_child()
print(cursor.node.type)
''
Please note that Python bindings exposed through official tree-sitter org (it's mostly distributed through one of the maintainers' personal Pypi account) on Pypi is incomplete, 24 as of me writing this. With our way, we can expose 100+ bindings readily available in nixpkgs
How can you be certain that your bindings work, you barely do any testing, there is a bunch of work going into this upstream to do this properly. And you are basically duplicating out of tree efforts like https://github.com/grantjenks/py-tree-sitter-languages, but with worse support and worse testing.
You can also just use pkgs.python3Packages.tree-sitter-languages
instead of pkgs.python3Packages.tree-sitter
in the example above (and replace the language and parser definition) and get access to all the languages.
We also don't need to depend on the wheels distributed through pypi, for example the tree-sitter-javascript package listed above, was built from source.
"date": "2021-12-16T17:14:17+00:00", | ||
"date": "2021-12-16T17:14:17Z", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I didn't know about that :D
In that case, we should take a look in the script and try to check what changed the behavior to avoid cluttering the git log/blame with unnecessary commits.
src = symlinkJoin { | ||
name = "${drvPrefix}-source"; | ||
paths = [ | ||
(writeTextDir "${snakeCaseName}/__init__.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I'm sorry I copy-pasted the wrong link, I meant to refer to https://github.com/tree-sitter/py-tree-sitter (the same one I referred to in #320783 (review)). The official one isn't deprecated and still maintained, and is the one we have in nixpkgs.
assert (lib.assertMsg (!(grammarDrv.meta ? license)) '' | ||
As of this writing, ${grammarDrv.pname} surprisingly doesn't have a license. | ||
This trap is set here to guarantee that if it ever does have a license, this package will inherit the license. | ||
''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement, on a technical level, doesn't make sense, so I think it's nonsensical.
snakeCaseName = lib.replaceStrings [ "-" ] [ "_" ] name; | ||
drvPrefix = "python-${name}"; | ||
langIdentOverrides = { | ||
tree_sitter_org_nvim = "tree_sitter_org"; | ||
}; | ||
langIdent = langIdentOverrides.${snakeCaseName} or snakeCaseName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please put a comment stating why you are replacing the name here.
Could you please tell me why you are renaming here?
Looking at the commit, it seems to have been named this name rather deliberately: 1705882
And before introducing hacky workarounds here, we should change the name in general or just leave it as is.
What if new a tree-sitter grammar gets added with a weird name like tree_sitter_org_nvim 😂?
Then more packages have to worry about it since it would also result in conflicts in our general tree-sitter infrastructure, so doing this workaround here appears to be a bad idea because it will cause more overhead in the long term.
def test_language(): | ||
lang = Language(language()) | ||
assert lang is not None | ||
parser = Parser() | ||
parser.language = lang | ||
tree = parser.parse(bytes("", "utf-8")) | ||
assert tree is not None | ||
'' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not answering my question. I asked if you tried to parse some actual code (manually) and check if the grammars work, I'm aware that improving this test isn't too easy, you could have a bunch of multiline strings in an attrset with actual language code and insert that instead of an empty string, at least for some languages.
Like for example:
{
tree_sitter_python = {
testCase = ''
foo = 20
'';
expectedResult = $whateverTreeSitterShouldReturn
;
};
}
is how I would probably go about this problem, and if a language isn't in the attr set, then you can still go with an empty string as default value.
["-std=c11"] if system() != 'Windows' else [] | ||
), | ||
define_macros=[ | ||
("Py_LIMITED_API", "0x03080000"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be documented, and based on that you should also set disabled = pythonOlder "3.8"
cc @doronbehar since you did a lot of work on tree-sitter python bindings in #316901 recently. |
I'm sorry but this PR is very confusing to me, and I tend to agree with most of @Janik-Haag's comments, and for sure using I'm mostly confused because I didn't find any details in neither the discourse thread, nor at ngi-nix/ngipkgs#139 and neither here about the target development setup. For example I wonder:
I'm not familiar with the ecosystem of all of these python packages, including many of those I added in #316901 , so I won't be able to help a lot. |
from setuptools import Extension, setup | ||
|
||
|
||
setup( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this should be moved to pyproject.toml
, see https://setuptools.pypa.io/en/latest/userguide/ext_modules.html#building-extension-modules
inherit version; | ||
pname = drvPrefix; | ||
|
||
src = symlinkJoin { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be cleaner keeping the files in-tree, pointing src
there and using substituteInPlace --subst-var-by
in postPatch
to pass the variables.
Thanks. This is pretty much what I imagined in https://discourse.nixos.org/t/need-help-enabling-grammars-in-treesitter-python/39500/7?u=jtojnar.
The grammars have nothing to do with And as you show in your example – you just pull in
Again, no support is necessary in
I do not think this will do anything that could not be achieved by just adding the
Yes, testing is an issue
but tree-sitter-languages is a dead end as I mentioned in https://discourse.nixos.org/t/need-help-enabling-grammars-in-treesitter-python/39500/7?u=jtojnar The only other alternative to have all those grammars available from Python with the current API (to my knowledge, as of 2024-04-12) would be going around and opening PRs to add a trivial binding like this https://github.com/tree-sitter/tree-sitter-python/tree/71778c2a472ed00a64abf4219544edbf8e4b86d7/bindings/python/tree_sitter_python for every upstream grammar, having them upload it to PyPI, and then packaging that. That does not scale.
That would be fine but it still does not solve the issue for grammars that do not have Python bindings. I would say the requirement to have m×n bindings for each combination of language and grammar itself is something that should be addressed but I do not have a solution for that 🤷♀️ |
Hi, I just want to mention that every recent enough grammar should have the binding, since the official template contains it: https://github.com/tree-sitter/tree-sitter-embedded-template. So maybe, all (maintained) upstream grammars should at some point contain the binding. |
@adfaure That is actually a grammar for ERB and EJS template languages. But looks like a template does exist so you might be right that it will be everywhere eventually. So maybe we could indeed use the upstream sources. For example, here is how I would convert --- a/pkgs/development/python-modules/tree-sitter-javascript/default.nix
+++ b/pkgs/development/python-modules/tree-sitter-javascript/default.nix
@@ -1,6 +1,6 @@
{ lib
, buildPythonPackage
-, fetchFromGitHub
+, tree-sitter-grammars
, setuptools
, wheel
, tree-sitter
@@ -8,16 +8,9 @@
buildPythonPackage rec {
pname = "tree-sitter-javascript";
- version = "0.21.3";
+ inherit (tree-sitter-grammars.tree-sitter-javascript) version src;
pyproject = true;
- src = fetchFromGitHub {
- owner = "tree-sitter";
- repo = "tree-sitter-javascript";
- rev = "v${version}";
- hash = "sha256-jsdY9Pd9WqZuBYtk088mx1bRQadC6D2/tGGVY+ZZ0J4=";
- };
-
build-system = [
setuptools
wheel |
Oh, I see. Thank you for pointing that out and finding the actual template. |
Co-authored-by: Robert James Hernandez <[email protected]> Co-authored-by: Yifei Sun <[email protected]> Co-authored-by: Ali Jamadi <[email protected]> Co-authored-by: yakampe <[email protected]> Co-authored-by: GetPsyched <[email protected]> Co-authored-by: Adrien Faure <[email protected]> Co-authored-by: Shahar "Dawn" Or <[email protected]>
e5dc900
to
b938c0a
Compare
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2024-summer-of-nix-program-updates/46053/3 |
Description of changes
As request in ngi-nix/ngipkgs#139
This generates a python binding package for each tree-sitter-grammar that exists in
tree-sitter.builtGrammars
.Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.