-
-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python3Packages.tree-sitter-grammars: init at 0.22.5 #320783
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -497,6 +497,13 @@ | |
githubId = 1773511; | ||
name = "Adrien Devresse"; | ||
}; | ||
adfaure = { | ||
email = "[email protected]"; | ||
matrix = "@adfaure:matrix.org"; | ||
github = "adfaure"; | ||
githubId = 8026586; | ||
name = "Adrien Faure"; | ||
}; | ||
adisbladis = { | ||
email = "[email protected]"; | ||
matrix = "@adis:blad.is"; | ||
|
@@ -707,6 +714,13 @@ | |
githubId = 45179933; | ||
name = "Alex Jackson"; | ||
}; | ||
a-jay98 = { | ||
name = "Ali Jamadi"; | ||
email = "[email protected]"; | ||
matrix = "@a.jamadi:matrix.org"; | ||
github = "a-jay98"; | ||
githubId = 23138252; | ||
}; | ||
ajgrf = { | ||
email = "[email protected]"; | ||
github = "ajgrf"; | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
{ lib | ||
, buildPythonPackage | ||
, pytestCheckHook | ||
, tree-sitter | ||
, symlinkJoin | ||
, writeTextDir | ||
# `name`: grammar derivation pname in the format of `tree-sitter-<lang>` | ||
, name | ||
, grammarDrv | ||
}: | ||
let | ||
inherit (grammarDrv) version; | ||
|
||
snakeCaseName = lib.replaceStrings [ "-" ] [ "_" ] name; | ||
drvPrefix = "python-${name}"; | ||
langIdentOverrides = { | ||
tree_sitter_org_nvim = "tree_sitter_org"; | ||
}; | ||
langIdent = langIdentOverrides.${snakeCaseName} or snakeCaseName; | ||
in | ||
buildPythonPackage { | ||
inherit version; | ||
pname = drvPrefix; | ||
|
||
src = symlinkJoin { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would probably be cleaner keeping the files in-tree, pointing |
||
name = "${drvPrefix}-source"; | ||
paths = [ | ||
(writeTextDir "${snakeCaseName}/__init__.py" | ||
Comment on lines
+25
to
+28
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be entirely frank, this seems like madness. It's hard to maintain code, just inline in some file, and I'm fairly certain you could have accomplished the same result using https://github.com/grantjenks/py-tree-sitter-languages (which is already packaged) and by putting less than 15-mins of work into https://github.com/NixOS/nixpkgs/blob/nixos-unstable/pkgs/development/python-modules/tree-sitter/default.nix so it has a package option for additional grammars. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. grantjenks/py-tree-sitter-languages is using a deprecated API:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey, I'm sorry I copy-pasted the wrong link, I meant to refer to https://github.com/tree-sitter/py-tree-sitter (the same one I referred to in #320783 (review)). The official one isn't deprecated and still maintained, and is the one we have in nixpkgs. |
||
'' | ||
from ._binding import language | ||
|
||
__all__ = ["language"] | ||
'' | ||
) | ||
(writeTextDir "${snakeCaseName}/binding.c" | ||
'' | ||
#include <Python.h> | ||
|
||
typedef struct TSLanguage TSLanguage; | ||
|
||
TSLanguage *${langIdent}(void); | ||
|
||
static PyObject* _binding_language(PyObject *self, PyObject *args) { | ||
return PyLong_FromVoidPtr(${langIdent}()); | ||
} | ||
|
||
static PyMethodDef methods[] = { | ||
{"language", _binding_language, METH_NOARGS, | ||
"Get the tree-sitter language for this grammar."}, | ||
{NULL, NULL, 0, NULL} | ||
}; | ||
|
||
static struct PyModuleDef module = { | ||
.m_base = PyModuleDef_HEAD_INIT, | ||
.m_name = "_binding", | ||
.m_doc = NULL, | ||
.m_size = -1, | ||
.m_methods = methods | ||
}; | ||
|
||
PyMODINIT_FUNC PyInit__binding(void) { | ||
return PyModule_Create(&module); | ||
} | ||
'' | ||
) | ||
(writeTextDir "setup.py" | ||
'' | ||
from platform import system | ||
from setuptools import Extension, setup | ||
|
||
|
||
setup( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Most of this should be moved to |
||
name="${snakeCaseName}", | ||
version="${version}", | ||
packages=["${snakeCaseName}"], | ||
ext_package="${snakeCaseName}", | ||
ext_modules=[ | ||
Extension( | ||
name="_binding", | ||
sources=["${snakeCaseName}/binding.c"], | ||
extra_objects = ["${grammarDrv}/parser"], | ||
extra_compile_args=( | ||
["-std=c11"] if system() != 'Windows' else [] | ||
), | ||
define_macros=[ | ||
("Py_LIMITED_API", "0x03080000"), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does the magic number/address There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Official docs: https://docs.python.org/3/c-api/stable.html#c.Py_LIMITED_API The usage here follows upstream examples:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That should be documented, and based on that you should also set |
||
("PY_SSIZE_T_CLEAN", None) | ||
], | ||
py_limited_api=True, | ||
) | ||
], | ||
) | ||
'' | ||
) | ||
(writeTextDir "tests/test_language.py" | ||
'' | ||
from ${snakeCaseName} import language | ||
from tree_sitter import Language, Parser | ||
|
||
|
||
def test_language(): | ||
lang = Language(language()) | ||
assert lang is not None | ||
parser = Parser() | ||
parser.language = lang | ||
tree = parser.parse(bytes("", "utf-8")) | ||
assert tree is not None | ||
'' | ||
Comment on lines
+101
to
+108
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not convinced by the soundness of this test, did you at least try to parse some actual code with some of the language-grammars? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIRC in one of our mob sessions, someone pointed out tree-sitter parsing cannot fail even if provided code doesn't follow the grammar, and no exception will be thrown no matter what (I'm not 100% sure tho) I agree that asserting There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's not answering my question. I asked if you tried to parse some actual code (manually) and check if the grammars work, I'm aware that improving this test isn't too easy, you could have a bunch of multiline strings in an attrset with actual language code and insert that instead of an empty string, at least for some languages. {
tree_sitter_python = {
testCase = ''
foo = 20
'';
expectedResult = $whateverTreeSitterShouldReturn
;
};
} is how I would probably go about this problem, and if a language isn't in the attr set, then you can still go with an empty string as default value. |
||
) | ||
]; | ||
}; | ||
|
||
preCheck = '' | ||
rm -r ${snakeCaseName} | ||
''; | ||
Comment on lines
+113
to
+115
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you please leave a comment on why you are removing the directory here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is "common" for python packages, it seems that It looks like it is fairly common for python packages (here an example: https://github.com/NixOS/nixpkgs/blob/nixos-unstable/pkgs/development/python-modules/tree-sitter/default.nix#L45). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, and if you take a look at the file you just linked, you will notice that it has a comment linking to the issue in question on why you have to remove this directory, so please also add a comment here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The correct comment would be to mention #255262 . |
||
|
||
nativeCheckInputs = [ tree-sitter pytestCheckHook ]; | ||
pythonImportsCheck = [ snakeCaseName ]; | ||
|
||
meta = { | ||
description = "Python bindings for ${name}"; | ||
license = lib.licenses.mit; | ||
maintainers = with lib.maintainers; [ a-jay98 adfaure mightyiam stepbrobd ]; | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you doing here??
So if the name is
tree_sitter_org_nvim
, then you want to replace it withtree_sitter_org
and otherwise just usesnakeCaseName
? Why do it in such a complicated manner, instead of just using a simple if statement?Also, please put a comment stating why you are replacing the name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I believe this is cleaner and easier to maintain than a if statement
What if new a tree-sitter grammar gets added with a weird name like
tree_sitter_org_nvim
😂? We can simply put another "override" in thereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please tell me why you are renaming here?
Looking at the commit, it seems to have been named this name rather deliberately: 1705882
And before introducing hacky workarounds here, we should change the name in general or just leave it as is.
Then more packages have to worry about it since it would also result in conflicts in our general tree-sitter infrastructure, so doing this workaround here appears to be a bad idea because it will cause more overhead in the long term.