-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative AST tool #4
Comments
Thank you very much for your interest and for your suggestion!!! I definitely will look into it! Glad you liked it! This project still needs a lot of work but I will try to do my best to keep improving it! |
firstly, O(+oo) thanks for your interest and for your suggestion, I really appreciate it:) in the past couple of weeks This looks really powerful and has so many features but I absolutely like and make it very strong candidate for after I do a rewrite. I find it very elegant that by means of a unified interface they make it so simple to switch between languages and without any dependencies other than the parsers for your language of interest. I also really like the whole incremental stuff,as having constantly an up to date syntax tree background could be helpful with some ideas I have. And S-expressions can be pretty cool:) However, the way errors are handled gives me some concern. don't want to bother you with details but it seems that depending on where and what kind of error is present, the S expression and the tree hierarchy can be broken. It seems that for some single errors things are not that bad, and the recovery should be possible, but when you have two or three (or the error is on the right-hand side of an assignment) things can start breaking really bad. not so small sections of tree can become flat losing a lot of information, nodes/tokens can be split and become children of sibling/parent/grandparent nodes, subsequent errors can be ignored and and so can be ; terminating logical lines and so on... The other day I was able to construct test cases where two different pieces of code seem to produce the same tree even though they are semantically different and I expected an error node in one of them:) furthermore,there are cases where it does not seem to hint anything about indentation errors but that is a little bit of a double-edged knife. in general I like the whole idea and I really hope that there is a neat workaround these issues or that is just me doing something wrong or missing something here, but I fear error recovery might give me more trouble than it currently does. I intent to keep investigating and experimenting as otherwise I really like it! So thanks a lot and sorry for the late reply! |
As there was some discussion in the dictation toolbox general chat Regarding error recovery of tree sitter, I am posting my comment from there for future reference I first came into contact with tree sitter from an issue at my PythonVoiceCodingPlugin. At first glance it appeared pretty awesome(and it still does) but After playing around a little bit quickly realized that there are some fluctuations in the performance of its error recovery: while generally up to the task, sometimes it felt less than satisfying. without going into full detail
I will post some quick examples which you can also give them a try online For instance, if in b:
x
print( in b ,f)
[x for in b] produces
which is not fully satisfying for me at least.Still all of these examples are kinda individual issues and have a localized impact, so they're not really the end of the world. More frustrating are cases where the structure of the tree gets kinda crippled if x in :
print(y in)
b = x module [0, 0] - [3, 0])
ERROR [0, 0] - [2, 9])
identifier [0, 3] - [0, 4])
ERROR [0, 8] - [0, 9])
identifier [1, 4] - [1, 9])
comparison_operator [1, 10] - [2, 5])
identifier [1, 10] - [1, 11])
ERROR [1, 14] - [1, 15])
identifier [2, 4] - [2, 5])
identifier [2, 8] - [2, 9]) which can sometimes be caused by multiple or carefully placed single errors( by the way please notice that it believes that x =
y = 4
x = y +
z = 0
x =
if i ==0:
print(i) module [0, 0] - [9, 0])
expression_statement [0, 0] - [1, 5])
assignment [0, 0] - [1, 5])
left: expression_list [0, 0] - [0, 1])
identifier [0, 0] - [0, 1])
right: assignment [1, 0] - [1, 5])
left: expression_list [1, 0] - [1, 1])
identifier [1, 0] - [1, 1])
right: expression_list [1, 4] - [1, 5])
integer [1, 4] - [1, 5])
expression_statement [3, 0] - [4, 5])
assignment [3, 0] - [4, 5])
left: expression_list [3, 0] - [3, 1])
identifier [3, 0] - [3, 1])
right: assignment [3, 4] - [4, 5])
left: expression_list [3, 4] - [4, 1])
binary_operator [3, 4] - [4, 1])
left: identifier [3, 4] - [3, 5])
right: identifier [4, 0] - [4, 1])
right: expression_list [4, 4] - [4, 5])
integer [4, 4] - [4, 5])
expression_statement [6, 0] - [8, 12])
assignment [6, 0] - [8, 12])
left: expression_list [6, 0] - [6, 1])
identifier [6, 0] - [6, 1])
right: assignment [7, 0] - [8, 12])
left: expression_list [7, 0] - [7, 8])
comparison_operator [7, 0] - [7, 8])
identifier [7, 0] - [7, 2])
ERROR [7, 3] - [7, 4])
identifier [7, 3] - [7, 4])
integer [7, 7] - [7, 8])
type: type [8, 4] - [8, 12])
call [8, 4] - [8, 12])
function: identifier [8, 4] - [8, 9])
arguments: argument_list [8, 9] - [8, 12])
identifier [8, 10] - [8, 11]) as it then believes that what ever comes after it is a continuation of that assignment, even though they are separated by newline tokens.These can especially be a problem if you enjoy using a workflow like "delete right, paste something else" like me. There was also a major issue with not allowing for empty blocks which meant that anything coming right after an if,for would be considered to belong in its block regardless of the indentation(!) but these fortunately got batched recently As a final note, please do not interpret these post as a discouragement attempt against the usege of tree sitter. end of the day I have only carried out a handful of small-scale experiments and I still think it is a pretty powerful and more importantly general-purpose library that has much to offer. I would just like to point out that there are some issues one may not expect at first glance, some of which may require us to search for workarounds depending on what we want our final program to be capable of and what we deem to be important for the end user experience. |
http://tree-sitter.github.io/tree-sitter/
This is an incremental AST parsing tool used by atom for its parsing. It may solve some of your AST parsing problems you hinted at in the plug-in description.
I was saying very excited by this development, and it may even get me to buy a sublime license :-)
The text was updated successfully, but these errors were encountered: