Create an LSP for yuck #1241

Fejiberglibstein · 2024-12-09T04:13:34Z

Description

Closes #1232.

Features of the LSP (will) include

autocomplete for widgets and widget's variables
hover documentation for builtin widgets
diagnostic messages for any errors
goto definition for user-made widgets
auto formatting
Document aware syntax highlighting
- Treesitter grammar exists, this would allow for more syntax highlighting possibilities, perhaps even allowing for syntax highlighting inside (literal) expressions.

Additional Notes

I am far from finished for now.

The goals of this draft PR is

To make sure my handling of yuck configuration is correct.
To discuss future implementation of LSP features, as I believe some things will need to be altered to allow the LSP to have sufficient data to operate.

Check the top comment for further discussion

Checklist

Please make sure you can check all the boxes that apply to this PR.

All widgets I've added are correctly documented.
I added my changes to CHANGELOG.md, if appropriate.
The documentation in the docs/content/main directory has been adjusted to reflect my changes.
I used cargo fmt to automatically format all code before committing

Fejiberglibstein · 2024-12-09T04:33:38Z

Current implementation details

For my current implementation, I am creating a config using yuck::config::Config::generate() for the LSP, but perhaps a different method would be better.

I needed to do really hacky stuff to avoid mutating self.documents in Backend::change_document, and ideally this should not be done this way, since this function will be getting called any time a change is made to the yuck file.

Please let me know what direction I should go in for handling yuck files so I may begin working on the actual LSP.

Future implementation details

For implementing things like syntax highlighting and hover docs, I may need more information than just the start and end of individual widgets.I will also need to know the tokens of each ast node: (defvar foo "h") -> [(, defvar, foo, "h", )] which I do not believe is possible when using Config. I will need the AST nodes of each widget in addition to the semantic meaning that Config provides, which I am unsure how to accomplish at the moment.

Fejiberglibstein · 2024-12-09T17:36:35Z

Upon further inspection of the codebase, I have noticed that there exists some global state to hold the file database.

A few questions:

Should this global state be used instead of creating my own EwwConfig and FileDatabase?
Overall, how should configs be handled for the LSP?
Does it make sense that the LSP wouldn't work unless you have a daemon started for the workspace you're working in?
- If you don't have a daemon started for the workspace you're in (ex. you're working in ~/custom_eww_workspace/ and you have to start the daemon with eww daemon -c ~/custom_eww_workspace) should the LSP initialize the server daemon?

imo it makes sense to have the daemon and LSP share the same files & config, since this reduces memory usage & improves performance. I am just wondering how I should go about sharing this state.

Another thing of note is that currently I have the LSP as a separate executable. Should the LSP instead be started with something like eww lsp -c path/to/config/, and then LSP clients such as neovim or vscode can simply run this command in the current working directory?

DOD-101 · 2024-12-15T13:08:29Z

Should this global state be used instead of creating my own EwwConfig and FileDatabase?

This is something I'm also curious about for the formatter.

I have written up a basic implementation (for which I will also open a draft PR soon), but this is one part I have yet to touch.

It seems like almost everything I need regarding file handling is already implemented, just not exposed publicly.

DOD-101 · 2024-12-15T13:11:04Z

Future implementation details

This is something that it would make sense for us to coordinate closer on, since the formatter will also require adjustments to the AST.

elkowar · 2024-12-16T23:14:36Z

I'll need to look into this a bit further, but just as a general heads up already:

In most cases, a good LSP will still work when there are parsing errors (otherwise, autocomplete etc might be quite hard, and highlighting more than one error is very difficult as well). The eww parser is currently somewhat unfit for this, I'd fear. A lot of information is lost, and it definitely isn't super error-tolerant.

Thus, and especially given the relative simplicity of yuck, I'd recommend trying to work a bit further disjunct from the main eww codebase for this. I could see some sort of connection to a running eww daemon be quite valuable for use cases such as showing errors from the daemon, but even that would require a good bit of work to get the required info.

Writing a simple-ish CST-parser (maybe using something like tree-sitter, or basing on Rowan, which many LSPs use) would probably be a lot more flexible in the long run, even if it's of course a lot of initial effort.
The same thing applies for an autoformatter as well – the current eww AST is very unlikely to be sufficient there.

DOD-101 · 2024-12-22T14:15:18Z

Writing a simple-ish CST-parser (maybe using something like tree-sitter, or basing on Rowan, which many LSPs use) would probably be a lot more flexible in the long run, even if it's of course a lot of initial effort.
The same thing applies for an autoformatter as well – the current eww AST is very unlikely to be sufficient there.

I don't have much experience in this area, so take everything I am saying with a grain of salt.

But it seems to me that, if I am understanding you correctly, you think it would be best if both the LSP and the formatter implement their own parsers.

Why?

It seems to me that it would be a lot of additional work to try and maintain 2 or 3 different parsers. Would it not make more sense to collaborate on improving the parsers used in eww itself, so that eww, the LSP and the formatter all use the same one? This is what most languages seem to do, at least for the most part.

I can't speak to the LSP side of things, but for the formatter, the eww AST seems to mostly fulfill my requirements. Perhaps later down the line there would need to be some more significant changes made to facilitate some more complicated changes, but as it stands now I have a formatter that can handle about 80% of yuck code properly, with no changes to the AST.

It seems to be that if we were all to go out and implement our own parser solutions we would end up with 3 parsers, that all need to be maintained, which I foresee causing headaches down the line.

Fejiberglibstein · 2024-12-22T15:29:57Z

I can understand where @elkowar is coming from. His implementation lacks error recovery, a huge part of what's needed for LSPs to work. Right now if there is any error at all, I believe the parser will just quit.

I ran into a similar issue when I was working on another LSP for a language my friends and I made. We had to add a bit of extra code to the parser to make it so it could try its best to get all the tokens out if there was an error.

That being said, I initially was writing some comment that was against using the current yuck parser. However, I realized that for a language like this, it would be very difficult to have correct autocomplete and syntax highlighting working when there are syntax errors. Take for example this code

(foo
    (bar
)

(something-else
    (foo))

it would be impossible to tell if that something else is its own thing or if its inside foo, making whatever autocomplete the LSP has useless.

Therefore, I don't think we need to worry very much about errors. Perhaps I am missing something obvious here, and I haven't dug too deep into the parser's code, but could it be as simple as returning everything that was parsed up until the syntax error? That way we could have syntax highlighting for everything up until the syntax error. This is essentially what was done for the language my friends and I made, and it worked fine.

bbb651 · 2024-12-22T16:46:47Z

I think using tree-sitter as a parser can be a good approach - It already exists and needs to exist for good editor support (e.g. zed and neovim), it handles error recovery and support incremental updates. I found this discussion showing how this was done in asm-lsp

Eugenenoble2005 · 2024-12-24T07:32:11Z

(foo
    (bar
)

(something-else
    (foo))
it would be impossible to tell if that something else is its own thing or if its inside foo, making whatever autocomplete the LSP has useless.

Not quite, like elkowar said yuck is relatively simple. In this case (something-else is simple a child of the first parent (foo while (bar is a sibling. For working completions in yuck, i think we really only need simple context up to the point of the cursor. I achieved this in my implementation with simple parenthesis matching to find the immediate parent regardless of syntax errors(errors probably might be dealt with later)

The screenshot above is kind of like the example you posted, Tree sitter knows the syntax is not correct, but we're ignoring that. The LSP tries to suggest widgets because the cursor is currently a child of box which inturn is a child of defwindow.

Eugenenoble2005 · 2024-12-24T07:35:02Z

But it seems to me that, if I am understanding you correctly, you think it would be best if both the LSP and the formatter implement their own parsers.

Why?

It seems to me that it would be a lot of additional work to try and maintain 2 or 3 different parsers. Would it not make more sense to collaborate on improving the parsers used in eww itself, so that eww, the LSP and the formatter all use the same one? This is what most languages seem to do, at least for the most part.

My guess is that it would require a lot of rewrites on eww's side on an already stable product for a feature which is not sacrosanct to it's functionality at least for now.

elkowar · 2024-12-27T16:59:55Z

Yea, making the eww AST verbose enough to be convenient for a proper LSP would require quite a bit of internal changes, for reasons of error recovery as well as the fact that currently, we don't preserve all information (i.e., if theres two different ways to write something, we don't necessarily retain all information needed to reconstruct the original code, we don't retain all comments, etc)

The eww grammar has not changed in over two years, so there's not that much "maintenance" that'd be necessary here -- just write the grammar once, and that's probably it. The goals of an LSP/formatter parser and the main eww parser are just very different. I do think doing something with treesitter is probably a good first idea here.

Having your fully own AST would allow you to retain any additional metadata you might want to store, as well.

From all I gather, there isn't really a one-size-fits-all strategy for this across other languages. many languages have multi-step parsers and grammars, where things go through a tokenizer, then a CST, then an AST, etc etc. some languages might re-use part of the original parser for their LSPs. Rust-analyzer, from what I gather, has a completely separate CST and AST structure based on rowan for its parser, which also, again, just has very different requirements and restrictions from the parsing logic used in the compiler.

elkowar · 2024-12-27T17:02:19Z

However, if you're able to make the eww-parser and AST usable for these usecases, without having to re-structure the entire codebase, I'd definitely be willing to consider this as an option as well. I do, however, think that error recovery is very important for an LSP, and even a formatter. There's more "hints" for them than you might think:
i.e., it's very clear when a top-level element starts, because they all start with a clearly defined "defvar", "defwidget", "defwindow", etc. So you can at least use these top-level nodes as error recovery boundaries. Then, in many other cases, the parentheses should suffice to group errors by other means -- not trivial, but definitely not conceptually super complex either, and the added value there could be huge.

DOD-101 · 2024-12-27T17:12:19Z

Thanks, @elkowar, for the detailed reply.

As for the formatter:

I will finish experimenting around with using the eww AST and then go ahead and write another implementation using tree sitter, as suggested.

At that point, I will go forward with whatever feels more well suited / capable. Hopefully I will then have something presentable as a 0.0.1 in the next few weeks.

Fejiberglibstein added 2 commits December 8, 2024 16:26

Setup lsp server

72d9a24

Handling Documents

dd09932

Fejiberglibstein mentioned this pull request Dec 9, 2024

[FEATURE] Make an LSP for yuck #1232

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create an LSP for yuck #1241

Create an LSP for yuck #1241

Fejiberglibstein commented Dec 9, 2024 •

edited

Loading

Fejiberglibstein commented Dec 9, 2024 •

edited

Loading

Fejiberglibstein commented Dec 9, 2024

DOD-101 commented Dec 15, 2024

DOD-101 commented Dec 15, 2024

elkowar commented Dec 16, 2024

DOD-101 commented Dec 22, 2024 •

edited

Loading

Fejiberglibstein commented Dec 22, 2024

bbb651 commented Dec 22, 2024

Eugenenoble2005 commented Dec 24, 2024

Eugenenoble2005 commented Dec 24, 2024

Why?

elkowar commented Dec 27, 2024

elkowar commented Dec 27, 2024

DOD-101 commented Dec 27, 2024

Create an LSP for yuck #1241

Are you sure you want to change the base?

Create an LSP for yuck #1241

Conversation

Fejiberglibstein commented Dec 9, 2024 • edited Loading

Description

Additional Notes

Checklist

Fejiberglibstein commented Dec 9, 2024 • edited Loading

Current implementation details

Future implementation details

Fejiberglibstein commented Dec 9, 2024

DOD-101 commented Dec 15, 2024

DOD-101 commented Dec 15, 2024

elkowar commented Dec 16, 2024

DOD-101 commented Dec 22, 2024 • edited Loading

Why?

Fejiberglibstein commented Dec 22, 2024

bbb651 commented Dec 22, 2024

Eugenenoble2005 commented Dec 24, 2024

Eugenenoble2005 commented Dec 24, 2024

Why?

elkowar commented Dec 27, 2024

elkowar commented Dec 27, 2024

DOD-101 commented Dec 27, 2024

Fejiberglibstein commented Dec 9, 2024 •

edited

Loading

Fejiberglibstein commented Dec 9, 2024 •

edited

Loading

DOD-101 commented Dec 22, 2024 •

edited

Loading