Tracking token source positions #793

dylanscott · 2024-10-18T22:16:41Z

Hello 👋 We're big fans of sqlparse, particularly its leniency in the face of weird or just plain invalid syntax, and use it to power some aspects of a SQL editing interface. We've long maintained a fork with a few minor tweaks and fixes (e.g. removing keywords irrelevant in our context). With the recent work to make it easier to customize the lexer it seems like we can handle most of this customization in a first-class way. But there is one other addition we made that is critical to our use-case which I wanted to inquire if you would be open to upstreaming: Tracking of source positions for tokens, which we use for syntax highlighting.

I've prepared a PR - #794 - with the changes we made to implement this. It's pretty non-invasive - the main thing I'm not sure about is if it would be considered a breaking change. As far as I can tell it does not change any of the API surface area covered in the documentation, so it may not be. It would only potentially be breaking for folks using the lexer directly, as it expands the raw/unwrapped token stream from 2-tuples to 3-tuples (in fact only two of the cases in test_tokenize.py had to be updated).

In any case, thank you for your work building and maintaining this library!

The text was updated successfully, but these errors were encountered:

dylanscott mentioned this issue Oct 18, 2024

token source position tracking #794

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking token source positions #793

Tracking token source positions #793

dylanscott commented Oct 18, 2024 •

edited

Loading

Tracking token source positions #793

Tracking token source positions #793

Comments

dylanscott commented Oct 18, 2024 • edited Loading

dylanscott commented Oct 18, 2024 •

edited

Loading