Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Juxtaposing quotation mark #436

Open
sergey-goncharov opened this issue Feb 8, 2023 · 4 comments
Open

Juxtaposing quotation mark #436

sergey-goncharov opened this issue Feb 8, 2023 · 4 comments
Labels
bug builtin Concerning built-in tokens like Integer, String etc. Haskell lexer Concerning the generated lexer OCaml

Comments

@sergey-goncharov
Copy link

This grammar Exp. Exp ::= Ident "'" ; generates a parser that correctly parses x ', but not x' for some reason. I've played with other characters, including other quotation symbols, such as and then both x’ and x ’ are recognized, which is imho the correct behaviour. Any particular reasons, why ' is treated specially (and how to avoid it?)? Thanks!

@andreasabel andreasabel added OCaml Haskell lexer Concerning the generated lexer builtin Concerning built-in tokens like Integer, String etc. bug labels Feb 8, 2023
@andreasabel
Copy link
Member

Since Idents can contain quotation marks, x' will parse as an identifier.
You can define your own identifier tokens, e.g.:

Exp. Exp ::= Id "'";
token Id letter (letter | digit | '_')*;

This works as expected in the "imperative" backends (C, C++, Java) but not in the "functional" ones (Haskell, OCaml).

The problem is in how the functional backends implement the lexer: they always include lexing of Ident, so that keywords can be lexed as Ident and then classified as keywords later. This is to prevent explosion of the lexer automata.
Unfortunately, it leads to this bug.

Related:

@sergey-goncharov
Copy link
Author

Thanks, Andreas, for a quick reply!

Exp. Exp ::= Id "'";
token Id letter (letter | digit | '_')*;

That is roughly how I started. I've just minimized that example, since the problem persisted. I am following the basics steps from the tutorial, which suggests to use Test, generated with bnfc -d -m for basic testing.

@andreasabel
Copy link
Member

Yes, unfortunately "'" isn't a proper operator character atm in the standard backend (Haskell).
You can use this invocation instead, to use the C backend instead of the Haskell one.

bnfc --c -m GRAMMAR.cf

@jasper-e
Copy link

jasper-e commented Sep 3, 2023

Yes, unfortunately "'" isn't a proper operator character atm in the standard backend (Haskell). You can use this invocation instead, to use the C backend instead of the Haskell one.

bnfc --c -m GRAMMAR.cf

My problem is that I combine bnfc Haskell backend with Haskell-based uuagc...
I would highly appreciate a command line flag for the Haskell backend to omit including Ident, which I could put in my Makefile...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug builtin Concerning built-in tokens like Integer, String etc. Haskell lexer Concerning the generated lexer OCaml
Projects
None yet
Development

No branches or pull requests

3 participants