Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closing comments handling #9

Open
const-volatile opened this issue Aug 1, 2017 · 0 comments
Open

Closing comments handling #9

const-volatile opened this issue Aug 1, 2017 · 0 comments

Comments

@const-volatile
Copy link

const-volatile commented Aug 1, 2017

I found a corner case, in which closing comments are not handled correctly. i.e.
<!--x>Comment<!-->
The end of the comment is marked by the second <!-->, but accidentally everything afterwards will be treated as comment also.

The following snippet demonstrates the problem (output is empty instead of "HELLO WORLD"):

html::dom page;
page.append_partial_html("<!--x>Comment<!--><html><head><title>HELLO WORLD</title></head><body></body></html>");
std::cout << page["title"].to_plain_text() << std::endl;

According to the HTML5 specification, parsing of the comment should happen as following:

Data state
< Markup declaration open state
-- Comment start state
x comment state
>Comment<! Append the current input character to the comment token's data
- Comment end dash state
- Comment end state
> Data state

Current implementation is in comment state (state = 12) while >Comment is getting parsed, but switches the state when the <! characters are encountered to state = 10.

case '<':
  {
    c = getc();
    if (c == '!') {
      pre_state = state;
      state = 10;
    } else {
      content += '<';
      content += c;
    }
  }
  break;
const-volatile added a commit to const-volatile/avhtml that referenced this issue Aug 1, 2017
avplayer#9

Treating '<' and '<!' in state=12 (comment state) like any other token ("Append the current input character to the comment token's data.", See also: https://www.w3.org/TR/html5/syntax.html#comment-state).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant