Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backslash / soft break / line break elements #7

Open
1 of 3 tasks
mortenpi opened this issue May 17, 2022 · 1 comment
Open
1 of 3 tasks

Backslash / soft break / line break elements #7

mortenpi opened this issue May 17, 2022 · 1 comment

Comments

@mortenpi
Copy link
Member

mortenpi commented May 17, 2022

CommonMark has implemented dedicated inlines for these characters. But do we actually need them here?

# struct Backslash <: AbstractInline end
# struct SoftBreak <: AbstractInline end
# struct LineBreak <: AbstractInline end

  • Implement LineBreak (b45a1e5, 2f77d57)
  • Implement or discard SoftBreak
  • Implement or discard Backslash
@mortenpi
Copy link
Member Author

A few notes for this:

  • Backslash: the Markdown standard library also seems to parse the backslashes, although they just end up being separate text nodes with \ content, so we can convert those:

    julia> md"""
           foo\\bar
           """.content
    1-element Vector{Any}:
     Markdown.Paragraph(Any["foo", "\\", "bar"])
    

    cmark doesn't seem to have a backslash node, however:

    julia> CMark.parse_document("foo\\bar") |> CMark.typetree
    CMARK_NODE_DOCUMENT => {
    	CMARK_NODE_PARAGRAPH => {
    		CMARK_NODE_TEXT
    	}
    }
    
  • LineBreak: I had forgotten that the Markdown standard library also has Markdown.LineBreak, and so does cmark, so we should definitely implement this.

  • SoftBreak: the Markdown standard library doesn't implement this and just converts it to a space instead:

    julia> md"""
           foo
           bar
           """.content
    1-element Vector{Any}:
     Markdown.Paragraph(Any["foo bar"])
    

    So when converting to and from that we'll lose some information. cmark, however, does have a node for this.

I wonder how to handle Text() elements that contain newlines and backlashes, however. It seems that for backslashes, we can anyway end up with literal backslashes in the text, but the cases where it resembles a escape sequence could be problematic. Similarly, newlines in the text could be a problem.

One option would be disallow these characters when constructing Text(). Otherwise, the consumers have to assume that the nodes can contain weird backlashes and newlines.

However, this would be annoying when constructing trees programmatically (as opposed to parsing Markdown). To work around that, we could provide e.g. an inlinetext function that does some simple inline parsing into LineBreak etc nodes (e.g. inlinetext("foo\\\nbar") -> [Text("foo"), LineBreak(), Text("bar")]).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant