Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying a header ID that starts with a number is ignored #26

Open
ArthurZey opened this issue Mar 30, 2021 · 5 comments
Open

Specifying a header ID that starts with a number is ignored #26

ArthurZey opened this issue Mar 30, 2021 · 5 comments

Comments

@ArthurZey
Copy link

Please forgive me if this is intended behavior, perhaps following a spec somewhere, but it seems that kramdown will autogenerate IDs that start with a number, but it will not respect manually specifying IDs. I don't see my usecase disclaimed under Specifying a Header ID. I'm cross-posting from gettalong/kramdown#711, where @gettalong suggested that this was expected behavior using the kramdown binary, but I'm still not 100% sure why, particularly because of the automatic generation of header IDs and how that differs from the behavior using the binary.

Autogeneration of ID for headings starting with numbers works

For example:

## 2021-03-30

yields

<h2 id="2021-03-30">2021-03-30</h2>

as expected. (The example remains true for header text that is purely numeric, such as ## 123.)

(Although, I didn't understand how to reconcile that with the HTML Converter documentation for "Automatic Generation of Header IDs", which suggested that I should expect a different result.)

Specifying a custom header ID works for headings starting with [a-z]

And

## Foo
{: #bar}

and

## Foo {#bar}

both yield

<h2 id="bar">Foo</h2>

also as expected.

Specifying a custom header ID starting with [0-9] does not work

So here's where I get the unexpected behavior:

## 2021-03-30: Foo {#2021-03-30}

yields

<h2 id="2021-03-30-foo-2021-03-30">2021-03-30: Foo {#2021-03-30}</h2>

and

## 2021-03-30: Foo
{: #2021-03-30}

yields

<h2 id="2021-03-30-foo">2021-03-30: Foo</h2>

Of course, I was expecting the following in both of the two examples immediately above:

<h2 id="2021-03-30">2021-03-30: Foo</h2>

I'm gathering that kramdown is ignoring manually specified header IDs when they start with numbers (since even adding [a-z] characters later in the ID doesn't help).

Context

I'm using GitHub Pages, and my Gemfile.lock file specifies the following under github-pages (213):

kramdown (= 2.3.0)
kramdown-parser-gfm (= 1.1.0)

And under jekyll (3.9.0):

kramdown (>= 1.17, < 3)

and then also

    kramdown (2.3.0)
      rexml
    kramdown-parser-gfm (1.1.0)
      kramdown (~> 2.0)

My _config.yml includes the following lines:

kramdown:
  smart_quotes: ["apos", "apos", "quot", "quot"]
  typographic_symbols: { hellip: ... , mdash: --- , ndash: -- , laquo: "<<" , raquo: ">>" , laquo_space: "<< " , raquo_space: " >>" }
  auto_id_stripping: true

I noticed that I didn't have a markdown: kramdown specified, but adding that in didn't change the behavior, so I'm thinking that maybe that becomes implicit if there's a kramdown object specified.

FWIW, I've also confirmed the same behavior on this online kramdown editor/renderer, so it doesn't seem to be specific to a GitHub Pages or Jekyll implementation.

It's a lot easier to see effects immediately if you include a TOC block:

* TOC
{:toc}
@gettalong
Copy link
Member

I'm cross-posting from gettalong/kramdown#711, where @gettalong suggested that this was expected behavior using the kramdown binary, but I'm still not 100% sure why,

What I meant was: It is the expected behaviour when using the kramdown parser and the HTML converter. Since you are using the GFM parser and HTML converter, you might get different results.

@ArthurZey
Copy link
Author

Indeed; I'm just saying I don't know why it's the expected behavior. What is the reason behind this behavior? It seems it must have been intentional. And then what's the reason for the difference in behavior between the binary and the GFM parser?

@gettalong
Copy link
Member

The binary just uses whatever parser/converter pair you specify. If you specify GFM+HTML, you will get the same output as on Jekyll.

The difference in behaviour might be because of differences between the kramdown syntax and the GFM syntax. Also see

def generate_gfm_header_id(text)

@ArthurZey
Copy link
Author

When I say "reason", I'm asking about the human decision to have this behavior, not what code causes it to manifest this way, though I will definitely look more carefully through the file you linked to; thank you!

Why did a human being decide that an ID can't start with a number?

@gettalong
Copy link
Member

This is due to the way HTML4 specified the ID attribute. Also of interest: gettalong/kramdown@90954bc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants