Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should kramdown treat dd elements like other list entries? #482

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ccorn
Copy link
Contributor

@ccorn ccorn commented Jan 1, 2018

Not sure whether this is intended behavior. It certainly irritates me:

$ kramdown <<EOF
$ item
$ : text
$ 
$   text
$ EOF
<dl>
  <dt>item</dt>
  <dd>text

    <p>text</p>
  </dd>
</dl>

A multi-paragraph definition does not wrap <p> around the first paragraph. As a consequence, the first paragraph does not get the margin-bottom that <p> is often styled with, as e.g. in the minima theme used in a default Jekyll setup. Which is how I noticed the issue.

Fortunately, the documentation specifies a way to explicitly request paragraph wrapping, by preceding the definition with a blank line:

$ kramdown <<EOF
$ item
$ 
$ : test
$ 
$   test
$ EOF
<dl>
  <dt>item</dt>
  <dd>
    <p>test</p>

    <p>test</p>
  </dd>
</dl>

But for other lists that handling is automatic. To achieve the same for definition lists, I added the following:

--- a/lib/kramdown/parser/kramdown/list.rb
+++ b/lib/kramdown/parser/kramdown/list.rb
@@ -201,6 +201,7 @@ module Kramdown
           elsif @src.check(EOB_MARKER)
             break
           elsif (result = @src.scan(content_re)) || (!last_is_blank && (result = @src.scan(lazy_re)))
+            item.options[:first_as_para] ||= last_is_blank
             result.sub!(/^(\t+)/) { " "*($1 ? 4*$1.length : 0) }
             result.sub!(indent_re, '')
             item.value << result

If I understand it correctly, the patched elsif branch is entered only when a dd entry is continued with a non-blank source line.

This patch only changes the output in cases where it previously contained a mixture of non-wrapped inline text adjacent to block elements, which supposedly should be avoided. The explicit wrapping form is still possible, but only necessary for single-paragraph items with this patch.

The kramdown documentation features 84 places where this change has an effect. I have made a diff of the resulting htmldoc.

Unfortunately, the htmldoc CSS uses margin-top for <p> which makes multi-paragraph definitions look bad: The definition term is spaced closer to the preceding text than to its own definition entry. But that is due to the unconditional style rule. Here is a quick fix:

--- a/doc/_design.scss
+++ b/doc/_design.scss
@@ -157,6 +157,10 @@ body {
         font-weight: bold;
     }
 
+    dd > p:first-child {
+        margin-top: 0;
+    }
+
     a {
         color: $link-color;
         text-decoration: underline;

But note that this style change also applies to explitly paragraphed definition lists. Compare the explicitly paragraphed example in the documentation with the output after the style change. I consider this an improvement.

Yet I cannot rule out the possibility that I am trying to fix something which is not broken. Perhaps the previous behavior has all been intended to provide users with a way to mix inline with block content in <dd> elements? If so, please point me to some rationale. Thanks.

@ccorn
Copy link
Contributor Author

ccorn commented Jan 1, 2018

It should be noted that this change breaks previous CSS-based workarounds. If users never put empty lines before definitions and put the following kludge in their (S)CSS:

/* Fix vertical spacing in <dd>text<p>More text</p></dd> */
dd > p:first-child { margin-top: $parskip; }

Then they would have to remove that now (or change $parskip to 0 explicitly) because that rule also applies when there is no text before the first <p> child.

@ccorn
Copy link
Contributor Author

ccorn commented Jan 1, 2018

I find that this behavior is actually specified (source). However, my impression is that this behavior might be the reason for style rules like the margin-top of blocks and countermeasures like negative margin-bottoms of headers. Apparently, the CSS has been made such that authors can avoid putting empty lines before their definitions. When playing with the CSS, I find that one can get along nicely without margin-top and negative margin-bottom just by using the proposed parser change. This would also render the CSS quickfix obsolete.
But since it is documented, the proposed (one-line) parser change should probably be made optional.

@ccorn
Copy link
Contributor Author

ccorn commented Jan 2, 2018

I have introduced a boolean option :auto_dd_para for the proposed parser change so that you can experiment a bit. Since it is off by default and not activated for the webgen-based documentation, I have reverted the CSS quickfix. Nevertheless, to get rid of the following webgen warning, I have replaced a color expression in doc/_design.scss.

DEPRECATION WARNING on line 166 of /_design.scss:
The operation `#1666A3 div 2` is deprecated and will be an error in future versions.
Consider using Sass's color functions instead.
http://sass-lang.com/documentation/Sass/Script/Functions.html#other_color_functions

By the way, the option handling framework is really convenient. No duplicated descriptions! Thanks for that.

I'd like your input on whether the proposed option semantics should, or should not, alternatively focus on that line and semantic criteria like it.children.size rather than on lexical criteria as currently.

@ccorn
Copy link
Contributor Author

ccorn commented Jan 2, 2018

By the way, I get 2838 test runs without MathjaxNode tests whereas the CI only does 2818 test runs with MathjaxNode tests included. The numbers are for Ruby 2.4. No obvious clue what the CI skips.

@gettalong gettalong self-assigned this Jan 5, 2018
@gettalong
Copy link
Owner

Thanks for all the input - however, I need some time to think through all of it.

Without the new option, as is the default, the first paragraph in
a multi-paragraph dd entry is not marked as a :p block unless
explicitly requested with an empty line before the dd entry.

With the new option, having more than one paragraph in the dd
entry causes the first paragraph to be p-marked as well.
The explicit syntax with a preceding empty line remains available,
but is only necessary for single-paragraph items with this option.
@ccorn
Copy link
Contributor Author

ccorn commented Jan 7, 2018

Merged changes from master, then rebased and squashed.

This version also handles EOB-separated paragraphs inside :dd elements.

Also undoes the SCSS color manipulation (fixed in separate branch)
@ccorn
Copy link
Contributor Author

ccorn commented Jan 8, 2018

Changed :auto_dd_para criterion to a semantic version (several block children) instead of the lexical one (contains blank line followed by something else). This also handles examples like

item
: text
  ^
  text

While I was at it, I noticed that a :dd item's .options[:first_as_para] is only conditionally cleaned up. I have changed that to clean it up unconditionally.

This needs review. I have assumed that the children (created by parse_blocks) are all nontrivial blocks, so it.children.size > 1 tells me all that is needed. Maybe that is wrong.
(Actually, counting initial blank blocks is OK with me, since these can reasonably be interpreted as request for paragraph wrappings. That's the idea behind :auto_dd_para.)

I have verified that the generated htmldoc matches that of master except for the additional documentation. No changes to the tag layout because the kramdown documentation does not use :auto_dd_para.

I have also removed the color function change to doc/_design.scss. Firstly because it was unrelated (goes to a separate PR), secondly because it was wrong (the resulting color changed to black with that change).

@ccorn
Copy link
Contributor Author

ccorn commented Jan 8, 2018

Added more tests for the :auto_dd_para option. Take a look at test/testcases/block/13_definition_list/auto_dd_para.text to get a glimpse of the current semantics:

  • Plain para is wrapped in <p> regardless of the state of the :auto_dd_para option.
  • para-iff-auto is wrapped in <p> if and only if the :auto_dd_para option is set.
  • inline is never wrapped in <p>.

You might notice that an item8 is missing. That was an interesting case causing a test failure in the text-to-kramdown-to-html path. That's probably a bug worth a separate issue.

@gettalong
Copy link
Owner

I have looked at your changes and I agree that such an option might be useful. However, I'm always cautious when the parser is involved since the generated AST should be the same for all kramdown documents regardless of set options (except the for the few parser options there are). And this option would generate a different AST.

@ccorn
Copy link
Contributor Author

ccorn commented Feb 3, 2018

Yes, it is a parser option affecting the AST. It deliberately changes the grammar in order to get rid of (presumably unintented) mixed block/inline sequences.
It does not affect how output converters process the AST. Therefore, identical ASTs produce the same output as before.

@ccorn
Copy link
Contributor Author

ccorn commented Feb 3, 2018

I'll have to update the wording in the specification to more closely match the implementation. The current (semantic) implementation is said to take effect when there are several blocks. In more visual terms, it actually takes effect when there are block boundaries before the end, which I think is the most intuitive or least surprising behavior, and perhaps the only one that thoroughly gets rid of mixed block/inline sequences. If you can think of any counterintuitive syntax cases enabled by the current implementation, I'd like to hear about that.

@ccorn
Copy link
Contributor Author

ccorn commented Feb 5, 2018

Just noticed the following:

$ kramdown <<EOF
$ * text
$   ^
$   text
$ EOF
<ul>
  <li>text
    <p>text</p>
  </li>
</ul>

That is, other list types actually use the lexical criterion (need blank line) instead of the semantic one (contains block boundaries).
For uniformity, :auto_dd_para should then switch back to the lexical criterion.
But if the primary purpose is to eliminate inline/block sequences, then all list types should be switched to the semantic criterion. Then :auto_dd_para should be :auto_para, and the parser would have to be patched at several places. (Well... two places.)
An alternative implementation of :auto_para would post-process the AST and remove the :transparent option from paragraphs that are followed by block elements. That is thorough but adds overhead.

@gettalong
Copy link
Owner

I have time this weekend, will take a closer look and then decide. New release will also probably be done.

Thanks for your patience!

@ccorn
Copy link
Contributor Author

ccorn commented Mar 8, 2018

No need to hurry this one. This auto-para branch still needs work. The current state solves most inline+block issues with dd elements, but as explained above, there are still cases left with other list types, so the option introduced here is subject to change. Work here will also need coordination with #486. I have worked on neither one recently. You can close this PR if you do not like it lingering around, I'll reopen it when it looks complete again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants