Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links with nested HTML or \n confuse the markdown converter #53

Open
sorcio opened this issue May 7, 2022 · 1 comment
Open

Links with nested HTML or \n confuse the markdown converter #53

sorcio opened this issue May 7, 2022 · 1 comment

Comments

@sorcio
Copy link
Contributor

sorcio commented May 7, 2022

A couple cases that generate the wrong markup.

1. nested html elements in link

<a href="http://example.com/"><span>foo</span></a> <a href="http://example.com/">bar</a> foo bar

gets converted to

[foo](http://example.com/) bar foo [bar](http://example.com/)

when I would expect

[foo](http://example.com/) [bar](http://example.com/) foo bar

What happens: the walker enters the a and then enters the span again, finding the same text twice. But the proper label text was already consumed, so it consumes until it finds another match. In some cases (like above) it even linkifies the wrong text.

2. line breaks in link

<a href="https://example.com/">foo
bar</a>

(notice the line break) gets converted to

foo bar

when I would expect

[foo bar](https://example.com/)

What happens: the node textContent is 'foo\nbar' which doesn't match any text in the plaintext.

2b. <br> in link

Another similar (but possibly harder to fix) case is

<a href="https://example.com/">foo<br>bar</a>

which like 2 doesn't create a link:

foo
bar

In general, since the code can't rely on the browser to properly deal with HTML content, some of these corner cases will probably keep popping up. But this happened in the real world (try copying and pasting the first news entry in the deprecated section here) and it seemed to be significant enough to report.

I will follow up with a PR that suggests potential fixes for 1 and 2 (but not 2b), but this is not my field so it might be far from good.

@imjohnbo
Copy link
Contributor

Thanks for fixing these edge cases, @sorcio ❇️! Note that plaintext paste (Ctrl/Cmd + Shift + V, improvements coming) can always be used if future edge cases are encountered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants