v3.0.0 Release
Full Changelog: v2.2.0...v3.0.0
⬆️ UPGRADE: Drop support for Python 3.7
Also add testing for Python 3.11
⬆️ UPGRADE: Update from upstream markdown-it 12.2.0
to 13.0.0
A key change is the addition of a new Token
type, text_special
, which is used to represent HTML entities and backslash escaped characters.
This ensures that (core) typographic transformation rules are not incorrectly applied to these texts.
The final core rule is now the new text_join
rule, which joins adjacent text
/text_special
tokens,
and so no text_special
tokens should be present in the final token stream.
Any custom typographic rules should be inserted before text_join
.
A new linkify
rule has also been added to the inline chain, which will linkify full URLs (e.g. https://example.com
),
and fixes collision of emphasis and linkifier (so http://example.org/foo._bar_-_baz
is now a single link, not emphasized).
Emails and fuzzy links are not affected by this.
- ♻️ Refactor backslash escape logic, add
text_special
#276 - ♻️ Parse entities to
text_special
token #280 - ♻️ Refactor: Add linkifier rule to inline chain for full links #279
‼️ Remove(p)
=>§
replacement in typographer #281‼️ Remove unusedsilent
arg inParserBlock.tokenize
#284- 🐛 FIX: numeric character reference passing #272
- 🐛 Fix: tab preventing paragraph continuation in lists #274
- 👌 Improve nested emphasis parsing #273
- 👌 fix possible ReDOS in newline rule #275
- 👌 Improve performance of
skipSpaces
/skipChars
#271 - 👌 Show text of
text_special
intree.pretty
#282
♻️ REFACTOR: Replace most character code use with strings
The use of StateBase.srcCharCode
is deprecated (with backward-compatibility), and all core uses are replaced by StateBase.src
.
Conversion of source string characters to an integer representing the Unicode character is prevalent in the upstream JavaScript implementation, to improve performance.
However, it is unnecessary in Python and leads to harder to read code and performance deprecations (during the conversion in the StateBase
initialisation).
See #270, thanks to @hukkinj1.
♻️ Centralise indented code block tests
For CommonMark, the presence of indented code blocks prevent any other block element from having an indent of greater than 4 spaces.
Certain Markdown flavors and derivatives, such as mdx and djot, disable these code blocks though, since it is more common to use code fences and/or arbitrary indenting is desirable.
Previously, disabling code blocks did not remove the indent limitation, since most block elements had the 3 space limitation hard-coded.
This change centralised the logic of applying this limitation (in StateBlock.is_code_block
), and only applies it when indented code blocks are enabled.
This allows for e.g.
<div>
<div>
I can indent as much as I want here.
<div>
<div>
See #260
🔧 Maintenance changes
Strict type annotation checking has been applied to the whole code base,
ruff is now used for linting,
and fuzzing tests have been added to the CI, to integrate with Google OSS-Fuzz testing, thanks to @DavidKorczynski.