Releases: jgm/pandoc
pandoc 2.14.0.1
Click to expand changelog
-
Commonmark reader: Fix regression in 2.14 with YAML metdata block parsing, which could cause the document body to be omitted after metadata (#7339).
-
HTML reader: fix column width regression in 2.14 (#7334). Column widths specified with a style attribute were off by a factor of 100.
-
Markdown reader: in
rebasePaths
, check for both Windows and Posix absolute paths. Previously Windows pandoc was treating/foo/bar.jpg
as non-absolute. -
Text.Pandoc.Logging: In rendering
LoadedResource
, use relative paths. -
Docx writer: fix regression on captions (#7328). The “Table Caption” style was no longer getting applied. (It was overwritten by “Compact.”)
-
Use commonmark-extensions 0.2.1.2
pandoc 2.14
Click to expand changelog
-
Change reader types, allowing better tracking of source positions [API change]. Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn’t report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn’t resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752).
-
Add
rebase_relative_paths
extension (#3752). When enabled, this extension rewrites relative image and link paths by prepending the (relative) directory of the containing file. This behavior is useful when your input sources are split into multiple files, across several directories, with files referring to images stored in the same directory. The extension can be enabled for all markdown and commonmark-based formats. -
Add Text.Pandoc.Sources (exported module), with a
Sources
type and aToSources
class. ASources
wraps a list of(SourcePos, Text)
pairs [API change]. A parsecStream
instance is provided forSources
. The module also exports versions of parsec’ssatisfy
and other Char parsers that track source positions accurately from aSources
stream (or any instance of the newUpdateSourcePos
class). -
Text.Pandoc.Parsing
- Export the modified Char parsers defined in Text.Pandoc.Sources instead of the ones parsec provides. Modified parsers to use a
Sources
as stream [API change]. - Improve include file functions [API change]. Remove old
insertIncludedFileF
. GiveinsertIncludedFile
a more general type, allowing it to be used whereinsertIncludedFileF
was. - Add parameter to the
citeKey
parser from Text.Pandoc.Parsing, which controls whether the@{..}
syntax is allowed [API change].
- Export the modified Char parsers defined in Text.Pandoc.Sources instead of the ones parsec provides. Modified parsers to use a
-
Text.Pandoc.Error: Modified the constructor
PandocParsecError
to take aSources
rather than aText
as first argument, so parse error locations can be accurately reported. -
Fix source position reporting for YAML bibliographies (#7273).
-
Issue error message when reader or writer format is malformed (#7231). Previously we exited with an error status but (due to a bug) no message.
-
Smarter smart quotes (#7216, #2103). Treat a leading
"
with no closing"
as a left curly quote. This supports the practice, in fiction, of continuing paragraphs quoting the same speaker without an end quote. It also helps with quotes that break over lines in line blocks. -
Markdown reader:
- Use MetaInlines not MetaBlocks for multimarkdown metadata fields. This gives better results in converting to e.g. pandoc markdown.
- Implement curly-brace syntax for Markdown citation keys (#6026). The change provides a way to use citation keys that contain special characters not usable with the standard citation key syntax. Example:
@{foo_bar{x}'}
for the keyfoo_bar{x}
. It also allows separating citation keys from immediately following text, e.g.@{foo}A
.
-
RST reader:
- Seek include files in the directory of the file containing the include directive, as RST requires (#6632).
- Use
insertIncludedFile
from Text.Pandoc.Parsing instead of reproducing much of its code.
-
Org reader: Resolve org includes relative to the directory containing the file containing the INCLUDE directive (#5501).
-
ODT reader: Treat tabs as spaces (#7185, niszet).
-
Docx reader:
-
LaTeX reader:
-
ConTeXt writer: improve ordered lists (#5016, Denis Maier). Change ordered list from itemize to enumerate. Add new itemgroup for ordered lists. Remove manual insertion of width attributes. Use tabular figures in ordered list enumerators.
-
HTML reader:
- Don’t fail on unmatched closing “script” tag (Albert Krenkel, #7282).
- Keep h1 tags as normal headers (#2293, Albert Krewinkel). The tags
<title>
and<h1 class="title">
often contain the same information, so the latter was dropped from the document. However, as this can lead to loss of information, the heading is now always retained. Use--shift-heading-level-by=-1
to turn the<h1>
into the document title, or a filter to restore the previous behavior. - Handle relative lengths (e.g.
2*
) in HTML column widths (#4063). See https://www.w3.org/TR/html4/types.html#h-6.6.
-
DocBook/JATS readers:
-
DocBook reader: ensure that first and last names are separated (#6541).
-
Jira reader (Albert Krewinkel, #7218):
- Support “smart” links:
[alias|https://example.com|smart-card]
syntax. - Allow spaces and most unicode characters in attachment links.
- No longer require a newline character after
{noformat}
. - Only allow URI path segment characters in bare links.
- The
file:
schema is no longer allowed in bare links; these rarely make sense.
- Support “smart” links:
-
Plain writer: handle superscript unicode minus (#7276).
-
LaTeX writer:
- Better handling of line breaks in simple tables (#7272). Now we also handle the case where they’re embedded in other elements, e.g. spans.
- For beamer output, support
exampleblock
andalertblock
(#7278). A block will be rendered as anexampleblock
if the heading has classexample
and analertblock
if it has classalert
. - Separate successive quote chars with thin space (#6958, Albert Krewinkel). Successive quote characters are separated with a thin space to improve readability and to prevent unwanted ligatures. Detection of these quotes sometimes had failed if the second quote was nested in a span element.
- Separate successive quote chars with thin space (#6958, Albert Krewinkel).
-
EPUB Writer: Fix belongs-to-collection XML id choice (#7267, nuew). The epub writer previously used the same XML id for both the book identifier and the epub collection. This causes an error on epubcheck.
-
BibTeX/BibLaTeX writer: Handle
annote
field (#7266). -
ZimWiki writer: allow links and emphasis in headers (#6605, Albert Krewinkel).
-
ConTeXt writer:
- Support blank lines in line blocks (#6564, Albert Krewinkel, thanks to @denismaier).
- Use span identifiers as reference anchors (#7246, Albert Krewinkel).
-
HTML writer:
- Keep attributes from code nested below
pre
tag (#7221, Albert Krewinkel). If a code block is defined with<pre><code class="language-x">…</code></pre>
, where the<pre>
element has no attributes, then the attributes from the<code>
element are used instead. Any leadinglanguage-
prefix is dropped in the code’sclass
attribute are dropped to improve syntax highlighting. - Ensure headings only have valid attribs in HTML4 (#5944, Albert Krewinkel).
- Parse
<header>
as a Div (Albert Krewinkel).
- Keep attributes from code nested below
-
Org writer:
- Inline latex envs need newlines (#7252, tecosaur). As specified in https://orgmode.org/manual/LaTeX-fragments.html, an inline
LaTeX block must start on a new line. - Use LaTeX style maths deliminators (#7196, tecosaur).
- Inline latex envs need newlines (#7252, tecosaur). As specified in https://orgmode.org/manual/LaTeX-fragments.html, an inline
-
JATS writer (Albert Krewinkel):
- Use either styled-content or named-content for spans (#7211). If the element has a content-type attribute, or at least one class, then that value is used as
content-type
and the span is put inside a<named-content>
element. Otherwise a<styled-content>
element is used instead. - Reduce unnecessary use of
<p>
elements for wrapping (#7227). The<p>
element is used for wrapping in cases were the contents would otherwise not be allowed in a certain context. Unnecessary wrapping is avoided, especially around quotes (<disp-quote>
elements). - Convert spans to
<named-content>
elements (#7211). Spans with attributes are converted to<named-content>
elements instead of being wrapped with<milestone-start/>
and<milestone-end>
elements. Milestone elements are not allowed in documents using the articleauthoring tag set, so this change ensures the creation of valid documents. - Add footnote number as label in backmatter (#7210). Footnotes in the backmatter are given the footnote’s number as a label. The articleauthoring output is unaffected from this change, as footnotes are placed inline there.
- Escape disallows chars in identifiers. XML identifiers must start with an underscore or letter, and can contain only a limited set of punctuation characters. Any IDs not adhering to these rules are rewritten by writing the offending characters as
Uxxxx
, wherexxxx
is the character’s hex code.
- Use either styled-content or named-content for spans (#7211). If the element has a content-type attribute, or at least one class, then that value is used as
-
Jira writer: use
{color}
when span has a color attribute (Albert Krewinkel, tarleb/jira-wiki-markup#10). -
Docx writer:
- Autoset table width if no column has an explicit width (Albert Krewinkel).
- Extract Table handling into separate module (Albert Krewinkel).
- Support colspans and rowspans in tables (Albert Krewinkel, #6315).
- Support multirow table headers (Albert Krewinkel).
- Improve integration ...
pandoc 2.13
Click to expand changelog
-
Support
yaml_metadata_block
extension forcommonmark
,gfm
(#6537). This support is a bit more limited than with pandoc’smarkdown
. The YAML block must be the first thing in the input, and the leaf notes are parsed in isolation from the rest of the document. So, for example, you can’t use reference links if the references are defined later in the document. -
Fix fallback to default partials when custom templates are used. If the directory containing a template does not contain the partial, it should be sought in the default templates, but this was not working properly (#7164).
-
Handle
nocite
better with--biblatex
and--natbib
(#4585). Previously the nocite metadata field was ignored with these formats. Now it populates anocite-ids
template variable and causes a\nocite
command to be issued. -
Text.Pandoc.Citeproc: apply
fixLinks
correctly (#7130). This is code that incorporates a prefix likehttps://doi.org/
into a following link when appropriate. -
Text.Pandoc.Shared:
- Remove
backslashEscapes
,escapeStringUsing
[API change]. Replace these inefficient association list lookups with more efficient escaping functions in the writers that used them (for a 10-25% performance boost in org, haddock, rtf, texinfo writers). - Remove
ToString
,ToText
typeclasses [API change]. These were needed for the transition from String to Text, but they are no longer used and may clash with other things. - Simplify
compactDL
.
- Remove
-
Text.Pandoc.Parsing:
- Change type of
readWithM
so that it is no longer polymorphic [API change]. TheToText
class has been removed, and now that we’ve completed the transition to Text we no longer need this to operate on Strings. - Remove
F
type synonym [API change]. Muse and Org were defining their ownF
anyway.
- Change type of
-
Text.Pandoc.Readers.Metadata:
- Export
yamlMetaBlock
[API change]. - Make
yamlBsToMeta
,yamlBsToRefs
polymorphic on the parser state [API change].
- Export
-
Markdown reader: Fix regression with
tex_math_backslash
(#7155). -
MediaWiki reader: Allow block-level content in notes (ref) (#7145).
-
Jira reader (Albert Krewinkel):
- Fixed parsing of autolinks (i.e., of bare URLs in the text). Previously an autolink would take up the rest of a line, as spaces were allowed characters in these items.
- Emoji character sequences no longer cause parsing failures. This was due to missing backtracking when emoji parsing fails.
- Mark divs created from panels with class “panel”.
-
RST reader: fix logic for ending comments (#7134). Previously comments sometimes got extended too far.
-
DocBook writer: include Header attributes as XML attributes on section (Erik Rask). Attributes with key names that are not allowed as XML attributes are dropped, as are attributes with invalid values and
xml:id
(DocBook 5) andid
(DocBook 4). -
Docx writer:
- Make
nsid
inabstractNum
deterministic. Previously we assigned a random number, but we don’t need random values, so now we just assign a value based on the list marker. - Use integral values for
w:tblW
(#7141).
- Make
-
Jira writer (Albert Krewinkel):
- Block quotes are only rendered as
bq.
if they do not contain a linebreak. - Jira writer: improve div/panel handling. Include div attributes in panels, always render divs with class
panel
as panels, and avoid nesting of panels.
- Block quotes are only rendered as
-
HTML writer: Add warnings on duplicate attribute values. This prevents emitting invalid HTML. Ultimately it would be good to prevent this in the types themselves, but this is better for now.
-
Org writer: Prevent unintended creation of ordered list items (#7132, Albert Krewinkel). Adjust line wrapping if default wrapping would cause a line to be read as an ordered list item.
-
JATS templates: support ‘equal-contrib’ attrib for authors (Albert Krewinkel). Authors who contributed equally to a paper may be marked with
equal-contrib
. -
reveal.js template: replace JS comment with HTML (#7154, Florian Kohrt).
-
Text.Pandoc.Logging: Add
DuplicateAttribute
constructor toLogMessage
. [API change] -
Use
-j4
for linux release build. This speeds up the build dramatically on arm. -
cabal.project: remove ghcoptions. Move flags to top level, so they can be set differently on the command line.
-
Require latest texmath, skylighting, citeproc, jira-wiki-markup. (The latest skylighting fixes a bad bug with Haskell syntax highlighting.) Narrow version bounds for texmath, skylighting, and citeproc, since the test output depend on them.
-
Use doclayout 0.3.0.2. This significantly reduces the time and memory needed to compile pandoc.
-
Use
foldl'
instead offoldl
everywhere. -
Update bounds for random (#7156, Alexey Kuleshevich).
-
Remove uses of some partial functions.
-
Don’t bake in a larger stack size for the executable.
-
Test improvements:
- Use
getExecutablePath
from base, avoiding the dependency onexecutable-path
. - Factor out
setupEnvironment
in Helpers, to avoid code duplication. - Fix finding of data files by setting teh
pandoc_datadir
environment variable when we shell out to pandoc. This avoids the need to use--data-dir
for the tests, which caused problems findingpandoc.lua
when compiling without theembed_data_files
flag (#7163).
- Use
-
Benchmark improvements:
- Build
+RTS -A8m -RTS
into default ghc-options for benchmark. This is necessary to get accurate benchmark results; otherwise we are largely measuring garbage collecting, some not related to the current benchmark. - Allow specifying BASELINE file in ‘make bench’ for comparison (otherwise the latest benchmark is chosen by default).
- Force
readFile
in benchmarks early (Bodigrim).
- Build
-
CONTRIBUTING: suggest using a
cabal.project.local
file (#7153, Albert Krewinkel). -
Add ghcid-test to Makefile. This loads the test suite in ghcid.
pandoc 2.12
Click to expand changelog
-
--resource-path
now accumulates if specified multiple times (#6152). Resource paths specified later on the command line are prepended to those specified earlier. Thus,--resource-path foo --resource-path bar:baz
is equivalent to--resource-path bar:bas:foo
. (The previous behavior was for the last--resource-path
to replace all the rest.)resource-path
in defaults files behaves the same way: it will be prepended to the resource path set by earlier command line options or defaults files. This change facilitates the use of multiple defaults files: each can specify a directory containing resources it refers to without clobbering the resource paths set by the others. -
Allow defaults files to refer to the home directory, the user data directory, and the directory containing the defaults file itself (#5871, #5982, #5977). In fields that expect file paths (and only in these fields),
${VARIABLE}
will expand to the value of the environment variableVARIABLE
(and in particular${HOME}
will expand to the path of the home directory). A warning will be raised for undefined variables.${USERDATA}
will expand to the path of the user data directory in force when the defaults file is being processed.${.}
will expand to the directory containing the defaults file. (This allows default files to be placed in a directory containing resources they make use of.)
-
When downloading content from URL arguments, be sensitive to the character encoding (#5600). We can properly handle UTF-8 and latin1 (ISO-8859-1); for others we raise an error. Fall back to latin1 if no charset is given in the mime type and UTF-8 decoding fails.
-
Allow abbreviations that don’t end in a period to be specified using
--abbreviations
(#7124). -
Add new unexported module Text.Pandoc.XML.Light, as well as Text.Pandoc.XML.Light.Types, Text.Pantoc.XML.Light.Proc, Text.Pandoc.XML.Light.Output. (Closes #6001, #6565, #7091).
This module exports definitions of
Element
andContent
that are isomorphic to xml-light’s, but with Text instead of String. This allows us to keep most of the code in existing readers that use xml-light, but avoid lots of unnecessary allocation.We also add versions of the functions from xml-light’s Text.XML.Light.Output and Text.XML.Light.Proc that operate on our modified XML types, and functions that convert xml-light types to our types (since some of our dependencies, like texmath, use xml-light).
We export functions that use xml-conduit’s parser to produce an
Element
or[Content]
. This allows existing pandoc code to use a better parser without much modification.The new parser is used in all places where xml-light’s parser was previously used. Benchmarks show a significant performance improvement in parsing XML-based formats (with docbook, opml, jats, and docx almost twice as fast, odt and fb2 more than twice as fast).
In addition, the new parser gives us better error reporting than xml-light. We report XML errors, when possible, using the new
PandocXMLError
constructor inPandocError
.These changes revealed the need for some changes in the tests. The docbook-reader.docbook test lacked definitions for the entities it used; these have been added. And the docx golden tests have been updated, because the new parser does not preserve the order of attributes.
-
DocBook reader:
- Avoid expensive tree normalization step, as it is not necessary with the new XML parser.
- Support
informalfigure
(#7079) (Nils Carlson).
-
Docx reader:
- Use Map instead of list for Namespaces. This gives a speedup of about 5-10%. With this and the XML parsing changes, the docx reader is now about twice as fast as in the previous release.
-
HTML reader:
- Small performance tweaks.
- Also, remove exported class
NamedTag(..)
[API change]. This was just intended to smooth over the transition from String to Text and is no longer needed. - As a result, the functions
isInlineTag
andisBlockTag
are no longer polymorphic; they apply to aTag Text
[API change]. - Do a lookahead to find the right parser to use. This takes benchmarks from 34ms to 23ms, with less allocation.
- Fix bad handling of empty
src
attribute iniframe
(#7099). Ifsrc
is empty, we simply skip theiframe
. Ifsrc
is invalid or cannot be fetched, we issue a warning nd skip instead of failing with an error.
-
JATS reader:
- Avoid tree normalization, which is no longer necessary given the new XML parser.
-
LaTeX reader:
- Don’t export
tokenize
,untokenize
[API change]. These are internal implementation details, which were only exported for testing. They don’t belong in the public API. - Improved efficiency of the parser. With these changes the reader is almost twice as fast as in the last release in our benchmarks.
- Code cleanup, removing some unnecessary things.
- Rewrite
withRaw
so it doesn’t rely on fragile assumptions about token positions (which break when macros are expanded) (#7092). This requires the addition ofsEnableWithRaw
andsRawTokens
inLaTeXState
, and a new combinatordisablingWithRaw
to disable collecting of raw tokens in certain contexts. AddparseFromToks
to Text.Pandoc.Readers.LaTeX.Parsing. Fix parsing of single character tokens so it doesn’t mess up the new raw token collecting. These changes slightly increase allocations and have a small performance impact. - Handle some bibtex/biblatex-specific commands that used to be dealt with in pandoc-citeproc (#7049).
- Optimize
satisfyTok
, avoiding unnecessary macro expansion steps. Benchmarks after this change show 2/3 of the run time and 2/3 of the allocation of the Feb. 10 benchmarks. - Removed
sExpanded
in state. This isn’t actually needed and checking it doesn’t change anything. - Improve
braced'
. Remove the parameter, have it parse the opening brace, and make it more efficient. - Factor out pieces of the LaTeX reader to make the module smaller. This reduces memory demands when compiling. Created Text.Pandoc.Readers.{LaTeX,Math,Citation,Table,Macro,Inline}. Changed Text.Pandoc.Readers.LaTeX.SIunitx to export a command map instead of individual commands.
- Handle table cells containing
&
in\verb
(#7129).
- Don’t export
-
Make Text.Pandoc.Readers.LaTeX.Types an unexported module [API change].
-
Markdown reader:
- Improved handling of mmd link attributes in references (#7080). Previously they only worked for links that had titles.
- Improved efficiency of the parser (benchmarks show a 15% speedup).
-
OPML reader:
- Avoid tree normalization, which is no longer necessary with the new XML parser.
-
ODT reader:
- Finer-grained errors on parse failure (#7091).
- Give more information if the zip container can’t be unpacked.
-
Org reader:
- Support
task_lists
extension (Albert Krewinkel, #6336). - Fix bug in org-ref citation parsing (Albert Krewinkel, #7101). The org-ref syntax allows to list multiple citations separated by comma. Previously commas were accepted as part of the citation id, so all citation lists were parsed as one single citation.
- Support
-
RST reader:
- Use
getTimestamp
instead ofgetCurrentTime
to fetch timestamp. SettingSOURCE_DATE_EPOCH
will allow reproducible builds. - RST reader: fix handling of header in CSV tables (#7064). The interpretation of this line is not affected by the delim option.
- Use
-
Jira reader:
-
Text.Pandoc.Shared
- Remove formerly exported functions that are no longer used in the code base:
splitByIndices
,splitStringByIndicies
,substitute
, andunderlineSpan
(which had been deprecated in April 2020) [API change]. - Export
handleTaskListItem
(Albert Krewinkel) [API change]. - Change
defaultUserDataDirs
todefaultUserDataDir
[API change]. We determine what is the default user data directory by seeing whether the XDG directory and/or legacy directory exist.
- Remove formerly exported functions that are no longer used in the code base:
-
BibTeX writer:
- BibTeX writer: use doclayout and doctemplate. This change allows bibtex/biblatex output to wrap as other formats do, depending on the settings of
--wrap
and--columns
(#7068).
- BibTeX writer: use doclayout and doctemplate. This change allows bibtex/biblatex output to wrap as other formats do, depending on the settings of
-
CSL JSON writer:
- Output
[]
if no references in input, instead of raising a PandocAppError as before.
- Output
-
Docx writer:
- Use
getTimestamp
instead ofgetCurrentTime
for timestamp. SettingSOURCE_DATE_EPOCH
will allow reproducible builds.
- Use
-
EPUB writer:
- Use
getTimestamp
instead ofgetCurrentTime
for timestamp. SettingSOURCE_DATE_EPOCH
will allow reproducible builds (#7093). This does not suffice to fully enable reproducible in EPUB, since a unique id is still being generated for each build. - Support
belongs-to-collection
metadata (#7063) (Nick Berendsen).
- Use
-
JATS writer:
- Escape special chars in reference elements (Albert Krewinkel). Prevents the generation of invalid markup if a citation element contains an ampersand or another character with a special meaning in XML.
-
Jira writer:
- Use Span identifiers as anchors (Albert Krewinkel).
- Use
{noformat}
instead of{code}
for unknown languages (Albert Krewinkel). Code blocks which are not marked as a language supported by Jira are rendered as preformatted text via{noformat}
blocks.
-
LaTeX writer:
- Adjust hypertargets to beginnings of paragraphs (#7078). Use
\vadjust pre
so that the h...
- Adjust hypertargets to beginnings of paragraphs (#7078). Use
pandoc 2.11.4
Click to expand changelog
-
Add
biblatex
,bibtex
as output formats (closes #7040). -
Recognize more extensions as markdown by default (#7034):
mkdn
,mkd
,mdwn
,mdown
,Rmd
. -
Implement defaults file inheritance (#6924, David Martschenko). Allow defaults files to inherit options from other defaults files by specifying them with the following syntax:
defaults: [list of defaults files or single defaults file]
. -
Fix infinite HTTP requests when writing epubs from URL source (#7013). Due to a bug in code added to avoid overwriting the cover image if it had the form
fileX.YYY
, pandoc made an endless sequence of HTTP requests when writing epub with input from a URL. -
Org reader:
- Allow multiple pipe chars in todo sequences (Albert Krewinkel, #7014). Additional pipe chars, used to separate “action” state from “no further action” states, are ignored. E.g., for the following sequence, both
DONE
andFINISHED
are states with no further action required:#+TODO: UNFINISHED | DONE | FINISHED
. - Restructure output of captioned code blocks (Albert Krewinkel, #6977). The Div wrapper of code blocks with captions now has the class “captioned-content”. The caption itself is added as a Plain block inside a Div of class “caption”. This makes it easier to write filters which match on captioned code blocks. Existing filters will need to be updated.
- Mark verbatim code with class
verbatim
(Dimitri Sabadie, #6998).
- Allow multiple pipe chars in todo sequences (Albert Krewinkel, #7014). Additional pipe chars, used to separate “action” state from “no further action” states, are ignored. E.g., for the following sequence, both
-
LaTeX reader:
- Handle
filecontents
environment (#7003). - Put contents of unknown environments in a Div when
raw_tex
is not enabled (#6997). (Whenraw_tex
is enabled, the whole environment is parsed as a raw block.) The class name is the name of the environment. Previously, we just included the contents without the surrounding Div, but having a record of the environment’s boundaries and name can be useful.
- Handle
-
Mediawiki reader:
- Allow space around storng/emph delimiters (#6993).
-
New module Text.Pandoc.Writers.BibTeX, exporting writeBibTeX and writeBibLaTeX. [API change]
-
LaTeX writer:
- Revert table line height increase in 2.11.3 (#6996). In 2.11.3 we started adding
\addlinespace
, which produced less dense tables. This wasn’t an intentional change; I misunderstood a comment in the discussion leading up to the change. This commit restores the earlier default table appearance. Note that if you want a less dense table, you can use something like\def\arraystretch{1.5}
in your header.
- Revert table line height increase in 2.11.3 (#6996). In 2.11.3 we started adding
-
EPUB writer:
- Adjust internal links to identifiers defined in raw HTML sections after splitting into chapters (#7000).
- Recognize
Format "html4"
,Format "html5"
as raw HTML. - Adjust internal links to images, links, and tables after splitting into chapters. Previously we only did this for Div and Span and Header elements (see #7000).
-
Ms writer:
- Don’t justify text inside table cells.
-
JATS writer:
-
Markdown writer:
- Cleaned up raw formats. We now react appropriately to
gfm
,commonmark
, andcommonmark_x
as raw formats.
- Cleaned up raw formats. We now react appropriately to
-
RST writer:
- Fix bug with dropped content from inside spans with a class in some cases (#7039).
-
Docx writer:
- Handle table header using styles (#7008). Instead of hard-coding the border and header cell vertical alignment, we now let this be determined by the Table style, making use of Word’s “conditional formatting” for the table’s first row. For headerless tables, we use the tblLook element to tell Word not to apply conditional first-row formatting.
-
Commonmark writer:
- Implement start number on ordered lists (#7009). Previously they always started at 1, but according to the spec the start number is respected.
-
HTML writer:
- Fix implicit_figure at end of footnotes (#7006).
-
ConTeXt template: Remove
\setupthinrules
from default template. The width parameter this used is not actually supported, and the command didn’t do anything. -
Text.Pandoc.Extensions:
- Add
Ext_element_citations
constructor (Albert Krewinkel).
- Add
-
Text.Pandoc.Citeproc.BibTeX: New unexported function
writeBibtexString
. -
Text.Pandoc.Citeproc:
- Use finer grained imports (Albert Krewinkel).
- Factor out and export
getStyle
[API change]. - Export
getReferences
[API change, #7106]. - Factor out getLang.
-
Text.Pandoc.Parsing: modify
gridTableWith'
for headerless tables. If the table lacks a header, the header row should be an empty list. Previously we got a list of empty cells, which caused an empty header to be emitted instead of no header. In LaTeX/PDF output that meant we got a double top line with space between. -
ImageSize: use
viewBox
for SVG if no length, width attributes (#7045). This change allows pandoc to extract size information from more SVGs. -
Add simple default.nix.
-
Use commonmark 0.1.1.3.
-
Use citeproc 0.3.0.5.
-
Update default CSL to use latest chicago-author-date.csl.
-
CONTRIBUTING.md: add note on GNU xargs.
-
MANUAL.txt:
- Update description of
-L
/--lua-filter
. - Document use of citations in note styles (#6828).
- Update description of
pandoc 2.11.3.2
Click to expand changelog
-
HTML reader: use renderTags’ from Text.Pandoc.Shared (Albert Krewinkel). A side effect of this change is that empty
<col>
elements are written as self-closing tags in raw HTML blocks. -
Asciidoc writer: Add support for writing nested tables (#6972, timo-a). Asciidoc supports one level of nesting. If deeper tables are to be written, they are omitted and a warning is issued.
-
Docx writer: fix nested tables with captions (#6983). Previously we got unreadable content, because docx seems to want a
<w:p>
element (even an empty one) at the end of every table cell. -
Powerpoint writer: allow arbitrary OOXML in raw inline elements (Albert Krewinkel). The raw text is now included verbatim in the output. Previously is was parsed into XML elements, which prevented the inclusion of partial XML snippets.
-
LaTeX writer: support colspans and rowspans in tables (#6950, Albert Krewinkel). Note that the multirow package is needed for rowspans. It is included in the latex template under a variable, so that it won’t be used unless needed for a table.
-
HTML writer: don’t include p tags in CSL bibliography entries (#6966). Fixes a regression in 2.11.3.
-
Add
meta-description
variable to HTML templates (#6982). This is populated by the writer by stringifying thedescription
field of metadata (Jerry Sky). Thedescription
meta tag will make the generated HTML documents more complete and SEO-friendly. -
Citeproc: fix handling of empty URL variables (
DOI
, etc.). ThelinkifyVariables
function was changing these to links which then got treated as non-empty by citeproc, leading to wrong results (e.g. ignoring nonempty URL when empty DOI is present). See jgm/citeproc#41. -
Use citeproc 0.3.0.3. Fixes an issue in author-only citations when both an author and translator are present, and an issue with citation group delimiters.
-
Require texmath 0.12.1. This improves siunitx support in math, fixes bugs with
\*mod
family operators and arrays, and avoids italicizing symbols and operator names in docx output. -
Ensure that the perl interpreter used for filters with
.pl
extension (wuffi). -
MANUAL: note that textarea content is never parsed as Markdown (Albert Krewinkel).
pandoc 2.11.3.1
Click to expand changelog
-
Added some missing files to extra-source-files and data files, so they are included in the sdist tarball. Closes #6961. Cleaned up some extraneous data and test files, and added a CI check to ensure that the test and data files included in the sdist match what is in the git repository.
-
Use citeproc 0.3.0.1, which avoids removing nonbreaking space at the end of the
initialize-with
attribute. (Some journals require nonbreaking space after initials, and this makes that possible.)
pandoc 2.11.3
Click to expand changelog
-
With
--bibliography
(orbibliography
in metadata), a URL may now be provided, and pandoc will fetch the resource. In addition, if a file path is provided and it is not found relative to the working directory, the resource path will be searched (#6940). -
Add
sourcepos
extension forcommonmark
,gfm
,commonmark_x
(#4565). With thesourcepos
extension set set,data-pos
attributes are added to the AST by the commonmark reader. No other readers are affected. Thedata-pos
attributes are put on elements that accept attributes; for other elements, an enlosing Div or Span is added to hold the attributes. -
Change extensions for
commonmark_x
: replaceauto_identifiers
withgfm_auto_identifiers
(#6863).commonmark_x
never actually supportedauto_identifiers
(it didn’t do anything), because the underlying library implements gfm-style identifiers only. Attempts to add theauto_identifiers
extension tocommonmark
will now fail with an error. -
HTML reader:
- Split module into several submodules (Albert Krewinkel). Reducing module size should reduce memory use during compilation.
- Support advanced table features (Albert Krewinkel): block level content in captions, row and colspans, body headers, row head columns, footers, attributes.
- Disable round-trip testing for tables. Information for cell alignment in a column is not preserved during round-trips (Albert Krewinkel).
- Allow finer grained options for tag omission (Albert Krewinkel).
- Simplify list attribute handling (Albert Krewinkel).
- Pay attention to
lang
attributes on body element (#6938). These (as well aslang
attributes on the html element) should update lang in metadata. - Retain attribute prefixes and avoid duplicates (#6938). Previously we stripped attribute prefixes, reading
xml:lang
aslang
for example. This resulted in two duplicatelang
attributes whenxml:lang
andlang
were both used. This commit causes the prefixes to be retained, and also avoids invald duplicate attributes.
-
Commonmark reader:
- Refactor
specFor
. - Set input name to
""
to avoid clutter in sourcepos output.
- Refactor
-
Org reader:
- Parse
#+LANGUAGE
intolang
metadata field (#6845, Albert Krewinkel). - Preserve targets of spurious links (#6916, Albert Krewinkel). Links with (internal) targets that the reader doesn’t know about are converted into emphasized text. Information on the link target is now preserved by wrapping the text in a Span of class
spurious-link
, with an attributetarget
set to the link’s original target. This allows to recover and fix broken or unknown links with filters.
- Parse
-
DocBook reader:
- Table text width support (#6791, Nils Carlson). Table width in relation to text width is not natively supported by docbook but is by the docbook
fo
stylesheets through an XML processing instruction,<?dbfo table-width="50%"?>
.
- Table text width support (#6791, Nils Carlson). Table width in relation to text width is not natively supported by docbook but is by the docbook
-
LaTeX reader:
- Improve parsing of command options (#6869, #6873). In cases where we run into trouble parsing inlines til the closing
]
, e.g. quotes, we return a plain string with the option contents. Previously we mistakenly included the brackets in this string. - Preserve center environment (#6852, Igor Pashev). The contents of the
center
environment are put in aDiv
with classcenter
. - Don’t parse
\rule
with width 0 as horizontal rule. These are sometimes used as spacers in LaTeX. - Don’t apply theorem default styling to a figure inside (#6925). If we put an image in italics, then when rendering to Markdown we no longer get an implicit figure.
- Improve parsing of command options (#6869, #6873). In cases where we run into trouble parsing inlines til the closing
-
Dokuwiki reader:
- Handle unknown interwiki links better (#6932). DokuWiki lets the user define his own Interwiki links. Previously pandoc reacted to these by emitting a google search link, which is not helpful. Instead, we now just emit the full URL including the wikilink prefix, e.g.
faquk>FAQ-mathml
. This at least gives users the ability to modify the links using filters.
- Handle unknown interwiki links better (#6932). DokuWiki lets the user define his own Interwiki links. Previously pandoc reacted to these by emitting a google search link, which is not helpful. Instead, we now just emit the full URL including the wikilink prefix, e.g.
-
Markdown writer:
-
RST writer:
- Better image handling (#6948). An image alone in its paragraph (but not a figure) is now rendered as an independent image, with an
alt
attribute if a description is supplied. An inline image that is not alone in its paragraph will be rendered, as before, using a substitution. Such an image cannot have a “center”, “left”, or “right” alignment, so the classesalign-center
,align-left
, oralign-right
are ignored. However,align-top
,align-middle
,align-bottom
will generate a correspondingalign
attribute.
- Better image handling (#6948). An image alone in its paragraph (but not a figure) is now rendered as an independent image, with an
-
Docx writer:
- Keep raw openxml strings verbatim (#6933, Albert Krewinkel).
- Use Content instead of Element. This allows us to inject raw OpenXML into the document without reparsing it into an Element, which is necessary if you want to inject an open tag or close tag.
- Fix bullets/lists indentation, so that the first level is slightly indented to the right instead of right on the margin (cholonam).
- Support bold and italic in “complex script” (#6911). Previously bold and italics didn’t work properly in LTR text. This commit causes the w:bCs and w:iCs attributes to be used, in addition to w:b and w:i, for bold and italics respectively.
-
ICML writer:
- Fix image bounding box for custom widths/heighta (Mauro Bieg, #6936).
-
LaTeX writer:
- Improve table spacing (#6842, #6860). Remove the
\strut
that was added at the end of minipage environments in cells. Replace\tabularnewline
with\\ \addlinespace
. - Improve calculation of column spacing (#6883).
- Extract table handling into separate module (Albert Krewinkel).
- Fix bug with nested
csl-
display Spans (#6921). - Improve longtable output (#6883). Don’t create minipages for regular paragraphs. Put width and alignment information in the longtable column descriptors.
- Improve table spacing (#6842, #6860). Remove the
-
OpenDocument writer:
- Support for table width as a percentage of text width (#6792, Nils Carson).
- Implement Div and Span ident support (#6755, Nils Carson). Spans and Divs containing an ident in the Attr will become bookmarks or sections with idents in OpenDocument format.
- Add two extensions,
xrefs_name
andxrefs_number
(#6774, Nils Carlson). Links to headings, figures and tables inside the document are substituted with cross-references that will use the name or caption of the referenced item forxrefs_name
or the number forxrefs_number
. For thexrefs_number
to be useful heading numbers must be enabled in the generated document and table and figure captions must be enabled using for example thenative_numbering
extension. In order for numbers and reference text to be updated the generated document must be refreshed.
-
JATS writer:
- Support advanced table features (Albert Krewinkel).
- Support author affiliations (#6687, Albert Krewinkel).
-
Docbook writer:
- Use correct id attribute consistently (Jan Tojnar). DocBook5 should always use
xml:id
instead ofid
. - Handle admonition titles better (Jan Tojnar). Docbook reader produces a
Div
withtitle
class for<title>
element within an “admonition” element. Markdown writer then turns this into a fenced div withtitle
class attribute. Since fenced divs are block elements, their content is recognized as a paragraph by the Markdown reader. This is an issue for Docbook writer because it would produce an invalid DocBook document from such AST – the<title>
element can only contain “inline” elements. Handle this special case separately by unwrapping the paragraph before creating the<title>
element. - Add XML namespaces to top-level elements (#6923, Jan Tojnar). Previously, we only added
xmlns
attributes to chapter elements, even when running with--top-level-division=section
. These namespaces are now added to part and section elements too, when they are the selected top-level divisions. We do not need to add namespaces to documents produced with--standalone
flag, since those will already have xmlns attribute on the root element in the template.
- Use correct id attribute consistently (Jan Tojnar). DocBook5 should always use
-
HTML writer:
- Fix handling of nested
csl-
display spans (#6921). Previously inner Spans used to represent CSL display attributes were not rendered as div tags as intended.
- Fix handling of nested
-
EPUB writer:
-
EPUB templates: use preserveAspectRatio=“xMidYMid” for cover image (#6895, Shin Sang-jae). This change affects both the epub2 and the epub3 templates. It avoids distortion of the cover image by requiring that the aspect ratio be preserved.
-
LaTeX template:
- Include
csquotes
package ifcsquotes
variable set. - Put back
amssymb
. We need it for checkboxes in todo lists, and maybe for other things. In this location it seems compatible with the cases that prompted #6469 and PR #6762. - Disable language-specific shorthands in babel (#6817, #6887). Babel defines “shorthands” for some languages, and these can produce unexpected results. For example, in Spanish,
1.22
gets rendered as122
, andet~al.
asetal
. One would think that babel’sshorthands=off
option (which we were using) would disable these, but it doesn’t. So we removeshorthands=off
and add some code that redefines the shorthands macro. Eventually this will be fixed in babel, I hope, and we can revert to something simpler.
- Include
-
JATS template: allow array of persistent institute ...
pandoc 2.11.2
Click to expand changelog
-
Default to using ATX (
##
-style) headings for Markdown output (#6662, Aner Lucero). Previously we used Setext (underlined) headings by default for levels 1–2. -
Add option
--markdown-headings=atx|setext
, and deprecate--atx-headers
(#6662, Aner Lucero). -
Support
markdown-headings
in defaults files. -
Fix corner case in YAML metadata parsing (#6823). Previously YAML metadata would sometimes not get recognized if a field ended with a newline followed by spaces.
-
--self-contained
: increase coverage (#6854). Previously we only self-contained attributes for certain tag names (img
,embed
,video
,input
,audio
,source
,track
,section
). Now we self-contain any occurrence ofsrc
,data-src
,poster
, ordata-background-image
, on any tag; and alsohref
onlink
tags. -
Markdown reader:
- Fix detection of locators following in-text citations. Prevously, if we had
@foo [p. 33; @bar]
, thep. 33
would be incorrectly parsed as a prefix of@bar
rather than a suffix of@foo
. - Improve period suppression algorithm for citations in notes in note citation styles (#6835).
- Don’t increment
stateNoteNumber
for example list references. This helps with #6836 (a bug in which example list references disturb calculation of citation note number and affect whenibid
is triggered).
- Fix detection of locators following in-text citations. Prevously, if we had
-
LaTeX reader:
- Move
getNextNumber
from Readers.LaTeX to Readers.LaTeX.Parsing. - Fix negative numbers in siunitx commands. A change in pandoc 2.11 broke negative numbers, e.g.
\SI{-33}{\celcius}
or\num{-3}
. This fixes the regression.
- Move
-
DocBook reader: drop period in formalpara title and put it in a div with class
formalpara-title
, so that people can reformat with filters (#6562). -
Man reader: improve handling of
.IP
(#6858). We now better handle.IP
when it is used with non-bullet, non-numbered lists, creating a definition list. We also skip blank lines like groff itself. -
Bibtex reader: fall back on
en-US
if locale for LANG not found. This reproduces earlier pandoc-citeproc behavior (jgm/citeproc#26). -
JATS writer:
- Wrap all tables (Albert Krewinkel). All
<table>
elements are put inside<table-wrap>
elements, as the former are not valid as immediate child elements of<body>
. - Move Table handling to separate module (Albert Krewinkel). Adds two new unexported modules: Text.Pandoc.Writers.JATS.Types, Text.Pandoc.Writers.JATS.Table.
- Wrap all tables (Albert Krewinkel). All
-
Org writer:
- Replace org #+KEYWORDS with #+keywords (TEC). As of ~2 years ago, lower case keywords became the standard (though they are handled case insensitive, as always).
- Update org supported languages and identifiers according to the current list contained in https://orgmode.org/worg/org-contrib/babel/languages/index.html (TEC).
-
Only use
filterIpynbOutput
if input format is ipynb (#6841). Before this change content could go missing from divs with classoutput
, even when non-ipynb was being converted. -
When checking reader/writer name, check base name now that we permit extensions on formats other than markdown.
-
Text.Pandoc.PDF: Fix
changePathSeparators
for Windows (#6173). Previously a path beginning with a drive, likeC:\foo\bar
, was translated toC:\/foo/bar
, which caused problems. With this fix, the backslashes are removed. -
Text.Pandoc.Logging: Add constructor
ATXHeadingInLHS
constructor toLogMessage
[API change]. -
Fix error that is given when people specify
doc
output (#6834, gison93). -
LaTeX template: add a
\break
after parbox inCSLRightInline
. This should fix spacing problems between entries with numeric styles. Also fix number of params onCSLReferences
. -
reveal.js template: Put quotes around
controlsLayout
,controlsBackArrows
, anddisplay
, since these require strings. AddshowSlideNumber
,hashOneBasedIndex
,pause
. -
Use citeproc 0.2. This fixes a bug with title case around parentheses.
-
pandoc.cabal: remove ‘static’ flag. This isn’t really necessary and can be misleading (e.g. on macOS, where a fully static build isn’t possible). cabal’s new option
--enable-executable-static
does the same. On stack you can add something like this to the options for your executable in package.yaml:ld-options: -static -pthread
-
Remove obsolete bibutils flag setting in
linux/make_artifacts.sh
. -
Manual:
- Correct
link-citation
->link-citations
. - Add a sentence about
pagetitle
for HTML (#6843, Alex Toldaiev).
- Correct
-
INSTALL.md: Remove references to
pandoc-citeproc
(#6857). -
CONTRIBUTING: describe hlint and how it’s used (#6840, Albert Krewinkel).
pandoc 2.11.1.1
Click to expand changelog
-
Citeproc: improve punctuation in in-text note citations (#6813). Previously in-text note citations inside a footnote would sometimes have the final period stripped, even if it was needed (e.g. on the end of ‘ibid’).
-
Use citeproc 0.1.1.1. This improves the decision about when to use
ibid
in cases where citations are used inside a footnote (#6813). -
Support
nocase
spans forcsljson
output. -
Require latest commonmark, commonmark-extensions. This fixes a bug with
autolink_bare_uris
and commonmark. -
LaTeX reader: better handling of
\\
inside math in table cells (#6811). -
DokuWiki writer: translate language names for code elements and improve whitespace (#6807).
-
MediaWiki writer: use
syntaxhighlight
tag instead of deprecatedsource
for highlighted code (#6810). Also supportstartFrom
attribute andnumberLines
. -
Lint code in PRs and when committing to master (#6790, Albert Krewinkel).
-
doc/filters.md: describe technical details of filter invocations (#6815, Albert Krewinkel).