Does sanitize-html convert "messy" HTML (which you find across the web) into standard HTML? #641
-
Years ago I read that the chrome browser could parse any HTML, even when you had self-closing tags on non-self-closable elements like Does This way, when converting HTML to PDF with pandoc (it successfully creates a malformed PDF with malformed HTML), I could first "sanitize" the HTML, to get standards-compliant HTML, and then render that to PDF. That would fix that problem, amongst other things. If not, what does |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
sanitize-html uses htmlparser2 to parse the document, so you're going to get htmlparser2's interpretation of the document, for good or ill (generally for good in my experience, I'm not casting shade). I think that answers the question, but if you have further questions about it you can check into the htmlparser2 documentation. |
Beta Was this translation helpful? Give feedback.
sanitize-html uses htmlparser2 to parse the document, so you're going to get htmlparser2's interpretation of the document, for good or ill (generally for good in my experience, I'm not casting shade).
I think that answers the question, but if you have further questions about it you can check into the htmlparser2 documentation.