Simple CLI wrapper around @mozilla/readability
and (optionally) DOMPurify.
Made with ❤️ by spaceduck.com
$ npm install -g @spaceduckapp/readability-cli
Basic:
$ readability-cli path/to/input.html --out path/to/output.json
Using stdin / stdout:
$ cat path/to/input.html | readability-cli - > path/to/output.json
Show help:
$ readability-cli --help
$ curl --silent 'https://example.com/' | readability-cli - --sanitize | jq
{
"title": "Example Domain",
"byline": null,
"dir": null,
"lang": null,
"content": "<div id=\"readability-page-1\" class=\"page\"><div>\n \n <p>This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.</p>\n <p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n</div></div>",
"textContent": "\n \n This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.\n More information...\n",
"length": 191,
"excerpt": "This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.",
"siteName": null,
"publishedTime": null
}