Usage

TARGET=~/src/getambassador.io PRODUCT=getambassadorio make -C ~/src/blc2 > blc.log

or

TARGET=~/src/telepresence.io PRODUCT=telepresenceio make -C ~/src/blc2 > blc.log

or

TARGET=~/src/other-thing.tld make -C ~/src/blc2 > blc.log

or

TARGET=~/src/getambassador.io PRODUCT=getambassadorio USER_AGENT=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36 make -C ~/src/blc2 > blc.log

Then tail blc.log for a summary, or grep ^Page blc.log for a list of pages with broken links.

Settings

TARGET (no default; required to be set):
- It looks at HTML files in the ${TARGET}/public directory
- It obeys redirects in the ${TARGET}/netlify.toml file (if one exists)
PRODUCT (default=generic):
- Specific per-product link checks settings are defined in ${PRODUCT}_blc.py files.
USER_AGENT (default: github.com/datawire/getambassador.io-blc2; not required to be set):
- Specifies the User_Agent header value for each request. It avoids security blocks from external sites
PAGES_TO_CHECK (not required to be set):
- Specifies the

Why

Why this is better than other broken link checkers (at least better than https://github.com/stevenvachon/broken-link-checker):

It does a better job of implementing the low-level fundamentals:
- It uses caching to avoid fetching the same resource twice.
- It understands HTTP 429 / Retry-After to back off and try again later, and does this without blocking other pages from being checked.
- It checks that the #fragment exists in the linked page.
- It checks more than just HTML:
  - It understands many link types in HTML
  - It understands sourcemap v3 links in JavaScript
  - It understands Babel/WebPack(?) /!* For license information please see … */ links in JavaScript.
  - It understands url(…) references in CSS.
  - That said, it could do even better; search for "TODO" in blclib/checker.py.
It does a better job of handling practical concerns:
- Given a directory of a static website, it can identify pages that are not linked to by any other page, flagging that as a problem.
- For Netlify sites, it understands Netlify's special files, so it can quickly run against a local directory of static files, rather than having to actually deploy to Netlify and check the pages with slow over-the-network requests.
It does a better job of letting you address your own high-level business-logic needs:
- It is easy to extend with your own business logic and site-specific checks, such as
  - Extra validation on the HTML, like "pages must have canonical links" (example: generic_blc.py:handle_html_extra)
  - Extra checks on links to detect links that are semantically broken even if they're not technicaly broken (links that are "ugly") (example: getambassadorio_blc.py:product_ugly_check).
  - Special handling for sites that implement fragments via JavaScript (coughGitHubcough) (example: generic_blc.py:handle_link).

In short:

It is faster
It checks links more thoroughly
It gives you tools to address false positives
It lets you add your own semantic checks

Dependencies

GNU Make
date
python3
node >=15.0.0
yarn ^v1.3.2

Manual checking

This is a list of links that needs to be checked manually

The links for the comparably were added due to the Cloudflare's anti-bot page. For further information, read this article.

Name		Name	Last commit message	Last commit date
Latest commit History 195 Commits
.github/workflows		.github/workflows
blclib		blclib
mypy-stubs		mypy-stubs
utils		utils
.editorconfig		.editorconfig
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
broken_links_to_csv.py		broken_links_to_csv.py
dev_requirements.txt		dev_requirements.txt
generic_blc.py		generic_blc.py
getambassadorio_blc.py		getambassadorio_blc.py
package.json		package.json
requirements.txt		requirements.txt
serve.js		serve.js
setup.cfg		setup.cfg
telepresenceio_blc.py		telepresenceio_blc.py
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Usage

Settings

Why

Dependencies

Manual checking

About

Releases

Packages

Contributors 7

Languages

datawire/getambassador.io-blc2

Folders and files

Latest commit

History

Repository files navigation

Usage

Settings

Why

Dependencies

Manual checking

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages