There are many ways to set up continuous integration for your Python project. This is my personal flavour of doing things. Feel free to pick-and-choose the parts that you like.
This README includes some justification and references for the choices made in this setup.
pre-commit is an awesome framework which many Python projects use. It allows you to select 'hooks' for various formatters and linters you want to use.
Run pre-commit install
after setting up your local environment to enable pre-commit to run all hooks whenever you do a git commit. The commit will be cancelled if not all hooks run successfully. To commit anyway, run with --no-verify
.
The following hooks have been selected for this CI setup:
- ruff: An extremely fast Python linter and formatter. Includes lints and formatting popularized by various other tools like
black
,flake8
andpyupgrade
, all in one tool. Replaces all linting and autoformatting tools except formypy
. Install the VSCode or PyCharm extension for the best developer experience. Adjust settings in thepyproject.toml
as desired. - pre-commit-hooks: Some auto-formatting for non-Python files. Includes a JSON formatter - a common format for config files. Remove the hook if you have no use for it.
- language-formatters: Formatters for TOML and YAML. Useful for keeping your
pyproject.toml
and your GitHub Actions workflows clean. - mdformat: Almost all projects will include some documentation in Markdown format. This hook makes sure these files are formatted consistently.
- typos: A source code spell checker. While it does produce some false positives, it can be helpful. Address false positives by adding ignore patterns to the
typos
section ofpyproject.toml
. - mypy: mypy is a static type checker for Python. One of the best things you can do for your code base is add type hints and be consistent with them. In this repo, mypy is configured with all strictness options enabled. Note that for mypy to work correctly as a pre-commit hook, you must define your main dependencies as
additional_dependencies
in the pre-commit hook. If you have many dependencies, it may be better to remove the mypy pre-commit hook and run mypy alongside your tests.
pytest is without question the best Python testing framework out there. Tests written in this framework are much more readable than when using Python's built-in unittest
framework.
pytest is extensible. I advise using pytest-mock
for your mocking needs. pytest-spark
is useful when you're working with pyspark.
Test coverage is calculated using the coverage
package.
The Makefile is used in this repo as a collection of small useful scripts. Most notably:
make fmt
runs autoformatting and lintingmake test
runs testsmake coverage
runs tests and generates a coverage report
Simply run make
to get an overview of available commands.
Poetry is an amazing, modern tool for developing Python packages. See my Poetry guide for pointers on using Poetry effectively.
Note that the dependency specification for this repository contains two dependency groups:
test
: Includes all testing dependencies.lint
: Includes all linting dependencies. This can be useful to help your IDE do autoformatting or show in-line linting errors.
Having these development dependencies in separate groups makes it easy to install only the required dependencies in the CI workflows.
GitHub Actions is GitHub's CI/CD offering. It allows you to enforce your linting checks and tests for new features, making sure your repo remains in good shape.
I included two separate workflows, one for linting and one for testing. Both workflows utilize caching to speed up subsequent runs, and define concurrency to save some more compute.
For open source repos, I recommend use the official pre-commit CI instead of the linting workflow in this repository. It has some nice bonuses, like keeping your pre-commit hooks up-to-date automatically.
The repo also includes a Dependabot configuration. This can help keep your Python dependencies and GitHub Actions up-to-date.
Because Dependabot can get a bit spammy with its pull requests, it's configured to skip patch versions and only open pull requests once a week.