Skip to content

Commit

Permalink
docs: adds DBT concept documentation (#111)
Browse files Browse the repository at this point in the history
* docs: adds DBT concept documentation and updates the DBT Extension how-to.
  • Loading branch information
pomegranited authored Jan 25, 2024
1 parent 1d835ac commit 157c345
Show file tree
Hide file tree
Showing 3 changed files with 113 additions and 12 deletions.
37 changes: 37 additions & 0 deletions docs/concepts/dbt.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
.. _dbt:

data build tool (dbt)
*********************

dbt is an open source, command-line tool managed by `dbtlabs`_ for generating and maintaining data transformations.

dbt allows engineers to transform data by writing ``SELECT`` statements that reflect business logic which dbt
materializes into tables and views that can be queried efficiently.

dbt also allows engineers to modularize and re-use their transformation code using "packages" that can be shared across
projects or organizations.

dbt in Aspects
##############

Aspects uses the `aspects-dbt`_ package to define the transforms used by the Aspects project. This package creates and
manages macros and materialized views for data tables stored in :ref:`Clickhouse`, and provides some tests.

Operators may create and install their own dbt packages; see :ref:`dbt-extensions` for details.

`tutor-contrib-aspects`_ also provides a "do" command to proxy running `dbt commands`_ against your deployment; run
``tutor [dev|local] do dbt --help`` for details.

References
##########

* `dbtlabs`_: dbt documentation
* `dbt-core`_: core dbt package
* `aspects-dbt`_: Aspects dbt transforms
* `tutor-contrib-aspects`_: Aspects Tutor plugin

.. _aspects-dbt: https://github.com/openedx/aspects-dbt/#aspects-dbt
.. _dbtlabs: https://docs.getdbt.com/
.. _dbt-core: https://github.com/dbt-labs/dbt-core
.. _dbt commands: https://docs.getdbt.com/reference/dbt-commands
.. _tutor-contrib-aspects: https://github.com/openedx/tutor-contrib-aspects
1 change: 1 addition & 0 deletions docs/concepts/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Concepts
xAPI <xapi_concepts>
Tracking Logs <tracking_logs>
Clickhouse <clickhouse>
dbt <dbt>
Ralph <ralph>
Vector <vector>
Pipelines <pipelines>
Expand Down
87 changes: 75 additions & 12 deletions docs/how-tos/dbt_extensions.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,77 @@
.. _dbt-extensions:

DBT extensions
**************

To extend the DBT project, you can use the following Tutor variables:

- **DBT_REPOSITORY**: A git repository URL to clone and use as the DBT project.
- **DBT_BRANCH**: The branch to use when cloning the DBT project.
- **DBT_PROJECT_DIR**: The directory to use as the DBT project.
- **EXTRA_DBT_PACKAGES**: A list of python packages for the DBT project to install.
- **DBT_ENABLE_OVERRIDE**: This variable determines whether the DBT project override feature
should be enabled or not. When enabled, it allows you to make changes to the **dbt_project.yml**
and **packages.yml** files using the tutor patches: `dbt-packages` and `dbt-project`.
Extending dbt
*************

As noted in :ref:`dbt`, you can install your own custom dbt package to apply your own transforms to the event data
in Aspects.

**Step 1. Create your dbt package**

Create a new dbt package using `dbt init`_.

Update the generated ``dbt_project.yml`` to use the ``aspects`` profile:

.. code-block:: yaml
# This setting configures which "profile" dbt uses for this project.
profile: 'aspects'
See `Building dbt packages`_ for more details, and `Writing data tests`_ for how to validate your transformations.

**Step 2. Link to aspects-dbt**

Aspects charts depend on the transforms in `aspects-dbt`_, so it's important that your dbt package also installs
the same version of `aspects-dbt`_ as your Aspects Tutor plugin.

To do this, add a ``packages.yml`` file to your dbt package at the top level, where:

* ``git`` url matches the default value of ``DBT_REPOSITORY`` in `tutor-contrib-aspects plugin.py`_
* ``revision`` matches the default value of ``DBT_BRANCH`` in `tutor-contrib-aspects plugin.py`_

.. code-block:: yaml
packages:
- git: "https://github.com/openedx/aspects-dbt.git"
revision: v2.2
**Step 3. Install and run your dbt package**

Update the following Tutor variables to use your package instead of the Aspects default.

- ``DBT_REPOSITORY``: A git repository URL to clone and use as the dbt project.

Set this to the URL for your custom dbt package.

Default: ``https://github.com/openedx/aspects-dbt``
- ``DBT_BRANCH``: The branch to use when cloning the dbt project.

Set this to the hash/branch/tag of your custom dbt package that you wish to use.

Default: varies between versions of Aspects.
- ``DBT_PROJECT_DIR``: The directory to use as the dbt project.

Set this to the name of your dbt package repository.

Default: ``aspects-dbt``
- ``EXTRA_DBT_PACKAGES``: Add any python packages that your dbt project requires here.

Default: ``[]``
- ``DBT_PROFILE_*``: variables used in the Aspects ``dbt/profiles.yml`` file, including several Clickhouse connection settings.

Once your package is configured in Tutor, you can run dbt commands directly on your deployment; run ``tutor [dev|local] do dbt --help`` for details.

References
**********

* `Building dbt packages`_: dbt's guide to building packages
* `Writing data tests`_: dbt's guide to writing package tests
* `aspects-dbt`_: Aspects' dbt package
* `eduNEXT/dbt-aspects-unidigital`_: a custom dbt packages running in production Aspects

.. _aspects-dbt: https://github.com/openedx/aspects-dbt
.. _dbt init: https://docs.getdbt.com/reference/commands/init
.. _eduNEXT/dbt-aspects-unidigital: https://github.com/eduNEXT/dbt-aspects-unidigital
.. _Building dbt packages: https://docs.getdbt.com/guides/building-packages
.. _Writing data tests: https://docs.getdbt.com/best-practices/writing-custom-generic-tests
.. _tutor-contrib-aspects plugin.py: https://github.com/openedx/tutor-contrib-aspects/blob/main/tutoraspects/plugin.py

0 comments on commit 157c345

Please sign in to comment.