From 80409ad638a189fa2cd694441b90b1d33a2bb082 Mon Sep 17 00:00:00 2001 From: Jillian Vogel Date: Tue, 23 Jan 2024 15:53:21 +1030 Subject: [PATCH 1/5] docs: adds DBT concept documentation and updates the DBT Extension how-to. --- docs/concepts/dbt.rst | 39 +++++++++++++++++++++++++++++++++ docs/concepts/index.rst | 1 + docs/how-tos/dbt_extensions.rst | 30 +++++++++++++++---------- 3 files changed, 58 insertions(+), 12 deletions(-) create mode 100644 docs/concepts/dbt.rst diff --git a/docs/concepts/dbt.rst b/docs/concepts/dbt.rst new file mode 100644 index 0000000..35efa90 --- /dev/null +++ b/docs/concepts/dbt.rst @@ -0,0 +1,39 @@ +.. _dbt: + +data build tool (dbt) +********************* + +``dbt`` is an open source, command-line tool managed by `dbtlabs`_ for generating and maintaining data transformations. + +``dbt`` allows engineers to transform data by writing ``SELECT`` statements that reflect business logic which ``dbt`` +materializes into tables and views that can be queried efficiently. + +``dbt`` also allows engineers to modularize and re-use their transformation code using "packages" that can be shared +across projects or organizations. + +dbt in Aspects +############## + +Aspects uses the `aspects-dbt`_ package to define the transforms used by the Aspects project. This package manages +materialized views for data tables stored in `Clickhouse`_. + +Operators may create and install their own ``dbt`` packages; see `dbt extensions`_ for details. + +`tutor-contrib-aspects`_ also provides a "do" command to proxy running `dbt commands`_ against your deployment; run +``tutor [dev|local] do dbt --help`` for details. + +References +########## + +* `dbtlabs`_: ``dbt`` documentation +* `dbt-core`_: core ``dbt`` package +* `aspects-dbt`_: Aspects dbt transforms +* `tutor-contrib-aspects`_: Aspects Tutor plugin + +.. _aspects-dbt: https://github.com/openedx/aspects-dbt/#aspects-dbt +.. _clickhouse: clickhouse.html +.. _dbtlabs: https://docs.getdbt.com/ +.. _dbt-core: https://github.com/dbt-labs/dbt-core +.. _dbt commands: https://docs.getdbt.com/reference/dbt-commands +.. _dbt extensions: ../how-tos/dbt_extensions.html +.. _tutor-contrib-aspects: https://github.com/openedx/tutor-contrib-aspects diff --git a/docs/concepts/index.rst b/docs/concepts/index.rst index 5bb81c5..bd70e12 100644 --- a/docs/concepts/index.rst +++ b/docs/concepts/index.rst @@ -9,6 +9,7 @@ Concepts xAPI Tracking Logs Clickhouse + dbt Ralph Vector Pipelines diff --git a/docs/how-tos/dbt_extensions.rst b/docs/how-tos/dbt_extensions.rst index 69ea946..9588906 100644 --- a/docs/how-tos/dbt_extensions.rst +++ b/docs/how-tos/dbt_extensions.rst @@ -1,14 +1,20 @@ .. _dbt-extensions: -DBT extensions -************** - -To extend the DBT project, you can use the following Tutor variables: - -- **DBT_REPOSITORY**: A git repository URL to clone and use as the DBT project. -- **DBT_BRANCH**: The branch to use when cloning the DBT project. -- **DBT_PROJECT_DIR**: The directory to use as the DBT project. -- **EXTRA_DBT_PACKAGES**: A list of python packages for the DBT project to install. -- **DBT_ENABLE_OVERRIDE**: This variable determines whether the DBT project override feature - should be enabled or not. When enabled, it allows you to make changes to the **dbt_project.yml** - and **packages.yml** files using the tutor patches: `dbt-packages` and `dbt-project`. +Extending dbt +************* + +As noted in `dbt concepts`_, you can install your own custom DBT packages to apply your own transforms to the event data +in Aspects. + +To change which DBT packages are installed, use the following Tutor variables: + +- **EXTRA_DBT_PACKAGES**: A list of pip dbt packages for Aspects to install. Add your custom ddt packages here. +- **DBT_REPOSITORY**: A git repository URL to clone and use as the main Aspects DBT project. +- **DBT_BRANCH**: The branch to use when cloning ``DBT_REPOSITORY``. + +To change how the ``dbt`` packages are configured, use these Tutor variables: + +- **DBT_PROFILE_\***: variables used in the ``dbt/profiles.yml`` file, including several Clickhouse connection settings + + +.. _dbt concepts: ../concepts/dbt.html From 5b5e01d65afac941b7d3d71a56746f0ec5fbde7f Mon Sep 17 00:00:00 2001 From: Jillian Vogel Date: Tue, 23 Jan 2024 20:24:42 +1030 Subject: [PATCH 2/5] fix: addresses review comments --- docs/concepts/dbt.rst | 4 ++-- docs/how-tos/dbt_extensions.rst | 10 +++++----- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/concepts/dbt.rst b/docs/concepts/dbt.rst index 35efa90..f7c4803 100644 --- a/docs/concepts/dbt.rst +++ b/docs/concepts/dbt.rst @@ -14,8 +14,8 @@ across projects or organizations. dbt in Aspects ############## -Aspects uses the `aspects-dbt`_ package to define the transforms used by the Aspects project. This package manages -materialized views for data tables stored in `Clickhouse`_. +Aspects uses the `aspects-dbt`_ package to define the transforms used by the Aspects project. This package creates and +manages macros and materialized views for data tables stored in `Clickhouse`_, and provides some tests. Operators may create and install their own ``dbt`` packages; see `dbt extensions`_ for details. diff --git a/docs/how-tos/dbt_extensions.rst b/docs/how-tos/dbt_extensions.rst index 9588906..ac268d4 100644 --- a/docs/how-tos/dbt_extensions.rst +++ b/docs/how-tos/dbt_extensions.rst @@ -3,13 +3,13 @@ Extending dbt ************* -As noted in `dbt concepts`_, you can install your own custom DBT packages to apply your own transforms to the event data -in Aspects. +As noted in `Concepts: dbt `_, you can install your own custom dbt packages to apply your own transforms +to the event data in Aspects. -To change which DBT packages are installed, use the following Tutor variables: +To change which dbt packages are installed, use the following Tutor variables: -- **EXTRA_DBT_PACKAGES**: A list of pip dbt packages for Aspects to install. Add your custom ddt packages here. -- **DBT_REPOSITORY**: A git repository URL to clone and use as the main Aspects DBT project. +- **EXTRA_DBT_PACKAGES**: A list of pip dbt packages for Aspects to install. Add your custom dbt packages here. +- **DBT_REPOSITORY**: A git repository URL to clone and use as the main Aspects dbt project. - **DBT_BRANCH**: The branch to use when cloning ``DBT_REPOSITORY``. To change how the ``dbt`` packages are configured, use these Tutor variables: From a058d53dbbda9d950df36b2c96a2f9b8926dcb02 Mon Sep 17 00:00:00 2001 From: Jillian Vogel Date: Tue, 23 Jan 2024 23:46:16 +1030 Subject: [PATCH 3/5] fix: use :ref: for internal links --- docs/concepts/dbt.rst | 6 ++---- docs/how-tos/dbt_extensions.rst | 6 +----- 2 files changed, 3 insertions(+), 9 deletions(-) diff --git a/docs/concepts/dbt.rst b/docs/concepts/dbt.rst index f7c4803..7a83539 100644 --- a/docs/concepts/dbt.rst +++ b/docs/concepts/dbt.rst @@ -15,9 +15,9 @@ dbt in Aspects ############## Aspects uses the `aspects-dbt`_ package to define the transforms used by the Aspects project. This package creates and -manages macros and materialized views for data tables stored in `Clickhouse`_, and provides some tests. +manages macros and materialized views for data tables stored in :ref:`Clickhouse`, and provides some tests. -Operators may create and install their own ``dbt`` packages; see `dbt extensions`_ for details. +Operators may create and install their own ``dbt`` packages; see :ref:`dbt-extensions` for details. `tutor-contrib-aspects`_ also provides a "do" command to proxy running `dbt commands`_ against your deployment; run ``tutor [dev|local] do dbt --help`` for details. @@ -31,9 +31,7 @@ References * `tutor-contrib-aspects`_: Aspects Tutor plugin .. _aspects-dbt: https://github.com/openedx/aspects-dbt/#aspects-dbt -.. _clickhouse: clickhouse.html .. _dbtlabs: https://docs.getdbt.com/ .. _dbt-core: https://github.com/dbt-labs/dbt-core .. _dbt commands: https://docs.getdbt.com/reference/dbt-commands -.. _dbt extensions: ../how-tos/dbt_extensions.html .. _tutor-contrib-aspects: https://github.com/openedx/tutor-contrib-aspects diff --git a/docs/how-tos/dbt_extensions.rst b/docs/how-tos/dbt_extensions.rst index ac268d4..4a18545 100644 --- a/docs/how-tos/dbt_extensions.rst +++ b/docs/how-tos/dbt_extensions.rst @@ -3,8 +3,7 @@ Extending dbt ************* -As noted in `Concepts: dbt `_, you can install your own custom dbt packages to apply your own transforms -to the event data in Aspects. +As noted in :ref:`dbt`, you can install your own custom dbt packages to apply your own transforms to the event data in Aspects. To change which dbt packages are installed, use the following Tutor variables: @@ -15,6 +14,3 @@ To change which dbt packages are installed, use the following Tutor variables: To change how the ``dbt`` packages are configured, use these Tutor variables: - **DBT_PROFILE_\***: variables used in the ``dbt/profiles.yml`` file, including several Clickhouse connection settings - - -.. _dbt concepts: ../concepts/dbt.html From a628d90f41b73f1b8be694df858fbb79825ca5b0 Mon Sep 17 00:00:00 2001 From: Jillian Vogel Date: Wed, 24 Jan 2024 00:53:10 +1030 Subject: [PATCH 4/5] docs: improve the "Extending dbt" how-to --- docs/how-tos/dbt_extensions.rst | 45 ++++++++++++++++++++++++++++----- 1 file changed, 38 insertions(+), 7 deletions(-) diff --git a/docs/how-tos/dbt_extensions.rst b/docs/how-tos/dbt_extensions.rst index 4a18545..8c05f14 100644 --- a/docs/how-tos/dbt_extensions.rst +++ b/docs/how-tos/dbt_extensions.rst @@ -3,14 +3,45 @@ Extending dbt ************* -As noted in :ref:`dbt`, you can install your own custom dbt packages to apply your own transforms to the event data in Aspects. +As noted in :ref:`dbt`, you can install your own custom ``dbt`` package to apply your own transforms to the event data +in Aspects. -To change which dbt packages are installed, use the following Tutor variables: +**Step 1. Create your dbt package** -- **EXTRA_DBT_PACKAGES**: A list of pip dbt packages for Aspects to install. Add your custom dbt packages here. -- **DBT_REPOSITORY**: A git repository URL to clone and use as the main Aspects dbt project. -- **DBT_BRANCH**: The branch to use when cloning ``DBT_REPOSITORY``. +See `Building dbt packages`_ for details. -To change how the ``dbt`` packages are configured, use these Tutor variables: +**Step 2. Link to aspects-dbt** -- **DBT_PROFILE_\***: variables used in the ``dbt/profiles.yml`` file, including several Clickhouse connection settings +Aspects charts depend on the transforms in `aspects-dbt`_, so it's important that your ``dbt`` package also installs +`aspects-dbt`_. + +To do this, add a ``packages.yml`` file to your ``dbt`` package at the top level, with content like this: + +.. code-block:: yaml + + packages: + - git: "https://github.com/openedx/aspects-dbt.git" + revision: v2.2 + +**Step 3. Install and run your dbt package** + +Update the following Tutor variables to use your package instead of the Aspects default. + +- **DBT_REPOSITORY**: A git repository URL to clone and use as the ``dbt`` project. + + Default: ``https://github.com/openedx/aspects-dbt`` +- **DBT_BRANCH**: The branch to use when cloning the ``dbt`` project. + + Default: varies between versions of Aspects. +- **DBT_PROJECT_DIR**: The directory to use as the ``dbt`` project. + + Default: ``aspects-dbt`` +- **EXTRA_DBT_PACKAGES**: Add any python packages that your ``dbt`` project requires here. + + Default: ``[]`` +- **DBT_PROFILE_\***: variables used in the Aspects ``dbt/profiles.yml`` file, including several Clickhouse connection settings. + +Once your package is configured in Tutor, you can run ``dbt`` commands directly on your deployment; run ``tutor [dev|local] do dbt --help`` for details. + +.. _aspects-dbt: https://github.com/openedx/aspects-dbt +.. _Building dbt packages: https://docs.getdbt.com/guides/building-packages From 98da80bebdce89b62e34e86d25ef0f221366c9ae Mon Sep 17 00:00:00 2001 From: Jillian Vogel Date: Wed, 24 Jan 2024 04:40:45 +1030 Subject: [PATCH 5/5] docs: address review comments * adds details about package profile, dependency versions * changes formatting of pages for clarity --- docs/concepts/dbt.rst | 14 ++++----- docs/how-tos/dbt_extensions.rst | 52 ++++++++++++++++++++++++++------- 2 files changed, 48 insertions(+), 18 deletions(-) diff --git a/docs/concepts/dbt.rst b/docs/concepts/dbt.rst index 7a83539..a3be638 100644 --- a/docs/concepts/dbt.rst +++ b/docs/concepts/dbt.rst @@ -3,13 +3,13 @@ data build tool (dbt) ********************* -``dbt`` is an open source, command-line tool managed by `dbtlabs`_ for generating and maintaining data transformations. +dbt is an open source, command-line tool managed by `dbtlabs`_ for generating and maintaining data transformations. -``dbt`` allows engineers to transform data by writing ``SELECT`` statements that reflect business logic which ``dbt`` +dbt allows engineers to transform data by writing ``SELECT`` statements that reflect business logic which dbt materializes into tables and views that can be queried efficiently. -``dbt`` also allows engineers to modularize and re-use their transformation code using "packages" that can be shared -across projects or organizations. +dbt also allows engineers to modularize and re-use their transformation code using "packages" that can be shared across +projects or organizations. dbt in Aspects ############## @@ -17,7 +17,7 @@ dbt in Aspects Aspects uses the `aspects-dbt`_ package to define the transforms used by the Aspects project. This package creates and manages macros and materialized views for data tables stored in :ref:`Clickhouse`, and provides some tests. -Operators may create and install their own ``dbt`` packages; see :ref:`dbt-extensions` for details. +Operators may create and install their own dbt packages; see :ref:`dbt-extensions` for details. `tutor-contrib-aspects`_ also provides a "do" command to proxy running `dbt commands`_ against your deployment; run ``tutor [dev|local] do dbt --help`` for details. @@ -25,8 +25,8 @@ Operators may create and install their own ``dbt`` packages; see :ref:`dbt-exten References ########## -* `dbtlabs`_: ``dbt`` documentation -* `dbt-core`_: core ``dbt`` package +* `dbtlabs`_: dbt documentation +* `dbt-core`_: core dbt package * `aspects-dbt`_: Aspects dbt transforms * `tutor-contrib-aspects`_: Aspects Tutor plugin diff --git a/docs/how-tos/dbt_extensions.rst b/docs/how-tos/dbt_extensions.rst index 8c05f14..f24e400 100644 --- a/docs/how-tos/dbt_extensions.rst +++ b/docs/how-tos/dbt_extensions.rst @@ -3,19 +3,31 @@ Extending dbt ************* -As noted in :ref:`dbt`, you can install your own custom ``dbt`` package to apply your own transforms to the event data +As noted in :ref:`dbt`, you can install your own custom dbt package to apply your own transforms to the event data in Aspects. **Step 1. Create your dbt package** -See `Building dbt packages`_ for details. +Create a new dbt package using `dbt init`_. + +Update the generated ``dbt_project.yml`` to use the ``aspects`` profile: + +.. code-block:: yaml + + # This setting configures which "profile" dbt uses for this project. + profile: 'aspects' + +See `Building dbt packages`_ for more details, and `Writing data tests`_ for how to validate your transformations. **Step 2. Link to aspects-dbt** -Aspects charts depend on the transforms in `aspects-dbt`_, so it's important that your ``dbt`` package also installs -`aspects-dbt`_. +Aspects charts depend on the transforms in `aspects-dbt`_, so it's important that your dbt package also installs +the same version of `aspects-dbt`_ as your Aspects Tutor plugin. + +To do this, add a ``packages.yml`` file to your dbt package at the top level, where: -To do this, add a ``packages.yml`` file to your ``dbt`` package at the top level, with content like this: +* ``git`` url matches the default value of ``DBT_REPOSITORY`` in `tutor-contrib-aspects plugin.py`_ +* ``revision`` matches the default value of ``DBT_BRANCH`` in `tutor-contrib-aspects plugin.py`_ .. code-block:: yaml @@ -27,21 +39,39 @@ To do this, add a ``packages.yml`` file to your ``dbt`` package at the top level Update the following Tutor variables to use your package instead of the Aspects default. -- **DBT_REPOSITORY**: A git repository URL to clone and use as the ``dbt`` project. +- ``DBT_REPOSITORY``: A git repository URL to clone and use as the dbt project. + + Set this to the URL for your custom dbt package. Default: ``https://github.com/openedx/aspects-dbt`` -- **DBT_BRANCH**: The branch to use when cloning the ``dbt`` project. +- ``DBT_BRANCH``: The branch to use when cloning the dbt project. + + Set this to the hash/branch/tag of your custom dbt package that you wish to use. Default: varies between versions of Aspects. -- **DBT_PROJECT_DIR**: The directory to use as the ``dbt`` project. +- ``DBT_PROJECT_DIR``: The directory to use as the dbt project. + + Set this to the name of your dbt package repository. Default: ``aspects-dbt`` -- **EXTRA_DBT_PACKAGES**: Add any python packages that your ``dbt`` project requires here. +- ``EXTRA_DBT_PACKAGES``: Add any python packages that your dbt project requires here. Default: ``[]`` -- **DBT_PROFILE_\***: variables used in the Aspects ``dbt/profiles.yml`` file, including several Clickhouse connection settings. +- ``DBT_PROFILE_*``: variables used in the Aspects ``dbt/profiles.yml`` file, including several Clickhouse connection settings. + +Once your package is configured in Tutor, you can run dbt commands directly on your deployment; run ``tutor [dev|local] do dbt --help`` for details. + +References +********** -Once your package is configured in Tutor, you can run ``dbt`` commands directly on your deployment; run ``tutor [dev|local] do dbt --help`` for details. +* `Building dbt packages`_: dbt's guide to building packages +* `Writing data tests`_: dbt's guide to writing package tests +* `aspects-dbt`_: Aspects' dbt package +* `eduNEXT/dbt-aspects-unidigital`_: a custom dbt packages running in production Aspects .. _aspects-dbt: https://github.com/openedx/aspects-dbt +.. _dbt init: https://docs.getdbt.com/reference/commands/init +.. _eduNEXT/dbt-aspects-unidigital: https://github.com/eduNEXT/dbt-aspects-unidigital .. _Building dbt packages: https://docs.getdbt.com/guides/building-packages +.. _Writing data tests: https://docs.getdbt.com/best-practices/writing-custom-generic-tests +.. _tutor-contrib-aspects plugin.py: https://github.com/openedx/tutor-contrib-aspects/blob/main/tutoraspects/plugin.py