Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: adds ADR 12. Clickhouse vs dbt #112

Merged
merged 7 commits into from
Jan 25, 2024
Merged

docs: adds ADR 12. Clickhouse vs dbt #112

merged 7 commits into from
Jan 25, 2024

Conversation

pomegranited
Copy link
Contributor

@pomegranited pomegranited commented Jan 24, 2024

Description

  1. Adds ADR 12. Clickhouse vs dbt
  2. Adjusts the titles on some of the referenced pages to make them more legible when used in context.

Supporting information

Part of #61

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Jan 24, 2024
@openedx-webhooks
Copy link

Thanks for the pull request, @pomegranited! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

  • supporting documentation
  • Open edX discussion forum threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

@pomegranited pomegranited mentioned this pull request Jan 24, 2024
7 tasks
@pomegranited pomegranited changed the title feat: adds ADR 12. Clickhouse vs dbt docs: adds ADR 12. Clickhouse vs dbt Jan 24, 2024
Copy link
Contributor

@bmtcril bmtcril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple small things!


* materialized views
* partitions
* dictionaries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dictionaries aren't available in dbt yet, but we're hoping to contract with Rory to get them added in the next FC. Currently they're still in Alembic. I don't know if it matters for this doc, maybe a (pending) and link to the task?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* dictionaries
* fields extracted from event JSON

Note that while we use dbt to manage these data transformations, the transformations themselves
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that while we use dbt to manage these data transformations, the transformations themselves
Note that while we use dbt to manage the database schema that performs the transformations, the transformations themselves happen in the ClickHouse process. As data is inserted it is immediately transformed and stored in various query-efficient tables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! I forgot to finish that thought :)

Have added your content as a "note", see: https://docsopenedxorg--112.org.readthedocs.build/projects/openedx-aspects/en/112/decisions/0012_clickhouse_dbt.html#decision

************

#. Contribute upstream to `dbt-clickhouse`_ where support for required features is missing.
#. Move transformations made by the "query" and "dataset" Aspects Superset assets to `aspects-dbt`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably deserves a "where possible", there are several things we still have to do in Superset virtual datasets because we have to use filters in specific ways for performance reasons, but I think this is still the long term goal as ClickHouse's predicate pushdown capabilities get overhauled.

Base automatically changed from jill/concept-dbt to main January 25, 2024 02:56
@pomegranited pomegranited requested a review from bmtcril January 25, 2024 03:08
@bmtcril bmtcril merged commit 54d6dbb into main Jan 25, 2024
3 checks passed
@bmtcril bmtcril deleted the jill/adr-dbt branch January 25, 2024 13:34
@openedx-webhooks
Copy link

@pomegranited 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
open-source-contribution PR author is not from Axim or 2U
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants