-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: adds ADR 12. Clickhouse vs dbt #112
Conversation
and updates the DBT Extension how-to.
and adjusts the titles on some of the referenced pages to make them more legible when used in context.
Thanks for the pull request, @pomegranited! Please note that it may take us up to several weeks or months to complete a review and merge your PR. Feel free to add as much of the following information to the ticket as you can:
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here. Please let us know once your PR is ready for our review and all tests are green. |
2c145af
to
18cc1da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a couple small things!
|
||
* materialized views | ||
* partitions | ||
* dictionaries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dictionaries aren't available in dbt yet, but we're hoping to contract with Rory to get them added in the next FC. Currently they're still in Alembic. I don't know if it matters for this doc, maybe a (pending)
and link to the task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* dictionaries | ||
* fields extracted from event JSON | ||
|
||
Note that while we use dbt to manage these data transformations, the transformations themselves |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that while we use dbt to manage these data transformations, the transformations themselves | |
Note that while we use dbt to manage the database schema that performs the transformations, the transformations themselves happen in the ClickHouse process. As data is inserted it is immediately transformed and stored in various query-efficient tables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! I forgot to finish that thought :)
Have added your content as a "note", see: https://docsopenedxorg--112.org.readthedocs.build/projects/openedx-aspects/en/112/decisions/0012_clickhouse_dbt.html#decision
************ | ||
|
||
#. Contribute upstream to `dbt-clickhouse`_ where support for required features is missing. | ||
#. Move transformations made by the "query" and "dataset" Aspects Superset assets to `aspects-dbt`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably deserves a "where possible", there are several things we still have to do in Superset virtual datasets because we have to use filters in specific ways for performance reasons, but I think this is still the long term goal as ClickHouse's predicate pushdown capabilities get overhauled.
b54d4b6
to
98da80b
Compare
@pomegranited 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future. |
Description
Supporting information
Part of #61