diff --git a/apps/docs/docs/contribute/connect-data/bigquery/replication.md b/apps/docs/docs/contribute/connect-data/bigquery/replication.md index 845501e3b..ae792f573 100644 --- a/apps/docs/docs/contribute/connect-data/bigquery/replication.md +++ b/apps/docs/docs/contribute/connect-data/bigquery/replication.md @@ -1,5 +1,5 @@ --- -title: 🏗️ Using a BigQuery Data Transfer Service +title: Using BigQuery Data Transfer Service sidebar_position: 2 --- @@ -15,12 +15,66 @@ If you already maintain a public dataset in the US multi-region, you should simply make a dbt source as shown in [this guide](./index.md). -## OSO Dataset Replication +## Define the Dagster asset -:::warning -Coming soon... This section is a work in progress. -To track progress, see this -[GitHub issue](https://github.com/opensource-observer/oso/issues/1311). -::: +Create a new asset file in +`warehouse/oso_dagster/assets/`. +This file should invoke the BigQuery Data Transfer asset factory. +For example, you can see this in action for +[Lens data](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/lens.py). +We make a copy of this data because the source dataset is not +in the US multi-region, which is required by our dbt pipeline. + +```python +# warehouse/oso_dagster/assets/lens.py +from ..factories import ( + create_bq_dts_asset, + BigQuerySourceConfig, + BqDtsAssetConfig, + SourceMode, + TimeInterval, +) + +lens_data = create_bq_dts_asset( + BqDtsAssetConfig( + name="lens", + destination_project_id="opensource-observer", + destination_dataset_name="lens_v2_polygon", + source_config=BigQuerySourceConfig( + source_project_id="lens-public-data", + source_dataset_name="v2_polygon", + service_account=None + ), + copy_interval=TimeInterval.Weekly, + copy_mode=SourceMode.Overwrite, + ), +) +``` + +For the latest documentation on configuration parameters, +check out the comments in the +[BigQuery Data Transfer factory](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/factories/bq_dts.py). + +In order for our Dagster deployment to recognize this asset, +you need to import it in +`warehouse/oso_dagster/assets/__init__.py`. + +```python +... +from .lens import * +... +``` + +For more details on defining Dagster assets, +see the [Dagster tutorial](https://docs.dagster.io/tutorial). + +### BigQuery Data Transfer examples in OSO + +In the +[OSO monorepo](https://github.com/opensource-observer/oso), +you will find a few examples of using the BigQuery Data Transfer asset factory: + +- [Farcaster data](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/farcaster.py) +- [Lens data](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/lens.py) diff --git a/apps/docs/docs/contribute/connect-data/gcs.md b/apps/docs/docs/contribute/connect-data/gcs.md index 6e26bfd28..d467e8f5e 100644 --- a/apps/docs/docs/contribute/connect-data/gcs.md +++ b/apps/docs/docs/contribute/connect-data/gcs.md @@ -42,7 +42,7 @@ For example, you can see this in action for from ..factories import ( interval_gcs_import_asset, SourceMode, - Interval, + TimeInterval, IntervalGCSAsset, ) @@ -57,7 +57,7 @@ gitcoin_passport_scores = interval_gcs_import_asset( destination_table="passport_scores", raw_dataset_name="oso_raw_sources", clean_dataset_name="gitcoin", - interval=Interval.Daily, + interval=TimeInterval.Daily, mode=SourceMode.Overwrite, retention_days=10, format="PARQUET", @@ -88,8 +88,8 @@ In the [OSO monorepo](https://github.com/opensource-observer/oso), you will find a few examples of using the GCS asset factory: -- [Superchain data](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets.py) -- [Gitcoin Passport scores](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets.py) -- [OpenRank reputations on Farcaster](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets.py) +- [Superchain data](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/__init__.py) +- [Gitcoin Passport scores](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/gitcoin.py) +- [OpenRank reputations on Farcaster](https://github.com/opensource-observer/oso/blob/main/warehouse/oso_dagster/assets/karma3.py)