Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot materialize view models shared via Delta Sharing #781

Open
shivonchain opened this issue Aug 30, 2024 · 4 comments
Open

Cannot materialize view models shared via Delta Sharing #781

shivonchain opened this issue Aug 30, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@shivonchain
Copy link

Describe the bug

When trying to materialize a view in dbt via dbt run or dbt build, and the view is shared via Delta Sharing, the materialization fails:

Runtime Error in model my_view_model (models/my_view_model.sql)
  [RequestId=xxxxx ErrorClass=INVALID_STATE] 
TABLE datacloud_prod_catalog.due_diligence_silver_views.my_view_model cannot be deleted 
because it is being shared via Delta Sharing. It is shared through the following shares: my_share. 
If you just want to update a shared view, please use ALTER VIEW instead.

To update a view definition when the underlying view is shared over Delta Sharing, ALTER VIEW should be used instead of CREATE OR REPLACE, which is issued by dbt for view models. Ideally, this is baked into the default view materialization in dbt-databricks.

An alternative I have considered is to use pre- and post-hooks to remove and add the view back to the underlying share. However, this is not always feasible.

Steps To Reproduce

In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.

  1. Create a view model in dbt.
  2. Share the view model with a data recipient via Delta Sharing.
  3. Run the view model using dbt run -s <model>.

Expected behavior

I expect the dbt view to be materialized even if it is shared by Delta Sharing. The view materialization should check if the view exists; if it does, then ALTER VIEW should be issued to Databricks instead of CREATE OR REPLACE to update the view definition.

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

The output of dbt --version:

Core:
  - installed: 1.8.2
  - latest:    1.8.6 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - databricks: 1.8.1 - Update available!
  - spark:      1.8.0 - Up to date!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

The operating system you're using: macOS, M3 Max

The output of python --version: Python 3.12.3

Additional context

Add any other context about the problem here.

@shivonchain shivonchain added the bug Something isn't working label Aug 30, 2024
@dataders
Copy link
Contributor

dataders commented Sep 4, 2024

@benc-db for when you're back from vacation

@benc-db
Copy link
Collaborator

benc-db commented Sep 12, 2024

This is an interesting edge case that will require some significant testing. The safest change would be to only use alter in place of 'create or replace' if the table is being delta-shared. Is there a tblproperty that indicates this?

@shivonchain
Copy link
Author

@benc-db I don't believe so, the closest piece of metadata I found was by calling the REST API and grabbing the storage_location of the view.

databricks tables get $TABLE_FQN | jq .storage_location

For Delta-shared tables/views, the storage_location URI will begin with uc-deltasharing://.

I'd like to see if I can grab storage_location with SQL instead, that way it could be incorporated into the materialization logic like you suggest

@henlue
Copy link
Contributor

henlue commented Sep 25, 2024

My team experiences the same issue. We would be glad about a solution as well.

@shivonchain If I'm not mistaken then the storage_location can only be used from the recipients side to check if a view was received via delta sharing. To resolve the issue we would need to know from the sharing side if a view is part of a share.

The only way I found to check if a view is being delta-shared is using a requests to the shares api: databricks shares list and databricks shares get <share_name> --include-shared-data. The result would be a list of all catalogs, schemas and relations that are being delta-shared. It would be possible to check whether the current view is part of this.

@benc-db would it maybe be better to try to execute the create or replace statement and, if it fails with the error above, retry it using an alter statement?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants