Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks-sqlalchemy-Use System tables to fetch metadata #451

Open
SomanathSankara opened this issue Oct 5, 2024 · 4 comments
Open

Databricks-sqlalchemy-Use System tables to fetch metadata #451

SomanathSankara opened this issue Oct 5, 2024 · 4 comments

Comments

@SomanathSankara
Copy link

Current Databricks Sql Alchemy uses describe command/show schema/table to get metadata
Can we use system tables to get metadata

@kravets-levko
Copy link
Contributor

Hi @SomanathSankara! Sorry, but I don't think I understand what the question is. Can you please elaborate it a bit?

@SomanathSankara
Copy link
Author

hi @kravets-levko correct my understanding.
Currently sqlalchemy (databricks ) package is using show table comand , show columns command to get table ,column details.
Cant we use UC system table(information schema) to fetch these details which will be faster

@susodapop
Copy link
Contributor

Adding some signal here: yes the system tables can be used to reflect the database schema. This would require Unity Catalog on the target instance, so it would be good to maintain a fallback to the existing SHOW TABLES behavior. But there's no technical limitation of SQLAlchemy that blocks this manner of implementation.

@bob-skowron
Copy link

bob-skowron commented Nov 27, 2024

Hi all. I was looking at potentially implementing this and submitting a PR. For some background, I'm running into an issue using alembic to manage migrations for Databricks and if the schema does not exist, alembic throws an error because it's passing through the error from SHOW TABLES FROM <invalid_schema>.

I had a couple comments/questions:

  1. information_schema only contains actual structures. It will not contain temporary tables or views. I see the get_view_names has a specific parameter for this, but do we see this as a limitation?
  2. If we do need the temporary tables/views (and thus have to use SHOW TABLES/VIEWS), is there any objection to modifying the code to catch any DDL errors and returning an empty array with a logged warning instead? Could also check if the requested schema is in the available schemas and return that way as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants