Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Harmonize imports of sqlalchemy module, use sa where applicable #255

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

amotl
Copy link
Contributor

@amotl amotl commented Dec 16, 2023

About

The patch follows a convention to import SQLAlchemy like import sqlalchemy as sa.

In this spirit, all references, even simple ones like symbols to SQLAlchemy base types like TEXT, or BIGINT, will be referenced by sa.TEXT, sa.BIGINT, etc., so it is easy to tell them apart when harmonizing type definitions coming from SA's built-in dialects vs. type definitions coming from 3rd-party dialects.

Note

This patch is stacked on top of GH-254, and as such, can't be reviewed well, because it includes the other changes. The specific commit of interest is 50eb91e.

References

In PostgreSQL, all boils down to the `jsonb[]` type, but arrays are
reflected as `sqlalchemy.dialects.postgresql.ARRAY` instead of
`sqlalchemy.dialects.postgresql.JSONB`.

In order to prepare for more advanced type mangling & validation, and to
better support databases pretending to be compatible with PostgreSQL,
the new test cases exercise arrays with different kinds of inner values,
because, on other databases, ARRAYs may need to have uniform content.

Along the lines, it adds a `verify_schema` utility function in the
spirit of the `verify_data` function, refactored and generalized from
the `test_anyof` test case.
Dispose the SQLAlchemy engine object after use within test utility functions.
Within `BasePostgresSDKTests`, new database connections via SQLAlchemy
haven't been closed, and started filling up the connection pool,
eventually saturating it.
Dispose the SQLAlchemy engine object after use within
`PostgresConnector`.
By wrapping them into a container class `AssertionHelper`, it is easy
to parameterize them, and to provide them to the test functions using
a pytest fixture.

This way, they are reusable from database adapter implementations which
derive from PostgreSQL.

The motivation for this is because the metadata column prefix `_sdc`
needs to be adjusted for other database systems, as they reject such
columns, being reserved for system purposes. In the specific case of
CrateDB, it is enough to rename it like `__sdc`. Sad but true.
@amotl amotl force-pushed the harmonize-sqlalchemy branch 2 times, most recently from 6af82ad to e358dd3 Compare December 19, 2023 22:05
It follows a convention to import SQLAlchemy like
`import sqlalchemy as sa`. In this spirit, all references, even simple
ones like symbols to SQLAlchemy base types like `TEXT`, or `BIGINT`,
will be referenced by `sa.TEXT`, `sa.BIGINT`, etc., so it is easy to
tell them apart when harmonizing type definitions coming from SA's
built-in dialects vs. type definitions coming from 3rd-party dialects.
@edgarrmondragon
Copy link
Member

Probably overkill for this to introduce Ruff, but it can enforce this type of import conventions. For example, the singer SDK enforces the import typing as t alias and bans imports such as from typing import Any: https://github.com/meltano/sdk/blob/0192d5875345a9d8ab42e81d261084290bd456d0/pyproject.toml#L333-L337

@amotl
Copy link
Contributor Author

amotl commented Dec 20, 2023

Sure, let's add Ruff on a subsequent iteration.

@edgarrmondragon
Copy link
Member

@amotl I'm happy to review this if you're still interested in getting it across the finish line 🙂

@amotl
Copy link
Contributor Author

amotl commented Jan 18, 2024

Hi again. It looks like the patch needs a rebase. If it is indeed independent from other ones, I will give it a refresh, in order to bring it in. My general work around Singer/Meltano stalled a bit due to other obligations, but I am looking forward to get back to it. Thanks for the reminder.

@amotl
Copy link
Contributor Author

amotl commented Jan 18, 2024

Ah right. Many of the patches are stacked upon each other, as outlined within the note on the original post. In this case, it is GH-254, which would need to be integrated beforehand. In turn, this one is stacked upon GH-250. So, I probably need to pick up the torch over there, by responding to @sebastianswms's review, and finally wrapping that up.

NB: I have not been able to hand in the PRs properly, because, while I could edit the target branch, I am not able to point them to the proper branches within the repository, because I am not a member on it, so I can't push them directly. On other repositories where I am a member of, it is easier to designate stacked PRs by, well, just editing the target branch so it becomes clear where the patch should go to. In this way, the detail view will also reflect the right diff, not including commits from other branches, like you can observe on this one.

Effectively, the relevant commit is just 1f92935, but well, it depends on the others, so there is not much I could do right now. It will most probably not apply to main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants