fix: custom query row validation failing when SQL contains upper cased columns #994

nj1973 · 2023-09-19T17:15:14Z

We found this issue on Oracle where column default to upper case but it could easily be reproduced on other engines by writing SQL with upper cased columns, the columns in the actual SQL engine were irrelevant, it was just the SQL we pass in.

The problem was only seen for custom query validation using --comparison-fields. When using --hash the calculated fields resulted in lower case end column names.

I added tests to protect from regressions for Oracle, SQL Server, Teradata vs BigQuery. The fix was global so should wortk on all engines but I was reluctant to add too many tests seeing as integration tests are already pretty slow.

The fix was very minor, in data_validation/clients.py I convert the Ibis schema to have lower case columns:

-    return client.sql(query)
+    iq = client.sql(query)
+    # Normalise all columns in the query to lower case.
+    # https://github.com/GoogleCloudPlatform/professional-services-data-validator/issues/992
+    iq = iq.relabel(dict(zip(iq.columns, [_.lower() for _ in iq.columns])))
+    return iq

…atch table equivalent which come back from Ibis in lower case

…ion to match table equivalent which come back from Ibis in lower case" This reverts commit 963b370.

…quivalent which comes back from Ibis in lower case

nj1973 · 2023-09-19T17:15:26Z

/gcbrun

nehanene15

LGTM!

nj1973 added 4 commits September 18, 2023 11:15

fix: Lowercase column names from custom query cursor description to m…

963b370

…atch table equivalent which come back from Ibis in lower case

Revert "fix: Lowercase column names from custom query cursor descript…

0d19878

…ion to match table equivalent which come back from Ibis in lower case" This reverts commit 963b370.

fix: Lowercase column names from custom query schema to match table e…

0ea2e53

…quivalent which comes back from Ibis in lower case

tests: Add tests of custom-queries with mixed case in query projections

95f0fe7

nj1973 requested a review from a team as a code owner September 19, 2023 17:15

nj1973 linked an issue Sep 19, 2023 that may be closed by this pull request

custom query row validation has inconsistent column case to table validation #992

Closed

nehanene15 approved these changes Sep 20, 2023

View reviewed changes

nj1973 merged commit a9fed41 into develop Sep 20, 2023
5 checks passed

nj1973 deleted the fix/custom-query-row-column-case branch September 20, 2023 14:45

release-please bot mentioned this pull request Sep 20, 2023

chore(develop): release 4.2.0 #955

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: custom query row validation failing when SQL contains upper cased columns #994

fix: custom query row validation failing when SQL contains upper cased columns #994

nj1973 commented Sep 19, 2023

nj1973 commented Sep 19, 2023

nehanene15 left a comment

fix: custom query row validation failing when SQL contains upper cased columns #994

fix: custom query row validation failing when SQL contains upper cased columns #994

Conversation

nj1973 commented Sep 19, 2023

nj1973 commented Sep 19, 2023

nehanene15 left a comment

Choose a reason for hiding this comment