fix: custom query row validation failing when SQL contains upper cased columns #994
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We found this issue on Oracle where column default to upper case but it could easily be reproduced on other engines by writing SQL with upper cased columns, the columns in the actual SQL engine were irrelevant, it was just the SQL we pass in.
The problem was only seen for custom query validation using
--comparison-fields
. When using--hash
the calculated fields resulted in lower case end column names.I added tests to protect from regressions for Oracle, SQL Server, Teradata vs BigQuery. The fix was global so should wortk on all engines but I was reluctant to add too many tests seeing as integration tests are already pretty slow.
The fix was very minor, in
data_validation/clients.py
I convert the Ibis schema to have lower case columns: