You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In issue #1337 we added support for customers to define a custom endpoint. This was tested by the requesting and confirmed as working, however, as they have continued their testing they have run into further issues.
Example (sanitised) command:
$ data-validation -v --log-level DEBUG validate column -sc bq_conn -tc bq_conn \
-tbls OWN.TAB=OWN.TAB --filters 'TS_COL > TIMESTAMP(CURRENT_DATE())'
...
11/26/2024 01:23:36 PM-DEBUG: Starting new HTTPS connection (1): bigquery-priv.p.googleapis.com:443
11/26/2024 01:23:36 PM-DEBUG: https://bigquery-priv.p.googleapis.com:443 "GET /bigquery/v2/projects/proj/datasets/OWN/tables/TAB?prettyPrint=false HTTP/11" 200 None
11/26/2024 01:23:36 PM-DEBUG: https://bigquery-priv.p.googleapis.com:443 "GET /bigquery/v2/projects/proj/datasets/OWN/tables/TAB?prettyPrint=false HTTP/11" 200 None
11/26/2024 01:23:36 PM-INFO: {'data_client': <ibis.backends.bigquery.Backend object at 0x7f4a5d49da90>, 'schema_name': 'OWN', 'table_name': 'TAB', 'source_query': None}
11/26/2024 01:23:36 PM-INFO: -- ** Source Query ** --
11/26/2024 01:23:36 PM-INFO: SELECT count(1) AS `count`
FROM `proj.OWN.TAB` t0
WHERE TS_COL > TIMESTAMP(CURRENT_DATE())
11/26/2024 01:23:36 PM-DEBUG: Converted retries value: 3 -> Retry(total=3, connect=None, read=None, redirect=None, status=None)
... same as above but for target connection ...
11/26/2024 01:24:03 PM-DEBUG: Retrying due to 503 failed to connect to all addresses; last error: UNKNOWN: ipv4:172.x.y.z:443: Failed to connect to remote host: Timeout occurred: FD Shutdown, sleeping 0.0s ...
11/26/2024 01:24:03 PM-DEBUG: Retrying due to 503 failed to connect to all addresses; last error: UNKNOWN: ipv4:142.x.y.z:443: Failed to connect to remote host: Timeout occurred: FD Shutdown, sleeping 0.1s ...
11/26/2024 01:24:03 PM-DEBUG: Retrying due to 503 failed to connect to all addresses; last error: UNKNOWN: ipv4:172.x.y.z:443: Failed to connect to remote host: Timeout occurred: FD Shutdown, sleeping 0.0s …
We can see some successful interactions with the correct private endpoint:
$ data-validation -v --log-level DEBUG query --conn bq_conn --query 'SELECT count(1) AS `count` FROM `proj.O_TEST.T1` t0'
...
[(1,)]
As does this:
$ data-validation -v --log-level DEBUG query --conn bq_conn --query 'SELECT count(1) AS `count` FROM `proj.OWN.TAB` where TS_COL > TIMESTAMP(CURRENT_DATE()) '
...
[(349893,)]
The second query above it the same query as DVT generated in the failing command. The only difference appears to be that one is from data-validation query and the other from data-validation validate column.
We are missing an override for the BigQuery Storage API endpoint.
data-validation query is not using the storage API, when I use data-validation validate column we pass through some Ibis to_arrow code which sends us down a different path and interacts with the storage API.
In Ibis v6 and upwards we have the option of passing in a BigQuery client and a BigQuery storage API client which has two benefits.
We would be able to easily fix this problem
We could avoid the monkey patching we do now for the BigQuery client which ends up making two connections, the standard Ibis one and then our custom one which overrides the original.
Obviously upgrading Ibis is non trivial so I need to consider other options.
In issue #1337 we added support for customers to define a custom endpoint. This was tested by the requesting and confirmed as working, however, as they have continued their testing they have run into further issues.
Example (sanitised) command:
We can see some successful interactions with the correct private endpoint:
But start trying to access blocked IPs when executing the query.
The text was updated successfully, but these errors were encountered: