Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to add an option to store validation results in a file rather than on stdout #1357

Open
nj1973 opened this issue Nov 28, 2024 · 2 comments
Assignees
Labels
priority: p1 High priority. Fix may be included in the next release. type: feature request 'Nice-to-have' improvement, new feature or different behavior or design.

Comments

@nj1973
Copy link
Contributor

nj1973 commented Nov 28, 2024

Some customers do not have access to BigQuery or choose not to use the service. When running DVT via a service like Cloud Run they have no way to easily capture the validation results.

We could add a file based alternative to --bq-result-handler, perhaps --file-result-handler or --text-result-handler that overrides writing to stdout with a file path. The file path should support cloud storage URIs.

We should give some thought to the option value format.

Is it as simple as a path string or should we revamp how we supply the format too by accepting a JSON value? For example:

--file-result-handler='{"path": "gs://some-uri", "format": "csv", "mode": "overwrite"}'

And revamp --bq-result-handler in a similar way (perhaps deprecating --service-account while we are at it):

--bq-result-handler='{"project": "my-project", "table": "my_dataset.results_table", "service-account": None}'
@nj1973 nj1973 added the type: feature request 'Nice-to-have' improvement, new feature or different behavior or design. label Nov 28, 2024
@nj1973
Copy link
Contributor Author

nj1973 commented Nov 29, 2024

Noting that we also have issue #1275 requesting ability to control the fields output to CSV. If we do go with a JSON based file or text result handler option then issue 1275 may be able to build on top of that by adding a columns attribute.

@sundar-mudupalli-work
Copy link
Collaborator

Hi,

When jobs are run with Cloud Run and Big Query is not used to capture output, the console output can be viewed using gcloud cli as follows:

gcloud beta run jobs logs read <job-name>
gcloud beta run jobs executions logs read <execution-id>

Not only you get each line output to the screen - you also get the timestamp - so you may have to remove that to have a clean output.

If we have a specific customer with this need and not able to use the gcloud command or it does not work for them, let us create an approach that works for them.

Sundar Mudupalli

@helensilva14 helensilva14 added the priority: p1 High priority. Fix may be included in the next release. label Dec 2, 2024
@nj1973 nj1973 self-assigned this Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p1 High priority. Fix may be included in the next release. type: feature request 'Nice-to-have' improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

3 participants