Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system(command; args) operator #2048

Open
CosmicToast opened this issue May 27, 2024 · 4 comments
Open

system(command; args) operator #2048

CosmicToast opened this issue May 27, 2024 · 4 comments

Comments

@CosmicToast
Copy link

Please describe your feature request.
I wish I could use yq to transform keys arbitrarily using external filters.

Describe the solution you'd like
If we have data1.yml like:

country: Australia

And we run a command:

yq '.country = system("/usr/bin/echo"; "test")'

it could output

country: test

system may also pipe inputs as text via stdin.

Describe alternatives you've considered
Some tasks simply require additional processing, which this makes possible.
The alternatives involve doing said processing in a different language (such as js), pulling in the involved parsers (e.g yaml), and running the modifications there.
It is possible to perform most of the edits in yq (or similar tool, such as jq), and then perform the final transformation steps in this separate tool.

Additional context
This is essentially the same request/proposal as jq#1614.
It allows for the same use-cases.
It appears that the jq PR implementing this (among others) is stalled, so this could be an additional reason to use yq over jq, when applicable.

I tried searching for this feature request prior to submission, but my searches either returned nothing comparable, or pages upon pages of (seemingly unrelated) results.
If I missed one, please do close this as a duplicate.

@mikefarah
Copy link
Owner

I'm not convinced this would be a great idea, security wise. You can get the same outcome by using environment variables and passing that to yq.

@CosmicToast
Copy link
Author

That’s not really true, the purpose is to perform processing of yq inputs using external tools.
The closest alternative would be to print out the data to be processed, pipe that into a separate script, that would then pipe that into another to process that would accept this data on top of the original data for processing.
If you want some real world examples of what types of processing this enables, please let me know.

@mikefarah
Copy link
Owner

Ah I think I see what you mean, you could use yq to actually perform operations. Kind of like you'd do with find in bash with the exec arg. There was another discussion here about passing in user-controlled expressions to yq (#1961) I did say it was not safe to do so; but still wouldn't surprise me if people are doing that. Adding this in would increase the vulnerability exposure. It could be done with an extra flag passed in; like that jq thread was proposing. Interested to hear other thoughts as well...

@CosmicToast
Copy link
Author

Sorry for the delay. Here are some example use-cases as encountered in practice (by me personally) in the last couple of weeks (I've been using the patched jq with yq just handling conversion, as I've already submitted a jq patch).

Data Sanitization

Imagine you're working with computer-generated data (for instance, OCR / ASR / etc).
You need to transform the output of the computer into something that's going to be used for EDI.
You can do that with jq/yq as-is, but you also need to do some post-processing.
For example, the machine might have some odd spaces due to recognition faults, or you might want to turn "thirty three" into "33".
You have a tool to perform this processing, but it's not written in jq/yq.

With system: you can just pipe the data to process into the tool and get the data back out.
This is now just a step in the processing pipeline that calls out externally.

Without this, you'd need to canonicalize the order of the data, extract the parts you want to handle out, process them separately, generate the replaced structure, merge the two parts, and then undo the canonicalization (if needed).

API Traversal

jq and yq are fairly good for consuming APIs, since those are typically JSON-native.
However, oftentimes multiple steps are needed.
Instead of dropping back to the shell regularly, it is often easier to simply integrate curl or similar.
This way it's possible to, for example, aggregate a paginated API, calculate the necessary submission, and then perform it.

Sanity Checking

Sometimes, data is entered incorrectly. Correcting it also obviously requires re-verification. It's much easier to write a small helper that checks validity and corrects it if needed, and then use jq/yq to essentially map against it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants