system(command; args) operator #2048

CosmicToast · 2024-05-27T11:11:49Z

Please describe your feature request.
I wish I could use yq to transform keys arbitrarily using external filters.

Describe the solution you'd like
If we have data1.yml like:

country: Australia

And we run a command:

yq '.country = system("/usr/bin/echo"; "test")'

it could output

country: test

system may also pipe inputs as text via stdin.

Describe alternatives you've considered
Some tasks simply require additional processing, which this makes possible.
The alternatives involve doing said processing in a different language (such as js), pulling in the involved parsers (e.g yaml), and running the modifications there.
It is possible to perform most of the edits in yq (or similar tool, such as jq), and then perform the final transformation steps in this separate tool.

Additional context
This is essentially the same request/proposal as jq#1614.
It allows for the same use-cases.
It appears that the jq PR implementing this (among others) is stalled, so this could be an additional reason to use yq over jq, when applicable.

I tried searching for this feature request prior to submission, but my searches either returned nothing comparable, or pages upon pages of (seemingly unrelated) results.
If I missed one, please do close this as a duplicate.

The text was updated successfully, but these errors were encountered:

mikefarah · 2024-06-15T06:16:46Z

I'm not convinced this would be a great idea, security wise. You can get the same outcome by using environment variables and passing that to yq.

CosmicToast · 2024-06-15T07:23:56Z

That’s not really true, the purpose is to perform processing of yq inputs using external tools.
The closest alternative would be to print out the data to be processed, pipe that into a separate script, that would then pipe that into another to process that would accept this data on top of the original data for processing.
If you want some real world examples of what types of processing this enables, please let me know.

mikefarah · 2024-06-16T00:53:05Z

Ah I think I see what you mean, you could use yq to actually perform operations. Kind of like you'd do with find in bash with the exec arg. There was another discussion here about passing in user-controlled expressions to yq (#1961) I did say it was not safe to do so; but still wouldn't surprise me if people are doing that. Adding this in would increase the vulnerability exposure. It could be done with an extra flag passed in; like that jq thread was proposing. Interested to hear other thoughts as well...

CosmicToast · 2024-06-25T11:06:59Z

Sorry for the delay. Here are some example use-cases as encountered in practice (by me personally) in the last couple of weeks (I've been using the patched jq with yq just handling conversion, as I've already submitted a jq patch).

Data Sanitization

Imagine you're working with computer-generated data (for instance, OCR / ASR / etc).
You need to transform the output of the computer into something that's going to be used for EDI.
You can do that with jq/yq as-is, but you also need to do some post-processing.
For example, the machine might have some odd spaces due to recognition faults, or you might want to turn "thirty three" into "33".
You have a tool to perform this processing, but it's not written in jq/yq.

With system: you can just pipe the data to process into the tool and get the data back out.
This is now just a step in the processing pipeline that calls out externally.

Without this, you'd need to canonicalize the order of the data, extract the parts you want to handle out, process them separately, generate the replaced structure, merge the two parts, and then undo the canonicalization (if needed).

API Traversal

jq and yq are fairly good for consuming APIs, since those are typically JSON-native.
However, oftentimes multiple steps are needed.
Instead of dropping back to the shell regularly, it is often easier to simply integrate curl or similar.
This way it's possible to, for example, aggregate a paginated API, calculate the necessary submission, and then perform it.

Sanity Checking

Sometimes, data is entered incorrectly. Correcting it also obviously requires re-verification. It's much easier to write a small helper that checks validity and corrects it if needed, and then use jq/yq to essentially map against it.

CosmicToast added enhancement v4 labels May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

system(command; args) operator #2048

system(command; args) operator #2048

CosmicToast commented May 27, 2024

mikefarah commented Jun 15, 2024

CosmicToast commented Jun 15, 2024

mikefarah commented Jun 16, 2024

CosmicToast commented Jun 25, 2024

system(command; args) operator #2048

system(command; args) operator #2048

Comments

CosmicToast commented May 27, 2024

mikefarah commented Jun 15, 2024

CosmicToast commented Jun 15, 2024

mikefarah commented Jun 16, 2024

CosmicToast commented Jun 25, 2024

Data Sanitization

API Traversal

Sanity Checking