Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML anchors, aliases and references #285

Open
electriquo opened this issue Jan 5, 2023 · 19 comments
Open

YAML anchors, aliases and references #285

electriquo opened this issue Jan 5, 2023 · 19 comments
Labels
enhancement New feature or request

Comments

@electriquo
Copy link

Is your feature request related to a problem? Please describe.
YAML supports anchors, aliases and references, but Dasel is unable to read it.

$ echo '
foo: &foo
  one: 1
bar:
  <<: *foo
  two: 2
' | yq -y
foo:
  one: 1
bar:
  one: 1
  two: 2

$ echo '
foo: &foo
  one: 1
bar:
  <<: *foo
  two: 2
' | dasel -r yaml

Describe the solution you'd like
Dasel should be able to render the YAML correctly

@electriquo electriquo added the enhancement New feature or request label Jan 5, 2023
@TomWright
Copy link
Owner

TomWright commented Jan 24, 2023

In my work on ordered maps I am writing custom logic around yaml encoding/decoding.

I expect I'll be able to resolve this issue when reading but writing could result in strange results - probably removal of the alias/merge and duplicate data

The reason behind this is that dasel, outside of encoding/decoding is unaware of these concepts.

I think I will purposely disallow this for now until I can come up with a proper solution to keep context throughout the lifecycle.

@electriquo
Copy link
Author

electriquo commented Jan 25, 2023

@TomWright What do you think of solving the issue partially? i.e. implementing only the reading?
This will allow to enjoy most of dasel features.

@TomWright
Copy link
Owner

Perhaps behind a flag. I don't want people to accidentally remove all aliases from their files

@electriquo
Copy link
Author

electriquo commented Jan 25, 2023

I don't want people to accidentally remove all aliases from their files

I wish dasel would preserve in the output the aliases, anchors, references, order and comments from the input
Currently, this is not the case so it might be to stick to the same convention that dasel output a static/rendered yaml output.

$ echo '
foo: foo
baz: baz
bar: &bar
  dasel: rules
cheese:
  <<: *bar
# comment
' | dasel -r yaml
bar:
  dasel: rules
baz: baz
cheese:
  dasel: rules
foo: foo

With that being said, I don't think dasel should care, since dasel is mostly used to transform/manipulate the input through pipes (same as in the snippet above which is inspired by the docs) — piping in and piping out.

@TomWright
Copy link
Owner

The trouble comes because dasel decodes all input data into a generic format, before processing the data and encoding back into the desired format.

This is how the cross-format transformations are possible.

The issue is that the generic format doesn't know about aliases, references etc right now. The ideal solution is to obviously make it aware and deal with it properly when re-encoding but that isn't an easy task. That is what I plan to do, but it will take time.

A requirement for this is that all data goes through some more in-depth decoding so I can actually read that information. The plus here is that the same is required for ordered maps and so some of the work will be done.

@electriquo
Copy link
Author

electriquo commented Jan 26, 2023

Perhaps behind a flag

So from what I see, there is no need to place it behind a flag as it preserve the current behavior of dasel.

The ideal solution is to obviously make it aware and deal with it properly when re-encoding but that isn't an easy task.

By the way, I am unfamiliar with any tool that does this.
If dasel ever support it, it will be differentiated from other tools due to this functionality.
Yet, I am unsure what will how much this functionality will be used by users.

@electriquo
Copy link
Author

electriquo commented Feb 5, 2023

@TomWright docker-compose is writing in go lang and has a support for anchors, aliases and references. It also does not suffer from things like #278, #294, #245, #293, etc.

Maybe it worth taking a look at docker-compose and learn from it.

@TomWright
Copy link
Owner

I am in the middle of reworking the yaml processing dasel does. A good bunch of those issues are solved with the rework, and some of them just aren't yaml processing issues.

All of the number formatting issues are due to the default output formatting of number values in golang, but I am thinking of solutions for those.

The fact that dasel parses into generic types massively complicates things in comparison to something like docker-compose that will be unmarshaling into known structs with expected types.

@electriquo
Copy link
Author

electriquo commented Feb 7, 2023

@TomWright I switched to a tool of my own which consumes structured files (json, yaml, csv, etc.) and let you manipulate them easily. I did it by reading the structured files, converting them to json and then apply a given manipulation (selection/transformation). I did not have to come up with any specific regular language for the manipulation, as I used the builtin syntax and command of the programming language that I used.

I can elaborate more on that, but maybe dasel can benefit from the same approach. I have a feeling it will simplify things.

@TomWright
Copy link
Owner

@electriquo If you have a look to the code I'd be more than happy to take a look

@electriquo
Copy link
Author

@TomWright the code is not polished to share it with the public. but let me share some short snippet

$ cat grabber.rb
#!/usr/bin/env ruby

require 'yaml'

file = ARGV[0]
query = ARGV[1]

data = File.read(file)
parsed = YAML.safe_load(data, aliases: true)
result = eval("#{parsed}#{query}")
puts(parsed.inspect)

$ grabber.rb test.yaml '.keys().map {|e| e.upcase}'

@pmeier
Copy link
Contributor

pmeier commented Jun 10, 2024

Any update on this @TomWright? I second @electriquo in that being able to read yaml files that use anchors would already solve 90%+ of the use cases.

For me, I just use dasel for all the config files rather than having to memorize a bunch of tools. Most of the time, I just read data from the files and pass them along to some other program. But the fact that dasel cannot deal with anchors or the like in yaml files, means that I still need to have something like yq available.

@TomWright
Copy link
Owner

Honestly I'd forgotten about this, but thank you for the reminder. I'll see what I can do

@pmeier
Copy link
Contributor

pmeier commented Jun 27, 2024

@TomWright any way I can help with this? If you can point me to the right parts of the source, I could try to send a PR.

@TomWright
Copy link
Owner

I do have a WIP locally but found the decoder/encoder may need some restructuring. I'm more than happy to accept PR's on the subject.

The code that needs to change is around these areas:

  • Decoder - We need to handle different node types and likely store the node type in Value metadata so we can access it when encoding again.
  • Encoder - May need substantial changes to write according to certain metadata on the value (e.g. is it a reference type or not). We also only pass in an interface{} value to the encoder at this time. This will likely need to accept a Value instead so it has access to metadata.
  • Metadata can be added to a value here
  • Metadata can be read from a value here
  • The actual YAML package we use under the hood is go-yaml/yaml.
  • Some reference code for that package can be found here, however it does look substantially different.

If you've got any questions please let me know. Apologies I haven't got this done yet, I've had a lot going on

@pmeier
Copy link
Contributor

pmeier commented Jun 27, 2024

Thanks for the update! Some questions:

  • Do we need to change the encoder for the first version? IIUC, this is for writing YAML with anchors. Are you ok with only handling the read case first?
  • Do we need a new node type or could it be sufficient to track nodes that have an anchor like foo: &foo in a map and just do a look-up and insert whenever we encounter a reference like <<: *foo?

So for the first version I would propose dasel -f foo.yaml -w yaml converting

foo: &foo
  bar: 1
  baz: "baz"

spam:
  ham: "eggs"
  <<: *foo

into

foo:
  bar: 1
  baz: "baz"

spam:
  ham: "eggs"
  foo:
    bar: 1
    baz: "baz"

Basically just resolving the anchors instead of keeping track of them.

That would enable the read use case as I can then do something like

$ dasel -f foo.yaml '.spam.foo' -w json
{
  "bar": 1,
  "baz": "baz"
}

@TomWright
Copy link
Owner

Good question. Decoding only is a good first step.

We'll need to handle the new node type within the decoder, but all of the values end up being written to a reflect value so nothing outside of the decoder needs to be aware

@TomWright
Copy link
Owner

As of v2.8.0 dasel supports this feature when reading. Note that as of now, writes will de-reference the aliases.

I'm releasing as-is to unblock read use-cases - these never worked before anyway because of unhandled yaml tags so I'm not worried about breaking changes.

Issue will remain open as writes are not yet handled.

@pmeier
Copy link
Contributor

pmeier commented Jul 1, 2024

Just for completion: my original patch was buggy. The fix and thus the proper reading behavior is only available with v2.8.1 onward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants