Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Reader#attribute_nodes with a rough equivalent #3102

Open
flavorjones opened this issue Jan 18, 2024 · 3 comments
Open

Replace Reader#attribute_nodes with a rough equivalent #3102

flavorjones opened this issue Jan 18, 2024 · 3 comments
Milestone

Comments

@flavorjones
Copy link
Member

Context

Issue #2598 describes at length an incompatibility between Ruby's compacting garbage collector and libxml2's xmlTextReaderExpand which re-uses C structs previously returned. The solution chosen and implemented in #2599 was to deprecate Reader#attribute_nodes. This method was removed entirely in Nokogiri v1.16.0.

In #2599 I acknowledged that there may be use cases the needed the semantics of #attribute_nodes (e.g., returning all attributes) and for which #attribute_hash may not suffice (because the non-namespaced hash keys used may collide). I intentionally deferred introducing a replacement until/unless we had a real-world use case that could drive out the API design.

In #2599 (comment), @ccutrer brought a real-world use case: https://github.com/instructure/ratom/blob/v0.10.11/lib/atom/xml/parser.rb#L103

So let's do this!

API design

I would like to play with a few options for retrieving the namespaced attributes from a Reader cursor.

Some ideas:

  1. Mimic Node#attribute_nodes with a new Reader#attribute_nodes and return an Array<XML::Attr>. This might be a challenging implementation because we'll be re-complicating the XML::Attr lifecycle again, but it's worth attempting to see if something simple like an ephemeral XML::Document would work.
  2. Mirror Reader#attribute_hash with Reader#namespaced_attribute_hash which returns a Hash<String => String> in which the hash key is the attribute name including the namespace prefix if any, and the hash value is the attribute value.

@ccutrer I'm curious what you think the API might look like?

Cleanup

While we're here, we should make sure to delete the additional GC lifecycle handling in xml_node.c for attributes. I deleted the #attribute_nodes method but didn't go further than that.

@ccutrer
Copy link
Contributor

ccutrer commented Jan 18, 2024

#3 is fine with me, as long as I also have any easy way to get from the namespace prefix to the namespace's href. Though maybe the key should be a pair of strings? So I don't have to split on ":" immediately to get back to the prefix itself? Or maybe not. Gotta consider attributes without a namespace. The ratom code actually re-combines it and joins with ":" for one of its branches.

@flavorjones
Copy link
Member Author

@ccutrer I'm not sure I understand - you say #3 but I only listed two ideas?

@ccutrer
Copy link
Contributor

ccutrer commented Jan 22, 2024

It was meant as a "or an alternative you didn't mention." Sorry for any confusion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants