Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing AttributeNode prefix and namespace URI #108

Open
fgateuil opened this issue Jun 29, 2023 · 2 comments
Open

Missing AttributeNode prefix and namespace URI #108

fgateuil opened this issue Jun 29, 2023 · 2 comments
Labels

Comments

@fgateuil
Copy link

Hi,

I'm trying to find attribute values within a XML document but the returned data seems erroneous.

Description

When I query an XML to get a specific node attribute with namespace (for instance //@xlink:href), the returned xmlquery.Node is missing the prefix and namespace URI.

Steps to reproduce

package main

import (
	"fmt"
	"strings"

	"github.com/antchfx/xmlquery"
)

func main() {
	xml := `<?xml version="1.0"?>
<root xmlns:xlink="http://www.w3.org/1999/xlink">
	<node xlink:href="http://www.github.com">Some text...</node>
</root>`

	root, _ := xmlquery.Parse(strings.NewReader(xml))
	node, _ := xmlquery.Query(root, "//@xlink:href")
	fmt.Println("NamespaceURI:", node.NamespaceURI)
	fmt.Println("Prefix:", node.Prefix)
	fmt.Println("Data:", node.Data)
}

Expected result

NamespaceURI: http://www.w3.org/1999/xlink
Prefix: xlink
Data: href

Actual result

NamespaceURI:
Prefix:
Data: href

Solution proposal

In github.com/antchfx/xmlquery/query.go#getCurrentNode:

func getCurrentNode(it *xpath.NodeIterator) *Node {
	n := it.Current().(*NodeNavigator)
	if n.NodeType() == xpath.AttributeNode {
		childNode := &Node{
			Type: TextNode,
			Data: n.Value(),
		}
		return &Node{
			Parent:       n.curr,
			Type:         AttributeNode,
			// START MODIFICATION
			NamespaceURI: n.NamespaceURL(),
			Prefix:       n.Prefix(),
			// END MODIFICATION
			Data:         n.LocalName(),
			FirstChild:   childNode,
			LastChild:    childNode,
		}
	}
	return n.curr
}

Additional information

If it appears that I just misused the library, what is the correct way to do please ?
My main use case is as follows:

  • find all the @xlink:href attributes in the document;
  • reset the attribute value to another value.
@zhengchun zhengchun added the bug label Jun 30, 2023
@zhengchun
Copy link
Contributor

Missing to consider attribute nodes prefix and Namespace URL.

You can use the below code to find a parent node node and then iterate over all its attribute values.

	node, _ := xmlquery.Query(root, "//node[@xlink:href]")
	for _, attr := range node.Attr {
		fmt.Println("NamespaceURI:", attr.NamespaceURI)
		fmt.Println("Prefix:", attr.Name.Space)
		fmt.Println("Data:", attr.Name.Local)
	}

@fgateuil
Copy link
Author

fgateuil commented Jul 2, 2023

Missing to consider attribute nodes prefix and Namespace URL.

You can use the below code to find a parent node node and then iterate over all its attribute values.

	node, _ := xmlquery.Query(root, "//node[@xlink:href]")
	for _, attr := range node.Attr {
		fmt.Println("NamespaceURI:", attr.NamespaceURI)
		fmt.Println("Prefix:", attr.Name.Space)
		fmt.Println("Data:", attr.Name.Local)
	}

Well, why not but if I'm doing so, I must first parse the xpath "//node[@xlink:href]" to extract the namespace (xlink) and prefix (href), and then loop over all the attributes to find the ones that match. It's not really efficient.

Anyway, thanks for your help @zhengchun: much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants