Nokogiri adds extra meta http-equiv="Content-Type"
to xhtml?
#3302
-
nokogiri (1.16.7 arm64-darwin) If I have an XML document that This meta tag isn't findable inside the visible nokogiri in-memory struture, but appears on serialization. Is this expected? Is there any way to turn this off or remove it? Thanks for any advice! require 'nokogiri'
input = <<~EOS
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
</head>
<body>
</body>
</html>
EOS
noko = Nokogiri::XML(input)
puts noko.to_xml <?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
</body>
</html> |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
@jrochkind This behavior is from libxml2, originally added in GNOME/libxml2@d5c2f92d (in 2002!). As far as I can tell, this is here because of the Character Encoding section of the XHTML1 spec, which to my reading is a bit ambiguous. There's no way to turn this off, unfortunately. Can I ask why this is an issue for you? I'd love to more deeply understand what you're trying to do. |
Beta Was this translation helpful? Give feedback.
@jrochkind This behavior is from libxml2, originally added in GNOME/libxml2@d5c2f92d (in 2002!).
As far as I can tell, this is here because of the Character Encoding section of the XHTML1 spec, which to my reading is a bit ambiguous.
There's no way to turn this off, unfortunately.
Can I ask why this is an issue for you? I'd love to more deeply understand what you're trying to do.