Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looking for wellformed check #119

Open
MatthD opened this issue Sep 7, 2023 · 1 comment
Open

Looking for wellformed check #119

MatthD opened this issue Sep 7, 2023 · 1 comment

Comments

@MatthD
Copy link

MatthD commented Sep 7, 2023

Hello,
I am the creator or node-libxml, I would like to based my lib on your's instead of the C implementation.
I am facing difficulty to try to perform wellformed check

main.rs

use libxml::{tree::Document, parser::XmlParseError};



fn main() {
    let parser = libxml::parser::Parser::default(); 
    let xml_file = parser.parse_file("tests/data/test-not-wellformed.xml");
    // let root_name = xml_file.unwrap().get_root_element().unwrap().get_name();
    dbg!(is_wellformed(xml_file.unwrap().get_root_element()));
}

fn is_wellformed(doc: Result<Document, XmlParseError>)-> bool{
    match doc {
        Err(_error) => {
            false
        },
        Ok(_doc) => {
            true
        },
    }
}

tests/data/test-not-wellformed.xml

<!DOCTYPE article PUBLIC "my doctype of doom" "mydoctype.dtd">
<xpath>
    <to>
        <my>
            <infos>trezaq</infos>
    </to>
</xpath>
``

return me true, should return me false because it's not wellformed.

Furthermore I would need DTD & XSD validation and path parsing but I suppose I will need other libraries or contribute to your ;) 
@dginev
Copy link
Member

dginev commented Sep 7, 2023

Hi @MatthD ,

I see you wrote:

I would like to based my lib on your's instead of the C implementation.

just a word of warning - the current rust-libxml crate depends on having the C headers installed, and is a thin wrapper over them in Rust.

It is not a full Rust reimplementation of libxml2. I am tracking the c2rust project's port of libxml2 to see if one day we could indeed be fully Rust-native, but that is some ways away still.


As to checking XML well-formedness, it is possible that we have not yet fleshed that out in the wrapper layer. There is a dedicated method in Parser called is_well_formed_html which seems to be doing a decent job at this for HTML, and we may want to extend/generalize Parser to also support XML well-formedness checks.

Similarly, the Parser parse_file method will currently always return error-free, as long as it managed to obtain a Document from the underlying libxml2 layer. So the error check in your snippet will only catch cases where no document could be constructed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants