Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Root hash mismatch when compared to other libraries. #182

Open
azteca1998 opened this issue Feb 2, 2023 · 1 comment
Open

Root hash mismatch when compared to other libraries. #182

azteca1998 opened this issue Feb 2, 2023 · 1 comment

Comments

@azteca1998
Copy link

Hello,

We're developing a patricia merkle tree and would like to compare our implementation with yours (both for testing and benchmarking). However we've found that the computed hash from your library doesn't match the ones computed from other libraries (including ours).

Hashing mismatch example.
//! # Basic example.
//!
//! Target hash extracted from here:
//!   https://github.com/ethereum/tests/blob/develop/TrieTests/trietest.json#L97-L104
//!
//! ## Dependencies
//!
//! ```toml
//! [dependencies]
//! cita_trie = "4.0.0"
//! hasher = "0.1.4"
//! hex-literal = "0.3.4"
//! memory-db = "0.31.0"
//! reference-trie = "0.26.0"
//! sha3 = "0.10.6"
//! trie-db = "0.24.0"
//!
//! [dependencies.patricia-merkle-tree]
//! git = "https://github.com/lambdaclass/merkle_patricia_tree"
//! ```

use cita_trie::{MemoryDB, PatriciaTrie, Trie};
use hasher::HasherKeccak;
use hex_literal::hex;
use patricia_merkle_tree::PatriciaMerkleTree;
use sha3::Keccak256;
use std::sync::Arc;

fn main() {
    const DATA: &[(&[u8], &[u8])] = &[(b"abc", b"123"), (b"abcd", b"abcd"), (b"abc", b"abc")];
    const HASH: [u8; 32] = hex!("7a320748f780ad9ad5b0837302075ce0eeba6c26e3d8562c67ccc0f1b273298a");

    println!("Expected   : {HASH:02x?}");
    println!("Our    hash: {:02x?}", run_ours(DATA.iter().copied()));
    println!("Cita   hash: {:02x?}", run_cita(DATA.iter().copied()));
    println!("Parity hash: {:02x?}", run_parity(DATA.iter().copied()));
}

fn run_ours<'a>(data: impl Iterator<Item = (&'a [u8], &'a [u8])>) -> [u8; 32] {
    let mut trie = PatriciaMerkleTree::<_, _, Keccak256>::new();

    data.for_each(|(p, v)| {
        trie.insert(p, v);
    });

    trie.compute_hash().as_slice().try_into().unwrap()
}

fn run_cita<'a>(data: impl Iterator<Item = (&'a [u8], &'a [u8])>) -> [u8; 32] {
    let mem_db = Arc::new(MemoryDB::new(true));
    let hasher = Arc::new(HasherKeccak::new());

    let mut trie = PatriciaTrie::new(mem_db, hasher);

    data.for_each(|(p, v)| {
        trie.insert(p.to_vec(), v.to_vec()).unwrap();
    });

    trie.root().unwrap().try_into().unwrap()
}

fn run_parity<'a>(data: impl Iterator<Item = (&'a [u8], &'a [u8])>) -> [u8; 32] {
    use memory_db::{HashKey, MemoryDB};
    use reference_trie::ExtensionLayout;
    use trie_db::{NodeCodec, TrieDBMutBuilder, TrieHash, TrieLayout, TrieMut};

    let mut mem_db =
        MemoryDB::<_, HashKey<_>, _>::new(<ExtensionLayout as TrieLayout>::Codec::empty_node());
    let mut root = <TrieHash<ExtensionLayout>>::default();

    let mut trie = TrieDBMutBuilder::<ExtensionLayout>::new(&mut mem_db, &mut root).build();

    data.for_each(|(p, v)| {
        trie.insert(p, v).unwrap();
    });

    trie.commit();
    *trie.root()
}

After some investigation we've found that although the hashing algorithm is the same (given the same inputs will produce the same outputs), the inputs to the hashing function are different. As per the ethereum wiki on patricia merkle tries we've used RLP encoding on our implementation, and also found references to it in yours.

When trying to decode (using RLP) the data passed to your hash function we've found that it can't be RLP (or at least not the RLP on the ethereum wiki) since it didn't make sense. For example, attempting to decode an RLP list with a few exabytes of data.

Do you use a specific (non-ethereum RLP) encoding that fits your needs? Can it be changed to use ethereum's RLP?

Thanks.

@cheme
Copy link
Contributor

cheme commented Aug 3, 2023

Do you use a specific (non-ethereum RLP) encoding that fits your needs? Can it be changed to use ethereum's RLP?

yes codec is define by the https://github.com/paritytech/trie/blob/master/trie-db/src/node_codec.rs NodeCodec use by the trie layout.

The ethereum one (the rlp codec) is not in this crate but in the old parity-ethereum repository, and later in the open-ethereum repository, and aligned to an old version of the crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants