Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MST and repository parsing APIs #167

Open
str4d opened this issue May 5, 2024 · 3 comments
Open

MST and repository parsing APIs #167

str4d opened this issue May 5, 2024 · 3 comments

Comments

@str4d
Copy link
Contributor

str4d commented May 5, 2024

As part of #118, and to enable consuming #commit events from the firehose, we need APIs for parsing and interacting with MSTs and repository CARs.

@str4d
Copy link
Contributor Author

str4d commented May 5, 2024

I've written code for MST and repo parsing myself for a CAR viewer project, on top of atrium-api and libipld (but will be migrating it to ipld-core soon). I'd be happy to upstream it here if we can decide where it should go.

The current APIs I have are:

struct Repository<R: tokio::io::AsyncRead + tokio::io::AsyncSeek> { .. }

impl<R: AsyncRead + AsyncSeek + Unpin + Send> Repository<R> {
    async fn load(reader: R) -> Result<Self, _> { .. }

    fn did(&self) -> &Did { .. }
    fn keys<'a>(&'a mut self) -> impl Stream<Item = Result<String, _>> + 'a { .. }

    fn get_collection<'a, C: Collection + 'a>(
        &'a mut self,
    ) -> impl futures::Stream<Item = Result<(RecordKey, C::Record), _>> + 'a { .. }

    fn get_collection_reversed<'a, C: Collection + 'a>(
        &'a mut self,
    ) -> impl futures::Stream<Item = Result<(RecordKey, C::Record), _>> + 'a { .. }

    async fn get<C: Collection>(
        &mut self,
        rkey: &RecordKey,
    ) -> Result<Option<C::Record>, _> { .. }
}

mod mst {
    enum Located<E> {
        Entry(E),
        InSubtree(Cid),
    }

    struct Node { .. }

    impl Node {
        fn parse(bytes: &[u8]) -> Result<Option<Self>, _> { .. }
        fn get(&self, key: &[u8]) -> Option<Located<Cid>> { .. }

        fn entries_with_prefix<'a>(
            &'a self,
            prefix: &'a [u8],
        ) -> impl Iterator<Item = Located<(&[u8], Cid)>> + 'a { .. }

        fn reversed_entries_with_prefix<'a>(
            &'a self,
            prefix: &'a [u8],
        ) -> impl Iterator<Item = Located<(&[u8], Cid)>> + 'a { .. }
    }
}

I went with async APIs because I'm reading a CAR file from disk. For firehose subscribers maybe sync APIs would be fine, but given that the crates in this repo already have async APIs, I figure this works fine as a starting point.

@sugyan
Copy link
Owner

sugyan commented May 7, 2024

Thanks for the suggestion!
I hadn't considered implementing something about that yet, but if you would like to add it, I’m very welcome to merge them.

Would it be better to add it as a new package, like atrium-repo (named from @atporoto/repo in reference to the original TypeScript implementation)?
Also, as you may have noticed, we are trying to add some new libraries in atrium-libs in #166, but this is still a draft. I am adding implementations little by little now and may eventually split each into separate packages.

@str4d
Copy link
Contributor Author

str4d commented May 7, 2024

Sure, atrium-repo sounds good. I'll open a PR with the initial implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants