SDK sketches #551

matthiasgoergens · 2024-11-05T03:06:18Z

matthiasgoergens
Nov 5, 2024
Maintainer

SDK design

SDK stands for software development kit. It encompasses everything a developer needs to build and run a guest program. Specifically, that includes the build system, a standard library and other libraries, our runtime, the prover.

Let's sketch out a few different approaches to SDK design, concentrating on how guest code would look like.

Input / Output

Manual

This is what Miden does. The VM offers a few ways for guest programs to get input and produce output. The low level mechanism are fine, and not to different from what we'd implement.

But they never built anything higher level around this: the SDK expects your host program to just somehow come up with the right Merkle tree and tape.

By itself, this model is pretty much useless for anything but toy examples.

Client/server model

The client is the guest program, and the server is the host program. From the point of view of the author of the guest program, the host program is like a server that you can send RPC requests to.

This model is conceptually relatively simple, but it tends to split your program into two parts: the client part, which runs on the guest, and the server part, which runs on the host. That can be a bit awkward to keep in your head, and there's quite a bit of conceptual redundancy.

This style encourages the guest to prepare and send all the information necessary to identify a call to the server, instead of unlocking the full potential of non-determinism in the oracle model.

Risc0 uses this model.

Unconstrained / constrained modes

This model can combine host and guest source code. At a low level, your VM gets a special instruction to 'fork' execution. When execution reaches the 'fork', we pause the original execution, make a clone of all the state, and proceed with the clone. The clone runs outside of the constraints and its execution is not proven. The clone can do anything, including calling out to the network, reading from disk, etc. At the end, the clone can return a value to the original execution, which resumes and can use that value in its own proven computation.

As mentioned above, this model allows you to naturally mix host and guest code. Most guest developers don't even need to be aware of what's happening: they are just calling a function to read from the network, and all the mode switching is handled by the SDK.

In this model it's really easy for the host code to have access to the full state of the guest code, without explicitly passing any information with the 'fork' call: it's all just available.

As a downside, becasue the host-code is also written in Risc-V, you need to add support for all the crazy stuff you want to do in your Risc-V emulator (or at least allow it to call out to third party programs). This style also blows up the size of your guest ELF.

This model is inspired by SP1's 'unconstrained' mode. (But what they are actually doing is a more complicated mix of models.)

Multi-stage computation

From the guest program developers point of view, this model looks like the unconstrained/constrained model. But instead of embedding both code paths in the same Risc-V ELF, you use conditional compilation to generate two separate ELFs.

The first ELF has all the code, including the unconstrained parts. It executes everything, and keeps track of the results of the unconstrained parts on a 'tape'. The second ELF only has the constrained code, and whenever it hits a 'fork' instruction, it read consults the tape we prepared earlier.

In this model, the first ELF doesn't even have to run on Risc-V, it can run natively for all we care. The only requirement is that it can produce a tape that the second ELF can read. Running natively means that you can use all the libraries and tools that are available for your host platform, without our emulator having to know anything about this.

Specifically, you can run a full fledged debugger, without our emulator having to support it.

Building

Both Risc0 and SP1 ship their own version of the Rust compiler. (As far as I can tell.) Those versions are quite out of date by now. So I'd like to avoid that, if we can, and allow people to the stock Rust compiler. We don't want to be in the business of maintaining a fork of the Rust compiler.

Ideally, I'd also want to make it easy for people to run eg C and C++ programs, too. If we can make the stock Rust compiler work, it's more likely that interested parties can make other compilers work as well.

Both Risc0 and SP1 use their own cargo plugins. I think that's mostly because they use their own compilers. But we can investigate if there are other pressing reasons. (They also use these plugins for configuring cargo and the compiler; that's something we are doing right now with just normal cargo configuration files and command line arguments.)

lispc · 2024-11-05T03:19:21Z

lispc
Nov 5, 2024
Maintainer

@hero78119

0 replies

hero78119 · 2024-11-05T10:46:05Z

hero78119
Nov 5, 2024
Collaborator

Thanks for the detail writing and with a nice overview! it greatly help and instantiate a design discussion.

A minor question: I noticed in risczero part there is a bit contradictory

a. The client is the guest program, and the server is the host program.
b. the client part, which runs on the host, and the server part, which runs on the guest.

:)

As there are few design choices, I think we can kick of thoughts from

what (potential/prioritized) applications/guest program we aiming, then evaluate how to add features to SDK to support their development
then if there are few choice, see which one bring lowest cost in Ceno, as prover still keep pay-as-you-go.

I guess most complex guest program is how to read/write unconstrained data from external world, e.g. network/disk. Those data can even be just partial (with Merkle root as source of truth). For example, Where is waldo in risczero. We can refer to those design, and figure out how it work behind the scene.

3 replies

matthiasgoergens Nov 5, 2024
Maintainer Author

Oh, the contradiction you found was just a simple mistake on my part. No deep insight there I'm afraid. I fixed it. Thanks for noticing.

matthiasgoergens Nov 5, 2024
Maintainer Author

I guess most complex guest program is how to read/write unconstrained data from external world, e.g. network/disk.

Oh, that's not at all at the top of the complexity ladder. Almost any design you cook up for unconstrained input will work reasonably well.

matthiasgoergens Nov 5, 2024
Maintainer Author

Your Merkle root suggestion makes sense for committed / constrained input. It's overkill for unconstrained input.

Because unconstrained input is truly unconstrained, thus doesn't have a notion of being complete in the first place, and this doesn't need an accommodation for incomplete reading.

However you bring up Merkle trees. Those are a good segue to constrained in input. Eg you want to be able to ask for a pre-image to a particular digest. (Or sometimes you want to ask for a digest for a particular pre-image, or you have both and just want confirmation.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK sketches #551

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

SDK sketches #551

matthiasgoergens Nov 5, 2024 Maintainer

SDK design

Input / Output

Manual

Client/server model

Unconstrained / constrained modes

Multi-stage computation

Building

Replies: 2 comments · 3 replies

lispc Nov 5, 2024 Maintainer

hero78119 Nov 5, 2024 Collaborator

matthiasgoergens Nov 5, 2024 Maintainer Author

matthiasgoergens Nov 5, 2024 Maintainer Author

matthiasgoergens Nov 5, 2024 Maintainer Author

matthiasgoergens
Nov 5, 2024
Maintainer

Replies: 2 comments 3 replies

lispc
Nov 5, 2024
Maintainer

hero78119
Nov 5, 2024
Collaborator

matthiasgoergens Nov 5, 2024
Maintainer Author

matthiasgoergens Nov 5, 2024
Maintainer Author

matthiasgoergens Nov 5, 2024
Maintainer Author