Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the ergonomics of awaiting predictions #204

Closed
zeke opened this issue Feb 16, 2024 · 8 comments
Closed

Improve the ergonomics of awaiting predictions #204

zeke opened this issue Feb 16, 2024 · 8 comments

Comments

@zeke
Copy link
Member

zeke commented Feb 16, 2024

The replicate.run method provides a nice way of running a prediction and getting its output with a single line of code:

const output = await replicate.run(identifier, options, progress);

That's nice if I only want the prediction output. But what if I want the whole prediction object? For example, maybe I want to know the prediction id, or see how long the prediction took to run. In this case, the best option is this:

let prediction = await replicate.predictions.create({ version, input })
prediction = await replicate.wait(prediction);

That works, but it's not ideal for a few reasons:

  • It's more code. No longer a glamorous one-liner.
  • The fact that replicate.prediction.create is itself an async function makes me think that maybe I'm awaiting its completion too, when in fact I'm actually just awaiting the empty shell of a prediction.
  • The wait call is on the replicate object, rather than on the prediction instance itself. This is confusing.
  • I can't make the prediction a const because I have to re-assign it when I call wait. It feels weird to create a thing and then overwrite it entirely, especially in the age of JavaScript constants.
  • I would probably always have to look at the docs to get this right.

How can we make this better? My off-the-cuff idea is a wait option on the replicate.predictions.create method:

const prediction = await replicate.predictions.create({ version, input, wait: true })

What do folks think of that? Open to ideas here.

cc @replicate/product @replicate/hackers

@mattt
Copy link
Contributor

mattt commented Feb 16, 2024

That's nice if I only want the prediction output. But what if I want the whole prediction object?

@zeke I'm pretty sure we don't document that anywhere, but replicate.run takes a progress callback that returns the prediction object after each polling response. You can use that to get the prediction ID, current status, and logs.

const output = await replicate.run(model, { input }, ({ id, status, logs }) => {
  console.log({ id, status, last_log_line: logs.split("\n").pop() });
});

How does that feel?

@zeke
Copy link
Member Author

zeke commented Feb 16, 2024

Good to know! That's definitely useful, but maybe not entirely intuitive. Imagine coming upon this code without having read the docs. Would you understand that that callback was a progress event?

Either way, we should document that!

@mattt
Copy link
Contributor

mattt commented Feb 19, 2024

Imagine coming upon this code without having read the docs. Would you understand that that callback was a progress event?

Not that code golf version, no. But there are more self-documenting ways to write it:

callback = (prediction) => {
   const last_log_line = prediction.logs.split("\n").pop() 
   console.log({id: prediction.id, log: last_log_line})
}
const output = await replicate.run(model, { input }, callback)

But more to your original point: I don't hate replicate.predictions.create(..., wait: true) — in fact, the Swift client has that pattern. We can definitely add that.

@zeke
Copy link
Member Author

zeke commented Feb 20, 2024

Yep that snippet is definitely more intuitive. Could probably take it a step further by naming callback to something suggestive of the fact that it's a repeated omitted event. As an old-school Node.js person, I think of the word "callback" meaning "the thing that gets called once when the function is done". Here's another take on it:

onProgress = (prediction) => {
   const last_log_line = prediction.logs.split("\n").pop() 
   console.log({id: prediction.id, log: last_log_line})
}
const output = await replicate.run(model, { input }, onProgress)

☝🏼 I know I'm being pedantic here, but the goal is to get to a code snippet for the docs that feels clear and unambiguous.


But more to your original point: I don't hate replicate.predictions.create(..., wait: true) — in fact, the Swift client has that pattern. We can definitely add that.

Nice! I thought I'd seen that somewhere. I forgot it was the Swift client.

@zeke
Copy link
Member Author

zeke commented Feb 20, 2024

A comment from @deepfates on replicate/create-replicate#39 (comment)

I like this from a usability standpoint but from a wording standpoint i feel weird about showcasing replicate.wait this way. Just seems to imply that it will take a long time, while .run implies something happening right away. No strong block though

☝🏼 Coming at this from a different angle: what if replicate.run had an option for giveMeTheWholePredictionInsteadOfJustTheOutput?

Maybe not a great idea, as I know that's unsavory from a good-programmer typed-languages perspective to have a function that returns differently shaped objects based on its inputs. But it could also be a nice way for people to migrate from the existing replicate.run thing we told them to write to the thing they actually want.

@mattt
Copy link
Contributor

mattt commented Feb 20, 2024

@zeke I think that gets to the underlying question: Do we want to make predictions more or less prominent in our system?

I'd argue that a prediction a means to an end. It's ephemeral. Like a train ticket, once you get to where you're going, you don't need it anymore (and ideally, you wouldn't need one at all). Except in the cases where you're still in transit or you didn't reach your intended destination, why hold onto it?

Why should someone care about a prediction object in the first place?

  • If it's for debugging, replicate.run failures throw an error. If that error contains a prediction object, that should be enough to go on. In this way, it's like a stack trace.
  • If it's for logs, the callback in run gives you that. (And also, we're likely extracting that into its own resource, because of how large they get for long-running work)
  • If it's for accounting, webhooks provide a good way to funnel everything into a single endpoint.
  • If it's for analytics, there are other solutions. For example, we could decorate replicate.run in such a way to wrap the output in a metrics object.

@onurkose
Copy link

onurkose commented May 15, 2024

Here is a better version of starting a prediction with run and not waiting until it's done:

    const predictionIdPromise = new Promise((resolve) => {
      const onProgress = (prediction: Prediction) => {
        resolve(prediction.id); // can be something like resolve({ id: prediction.id, ...more parameters to return });
      };

      replicate.run(
        identifier,
        { input },
        onProgress,
      );
    });

    const predictionId = await predictionIdPromise;

@mattt
Copy link
Contributor

mattt commented Sep 16, 2024

@onurkose FYI - replicate.run is a convenience method that combines replicate.predictions.create and replicate.wait. You can call replicate.predictions.create to create a prediction and not wait for it to finish, which is useful for "fire-and-forget" workflows, where you rely on webhooks to deliver results asynchronously.

@mattt mattt closed this as completed Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants