Skip to content

Experimental alt-core Cog runtime implementation

Notifications You must be signed in to change notification settings

replicate/cog-runtime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cog Runtime

Caution

This project is experimental and should not be handled near open flame.

Alt-core Cog runtime implementation.

The original Cog seeks to be a great developer tool and also an arguably great production runtime. Cog runtime is focused on being a fantastic production runtime only.

How is cog-runtime formed?

sequenceDiagram
    participant r8
    participant server as cog-server
    participant runner as coglet
    participant predictor as Predictor
    r8->>server: Boot
    activate server
    server->>runner: exec("python3", ...)
    activate runner
    runner<<->>predictor: setup()
    activate predictor
    runner--)server: SIGHUP (output)
    runner--)server: SIGUSR1 (ready)
    server->>runner: read("setup-result.json")
    r8->>server: GET /health-check
    r8->>server: POST /predictions
    server->>runner: write("request-{id}.json")
    runner--)server: SIGUSR2 (busy)
    runner<<->>predictor: predict()
    deactivate predictor
    runner--)server: SIGHUP (output)
    runner--)server: SIGUSR1 (ready)
    server->>runner: read("response-{id}.json")
    deactivate runner
    server->>r8: POST /webhook
    deactivate server
Loading

This sequence is simplified, but the rough idea is that the Replicate platform (r8) depends on cog-server to provide an HTTP API in front of a coglet that communicates via files and signals.

cog-server

Go-based HTTP server that knows how to spawn and communicate with coglet.

coglet

Python-based model runner with zero dependencies outside of the standard library. The same in-process API provided by Cog is available, e.g.:

from cog import BasePredictor, Input

class MyPredictor(BasePredictor):
    def predict(self, n=Input(default="how many", ge=1, le=100)) -> str:
        return "ok" * n

In addition to simple cases like the above, the runner is async by default and supports continuous batching.

Communication with the cog-server parent process is managed via input and output files and the following signals:

  • SIGUSR1 model is ready
  • SIGUSR2 model is busy
  • SIGHUP output is available

About

Experimental alt-core Cog runtime implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published