Real-time serving with embedded model

Real-time serving with embedded model is about distributed event-at-a-time processing with millisecond latency and high throughput.

What to optimize: latency and throughput

End user: usually no direct interactions with a model

Validation: offline and online via A/B testing

Where to start

Learn MLOps general concepts:

Next learn more about real-time serving with embedded ML models:

This workshop is WIP

It will cover a real-life use case of embedding a machine learning model into streaming app and its troubleshooting

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md