Paper notes - Scheduling & Networked Systems

Materials

CCF list 2019: [chinese] [international]

To read (learn)

The Tail in Scale
FA2 source code
(OSDI'20) Serving DNNs like Clockwork: Performance Predictability from the Bottom Up
(SC'20) BATCH: Machine Learning Inference Serving on Serverless Platforms with Adaptive Batching
(NSDI'17) Clipper: A Low-Latency Online Prediction Serving System
(ATC'21) InFaas: Automated Model-less Inference Serving
(OSDI'20) AntMan: Dynamic Scaling on GPU Clusters for Deep Learning
(SoCC'21) Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines
(OSDI'18) Gandiva: introspective cluster scheduling for deep learning
(Middleware'17) Swayam: distributed autoscaling to meet SLAs of machine learning inference services with resource efficiency
[partial] Borg; RAS; Trarfik; Ngnix; blog
[low priority]: MapReduce; GFS; BigTable
[low priority] Spanner(google); B4; Dynamo
[book]: Designing Data-Intensive Applications

2022.11

AntMan
BATCH
INFless
- Problem
- Insights
- Solution
- Other
MArk: Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving
- Problem
- Insights
- Solution
- Other
[Serverless] 🔖 Cloud Programming Simplified: A Berkeley View on Serverless Computing
- history of cloud computing
- motivations for serverless computing
- limitations of serverless
[Serverless] Evaluation of Production Serverless Computing Environments
- evaluates the performance of production serverless
- "serverless is powered by container technologies which have near zero start-up delay and deleting latency."
- "a container is deployed and terminated within a few milliseconds for the function invocation w/ pre warmup policy"
[Serverless] Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider
- characterize the FaaS workload of Azure Functions
- propose a practical resource management policy to reduce the number of cold starts
[Serverless] 🔖 Xanadu: Mitigating cascading cold starts in serverless function chain deployments
- (pipeline) deploy resources for the execution chain before traffic bursts
- cascading cold start increase linearly with chain length

2022.10

InferLine: latency-aware provisioning and scaling for prediction serving pipelines
- Problem
- Insight
- Solution
- Other
[Spot instance] Tributary: spot-dancing for elastic services with latency SLOs
- Transient Instance (AWS Spot Instance)
- Trace: ClarkNet & WITS & ...
[Spot instance] Cocktail: A Multidimensional Optimization for Model Serving in Cloud
- Ensemble Learning
- Transient Instance
- "DeepAR-estimator"
- Trace: Wikipedia & tweet

2022.09

Twine: A Unified Cluster Management System for Shared Infrastructure
Shard Manager: A Generic Shard Management Framework for Geo-distributed Applications
Autopilot: workload autoscaling at Google
Piccolo: ---

2022

Scrooge: A Cost-Effective Deep Learning Inference System
Nexus: A GPU Cluster Engine for Accelerating DNN-Based Video Analysis
AutoScale: Dynamic, Robust Capacity Management for Multi-Tier Data Centers

Template:

😏❤️🔖 Template: An example [GPU Scheduling]

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
materials		materials
notes		notes
paper		paper
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paper notes - Scheduling & Networked Systems

Materials

To read (learn)

2022.11

2022.10

2022.09

2022

Template:

About

Releases

Packages

zhixin612/papernotes-scheduling

Folders and files

Latest commit

History

Repository files navigation

Paper notes - Scheduling & Networked Systems

Materials

To read (learn)

2022.11

2022.10

2022.09

2022

Template:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages