Skip to content

Commit

Permalink
add project ray
Browse files Browse the repository at this point in the history
  • Loading branch information
Yuduo Wu committed Dec 14, 2018
1 parent 6a00f48 commit c4d8144
Show file tree
Hide file tree
Showing 4 changed files with 50 additions and 15 deletions.
65 changes: 50 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
***
<p align="center"><img src="images/logo.png" width="100%"/></p>

<h1 align="center">
:tent: Awesome AI Infrastructures :tent:
</h1>
***

<p align="center">
:orange_book: List of real-world AI infrastructures (a.k.a., <strong>machine
Expand All @@ -17,7 +15,7 @@ recommendations and suggestions are welcome :tada:.

***

# Introduction
## Introduction

This list contains some popular actively-maintained AI infrastructures that
focus on one or more of the following topics:
Expand All @@ -33,7 +31,7 @@ inference frameworks. My learning goals are: understand the workflows and
principles of how to build (large-scale) systems that can enable machine
learning in production.

# Machine Learning Platforms
## Machine Learning Platforms

### [TFX](https://www.tensorflow.org/tfx/) - TensorFlow Extended ([Google](https://www.google.com/about/))

Expand Down Expand Up @@ -85,7 +83,7 @@ learning in production.

#### Architecture:

<p align="center"><img src="images/nvidia-rapids-arch.png" width="90%"/></p>
<p align="center"><img src="images/nvidia-rapids-arch.png" width="80%"/></p>

#### Components:

Expand Down Expand Up @@ -167,7 +165,7 @@ up for easy, fast, and scalable distributed training.

#### Architecture:

<p align="center"><img src="images/apple-alchemist-arch.png" width="90%"/></p>
<p align="center"><img src="images/apple-alchemist-arch.png" width="70%"/></p>

#### Components:

Expand Down Expand Up @@ -272,9 +270,27 @@ allows them to upload and browse the code assets, submit distributed jobs, and q
| [__h2o__](https://www.h2o.ai/products/h2o/) | [__h2o4gpu__](https://www.h2o.ai/products/h2o4gpu/) |

<p align="center"><img src="images/h2o-arch.png" width="90%"/></p>
<p align="center"><img src="images/h2o-arch.png" width="80%"/></p>

### Project Ray ([RISELab](https://rise.cs.berkeley.edu/projects/ray/))

> Ray is a high-performance distributed execution framework targeted at large-scale machine learning and reinforcement learning applications. It achieves scalability and fault tolerance by abstracting the control state of the system in a global control store and keeping all other components stateless. It uses a shared-memory distributed object store to efficiently handle large data through shared memory, and it uses a bottom-up hierarchical scheduling architecture to achieve low-latency and high-throughput scheduling. It uses a lightweight API based on dynamic task graphs and actors to express a wide range of applications in a flexible manner.
| [__homepage__](https://ray.readthedocs.io/en/latest/) | [__github__](https://github.com/ray-project/ray) | [__blog__](https://ray-project.github.io/) | [__design overview__](https://ray.readthedocs.io/en/latest/internals-overview.html) | [__paper__](https://arxiv.org/abs/1712.05889) |

#### Architecture:

<p align="center"><img src="images/ucb-ray-arch.png" width="70%"/></p>

#### Components:

- **Tune**: Scalable hyper-parameter search.

- **RLlib**: Scalable reinforcement learning.

# Model Inference Deployment
- **Distributed** training.

## Model Inference Deployment

### CoreML ([Apple](https://www.apple.com/))

Expand All @@ -284,7 +300,7 @@ allows them to upload and browse the code assets, submit distributed jobs, and q
| [__homepage__](https://developer.apple.com/machine-learning/) | [__documentation__](https://developer.apple.com/documentation/coreml) | [__resources__](https://developer.apple.com/machine-learning/build-run-models/) |

<p align="center"><img src="images/apple-coreml-arch.png" width="90%"/></p>
<p align="center"><img src="images/apple-coreml-arch.png" width="70%"/></p>

### TensorFlow Lite

Expand Down Expand Up @@ -341,7 +357,7 @@ allows them to upload and browse the code assets, submit distributed jobs, and q

#### Architecture:

<p align="center"><img src="images/oracle-graphpipe-arch.jpg" width="90%"/></p>
<p align="center"><img src="images/oracle-graphpipe-arch.jpg" width="60%"/></p>

#### Features:

Expand All @@ -351,7 +367,7 @@ allows them to upload and browse the code assets, submit distributed jobs, and q

- Efficient client implementations in Go, Python, and Java.

# Model Training / Inference Optimizations
## Model Training / Inference Optimizations

### TensorFlow XLA (Accelerated Linear Algebra)

Expand Down Expand Up @@ -492,6 +508,25 @@ to any order.

- **Hardware Optimizations**: ONNX makes it easier for optimizations to reach more developers. Any tools exporting ONNX models can benefit ONNX-compatible runtimes and libraries designed to maximize performance on some of the best hardware in the industry.

### Neural Network Distiller (Intel AI Lab)

> Distiller is an open-source Python package for neural network compression research.
Network compression can reduce the footprint of a neural network, increase its inference speed and save energy. Distiller provides a PyTorch environment for prototyping and analyzing compression algorithms, such as sparsity-inducing methods and low precision arithmetic.

| [__homepage__](https://nervanasystems.github.io/distiller/index.html) | [__github__](https://github.com/NervanaSystems/distiller/) | [__documentation__](https://nervanasystems.github.io/distiller/usage/index.html) |

#### Workflow:

<p align="center"><img src="images/intel-distiller-arch.png" width="70%"/></p>

#### Components:

- A framework for integrating pruning, regularization and quantization algorithms.

- A set of tools for analyzing and evaluating compression performance.

- Example implementations of state-of-the-art compression algorithms.

### AMC - AutoML for Model Compression engine

> We propose AutoML for Model Compression (AMC) which leverage [reinforcement learning](https://en.wikipedia.org/wiki/Reinforcement_learning) to provide the model compression policy. This learning-based compression policy outperforms conventional rule-based compression policy by having higher compression ratio, better preserving the accuracy and freeing human labor.
Expand Down Expand Up @@ -570,7 +605,7 @@ batch size during training
- Input pipeline optimization: dataset sharding and caching, prefetch, fused JPEG decoding and cropping, parallel data parsing
- Communication: 2D gradient summation

# AI Infrastructures / Machine Learning Systems Online Lectures
## AI Infrastructures / Machine Learning Systems Lectures

#### CSE 599W Systems for ML (University of Washington)

Expand All @@ -586,7 +621,7 @@ batch size during training

| [__link__](https://pooyanjamshidi.github.io/mls/) | [__github__](https://github.com/pooyanjamshidi/mls) | [__materials__](https://pooyanjamshidi.github.io/mls/lectures/) |

# AI Infrastructures / Machine Learning Systems Conferences
## AI Infrastructures / Machine Learning Systems Conferences

#### [SysML - Conference on Systems and Machine Learning @ Stanford](https://www.sysml.cc/)

Expand Down
Binary file added images/intel-distiller-arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/ucb-ray-arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit c4d8144

Please sign in to comment.