Tiny-DeepSpeed

Welcome to Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library. This project is designed to provide a simple, easy-to-understand codebase that helps learners and developers understand the core functionalities of DeepSpeed, a powerful library for accelerating deep learning models.

Share us a ⭐ if this github repo does help.

If you encounter any question, please feel free to contact us. You can create an issue or just send an email to me at [email protected].

This project is highly inspired by CoreScheduler, a High-Performance Scheduler for Large Model Training

Getting Started

Prerequisites

Before you begin, ensure you have the following installed:

Python 3.11
PyTorch (CUDA) 2.3.1
triton 2.3.1

Installation

Clone this repository to your local machine:

git clone https://github.com/liangyuwang/Tiny-DeepSpeed.git
cd Tiny-DeepSpeed

Running the Demo

To run the Tiny-DeepSpeed demo, use the following command (set "num_device" to your number of devices):

# Single Device
python example/single_device/train.py

# DDP mode
torchrun --nproc_per_node num_device --nnodes 1 example/ddp/train.py

# Zero1 mode
torchrun --nproc_per_node num_device --nnodes 1 example/zero1/train.py

# Zero2 mode
torchrun --nproc_per_node num_device --nnodes 1 example/zero2/train.py

This will initiate a simple training loop using the Tiny-DeepSpeed framework.

Feel free to try our demo online on Kaggle Notebook

Features

Simplified Codebase: Stripped down to the essential components to facilitate learning and experimentation with DeepSpeed.
Meta Device Model Initialization: Loads model parameters on a meta device, avoiding actual parameter initialization and reducing initial memory usage.
Parameter Distribution via Cache Rank Map: Implements a cache rank map table to distribute model parameters across different ranks. Each parameter is assigned a rank ID based on the number of participants, allowing for efficient and targeted initialization.
Scalability and Flexibility: Demonstrates basic principles of distributed training and parameter management that can be scaled up for more complex implementations.
Educational Tool: Serves as a practical guide for those new to model optimization and distributed computing in machine learning.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
example		example
script		script
tiny_deepspeed		tiny_deepspeed
LICENSE		LICENSE
README.md		README.md
license_addition_errors.log		license_addition_errors.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tiny-DeepSpeed

Getting Started

Prerequisites

Installation

Running the Demo

Features

TODO:

About

Releases

Packages

Languages

License

liangyuwang/Tiny-DeepSpeed

Folders and files

Latest commit

History

Repository files navigation

Tiny-DeepSpeed

Getting Started

Prerequisites

Installation

Running the Demo

Features

TODO:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages