LLM, CUDA/System, Distributed training
-
KAUST (King Abdullah University of Science and Technology)
Pinned Loading
-
Tiny-DeepSpeed
Tiny-DeepSpeed PublicTiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library
-
Tiny-Megatron
Tiny-Megatron PublicTiny-Megatron, a minimalistic re-implementation of the Megatron library
Python 3
-
TheCoreTeam/core_scheduler
TheCoreTeam/core_scheduler PublicCoreScheduler: A High-Performance Scheduler for Large Model Training
-
Flash-Attention-Implementation
Flash-Attention-Implementation PublicImplementation of Flash-Attention (both forward and backward) with PyTorch, CUDA, and Triton
Python 1
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.