A full-stack infrastructure software from PyTorch to GPUs for the LLM era.
Decouple AI infrastructure from specific hardware vendors.
Virtualize of all GPU/NPUs in a cluster for higher utilization and failover.
Scale to thousands of GPUs/NPUs with automatic parallelization and optimization.
Supports any multi-billion or multi-trillion parameter model for training, and serving.
🚀 Designed to unlock the full potential of your AI infrastructure!
The ambre-models repository is designed to work with a cluster where the MoAI Platform is installed. To test these scripts, please contact us.
Run the training script to fine-tune the model. For example, to fine-tune the internlm2_5-20b-chat
model:
bash finetuning_codes/scripts/train_internlm.py
The MoAI Platform also supports deploying inference servers for your model.
-
Run the script to deploy the model:
bash inference_codes/scripts/change_model.sh
-
Select the model number:
Checking agent server status... Agent server is normal ┌───── Current Server Info ────┐ │ Model : internlm2_5-20b-chat │ │ LoRA : False │ │ Checkpoint : │ │ Server Status : NORMAL │ └──────────────────────────────┘ ========== Supported Model List ========== 1. Meta-Llama-3-70B-Instruct 2. internlm2_5-20b-chat ========================================== Select Model Number [1-2/q/Q]:
-
Check the server status:
bash inference_codes/scripts/check_server.sh
Example output:
2024-10-15 15:34:28.736 | INFO | __main__:check_server:38 - Checking agent server status... 2024-10-15 15:34:28.754 | INFO | __main__:check_server:41 - Agent server is normal ┌───── Current Server Info ────┐ │ Model : internlm2_5-20b-chat │ │ LoRA : False │ │ Checkpoint : │ │ Server Status : NORMAL │ └──────────────────────────────┘
-
Chat with your model locally or build a chat platform using the API:
bash inference_codes/scripts/chat.sh
Example chat:
[INFO] Type 'quit' to exit Prompt: hello ================================================================================ Assistant: Hello! How can I assist you today?
This repository supports any multi-billion or multi-trillion parameter models for training and serving.
- Meta-Llama-3-70B-Instruct
- InternLM 2.5-20B Chat
Additional models will be added in future updates. Stay tuned for more!
Section | Description |
---|---|
Portal | Overview of technologies and company |
Documentation | Detailed explanation of technology and tutorial |
ModelHub | Chatbot using the MoAI Platform solution |