Skip to content

Latest commit

 

History

History
13 lines (10 loc) · 1.1 KB

README.md

File metadata and controls

13 lines (10 loc) · 1.1 KB

NeMo NLP/LLM Collection

The NeMo NLP/LLM Collection is designed to provide comprehensive support for on-demand large language community models as well as Nvidia's top LLM offerings. By harnessing the cutting-edge Megatron Core, our LLM collection is highly optimized, empowering NeMo users to undertake foundation model training across thousands of GPUs while facilitating fine-tuning of LLMs using techniques such as SFT and PEFT. Leveraging the Transformer Engine library, our collection ensures seamless support for FP8 workloads on Hopper H100 GPUs. Additionally, we prioritize supporting TRTLLM export for the released models, which can accelerate inference by 2-3x depending on the model size. Here's a detailed list of the models currently supported within the LLM collection:

  • Bert
  • GPT-style models
  • Falcon
  • code-llama 7B
  • Mistral
  • Mixtral

Our documentation offers comprehensive insights into each supported model, facilitating seamless integration and utilization within your projects.