NeMo NLP/LLM Collection

The NeMo NLP/LLM Collection is designed to provide comprehensive support for on-demand large language community models as well as Nvidia's top LLM offerings. By harnessing the cutting-edge Megatron Core, our LLM collection is highly optimized, empowering NeMo users to undertake foundation model training across thousands of GPUs while facilitating fine-tuning of LLMs using techniques such as SFT and PEFT. Leveraging the Transformer Engine library, our collection ensures seamless support for FP8 workloads on Hopper H100 GPUs. Additionally, we prioritize supporting TRTLLM export for the released models, which can accelerate inference by 2-3x depending on the model size. Here's a detailed list of the models currently supported within the LLM collection:

Bert
GPT-style models
Falcon
code-llama 7B
Mistral
Mixtral

Our documentation offers comprehensive insights into each supported model, facilitating seamless integration and utilization within your projects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NeMo NLP/LLM Collection

Files

README.md

Latest commit

History

README.md

File metadata and controls

NeMo NLP/LLM Collection