Release v0.0.11: SDXL, LLama v2 training and inference, Inf2 powered TGI · huggingface/optimum-neuron

SDXL Export and Inference

Optimum CLI now supports compiling components in the SDXL pipeline for inference on neuron devices (inf2/trn1).

Below is an example of compiling SDXL models. You can either compile it with an inf2 instance (inf2.8xlarge or larger recommended) or a CPU-only instance (disable the validation with --disable-validation) :

optimum-cli export neuron --model stabilityai/stable-diffusion-xl-base-1.0 --task stable-diffusion-xl --batch_size 1 --height 1024 --width 1024 --auto_cast matmul --auto_cast_type bf16 sdxl_neuron/

And then run inference with the class NeuronStableDiffusionXLPipeline

from optimum.neuron import NeuronStableDiffusionXLPipeline

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
stable_diffusion_xl = NeuronStableDiffusionXLPipeline.from_pretrained(
    model_id="sdxl_neuron/", device_ids=[0, 1]
)
image = stable_diffusion_xl(prompt).images[0]

Add sdxl exporter support by @JingyaHuang in #203
Add Stable Diffusion XL inference support by @JingyaHuang in #212

Llama v1, v2 Inference

Add support for Llama inference through NeuronModelForCausalLM by @dacorvo in #223

Llama v2 Training

Llama V2 training support by @michaelbenayoun in #211
LLama V1 training fix by @michaelbenayoun in #211

TGI

AWS Inferentia2 TGI server by @dacorvo in #214

Major bugfixes

neuron_parallel_compile, ParallelLoader and Zero-1 fixes for torchneuron 8+ by @michaelbenayoun in #200
flan-t5 fix: T5Parallelizer, NeuronCacheCallback and NeuronHash refactors by @michaelbenayoun in #207
Fix optimum-cli broke by optimum 1.13.0 release by @JingyaHuang in #217

Other changes

Bump Inference APIs to Neuron 2.13 by @JingyaHuang in #206
Add log for SD when applying optim attn & pipelines lazy loading by @JingyaHuang in #208
Cancel concurreny CIs for inference by @JingyaHuang in #218
fix(tgi): typer does not support Union types by @dacorvo in #219
Bump neuron-cc version to 1.18.* by @JingyaHuang in #224

Full Changelog: v0.0.10...v0.0.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.11: SDXL, LLama v2 training and inference, Inf2 powered TGI

SDXL Export and Inference

Llama v1, v2 Inference

Llama v2 Training

TGI

Major bugfixes

Other changes

Contributors