Skip to content

v0.0.11: SDXL, LLama v2 training and inference, Inf2 powered TGI

Compare
Choose a tag to compare
@JingyaHuang JingyaHuang released this 12 Sep 13:50
· 279 commits to main since this release

SDXL Export and Inference

Optimum CLI now supports compiling components in the SDXL pipeline for inference on neuron devices (inf2/trn1).

Below is an example of compiling SDXL models. You can either compile it with an inf2 instance (inf2.8xlarge or larger recommended) or a CPU-only instance (disable the validation with --disable-validation) :

optimum-cli export neuron --model stabilityai/stable-diffusion-xl-base-1.0 --task stable-diffusion-xl --batch_size 1 --height 1024 --width 1024 --auto_cast matmul --auto_cast_type bf16 sdxl_neuron/

And then run inference with the class NeuronStableDiffusionXLPipeline

from optimum.neuron import NeuronStableDiffusionXLPipeline

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
stable_diffusion_xl = NeuronStableDiffusionXLPipeline.from_pretrained(
    model_id="sdxl_neuron/", device_ids=[0, 1]
)
image = stable_diffusion_xl(prompt).images[0]

Llama v1, v2 Inference

  • Add support for Llama inference through NeuronModelForCausalLM by @dacorvo in #223

Llama v2 Training

TGI

Major bugfixes

Other changes

Full Changelog: v0.0.10...v0.0.11