Skip to content

Latest commit

 

History

History

smoothquant

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

SmoothQuant original conversion script

This converts an OPT or Bloom 🤗 transformers model to a "smoothed" version, as described in SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models.

$ python smoothquant.py --model facebook/opt-1.3b --save-path smoothed-models/facebook/opt-1.3b

Note: due to hard-coded assumptions on model architecture in the script this only works for OPT models that apply the layer_norm before the attention (do_layer_norm_before=true in config.json). This means all models but facebook/opt-350m.