GPTQ causes poorly generated text #540

MDK8888 · 2024-07-25T06:41:57Z

Hey, I'm the creator of the GPTFast, which scales the techniques outlined in gpt-fast to more models. I use a combination of AutoGPTQ as well as the GPTQ quantization methods here when I quantize, but the quality of the generated text after quantization is poor, often repeating a token many times once quantized. My quantization method is linked here: https://github.com/MDK8888/GPTFast/blob/master/GPTFast/Core/Quantize/GPTQ/Quantizers/GPTQModelQuantizer.py.

jerryzh168 · 2024-07-25T18:00:27Z

cc @HDCharles could you help take a look, I think the issue might be in the custom code.

jerryzh168 · 2024-07-26T22:55:15Z

@MDK8888 I'd also suggest to do a proper eval as well, maybe you can evaluate the wikitext perplexity and compare the result before and after quantization?

msaroufim · 2024-07-27T00:50:31Z

I doubt an eval would help as a first step here, per @MDK8888 the outputs were just garbage

That said @MDK8888 it's quite difficult and unrealistic for us to debug numerics issues for custom quantization implementations so is there any way you can create a minimal repro using the existing gptq implementation here in ao?

MDK8888 · 2024-07-27T05:22:39Z

Hey, I will try to create a minimal repo over the weekend using the existing GPTQ implementation in ao and share the results here. @jerryzh168 @msaroufim thanks for responding!

MDK8888 · 2024-07-31T02:19:21Z

Hey, sorry for the late response! I tried working with the existing GPTQ implementation in ao, but I was getting a little bit confused with the MultiInputs. I have the repository linked here with my progress: https://github.com/MDK8888/GPTQTest - will keep working on this throughout the week to try and get it to work.

MDK8888 · 2024-08-22T04:52:47Z

Hey, I was able to fix the issue in GPTFast 0.3.1 - as it turns out, the linear layers in a lot of transformers actually have a bias :)

MDK8888 closed this as completed Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQ causes poorly generated text #540

GPTQ causes poorly generated text #540

MDK8888 commented Jul 25, 2024

jerryzh168 commented Jul 25, 2024

jerryzh168 commented Jul 26, 2024

msaroufim commented Jul 27, 2024

MDK8888 commented Jul 27, 2024 •

edited

Loading

MDK8888 commented Jul 31, 2024

MDK8888 commented Aug 22, 2024

GPTQ causes poorly generated text #540

GPTQ causes poorly generated text #540

Comments

MDK8888 commented Jul 25, 2024

jerryzh168 commented Jul 25, 2024

jerryzh168 commented Jul 26, 2024

msaroufim commented Jul 27, 2024

MDK8888 commented Jul 27, 2024 • edited Loading

MDK8888 commented Jul 31, 2024

MDK8888 commented Aug 22, 2024

MDK8888 commented Jul 27, 2024 •

edited

Loading