-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot use prefix tuning on quantized Codellama #2035
Comments
Same issue. Any progress here? |
Thanks for reporting. Yes, this is a known issue that was introduced by introducing kv-cache to some model architectures in recent transformers versions, and that is affecting prefix tuning. We have a long discussion in #869 which also mentions some workarounds. If this is an option for you, you could also try older transformers versions (e.g. 4.36.0 or older should work). At the moment, I'm still figuring out how we can best make these recent transformers changes compatible with prefix-tuning, but unfortunately it's not an easy thing to fix. |
Thx to your quick reply. @BenjaminBossan For example, on Qwen2-1.5B and alpaca-cleaned, prefix-tuning yields ~10, while p-tuning yields ~1. Do you have any ideas on this phenomenon? |
Sorry, I don't have a lot of practical experience with these prompt tuning methods, maybe others can give some advise. Since the difference is so large, I would not exclude the possibility that there is a bug. Do you see that the training loss decreases? Did you try varying the hyper-parameters? It could be worth a try to not use the workaround and instead checkout older transformers versions. If you see much better scores there, it is very likely that there is a bug in the workaround. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
System Info
I'm trying to PEFT with quantized LLMs. When I used prompt tuning, LoRA, and IA3, it works. However, when I use prefix tuning on 8-bit codellama-7b-hf, it reports the following error:
Who can help?
@BenjaminBossan @sayakpaul @tmm1
Information
Tasks
examples
folderReproduction
Expected behavior
I want to fine tune 8bit codellama-7b with prefix tuning
The text was updated successfully, but these errors were encountered: