-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modules_to_save Incorrect Overlap in Multiple LoRA Adapters #2206
Comments
Thanks a lot for reporting this. Indeed, the handling of |
No worries, glad to be of any help. As far as I have tested it should be fine and using the correct loaded layer, the only problem is redundancy in loaded modules. I also dug a bit deeper and noticed that the problem originates from this function: Line 966 in 162d7e5
For an unknown reason when using load_adapter : Line 969 in 162d7e5
The set is not being updated to only the new layer and it will still hold the old layer in the set too (which shouldn't). For example if I manually hack the above script the problem will be solved: ...
# Apply and save the second adapter
os.makedirs(adapter_2_path, exist_ok=True)
model_with_lora_2 = get_peft_model(base_model, lora_config_2, adapter_name="adapter_2")
model_with_lora_2.save_pretrained(adapter_2_path)
# Load a fresh base model and wrap it in PeftModel by loading the first adapter
base_model = AutoModelForCausalLM.from_pretrained("gpt2")
peft_model = PeftModel.from_pretrained(base_model, os.path.join(adapter_1_path, "adapter_1"), adapter_name="adapter_1")
peft_model.modules_to_save = {"wte"} # <----------- HERE manually changing the modules_to_save
# Load the second adapter into the PeftModel
peft_model.load_adapter(os.path.join(adapter_2_path, "adapter_2"), adapter_name="adapter_2")
... |
Resolves huggingface#2206 NOT READY TO MERGE YET. Tentative solution to that issue. The problem is that we keep a "global" modules_to_save on the model which contains all possible modules_to_save for each adapter. When the first adapter targets layer "foo" with modules_to_save and the second adapter targets "bar", then "foo" will create a copy of the original module for the second adapter, even though it's not needed. This does not change the result but is unnecessary and takes up memory. Thus it should be avoided. TODO: Tests.
Okay, I managed to reproduce the error. My tentative fix is in #2220. Right now, there is a CI issue but it should hopefully resolve itself soon. Meanwhile, it would be great if you could check if the fix makes sense to you. Just a side note, when adding multiple adapters, don't use |
@saeid93 did you have the opportunity to test this fix on your original use case? |
@BenjaminBossan sorry, I forgot to check this one. However, I just checked it after your message with a fresh installation of peft from your drafted pull requests and the problem I had earlier do not appear anymore! Thank you for fixing this one. ModulesToSaveWrapper(
(original_module): Embedding(50257, 768)
(modules_to_save): ModuleDict(
(adapter_2): Embedding(50257, 768)
)
)
ModulesToSaveWrapper(
(original_module): Linear(in_features=768, out_features=50257, bias=False)
(modules_to_save): ModuleDict(
(adapter_1): Linear(in_features=768, out_features=50257, bias=False)
)
) |
Great, thanks for testing! |
System Info
Python 3.11.9
transformers==4.40.2
peft==0.11.2
Who can help?
@BenjaminBossan
A bug occurs in the PEFT library when using multiple LoRA adapters, each with a unique
modules_to_save
configuration. The issue arises when themodules_to_save
from the first LoRA adapter (e.g.,adapter_1
) is applied to subsequent adapters (e.g.,adapter_2
), rather than maintaining independent configurations. As a result, modules specified inmodules_to_save
foradapter_1
also appear inadapter_2
, leading to unintended behavior and possibly affecting fine-tuning accuracy. This incorrect handling ofmodules_to_save
causes duplicate entries where only the respective LoRA adapter’s modules should be saved.Information
Tasks
examples
folderReproduction
The following example code demonstrates this issue, displaying the model structure where
adapter_2
contains modules meant only foradapter_1
.Example Code
The code output will be:
Expected behavior
As you see adapter 2 is also built for the "lm_head" module to which it shouldn't, the expected output is shown below:
The text was updated successfully, but these errors were encountered: