Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Critical Merging Bug just started... #457

Open
David-AU-github opened this issue Nov 12, 2024 · 1 comment
Open

Critical Merging Bug just started... #457

David-AU-github opened this issue Nov 12, 2024 · 1 comment

Comments

@David-AU-github
Copy link

Confirming exact same error ; mergekit can not find the "base_model" ; including if the path is local (absolute) on windows.

Funny thing is some mergekits work fine - no issue, where as others fail for the reasons below.
And merges I did in late SEPT 2024, now SOME fail ; others are fine ?!?!

Example: L3 models -> merge fine, no issue
Gemmas: Now break as noted below... but not all of them (??!?!)

This works fine:

models:

  • model: G:/9B/gemma-2-9b-it-abliterated
    parameters:
    weight: .4
    merge_method: dare_ties
    base_model: G:/9B/gemma2-gutenberg-9B
    tokenizer_source: union
    dtype: bfloat16

BUT THIS DIES:

models:

  • model: G:/9B/Gemma-2-Ataraxy-9B
    parameters:
    weight: [1,1,.75,.5,.25,.25,.05,.01]
  • model: G:/9B/Gemma-2-9B-It-SPPO-Iter3
    parameters:
    weight: [1,1,.75,.5,.25,.25,.05,.01]
  • model: G:/9B/gemma-2-Ifable-9B
    parameters:
    weight: [1,1,.75,.5,.25,.25,.05,.01]
    merge_method: dare_ties
    base_model: E:/Gemma-Dark-Writer3-mega-ab
    dtype: bfloat16

But exact SAME as above (3 models, base, dare_ties) , for Llama 3/3.1 merge - works fine (??)

Other GEMMA merges of the same type (3 models, base, dare_ties) that DID work (sept 2024) now crash and burn.

Even if I change this:
"base_model: E:/Gemma-Dark-Writer3-mega-ab"

Still dies, no matter what.
If I put in a bad location , it gives the normal not found too ; (??)

Likewise any "Gemma" merges like the one above that DID WORK fine, now crash and burn.
(specifically: dare_ties, 3 models + base model)

Please advise.

Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Program Files\Python312\Scripts\mergekit-yaml.exe_main
.py", line 7, in
File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1157, in call
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\mergekit3\mergekit\mergekit\options.py", line 82, in wrapper
f(*args, **kwargs)
File "F:\mergekit3\mergekit\mergekit\scripts\run_yaml.py", line 47, in main
run_merge(
File "F:\mergekit3\mergekit\mergekit\merge.py", line 96, in run_merge
for _task, value in exec.run(quiet=options.quiet):
File "F:\mergekit3\mergekit\mergekit\graph.py", line 197, in run
res = task.execute(**arguments)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\mergekit3\mergekit\mergekit\merge_methods\generalized_task_arithmetic.py", line 126, in execute
tvs, base = get_task_vectors(
^^^^^^^^^^^^^^^^^
File "F:\mergekit3\mergekit\mergekit\merge_methods\generalized_task_arithmetic.py", line 201, in get_task_vectors
base = tensors[base_model]
~~~~~~~^^^^^^^^^^^^
KeyError: ModelReference(model=ModelPath(path='G:/9B/gemma2-gutenberg-9B', revision=None), lora=None, override_architecture=None)

Originally posted by @David-AU-github in #446

@David-AU-github
Copy link
Author

Added issue:
seems even when using "mergekit" work around ; that merge kit is not creating "tokenizer.model" for Gemma models.
(previously it did).

RESULT: Can't quant models from source without this file in llamacpp ;
I used one from a previous merge (mergekit) .

I think something has changed upstream? Transformers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant