Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models: Add Gemma-2-9b-it-GGUF #2803

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Models: Add Gemma-2-9b-it-GGUF #2803

wants to merge 4 commits into from

Conversation

ThiloteE
Copy link
Collaborator

@ThiloteE ThiloteE commented Aug 6, 2024

Describe your changes

Adds model support for Gemma-2-9b-it

Description of Model

At the date of writing, the model has strong results in benchmarks (for its parameter size). It claims to support a context of up to 8k.

  • The model was apparently trained and finetuned on mostly English datasets
  • License: Gemma

Personal Impression:

For 9 billion parameters, the model has reasonable output. I tested the model with a 14k character conversation and there were no tokenizer issues and no severe repetition problems as far as I could discern. I have seen refusals when it was tasked with certain things and it seems to be finetuned with a particular alignment. Its quality of responses makes it a good model, if you can bear its alignment or your use case happens to fall within the originally intended use cases of the model. It mainly will appeal to English speaking users.

Clayton reported, the model has a tendency to keep asking questions, even if instructed not to.

Critique:

  • The license is very restrictive.
  • Its context window of 8192 is a little short compared to other state of the art models with roughly similar architecture and within its parameter size range.
  • only works on CPU and Cuda backend.

Motivation for this pull-request

  • Other quants uploaded to huggingface and that are accessible via the search feature of GPT4All have tokenizer eos issues.
  • To date, the model is rumoured to be one of the better models out there.
  • For it's size it is high on the huggingface open leaderboard benchmark
  • Made by Google, the model has a certain reputation

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • I have added thorough documentation for my code.
  • I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed.
  • If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.

Signed-off-by: ThiloteE <[email protected]>
@ThiloteE ThiloteE added models models.json This requires a change to the official model list. labels Aug 6, 2024
@ThiloteE
Copy link
Collaborator Author

ThiloteE commented Aug 6, 2024

I am a little unsure, if the \n at the end of the chat template is really necessary or not. It is in the tokenizer_config.json, so it should be in there by default though. If anybody wants to do extensive tests, go ahead, but my 14k characters test was done without a new line and it still worked.

Ready for review.

@ThiloteE ThiloteE marked this pull request as ready for review August 6, 2024 20:20
@ThiloteE
Copy link
Collaborator Author

ThiloteE commented Aug 6, 2024

image

@ThiloteE ThiloteE changed the title Add support for Gemma-2-9b-it-GGUF Models: Add Gemma-2-9b-it-GGUF Sep 11, 2024
@ThiloteE
Copy link
Collaborator Author

This model is not supported on the Nomic Vulkan backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models.json This requires a change to the official model list. models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants