Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chat: Add Q8_0 quantization type #2919

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

vsrinivas
Copy link

@vsrinivas vsrinivas commented Aug 28, 2024

Describe your changes

Add support for Q8_0 quantization type.

Issue ticket number and link

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • I have added thorough documentation for my code.
  • I have tagged PR with relevant project labels. I acknowledge that a PR without labels may be dismissed.
  • If this PR addresses a bug, I have provided both a screenshot/video of the original bug and the working solution.

Demo

Steps to Reproduce

Notes

@vsrinivas
Copy link
Author

Ping!

@ThiloteE
Copy link
Collaborator

ThiloteE commented Sep 10, 2024

Have you tested all the supported backends with a supported model, such as llama-3-8b-instruct? Providing screenshots of a model that works with your PR would be good.

By the way, why only Q8 and not other quants, such as Q6_K, which is very close to Q8 in terms of perplexity, but much smaller in terms of GB?

Could you outline your rationale for this change too? What do you think will be the impact of this change on the userbase and on maintaining GPT4All?

Copy link
Collaborator

@manyoso manyoso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't go in until all of our backends support this quant type. It would also need to be signed. But again, this really needs much more since the current vulkan backend does not support this quant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants