Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama65B patch for int4 fp32 #1769

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Abhishek-Varma
Copy link
Contributor

Adding a WIP patch over @dan-garvey 's patch here for int4/fp32.

Apart from the bug fixes, I have also changed the order of compilation (Second and then First Vicuna/Llama).

Have put it on compilation - if IR gets generated we should be able to use the diff of this patch on top of @dan-garvey 's patch.

Signed-off-by: Abhishek Varma <[email protected]>
@Abhishek-Varma
Copy link
Contributor Author

Would need to get rid of the hardcoded dumps of IR post compilation's success before getting the diff of this patch in.
CC: @dan-garvey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant