Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Intx Tensor Subclasses Quantization #439

Open
1 task done
vayuda opened this issue Jun 25, 2024 · 2 comments
Open
1 task done

[RFC] Intx Tensor Subclasses Quantization #439

vayuda opened this issue Jun 25, 2024 · 2 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@vayuda
Copy link
Collaborator

vayuda commented Jun 25, 2024

Objective:

Implement sub byte unsigned integer quantization baselines from 1-7 to enable users to experiment with low bit quantization in pytorch.

Tracker:

  • Create a UIntx Tensor Subclass per [RFC] torchao Contributor Guide #391
  • Integrate with existing quant API + AQT
  • Profile performance with llama2 and 3, noting metrics mentioned in The next tutorials #426
  • Add support for int_x as well
  • Integrate with existing uint dtypes
  • Add fused kernel for unpack + dequant

Tasks

  1. CLA Signed
@vayuda vayuda self-assigned this Jun 25, 2024
@jerryzh168
Copy link
Contributor

This is great @vayuda, after Intx Tensor subclass matures we can also merge this into pytorch core, but we can keep this in torchao for a while to flesh out the extensibility stories (how to add a new op, layout, implementation branch to these Tensors) etc.

@HDCharles
Copy link
Contributor

@vayuda can you link the results/PRs for some of these checked off bits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants
@jerryzh168 @vayuda @HDCharles and others