Skip to content

Expose FakeQuantizeConfigs in QAT quantizers#1214

Merged
andrewor14 merged 1 commit into
mainfrom
expose-qat-quantizer
Nov 4, 2024
Merged

Expose FakeQuantizeConfigs in QAT quantizers#1214
andrewor14 merged 1 commit into
mainfrom
expose-qat-quantizer

Conversation

@andrewor14

@andrewor14 andrewor14 commented Nov 1, 2024

Copy link
Copy Markdown
Contributor

Summary: This commit exposes the activation and weight FakeQuantizeConfigs in the existing QAT quantizers. These are helpful for implementing advanced functionality based on the quantization schemes represented by these quantizers, such as composing QAT + LoRA.

Test Plan:
python test/quantization/test_qat.py

>>> from torchao.quantization.qat.linear import Int8DynActInt4WeightQATQuantizer
>>> q = Int8DynActInt4WeightQATQuantizer()
>>> q.get_activation_fake_quantize_config()
FakeQuantizeConfig(dtype=torch.int8, granularity=PerToken(), mapping_type=<MappingType.ASYMMETRIC: 3>, scale_precision=torch.float32, zero_point_precision=torch.float32, zero_point_domain=<ZeroPointDomain.INT: 1>, is_dynamic=True, range_learning=False)
>>> q.get_weight_fake_quantize_config()
FakeQuantizeConfig(dtype=<TorchAODType.INT4: 4>, granularity=PerGroup(group_size=256), mapping_type=<MappingType.SYMMETRIC: 1>, scale_precision=torch.float32, zero_point_precision=torch.float32, zero_point_domain=<ZeroPointDomain.INT: 1>, is_dynamic=True, range_learning=False)
>>> from torchao.quantization.qat.linear import Int4WeightOnlyQATQuantizer
>>> q = Int4WeightOnlyQATQuantizer()
>>> q.get_activation_fake_quantize_config()
>>> q.get_weight_fake_quantize_config()
FakeQuantizeConfig(dtype=torch.uint4, granularity=PerGroup(group_size=256), mapping_type=<MappingType.ASYMMETRIC: 3>, scale_precision=torch.bfloat16, zero_point_precision=torch.bfloat16, zero_point_domain=<ZeroPointDomain.FLOAT: 2>, is_dynamic=True, range_learning=False)

@pytorch-bot

pytorch-bot Bot commented Nov 1, 2024

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1214

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 504590d with merge base 59dab15 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 1, 2024
@andrewor14 andrewor14 requested a review from jerryzh168 November 1, 2024 20:35
Comment thread torchao/quantization/qat/linear.py Outdated
Comment thread torchao/quantization/qat/linear.py
Summary: This commit exposes the activation and weight
FakeQuantizeConfigs in the existing QAT quantizers. These are
helpful for implementing advanced functionality based on the
quantization schemes represented by these quantizers, such as
composing QAT + LoRA.

Test Plan:
python test/quantization/test_qat.py
@andrewor14 andrewor14 force-pushed the expose-qat-quantizer branch from b76385f to 504590d Compare November 1, 2024 22:40
@andrewor14 andrewor14 merged commit 88d604f into main Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants