Bump Int4WeightOnlyConfig version to 2#2949
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2949
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d2168f2 with merge base c452495 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
2341ca6 to
e00fe75
Compare
e00fe75 to
0ad98af
Compare
Summary: This is in preparation for version bump in #2949 added version=1 for both `int4_weight_only` and `Int4WeightOnlyConfig` Test Plan: regression tests with CI Reviewers: Subscribers: Tasks: Tags:
Summary: This is in preparation for version bump in #2949 added version=1 for both `int4_weight_only` and `Int4WeightOnlyConfig` Test Plan: regression tests with CI Reviewers: Subscribers: Tasks: Tags:
edca31d to
5301a7e
Compare
5301a7e to
9364280
Compare
Int4WeightOnlyConfig version to 2
| _int4_quant_code = """ | ||
| from torchao.quantization import Int4WeightOnlyConfig | ||
| quant_config = Int4WeightOnlyConfig(group_size=128, packing_format="tile_packed_to_4d", int4_choose_qparams_algorithm="hqq", version=2) | ||
| quant_config = Int4WeightOnlyConfig(group_size=128, packing_format="tile_packed_to_4d", int4_choose_qparams_algorithm="hqq") |
There was a problem hiding this comment.
It's called int4_packing_format now, no?
There was a problem hiding this comment.
yeah that's true, I have updated locally, will push change together with other things
| _int4_quant_code = """ | ||
| from torchao.quantization import Int4WeightOnlyConfig | ||
| quant_config = Int4WeightOnlyConfig(group_size=128, packing_format="tile_packed_to_4d", int4_choose_qparams_algorithm="hqq", version=2) | ||
| quant_config = Int4WeightOnlyConfig(group_size=128, packing_format="tile_packed_to_4d", int4_choose_qparams_algorithm="hqq") |
There was a problem hiding this comment.
just found we also need to update packing_format to int4_packing_format I have made change locally, can push these changes before land.
|
Should you import to fbcode to see if you break any internal tests? |
9364280 to
7e8b47d
Compare
|
@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this in D81985661. |
|
looks like there are some conflicts in importing, I'll unlink and merge for now, will rely on diff train |
Summary: Current Int4WeightOnlyConfig has version 1 and 2, and default is 1, this PR changes the default to 2 and made modification to callsites. For the Int4WeightOnlyConfig that's using the old configuration, we added explicit `version=1`, we can migrate the callsite to use the version 2 separately For READMEs we migrate the usage to version 2 directly Deprecation: TODO Test Plan: Regression tests: python test/dtypes/test_affine_quantized.py python test/quantization/test_quant_api.py python test/quantization/quantize_/workflows/int4/test_int4_marlin_sparse_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_opaque_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_plain_int32_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_tensor.py python test/quantization/quantize_/workflows/int4/test_int4_tile_packed_to_4d_tensor.py Reviewers: Subscribers: Tasks: Tags:
7e8b47d to
d2168f2
Compare
Summary:
Current Int4WeightOnlyConfig has version 1 and 2, and default is 1, this PR
version=1, we can migrate the callsite to use the version 2 separately (note this is done in Add version=1 for calls to int4 weight only config #2958)Deprecation Note:
We updated the implementation for int4 Tensor, so bumps the default version from 1 to 2 for these two configs.
Suggestion: upgrade torchao to 0.14 and later and generate the checkpoint again:
Or download the checkpoint again (please let us know if the checkpoint is not updated)
Please see #2948 for more details around the deprecation.
Test Plan:
Regression tests:
python test/dtypes/test_affine_quantized.py
python test/quantization/test_quant_api.py
python test/quantization/quantize_/workflows/int4/test_int4_marlin_sparse_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_opaque_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_plain_int32_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_tensor.py
python test/quantization/quantize_/workflows/int4/test_int4_tile_packed_to_4d_tensor.py
python test/integration/test_load_and_run_checkpoint.py
Reviewers:
Subscribers:
Tasks:
Tags: