Initial ParetoQ commit by andrewor14 · Pull Request #1876 · pytorch/ao

andrewor14 · 2025-03-12T20:14:05Z

This project contains the training code of ParetoQ introduced in: "ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization" (https://arxiv.org/abs/2502.02631). All code is written by @liuzechun and @zxdmike and migrated from
https://github.com/facebookresearch/ParetoQ.

ParetoQ is the first unified framework that facilitates rigorous comparisons across 1-bit, 1.58-bit, 2-bit, 3-bit, and 4-bit quantization settings. By optimizing training schemes and refining quantization functions, ParetoQ surpasses all previous methods tailored to specific bit widths. Specifically, the 1.58-bit ParetoQ LLaMA-3 8B model reduces the performance gap to full precision by relatively 37.8% compared to the 1-bit Era’s 1.58-bit LLaMA-3 8B model, while using only 30% of the training tokens.

pytorch-bot · 2025-03-12T20:14:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1876

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 24191f4 with merge base 6726b0b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo · 2025-03-28T19:59:29Z

should there be a test of some sort? Otherwise it's likely this will break soon without anyone knowing.

andrewor14 · 2025-04-09T16:06:18Z

should there be a test of some sort? Otherwise it's likely this will break soon without anyone knowing.

added

@liuzechun

This project contains the training code of ParetoQ introduced in: "ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization" (https://arxiv.org/abs/2502.02631). All code is written by @liuzechun and @zxdmike and migrated from https://github.com/facebookresearch/ParetoQ. ParetoQ is the first unified framework that facilitates rigorous comparisons across 1-bit, 1.58-bit, 2-bit, 3-bit, and 4-bit quantization settings. By optimizing training schemes and refining quantization functions, ParetoQ surpasses all previous methods tailored to specific bit widths. Specifically, the 1.58-bit ParetoQ LLaMA-3 8B model reduces the performance gap to full precision by relatively 37.8% compared to the 1-bit Era’s 1.58-bit LLaMA-3 8B model, while using only 30% of the training tokens.

@liuzechun

This project contains the training code of ParetoQ introduced in: "ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization" (https://arxiv.org/abs/2502.02631). All code is written by @liuzechun and @zxdmike and migrated from https://github.com/facebookresearch/ParetoQ. ParetoQ is the first unified framework that facilitates rigorous comparisons across 1-bit, 1.58-bit, 2-bit, 3-bit, and 4-bit quantization settings. By optimizing training schemes and refining quantization functions, ParetoQ surpasses all previous methods tailored to specific bit widths. Specifically, the 1.58-bit ParetoQ LLaMA-3 8B model reduces the performance gap to full precision by relatively 37.8% compared to the 1-bit Era’s 1.58-bit LLaMA-3 8B model, while using only 30% of the training tokens.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 12, 2025

andrewor14 marked this pull request as draft March 12, 2025 20:14

andrewor14 force-pushed the paretoq branch from ade3706 to 39d0119 Compare March 12, 2025 21:01

andrewor14 added the topic: new feature Use this tag if this PR adds a new feature label Mar 12, 2025

andrewor14 force-pushed the paretoq branch from 39d0119 to 35597f5 Compare March 13, 2025 20:30

andrewor14 marked this pull request as ready for review March 13, 2025 20:31

andrewor14 force-pushed the paretoq branch 2 times, most recently from 29400c6 to 77b1bcc Compare March 14, 2025 16:16

andrewor14 force-pushed the paretoq branch from 77b1bcc to 0c019f2 Compare March 28, 2025 16:02

andrewor14 force-pushed the paretoq branch 2 times, most recently from ca0fdaa to 87638de Compare April 9, 2025 16:05

andrewor14 requested review from drisspg, jainapurva and jerryzh168 April 9, 2025 16:06

andrewor14 force-pushed the paretoq branch from 87638de to 5b2a7d2 Compare April 9, 2025 16:13

jerryzh168 approved these changes Apr 9, 2025

View reviewed changes

andrewor14 force-pushed the paretoq branch from 5b2a7d2 to 24191f4 Compare April 9, 2025 17:13

andrewor14 merged commit 31f119e into main Apr 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial ParetoQ commit#1876

Initial ParetoQ commit#1876
andrewor14 merged 1 commit into
mainfrom
paretoq

andrewor14 commented Mar 12, 2025

Uh oh!

pytorch-bot Bot commented Mar 12, 2025 •

edited

Loading

Uh oh!

vkuzo commented Mar 28, 2025

Uh oh!

andrewor14 commented Apr 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

andrewor14 commented Mar 12, 2025

Uh oh!

pytorch-bot Bot commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1876

✅ No Failures

Uh oh!

vkuzo commented Mar 28, 2025

Uh oh!

andrewor14 commented Apr 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot Bot commented Mar 12, 2025 •

edited

Loading