This repo contains pre-trained model weights and training/sampling PyTorch(torch>=2.1.0) codes used in
Exploiting Discriminative Codebook Prior for Autoregressive Image Generation
Longxiang Tang, Ruihang Chu, Xiang Wang, Yujin Han, Pingyu Wu, Chunming He, Yingya Zhang, Shiwei Zhang, Jiaya Jia
HKUST, Alibaba Tongyi Lab
In this work, we propose the Discriminative Codebook Prior Extractor (DCPE) as an alternative to k-means clustering for more effectively mining and utilizing the token similarity information embedded in the codebook. DCPE replaces the commonly used centroid-based distance, which is found to be unsuitable and inaccurate for the token feature space, with a more reasonable instance-based distance. Using an agglomerative merging technique, it further addresses the token space disparity issue by avoiding splitting high-density regions and aggregating low-density ones.
Download LlamaGen VA-VAE model vq_ds16_c2i.pt.
Download pretrained weight of our DCPE at ModelScope.
Put them into ./pretrained_models
.
See Getting Started for detailed instructions.
The majority of this project is licensed under MIT License. Portions of the project are available under separate license of referred projects, detailed in corresponding files.
@article{tang2025exploiting,
title={Exploiting Discriminative Codebook Prior for Autoregressive Image Generation},
author={Longxiang Tang and Ruihang Chu and Xiang Wang and Yujin Han and Pingyu Wu and Chunming He and Yingya Zhang and Shiwei Zhang and Jiaya Jia},
journal={arXiv preprint arXiv:2508.10719},
url={https://arxiv.org/abs/2508.10719},
year={2025},
}