| | --- |
| | license: mit |
| | language: |
| | - en |
| | --- |
| | # Infinity β: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis |
| |
|
| | <div align="center"> |
| |
|
| | [](https://opensource.bytedance.com/gmpt/t2i/invite) |
| | [](https://foundationvision.github.io/infinity.project/) |
| | [](https://arxiv.org/abs/2412.04431) |
| | [](https://huggingface.co/FoundationVision/infinity) |
| | [](https://github.com/FoundationVision/Infinity) |
| |
|
| | </div> |
| | <p align="center" style="font-size: larger;"> |
| | <a href="https://arxiv.org/abs/2412.04431">Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis</a> |
| | </p> |
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| | ## π Introduction |
| | We present Infinity, a Bitwise Visual AutoRegressive Modeling capable of generating high-resolution and photorealistic images. Infinity redefines visual autoregressive model under a bitwise token prediction framework with an infinite-vocabulary tokenizer & classifier and bitwise self-correction. Theoretically scaling the tokenizer vocabulary size to infinity and concurrently scaling the transformer size, our method significantly unleashes powerful scaling capabilities. Infinity sets a new record for autoregressive text-to-image models, outperforming top-tier diffusion models like SD3-Medium and SDXL. Notably, Infinity surpasses SD3-Medium by improving the GenEval benchmark score from 0.62 to 0.73 and the ImageReward benchmark score from 0.87 to 0.96, achieving a win rate of 66%. Without extra optimization, Infinity generates a high-quality 1024Γ1024 image in 0.8 seconds, making it 2.6Γ faster than SD3-Medium and establishing it as the fastest text-to-image model. |
| |
|
| | ## π Note |
| | This repo is used for hosting Infinity's checkpoints. For more details, please refer to [](https://github.com/FoundationVision/Infinity) |
| |
|
| | ## π Citation |
| | If our work assists your research, feel free to give us a star β or cite us using: |
| |
|
| | ``` |
| | @misc{han2024infinityscalingbitwiseautoregressive, |
| | title={Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis}, |
| | author={Jian Han and Jinlai Liu and Yi Jiang and Bin Yan and Yuqi Zhang and Zehuan Yuan and Bingyue Peng and Xiaobing Liu}, |
| | year={2024}, |
| | eprint={2412.04431}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CV}, |
| | url={https://arxiv.org/abs/2412.04431}, |
| | } |
| | ``` |