🦙PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs

Introduction

What is PanoLlama:

New Paradigm: A novel framework that redefines panoramic image generation as a next-token prediction task, fundamentally demonstrating its superiority over diffusion-based methods.

Speed Up: We developed a training-free autoregressive strategy on the pre-trained LlamaGen architecture, achieving panorama generation of high quality and arbitrary size.

Versatile Applications: Beyond text-to-panorama generation, it also supports multi-scale, multi-layout, and multi-guidance generation tasks.

Comprehensive Evaluation: We evaluate our method across a range of baselines and metrics, ensuring the reliability of our experimental results.

For more details, please visit our paper page.

Get Started

Configuration Set up and configure the environment by installing the required packages:

pip install -r requirements.txt

Pre-trained Models Download pre-trained models from LlamaGen, and place them in the folder /models under the corresponding modules:

module	model	params	tokens	weight
text encoder	FLAN-T5-XL	3B	/	flan-t5-xl
image tokenizer	VQVAE	72M	16x16	vq_ds16_t2i.pt
token generator	LlamaGen-XL	775M	32x32	t2i_XL_stage2_512.pt

Generation We support panorama expansion in vertical, horizontal, and both directions. Try the following command to generate a horizontal one:

python -m token_generator.sample \
    --seed -1 \
    --times 12 \
    --addit-cols 24 \
    --lam 1 \
    --gen-mode h \
    --n 1

Citation

If you find our work helpful, please consider citing:

@article{zhou2024panollama,
  title={PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs},
  author={Zhou, Teng and Zhang, Xiaoyu and Tang, Yongchuan},
  journal={arXiv preprint arXiv:2411.15867},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
docs		docs
image_tokenizer		image_tokenizer
text_encoder		text_encoder
token_generator		token_generator
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦙PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs

Introduction

Get Started

Citation

About

Releases

Packages

Contributors 2

Languages

0606zt/PanoLlama

Folders and files

Latest commit

History

Repository files navigation

🦙PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs

Introduction

Get Started

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages