This repo implements UniTok, a unified visual tokenizer well-suited for both generation and understanding tasks. It is compatiable with autoregressive generative models (e.g. LlamaGen), multimodal ...
Latent Diffusion Models (LDMs) inherently follow a coarse-to-fine generation process, where high-level semantic structure is generated slightly earlier than fine-grained texture. This indicates the ...