--- license: apache-2.0 pipeline_tag: text-to-video --- # RepVideo: Rethinking Cross-Layer Representation for Video Generation

Chenyang Si^1†, Weichen Fan^1†, Zhengyao Lv², Ziqi Huang¹, Yu Qiao², Ziwei Liu^1✉

S-Lab, Nanyang Technological University¹ Shanghai Artificial Intelligence Laboratory ²
^†Equal contribution. ^✉Corresponding Author.

Paper | Project Page

--- ![](https://img.shields.io/badge/RepVideo-v0.1-darkcyan) [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FVchitect%2FRepVideo&count_bg=%23BDC4B7&title_bg=%2342C4A8&icon=octopusdeploy.svg&icon_color=%23E7E7E7&title=visitors&edge_flat=true)](https://hits.seeyoufarm.com) [![Generic badge](https://img.shields.io/badge/Checkpoint-red.svg)](https://huggingface.co/Vchitect/RepVideo) [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Farxiv.org%2Fpdf%2F2501.08994&count_bg=%2379C83D&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=Paper&edge_flat=false)](https://hits.seeyoufarm.com) [![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FVchitect%2FRepVid-Webpage&count_bg=%23BE4C4C&title_bg=%235E5D64&icon=&icon_color=%23E7E7E7&title=Page&edge_flat=false)](https://hits.seeyoufarm.com) ## :astonished: Gallery

## Installation ### 1. Create a conda environment and download models ```bash conda create -n RepVid python==3.10 conda activate RepVid pip install -r requirements.txt mkdir ckpt cd ckpt mkdir t5-v1_1-xxl wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/text_encoder/config.json wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/text_encoder/model-00001-of-00002.safetensors wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/text_encoder/model-00002-of-00002.safetensors wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/text_encoder/model.safetensors.index.json wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/tokenizer/added_tokens.json wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/tokenizer/special_tokens_map.json wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/tokenizer/spiece.model wget https://huggingface.co/THUDM/CogVideoX-2b/resolve/main/tokenizer/tokenizer_config.json cd ../ mkdir vae wget https://cloud.tsinghua.edu.cn/f/fdba7608a49c463ba754/?dl=1 mv 'index.html?dl=1' vae.zip unzip vae.zip ``` ## Inference ~~~bash cd sat bash run.sh ~~~ ## BibTeX ``` @article{si2025RepVideo, title={RepVideo: Rethinking Cross-Layer Representation for Video Generation}, author={Si, Chenyang and Fan, Weichen and Lv, Zhengyao and Huang, Ziqi and Qiao, Yu and Liu, Ziwei}, journal={arXiv 2501.08994}, year={2025} } ``` ## 🔑 License This code is licensed under Apache-2.0. The framework is fully open for academic research and also allows free commercial usage. ## Disclaimer We disclaim responsibility for user-generated content. The model was not trained to realistically represent people or events, so using it to generate such content is beyond the model's capabilities. It is prohibited for pornographic, violent and bloody content generation, and to generate content that is demeaning or harmful to people or their environment, culture, religion, etc. Users are solely liable for their actions. The project contributors are not legally affiliated with, nor accountable for users' behaviors. Use the generative model responsibly, adhering to ethical and legal standards.