Full list can be founded at [Google Scholar] (๐Ÿง‘โ€๐Ÿ’ป Co-first Author, ๐Ÿ“ฎ Corresponding Author)
Preprint

50. EmoCAST: Emotional Talking Portrait via Emotive Text Description
Yiguo Jiang, Xiaodong Cun๐Ÿ“ฎ, Yong Zhang, Yudian Zheng, Fan Tang, Chi-Man Pun๐Ÿ“ฎ
Preprint, 2025.
49. VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning
Liyun Zhu, Qixiang Chen, Xi Shen, Xiaodong Cun๐Ÿ“ฎ
Preprint, 2025.
48. Sci-Fi: Symmetric Constraint for Frame Inbetweening
Liuhan Chen, Xiaodong Cun๐Ÿ“ฎ, Xiaoyu Li, Xianyi He, Shenghai Yuan, Jie Chen, Ying Shan, Li Yuan๐Ÿ“ฎ
Preprint, 2025.
47. GSFixer: Improving 3D Gaussian Splatting with Reference-Guided Video Diffusion Priors
Xingyilang Yin, Qi Zhang, Jiahao Chang, Ying Feng, Qingnan Fan, Xi Yang, Chi-Man Pun๐Ÿ“ฎ, Huaqi Zhang, Xiaodong Cun๐Ÿ“ฎ
Preprint, 2025.
45. AnimateZero: Video Diffusion Models are Zero-Shot Image Animators
Jiwen Yu, Xiaodong Cun ๐Ÿ“ฎ, Chenyang Qi, Yong Zhang, Xintao Wang, Ying Shan, Jian Zhang
Preprint, 2024.
44. Learning Enriched Illuminants for Cross and Single Sensor Color Constancy
Xiaodong Cun ๐Ÿง‘โ€๐Ÿ’ป, Zhendong Wang ๐Ÿง‘โ€๐Ÿ’ป, Chi-Man Pun, Wengang Zhou, Jianzhuang Liu, Xu Jia, Houqiang Li
Preprint, 2021.

Year 2025

43. AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse.
Zichao Yu, Zhen Zou, Guojiang Shao, Chengwei Zhang, Shengze Xu, Jie Huang, Feng Zhao, Xiaodong Cun, Wenyi Zhang.
ACM MultiMedia, 2025.
42. Mobius: Text to Seamless Looping Video Generation via Latent Shift
Xiuli Bi, Jianfei Yuan, Bo Liu, Yong Zhang, Xiaodong Cun ๐Ÿ“ฎ, Chi-Man Pun, Bin Xiao
SIGGRAPH (Conference Track), 2025.

41. DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
Minghong Cai, Xiaodong Cun, Xiaoyu Li, Wenze Liu, Zhaoyang Zhang, Yong Zhang, Ying Shan, Xiangyu Yue
Computer Vision and Pattern Recognition (CVPR), 2025.

40. DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos.
Wenbo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan.
Computer Vision and Pattern Recognition (CVPR Highlight), 2025.

Best Paper at PixFoundation workshop of CVPR 25.
39. DEIM: DETR with Improved Matching for Fast Convergence.
hihua Huang, Zhichao Lu, Xiaodong Cun, Yongjun Yu, Xiao Zhou, Xi Shen.
Computer Vision and Pattern Recognition (CVPR), 2025.

38. CustomTTT: Motion and Appearance Customized Video Generation via Test-Time Training
Xiuli Bi, Jian Lu, Bo Liu, Xiaodong Cun ๐Ÿ“ฎ, Yong Zhang, Weisheng Li, Bin Xiao
AAAI Conference on Artificial Intelligence (AAAI), 2025.


Year 2024

36. CV-VAE: A Compatible Video VAE for Latent Generative Video Models.
Sijie Zhao, Yong Zhang, Xiaodong Cun Shaoshu Yang, Muyao Niu, Xiaoyu Li, Wenbo Hu, Ying Shan.
NeurIPS, 2024.

35. Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
IEEE Transactions on Visualization and Computer Graphic (TVCG), 2024.

34. Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models.
Qinyu Yang, Haoxin Chen, Yong Zhang, Menghan Xia, Xiaodong Cun, Zhixun Su, Ying Shan.
European Conference on Computer Vision (ECCV), 2024.

32. Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation.
Lanqing Guo, Yingqing He, Haoxin Chen, Menghan Xia, Xiaodong Cun, Yufei Wang, Siyu Huang, Yong Zhang, Xintao Wang, Qifeng Chen, Ying Shan, Bihan Wen.
European Conference on Computer Vision (ECCV), 2024.

31. VideoCrafter1 & VideoCrafter2
Haoxin Chen ๐Ÿง‘โ€๐Ÿ’ป, Menghan Xia ๐Ÿง‘โ€๐Ÿ’ป, Yong Zhang ๐Ÿง‘โ€๐Ÿ’ป, Xiaodong Cun๐Ÿง‘โ€๐Ÿ’ป,Xintao Wang, Ying Shan
Computer Vision and Pattern Recognition (CVPR) & Technical report, 2024.

PaperDigest Most Influential Papers of ArXiv 24 (paperdigest.org).
30. EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Yaofang Liu๐Ÿง‘โ€๐Ÿ’ป, Xiaodong Cun๐Ÿง‘โ€๐Ÿ’ป, Xuebo Liu, Xintao Wang, Yong Zhang, Haoxin Chen , Yang Liu, Tieyong Zeng, Raymond Chan, Ying Shan
Computer Vision and Pattern Recognition (CVPR), 2024.

29. SmartEdit: Exploring Complex Instruction-based Image Editing with Large Language Models.
Yuzhou Huang, Liangbin Xie, Xintao Wang, Ziyang Yuan, Xiaodong Cun, Yixiao Ge, Jiantao Zhou, Chao Dong, Rui Huang, Ruimao Zhang, Ying Shan.
Computer Vision and Pattern Recognition (CVPR Highlight), 2024.

28. Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework.
Ziyao Huang, Fan Tang, Yong Zhang, Xiaodong Cun, Juan Cao, Jintao Li, Tong-yee Lee.
Computer Vision and Pattern Recognition (CVPR), 2024.

27. Depth-aware Test-Time Training for Zero-shot Video Object Segmentation. CVPR 2024.
Weihuang Liu, Xi Shen, Haolun Li, Xiuli Bi, Bo Liu, Chi-Man Pun๐Ÿ“ฎ, Xiaodong Cun๐Ÿ“ฎ.
Computer Vision and Pattern Recognition (CVPR), 2024.

26. X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
Lingmin Ran, Xiaodong Cun, Jia-Wei Liu, Rui Zhao, Song Zijie, Xintao Wang, Jussi Keppo, Mike Zheng Shou
Computer Vision and Pattern Recognition (CVPR), 2024.

25. Sketch Video Synthesis
Yudian Zheng, Xiaodong Cun ๐Ÿ“ฎ,Menghan Xia, Chi-Man Pun
Eurographics, 2024.


Year 2023

22. TaleCrafter: Interactive Story Visualization with Multiple Characters
Yuan Gong, Youxin Pang, Xiaodong Cun, Menghan Xia, Yingqing He, Haoxin Chen, Longyue Wang, Yong Zhang, Xintao Wang, Ying Shan and Yujiu Yang
SIGGRAPH Asia (Conference Track), 2023.

20. FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
Chenyang Qi, Xiaodong Cun ๐Ÿ“ฎ, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen ๐Ÿ“ฎ
International Conference on Computer Vision (ICCV), 2023.

ICCV 23 Oral Presentation (2.3%)
PaperDigest Most Influential Papers of ICCV 23 (paperdigest.org).
19. ToonTalker: Cross-Domain Face Reenactment
Yuan Gong, Yong Zhang, Xiaodong Cun, Fei Yin, Yanbo Fan, Xuan Wang, Baoyuan Wu, and Yujiu Yang
International Conference on Computer Vision (ICCV), 2023.

18. LivelySpeaker: Towards Semantic-aware Co-Speech Gesture Generation
Yihao Zhi๐Ÿง‘โ€๐Ÿ’ป, Xiaodong Cun๐Ÿง‘โ€๐Ÿ’ป, Xuelin Chen, Xi Shen, Wen Guo, Shaoli Huang , Shenghua Gao
International Conference on Computer Vision (ICCV), 2023.

16. Explicit Visual Prompting for Low-Level Structure Segmentations
Weihuang Liu, Xi Shen , Chi-Man Pun ๐Ÿ“ฎ, Xiaodong Cun ๐Ÿ“ฎ
Computer Vision and Pattern Recognition (CVPR) & Journal Submission, 2023.

15. T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations
Jianrong Zhang ๐Ÿง‘โ€๐Ÿ’ป, Yangsong Zhang ๐Ÿง‘โ€๐Ÿ’ป, Xiaodong Cun, Shaoli Huang, Yong Zhang, Hongwei Zhao, Hongtao Lu, Xi Shen ๐Ÿ“ฎ
Computer Vision and Pattern Recognition (CVPR), 2023.

14. 3D GAN Inversion with Facial Symmetry Prior
Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Oztireli, Yujiu Yang
Computer Vision and Pattern Recognition (CVPR), 2023.

13. DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
Youxin Pang, Yong Zhang, Weize Quan, Yanbo Fan, Xiaodong Cun, Ying Shan, Dong-ming Yan
Computer Vision and Pattern Recognition (CVPR), 2023.

12. SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Wenxuan Zhang ๐Ÿง‘โ€๐Ÿ’ป, Xiaodong Cun ๐Ÿง‘โ€๐Ÿ’ป, Xuan Wang, Yong Zhang, Xi Shen, Yu Guo, Ying Shan, Fei Wang
Computer Vision and Pattern Recognition (CVPR), 2023.

Top 10 you won't miss paper of CVPR 2023 (voxel51.com).
Top 10 Most Github Star CVPR paper (github.com).
10. CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying
Weihuang Liu, Xiaodong Cun ๐Ÿ“ฎ, Chi-Man Pun ๐Ÿ“ฎ, Menghan Xia, Yong Zhang, and Jue Wang
AAAI Conference on Artificial Intelligence (AAAI, Oral), 2023.


Year 2022

9. VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Kun Cheng ๐Ÿง‘โ€๐Ÿ’ป, Xiaodong Cun ๐Ÿง‘โ€๐Ÿ’ป๐Ÿ“ฎ, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, Nannan Wang
SIGGRAPH Asia (Conference Track), 2022.

Top 10 Most Github Star SIGGRAPH paper (github.com).
8. StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pretrained StyleGAN
Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang
European Conference on Computer Vision (ECCV), 2022.

7. Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization
Jingtang Liang ๐Ÿง‘โ€๐Ÿ’ป , Xiaodong Cun ๐Ÿง‘โ€๐Ÿ’ป , Chi-Man Pun, Jue Wang
European Conference on Computer Vision (ECCV), 2022.

6. Uformer: A General U-Shaped Transformer for Image Restoration
Zhendong Wang, Xiaodong Cun ๐Ÿ“ฎ, Jianmin Bao, Jianzhuang Liu, Wengang Zhou, Houqiang Li
Computer Vision and Pattern Recognition (CVPR), 2022.

PaperDigest Most Influential Papers of CVPR 22 (paperdigest.org).
Before 2021

4. Defocus Blur Detection via Depth Distillation
Xiaodong Cun, Chi-Man Pun
European Conference on Computer Vision (ECCV), 2020.

2. Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN
Xiaodong Cun, Chi-Man Pun, Cheng Shi
AAAI Conference on Artificial Intelligence (AAAI), 2020.