[2021] We release OpenDILab, an open source decision intelligence platform
[2020] 5 papers with 1 oral got published on CVPR/ECCV 2020
[2020] My team won 2 championships of ActivityNet on the Spatio-temporal Action Localization (AVA) track and the Trimmed Activity Recognition (Kinetics) track
[2020] My team won the championship of NIST FRVT 1:N, a 12-million-level commercial facial recognition benchmark held by US government.
[2019] 7 papers with 4 oral presentations published on ICCV/CVPR/AAAI in 2019
I am in the process of launching a startup venture. Previously, I was the Executive Director of Research and GM at SenseTime Group, spearheading large-scale AIGC and multi-modal interactive models, where I led a team of approximately 100 top-tier researchers and developers, utilizing over 4,000 GPUs to drive innovative technology and products. I hold a PhD from MMLab, CUHK, supervised by Prof. Xiaogang Wang, and have won multiple international AI competitions, along with a Google PhD Fellowship.
See full list at Google Scholar.
*equal contribution +corresponding author
▲Large model for Multi-modal generation, AIGC
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines Dongzhi Jiang, Renrui Zhang, Ziyu Guo, Yanmin Wu, jiayi lei, Pengshuo Qiu, Pan Lu, Zehui Chen, Guanglu Song, Peng Gao, Yu Liu, Chunyuan Li, Hongsheng Li 2025 ICLR
SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu 2025 ICLR
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu+, Hongsheng Li (Spotlight) 2024 NeurIPS
Phased Consistency Model Fu-Yun Wang, Zhaoyang Huang, Alexander William Bergman, Dazhong Shen, Peng Gao, Michael Lingelbach, Keqiang Sun, Weikang Bian, Guanglu Song, Yu Liu+, Hongsheng Li, Xiaogang Wang 2024 NeurIPS
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu+, Hongsheng Li 2024 NeurIPS
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models Bingqi Ma, Zhuofan Zong, Guanglu Song, Hongsheng Li, Yu Liu+ 2024 NeurIPS
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis Linjiang Huang, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu+, Hongsheng Li 2024 ECCV
Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models Xiaoshi Wu, Yiming Hao, Manyuan Zhang, Keqiang Sun, Zhaoyang Huang, Guanglu Song, Yu Liu+, Hongsheng Li 2024 ECCV
ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model Fu Yun Wang, Zhaoyang Huang, Qiang Ma, Guanglu Song, Xudong LU, Weikang Bian, Yijin Li, Yu Liu+, Hongsheng Li (Oral) 2024 ECCV
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song Yu Liu+, Hongsheng Li 2024 ECCV
Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediciton Tasks Manyuan Zhang, Guanglu Song, Xiaoyu Shi, Yu Liu+, Hongsheng Li 2024 ECCV
Enhancing Vision-Language Model with Unmasked Token Alignment Jihao Liu, Jinliang Zheng, Boxiao Liu, Yu Liu+, Hongsheng Li 2024 TMLR
Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance Dazhong Shen, Guanglu Song, Zeyue Xue, Fu-Yun Wang, Yu Liu+ 2024 CVPR
EasyDrag: Efficient Point-based Manipulation on Diffusion Models Xingzhong Hou, Boxiao Liu, Yi Zhang, Jihao Liu, Yu Liu+, Haihang You 2024 CVPR
GLID: Pre-training a Generalist Encoder-Decoder Vision Model Jihao Liu, Jinliang Zheng, Yu Liu+, Hongsheng Li 2024 CVPR
RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths Zeyue Xue, Guanglu Song, Qiushan Guo, Boxiao Liu, Zhuofan Zong, Yu Liu+, Ping Luo 2023 NIPS
Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction Zhuofan Zong, Dongzhi Jiang, Guanglu Song, Zeyue Xue, Jingyong Su, Hongsheng Li, Yu Liu+ 2023 ICCV
Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding Jihao Liu, Tai Wang, Boxiao Liu, Qihang Zhang, Yu Liu+, Hongsheng Li 2023 ICCV
SmartRefine: An Scenario-Adaptive Refinement Framework for Efficient Motion Prediction Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu+ 2024 CVPR
LMDrive: Closed-Loop End-to-End Driving with Large Language Models Hao Shao, Yuxuan Hu, Letian Wang, Guanglu Song, Steven L. Waslander, Yu Liu+, Hongsheng Li 2024 CVPR
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu+, Wanli Ouyang 2024 AAAI
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios Yazhe Niu, Yuan Pu, Zhenjie Yang, Xueyan Li, Tong Zhou, Jiyuan Ren, Shuai Hu, Hongsheng Li, Yu Liu+ (Spotlight) 2023 NIPS
Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors Letian Wang, Jie Liu, Hao Shao, Wenshuo Wang, Ruobing Chen, Yu Liu+, Steven L. Waslander 2023 RSS
ReasonNet: End-to-End Driving with Temporal and Global Reasoning Hao Shao, Letian Wang, Ruobing Chen, Steven L. Waslander, Hongsheng Li, Yu Liu+ (CARLA 2022 Champion) 2023 CVPR
GoBigger: A Scalable Platform for Cooperative-Competitive Multi-Agent Interactive Simulation Ming Zhang, Shenghan Zhang, Zhenjie Yang, Lekai Chen, Jinliang Zheng, Chao Yang, Chuming Li, Hang Zhou, Yazhe Niu, Yu Liu+ 2023 ICLR
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization Junting Pan, Siyu Chen, Mike Zheng Shou, Yu Liu+, Jing Shao, Hongsheng Li 2021 CVPR
Discriminability Distillation in Group Representation Learning Manyuan Zhang, Guanglu Song, Hang Zhou, Yu Liu+ 2020 ECCV
Learning Where to Focus for Efficient Video Object Detection Zhengkai Jiang, Yu Liu+, Ceyuan Yang, Jihao Liu, Gao Peng, Qian Zhang, Shiming Xiang, Chunhong Pan 2020 ECCV
Search to Distill: Pearls are Everywhere but not the Eyes Yu Liu+, Xuhui Jia, Mingxing Tan, Raviteja Vemulapalli, Yukun Zhu, Bradley Green, Xiaogang Wang (Oral) 2020 CVPR
Revisiting the Sibling Head in Object Detector Guanglu Song, Yu Liu+, Xiaogang Wang (OpenImage 2019 Champion) 2020 CVPR
Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu+, Xiaogang Wang 2020 CVPR
KPNet: Towards Minimal Face Detector Guanglu Song, Yu Liu+, Yuhang Zang, Xiaogang Wang, Biao Leng, Qingsheng Yuan (Oral) 2020 AAAI
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Hang Zhou, Yu Liu, Ziwei Liu, Ping Luo, Xiaogang Wang (Oral) 2019 AAAI
Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection Hongyang Li, Yu Liu, W Ouyang, X Wang 2018 IJCV
Region-based Quality Estimation Network for Large-Scale Person Re-identification Guanglu Song, Biao Leng, Yu Liu, Congrui Hetang, Shaofan Cai 2018 AAAI
Learning Deep Features via Congenerous Cosine Loss for Person Recognition Yu Liu, Hongyang Li, Xiaogang Wang arxiv:1702.06890, 2017
Scale-Aware Face Detection Zekun Hao, Yu Liu, Hongwei Qin, Junjie Yan 2017 CVPR
POI: Multiple Object Tracking with High Performance Detection and Appearance Feature F Yu, W Li, Q Li, Y Liu, X Shi, J Yan (Top-1 Solution) 2016 ECCV workshop
Crafting GBD-Net for Object Detection X Zeng, W Ouyang, J Yan, H Li, T Xiao, K Wang, Y Liu, Y Zhou, B Yang, ... T-PAMI
3D object understanding with 3D Convolutional Neural Networks B Leng, Y Liu, K Yu, X Zhang, Z Xiong Information Sciences 366, 188-201, 2016
Honors and Awards
Won the 1st prize for scientific and technological progress, CAAI, 2024
Won the 1st place in CARLA Autonomous Driving Challenge 2022
Won the 1st place in ActivityNet 2020, AVA track
Won the 1st place in ActivityNet 2020, Kinetics track
Won the 1st place in NIST FRVT held by US government in 2020, 2021 and 2022
Won the 1st place in ICCV19 Multi-Moments in Time (MIT) Challenge
Won the 1st place in Google OpenImage Object Detection Challenge 2019
Won the 1st place in Google OpenImage Instance Segmentation Challenge 2019
Google PhD Fellowship in 2019 (1/China, 50/world)
Won the 1st place in ICCV19 Lightweight Face Recognition Challenge
Won the 1st place in NIST-FRVT threshold based 1:N track 2018
Won the 1st place in Multiple Objects Tracking Challenge (MOT16) in 2016
Won the 1st place in detection track of ImageNet (ILSVRC) in 2016
Won the best undergraduate dissertation in 2016 (1/230)
IEEE-Microsoft Undergraduate Fellowship in 2016 (40/world)
The Outstanding Winner of Challenge Cup in 2015 (top 1/China)