Yu Liu's academic page
News
Job positions and PhD program are open for talents interested in AIGC, Large model and RL.
[2024] My team have 19 papers published on NeurIPS/CVPR/TMLR/ICML/ECCV in 2024
[2024] Our AIGC product MiaoHua has garnered users over 3,000,000 , with a DAU exceeding 530,000 , all within a remarkable 9-day post-launch.
[2024] We are granted the 吴文俊奖 - 科技进步一等奖
[2024] Our OpenDILab achieves 21,000+ stars on GitHub!
[2023] My team have 20 papers published on TPAMI/ICCV/ICLR/NeurIPS/IROS/AAAI in 2023
[2023] My team won the championship of CARLA Autonomous Driving Challenge
[2022] My team have 9 papers published on ECCV/CoRL/NeurIPS/AAAI in 2022
[2022] We release DI-Star , an implementation of AlphaStar in pyTorch, beating pro players with 6000+ MMR.
[2021] My team won the best paper of ICCV21 MFR workshop
[2021] My team won 3 championships of ICCV21 The Masked Face Recognition Challenge
[2021] We release OpenDILab , an open source decision intelligence platform
[2020] 5 papers with 1 oral got published on CVPR/ECCV 2020
[2020] My team won 2 championships of ActivityNet on the Spatio-temporal Action Localization (AVA) track and the Trimmed Activity Recognition (Kinetics) track
[2020] My team won the championship of NIST FRVT 1:N , a 12-million-level commercial facial recognition benchmark held by US government.
[2019] 7 papers with 4 oral presentations published on ICCV/CVPR/AAAI in 2019
[2019] I won 4 champions in 4 ICCV AI challenges:MMIT (solutions ), OpenImage Instance Segmentation Challenge (solutions ), OpenImage Object Detection Challenge (solutions ), LFR 2019 (model and report )
[2019] I was granted the Google PhD Fellowship 2019 .
[2018] 4 papers published by CVPR/ECCV/IJCV/AAAI in 2018.
[2017] 6 papers published on CVPR/ICCV/NIPS/AAAI/T-PAMI in 2017.
[2016] We win the 1st place in ECCV-MOT16 Challenge! code
[2016] We win the 1st place in ImageNet Challenge 2016 ! code
Read more
About me
I am the Executive Director of Research and GM at SenseTime Group, spearheading large-scale AIGC and multi-modal interactive models. I lead a team of approximately 100 top-tier researchers and developers, utilizing over 4,000 GPUs to drive innovative technology and products. I hold a PhD from MMLab , CUHK, supervised by Prof. Xiaogang Wang , and have won multiple international AI competitions, along with a Google PhD Fellowship.
Working Experience
Publications
See full list at Google Scholar .
*equal contribution + corresponding author
▼ Large model for Multi-modal generation, AIGC
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu+ , Hongsheng Li
(Spotlight) 2024 NeurIPS
Instruction-Guided Visual Masking
Jinliang Zheng, Jianxiong Li, Sijie Cheng, Yinan Zheng, Jiaming Li, Jihao Liu, Yu Liu , Jingjing Liu, Xianyuan Zhan
2024 NeurIPS
MoVA: Adapting Mixture of Vision Experts to Multimodal Context
Zhuofan Zong, Bingqi Ma, Dazhong Shen, Guanglu Song, Hao Shao, Dongzhi Jiang, Hongsheng Li, Yu Liu+
2024 NeurIPS
Phased Consistency Model
Fu-Yun Wang, Zhaoyang Huang, Alexander William Bergman, Dazhong Shen, Peng Gao, Michael Lingelbach, Keqiang Sun, Weikang Bian, Guanglu Song, Yu Liu , Hongsheng Li, Xiaogang Wang
2024 NeurIPS
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu+ , Hongsheng Li
2024 NeurIPS
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models
Bingqi Ma, Zhuofan Zong, Guanglu Song, Hongsheng Li, Yu Liu+
2024 NeurIPS
FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis
Linjiang Huang, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu , Hongsheng Li
2024 ECCV
Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Xiaoshi Wu, Yiming Hao, Manyuan Zhang, Keqiang Sun, Zhaoyang Huang, Guanglu Song, Yu Liu , Hongsheng Li
2024 ECCV
ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model
Fu Yun Wang, Zhaoyang Huang, Qiang Ma, Guanglu Song, Xudong LU, Weikang Bian, Yijin Li, Yu Liu , Hongsheng Li
(Oral) 2024 ECCV
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation
Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song Yu Liu , Hongsheng Li
2024 ECCV
Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediciton Tasks
Manyuan Zhang, Guanglu Song, Xiaoyu Shi, Yu Liu , Hongsheng Li
2024 ECCV
Enhancing Vision-Language Model with Unmasked Token Alignment
Jihao Liu, Jinliang Zheng, Boxiao Liu, Yu Liu , Hongsheng Li
2024 TMLR
Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance
Dazhong Shen, Guanglu Song, Zeyue Xue, Fu-Yun Wang, Yu Liu
2024 CVPR
EasyDrag: Efficient Point-based Manipulation on Diffusion Models
Xingzhong Hou, Boxiao Liu, Yi Zhang, Jihao Liu, Yu Liu , Haihang You
2024 CVPR
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Jihao Liu, Jinliang Zheng, Yu Liu , Hongsheng Li
2024 CVPR
RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
Zeyue Xue, Guanglu Song, Qiushan Guo, Boxiao Liu, Zhuofan Zong, Yu Liu+ , Ping Luo
2023 NIPS
Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction
Zhuofan Zong, Dongzhi Jiang, Guanglu Song, Zeyue Xue, Jingyong Su, Hongsheng Li, Yu Liu+
2023 ICCV
Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding
Jihao Liu, Tai Wang, Boxiao Liu, Qihang Zhang, Yu Liu , Hongsheng Li
2023 ICCV
UniFormer: Unifying Convolution and Self-Attention for Visual Recognition
Kunchang Li, Yali Wang, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu , Hongsheng Li, Yu Qiao
2023 TPAMI
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu, Xin Huang, Jinliang Zheng, Yu Liu+ , Hongsheng Li
2023 CVPR
▼ Large-scale Reinforcement Learning, Embodied AI
MaskMA: Towards Zero-Shot Multi-Agent Decision Making with Mask-Based Collaborative Learning
Jie Liu, Yinmin Zhang, Chuming Li, Zhiyuan You, Zhanhui Zhou, Chao Yang, Yaodong Yang, Yu Liu , Wanli Ouyang
2024 TMLR
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Jianxiong Li, Jinliang Zheng, Yinan Zheng, Liyuan Mao, Xiao Hu, Sijie Cheng, Haoyi Niu, Jihao Liu, Yu Liu , Jingjing Liu, Ya-Qin Zhang, Xianyuan Zhan
2024 ICML
SmartRefine: An Scenario-Adaptive Refinement Framework for Efficient Motion Prediction
Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu
2024 CVPR
LMDrive: Closed-Loop End-to-End Driving with Large Language Models
Hao Shao, Yuxuan Hu, Letian Wang, Guanglu Song, Steven L. Waslander, Yu Liu , Hongsheng Li
2024 CVPR
Critic-Guided Decision Transformer for Offline Reinforcement Learning
Yuanfu Wang, Chao Yang, Ying Wen, Yu Liu , Yu Qiao
2024 AAAI
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu , Wanli Ouyang
2024 AAAI
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios , Code
Yazhe Niu, Yuan Pu, Zhenjie Yang, Xueyan Li, Tong Zhou, Jiyuan Ren, Shuai Hu, Hongsheng Li, Yu Liu
(Spotlight) 2023 NIPS
Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning
Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu , Wanli Ouyang
2023 ECAI
Accelerating Reinforcement Learning for Autonomous Driving using Task-Agnostic and Ego-Centric Motion Skills
Tong Zhou, Letian Wang, Ruobing Chen, Wenshuo Wang, Yu Liu+
2023 IROS
Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning
Chuming Li, Ruonan Jia, JIAWEI YAO, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu , Wanli Ouyang
2023 IJCAI - PRL workshop
Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors
Letian Wang, Jie Liu, Hao Shao, Wenshuo Wang, Ruobing Chen, Yu Liu+ , Steven L. Waslander
2023 RSS
ReasonNet: End-to-End Driving with Temporal and Global Reasoning
Hao Shao, Letian Wang, Ruobing Chen, Steven L. Waslander, Hongsheng Li, Yu Liu+
(CARLA 2022 Champion) 2023 CVPR
GoBigger: A Scalable Platform for Cooperative-Competitive Multi-Agent Interactive Simulation , Project , Challenge
Ming Zhang, Shenghan Zhang, Zhenjie Yang, Lekai Chen, Jinliang Zheng, Chao Yang, Chuming Li, Hang Zhou, Yazhe Niu, Yu Liu+
2023 ICLR
Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency
Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu+ , Wanli Ouyang
2023 AAAI
Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer
Hao Shao, Letian Wang, Ruobing Chen, Hongsheng Li, Yu Liu+
2022 CoRL
▼ Large-scale Optimiation, Computer Vision
Teach-DETR: Better Training DETR with Teachers
Linjiang Huang, Kaixin Lu, Guanglu Song, Liang Wang, Si Liu, Yu Liu , Hongsheng Li
2023 TPAMI
DETRs with Collaborative Hybrid Assignments Training
Zhuofan Zong, Guanglu Song, Yu Liu+
2023 ICCV
Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection
Manyuan Zhang, Guanglu Song, Yu Liu , Hongsheng Li
2023 ICCV
Masked Autoencoders Are Stronger Knowledge Distillers for Object Detectors
Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu , Yujiu Yang
2023 ICCV
UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors
Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu , Yujiu Yang
2023 ICCV
Large-batch Optimization for Dense Visual Predictions , Code
Zeyue Xue, Jianming Liang, Guanglu Song, Zhuofan Zong, Liang Chen, Yu Liu+ , Ping Luo+
2022 NeurIPS
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP
Jihao Liu, Xin Huang, Guanglu Song, Hongsheng Li+ , Yu Liu+
2022 ECCV
Self-slimmed Vision Transformer
Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu+
2022 ECCV
TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers
Jihao Liu, Boxiao Liu, Hang Zhou, Hongsheng Li+ , Yu Liu+
2022 ECCV
Unifying Visual Perception by Dispersible Points Learning
Jianming Liang, Guanglu Song, Biao Leng, Yu Liu+
2022 ECCV
Towards Robust Face Recognition with Comprehensive Search
Manyuan Zhang, Guanglu Song, Yu Liu+ , Hongsheng Li
2022 ECCV
Rethinking Robust Representation Learning Under Fine-grained Noisy Faces
Bingqi Ma, Guanglu Song, Boxiao Liu, Yu Liu+
2022 ECCV
UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning , Code
Kunchang Li, Yali Wang, Peng Gao, Guanglu Song, Yu Liu , Hongsheng Li, Yu Qiao
2022 ICLR
Switchable K-class Hyperplanes for Noise-robust Representation Learning
Boxiao Liu, Guanglu Song, Manyuan Zhang, Haihang You, Yu Liu+
2021 ICCV
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan, Siyu Chen, Mike Zheng Shou, Yu Liu , Jing Shao, Hongsheng Li
2021 CVPR
Discriminability Distillation in Group Representation Learning
Manyuan Zhang, Guanglu Song, Hang Zhou, Yu Liu+
2020 ECCV
Learning Where to Focus for Efficient Video Object Detection
Zhengkai Jiang, Yu Liu+ , Ceyuan Yang, Jihao Liu, Gao Peng, Qian Zhang, Shiming Xiang, Chunhong Pan
2020 ECCV
Search to Distill: Pearls are Everywhere but not the Eyes
Yu Liu , Xuhui Jia, Mingxing Tan, Raviteja Vemulapalli, Yukun Zhu, Bradley Green, Xiaogang Wang
(Oral) 2020 CVPR
Revisiting the Sibling Head in Object Detector
Guanglu Song, Yu Liu+ , Xiaogang Wang
(OpenImage 2019 Champion) 2020 CVPR
Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images
Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu+ , Xiaogang Wang
2020 CVPR
KPNet: Towards Minimal Face Detector
Guanglu Song, Yu Liu+ , Yuhang Zang, Xiaogang Wang, Biao Leng, Qingsheng Yuan
(Oral) 2020 AAAI
Temporal Interlacing Network
Hao Shao, Shengju Qian, Yu Liu+
2020 AAAI
Differentiable Kernel Evolution
Yu Liu , Jihao Liu, Ailing Zeng, Xiaogang Wang
2019 IEEE ICCV
Towards Flops-constrained Face Recognition , Code
Yu Liu* , Guanglu Song*, Manyuan Zhang*, Jihao Liu*, Yucong Zhou, Junjie Yan
(Top-1 Solution) 2019 ICCV Lightweight Face Recognition Challenge & Workshop
Gradient Harmonized Single-stage Detector ,Code
Buyu Li*, Yu Liu* , Xiaogang Wang
(Oral) 2019 AAAI
Conditional Adversarial Generative Flow for Controllable Image Synthesis
Rui Liu, Yu Liu , Xinyu Gong, Xiaogang Wang, Hongsheng Li
2019 CVPR
Exploring Disentangled Feature Representation Beyond Face Identification
Yu Liu , Fangyin Wei, Jing Shao, Lv Sheng, Junjie Yan, Xiaogang Wang
2018 CVPR
Transductive Centroid Projection for Semi-supervised Large-scale Recognition
Yu Liu , Guanglu Song, Jing Shao, Xiao Jin, Xiaogang Wang
2018 ECCV
Rethinking Feature Discrimination and Polymerization for Large-scale Recognition
Yu Liu , Hongyang Li, Xiaogang Wang
2017 NIPS deep learning workshop
Recurrent Scale Approximation for Object Detection in CNN , Code
Yu Liu , Hongyang Li, Junjie Yan et al.
2017 IEEE ICCV
Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy
Guanglu Song*, Yu Liu* , Ming Jiang, Yujie Wang, Junjie Yan, Biao Leng
2018 CVPR
Quality Aware Network for Set to Set Recognition , Code
Yu Liu , Junjie Yan, Wanli Ouyang
2017 CVPR
Knowledge Distillation via Route Constrained Optimization
Xiao Jin, Baoyun Peng, Yichao Wu, Yu Liu , Jiaheng Liu, Ding Liang, Junjie Yan, Xiaolin Hu
(Oral) 2019 IEEE ICCV
Correlation Congruence for Knowledge Distillation
Baoyun Peng, Xiao Jin, Jiaheng Liu, Shunfeng Zhou, Yichao Wu, Yu Liu , Dongsheng Li, Zhaoning Zhang
2019 IEEE ICCV
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation , Code
Hang Zhou, Yu Liu , Ziwei Liu, Ping Luo, Xiaogang Wang
(Oral) 2019 AAAI
Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection , Code
Hongyang Li, Yu Liu , W Ouyang, X Wang
2018 IJCV
Region-based Quality Estimation Network for Large-Scale Person Re-identification
Guanglu Song, Biao Leng, Yu Liu , Congrui Hetang, Shaofan Cai
2018 AAAI
Learning Deep Features via Congenerous Cosine Loss for Person Recognition , Code
Yu Liu , Hongyang Li, Xiaogang Wang
arxiv:1702.06890, 2017
Scale-Aware Face Detection
Zekun Hao, Yu Liu , Hongwei Qin, Junjie Yan
2017 CVPR
POI: Multiple Object Tracking with High Performance Detection and Appearance Feature
F Yu, W Li, Q Li, Y Liu , X Shi, J Yan
(Top-1 Solution) 2016 ECCV workshop
Crafting GBD-Net for Object Detection
X Zeng, W Ouyang, J Yan, H Li, T Xiao, K Wang, Y Liu , Y Zhou, B Yang, ...
T-PAMI
3D object understanding with 3D Convolutional Neural Networks
B Leng, Y Liu , K Yu, X Zhang, Z Xiong
Information Sciences 366, 188-201, 2016
Honors and Awards
Won the 1st prize for scientific and technological progress, CAAI, 2024
Won the 1th place in CARLA Autonomous Driving Challenge 2022
Won the 1th place in ActivityNet 2020, AVA track
Won the 1th place in ActivityNet 2020, Kinetics track
Won the 1th place in NIST FRVT held by US government in 2020, 2021 and 2022
Won the 1th place in ICCV19 Multi-Moments in Time (MIT) Challenge
Won the 1th place in Google OpenImage Object Detection Challenge 2019
Won the 1th place in Google OpenImage Instance Segmentation Challenge 2019
Google PhD Fellowship in 2019 (1/China, 50/world)
Won the 1th place in ICCV19 Lightweight Face Recognition Challenge
Won the 1th place in NIST-FRVT threshold based 1:N track 2018
Won the 1th place in Multiple Objects Tracking Challenge (MOT16) in 2016
Won the 1th place in detection track of ImageNet (ILSVRC) in 2016
Won the best undergraduate dissertation in 2016 (1/230)
IEEE-Microsoft Undergraduate Fellowship in 2016 (40/world)
The Outstanding Winner of Challenge Cup in 2015 (top 1/China)
Projects & Datasets
GitHub
OpenDILab , a generalized Decision Intelligence engine.
X-Temporal , Easily implement SOTA video understanding methods with PyTorch on multiple machines and GPUs
CaffeMex v2.3 , a multi-GPU & memory-reduced MAT-Caffe on LINUX and WINDOWS
Labeled Pedestrains in the Wild , a large scale pedestrain re-identification benchmark