Study

Title
Professor, Doctoral & Graduate Supervisor, School of future technology
leizhang@idea.edu.cn
Honor
IDEA (Institute of Digital Economy and Artificial Intelligence) Distinguished Scientist of the Greater Bay Area of Guangdong, Hong Kong and Macao
MEng: 1) Electronic Information
MS:1)Intelligent Science and Technology
Ph.D: 1) Electronic Information; 2) Intelligent Science and Technology;
Zhang Lei is a Chair scientist at the Digital Economy Research Institute of the Guangdong-Hong Kong-Macao Greater Bay Area (IDEA), responsible for research in computer vision and robotics. He also serves as a visiting professor at South China University of Technology and the Hong Kong University of Science and Technology (Guangzhou). He has served as the chief researcher at Microsoft Research Asia, Microsoft Headquarters Research, and computer vision-related product departments. For a long time, he has led research groups to conduct fundamental research in the field of computer vision and applied research in large-scale image analysis, object detection, and multimodal understanding of visual language. The research results have been widely applied to Microsoft Bing's search and cognitive services cloud computing platform. He has published over 150 papers in related fields such as computer vision and holds more than 60 authorized patents in the United States. He was elected as an IEEE Fellow in 2020 for his contributions to large-scale image recognition and multimedia information retrieval.
Computer vision, machine learning, etc.
Huang Y, Wang J, Zeng A, et al. Dreamwaltz-g: Expressive 3d gaussian avatars from skeleton-guided 2d diffusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.
Yang J, Zeng A, Ren T, et al. ED-Pose++: Enhanced Explicit Box Detection for Conventional and Interactive Multi-Object Keypoint Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025.
Zhang Y, Wu G, Chen L H, et al. HumanMM: Global Human Motion Recovery from Multi-shot Videos[C]//Proceedings of the Computer Vision and Pattern Recognition Conference. 2025: 1973-1983.
Huang S, Li F, Zhang H, et al. A mutual supervision framework for referring expression segmentation and generation[J]. International Journal of Computer Vision, 2025, 133(6): 3597-3612.
Zhang Y, Lin J, Zeng A, et al. Motion-x++: A large-scale multimodal 3d whole-body human motion dataset[J]. arXiv preprint arXiv:2501.05098, 2025.
Qu J, Li H, Liu S, et al. TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video[J]. arXiv preprint arXiv:2411.18671, 2024.
Ren T, Chen Y, Jiang Q, et al. Dino-x: A unified vision model for open-world object detection and understanding[J]. arXiv preprint arXiv:2411.14347, 2024.
Liu S, Zeng Z, Ren T, et al. Grounding dino: Marrying dino with grounded pre-training for open-set object detection[C]//European conference on computer vision. Cham: Springer Nature Switzerland, 2024: 38-55.
Yang J, Zeng A, Zhang R, et al. X-pose: Detecting any keypoints[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 249-268.
Ren T, Jiang Q, Liu S, et al. Grounding dino 1.5: Advance the' edge' of open-set object detection[J]. arXiv preprint arXiv:2405.10300, 2024.