ZHANG Huaidong

Academic Title: Associate Professor

Research Interests: Computer vision, Embodied intelligence, Continual learning, Representation learning

Biography

Huaidong Zhang is an Associate Professor at the School of Future Technology, South China University of Technology (SCUT), and a member of the State Key Laboratory of Subtropical Building Science. He is also a member of the Committee on Visualization and Cognitive Computing of the China Society of Image and Graphics. His research focuses on visual perception-decision making and the continuous optimization theory of visual models.

He has led a National Natural Science Foundation of China (Youth Fund) project and a Guangdong Provincial Natural Science Foundation (General Program) project , and has been selected for the 'Sailing Program' for Young Doctors in Guangzhou. The products he developed have achieved intelligent cognitive algorithms for high-precision vehicle and pedestrian detection, anomaly event detection, and traffic flow information statistics. His achievements were recognized with the Second Prize of the Science and Technology Progress Award from the China Society of Image and Graphics in 2022. He has achieved international frontier research results in areas such as visual perception-decision making, continuous learning of visual models, and model light-weighting. In the past five years, he has published over forty papers. Representative works have appeared in international conferences and journals such as CVPR, ECCV, AAAI, IJCAI, TPAMI, TIP, TMM, TVCG, and TCSVT.

His research group has published 6 papers at the international computer vision conference CVPR from 2023 to 2025, with all first authors being students from the group. Current research directions of the group include embodied intelligence, continuous learning, and knowledge distillation.

Contact: huaidongz@scut.edu.cn

Education Background

2015-2020, Ph.D., Computer Science and Technology, SCUT, China

2011-2015, B.S., Computer Science and Technology, SCUT, China

Working Experience

2022-Present, Associate Professor (Pre-tenure), School of Future Technology, SCUT

2020-2022, Postdoctoral Researcher, The Hong Kong Polytechnic University

Courses Taught

3D Visual Intelligence Technology

Advanced Language Programming

Advanced Language Programming Practical Training

Summer Course: Artificial Intelligence and High-end Manufacture

Projects

[1] National Natural Science Foundation of China (Youth Fund C), 'Research on Optimization of Visual Object Detection Model Based on Limited Labeled Data'. 2024.01.01-2026.12.31, 300,000 RMB, Host, Ongoing.

[2] Guangdong Provincial Natural Science Foundation Project, 'Research on 3D Intelligent Modeling and Driving Technology for Representative Figures of Intangible Cultural Heritage'. 2024.01-2026.12, 150,000 RMB, Host, Ongoing.

[3] Guangzhou Basic and Applied Research Project, 'Intelligent 3D Digital Human Reconstruction of Classical Lingnan Cantonese Opera Intangible Cultural Heritage Figures'. 2024.01-2025.12, 50,000 RMB, Host, Ongoing.

Selected Publications

[1] Xie, X., Huang, Z., Xu, W., Xiao, P., Xu, X., & Zhang, H. (2025). Let's Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]Liu, S., Lv, J., Kang, J., Zhang, H., Liang, Z., & He, S. (2025). MODfinity Unsupervised Domain Adaptation with Multimodal Information Flow Intertwining. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Zheng, Y., Jiang, Z., He, S., Sun, Y., Dong, J., Zhang, H., & Du, Y. (2025). NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Li, X., Zhan, J., He, S., Xu, Y., Dong, J., Zhang, H., & Du, Y. (2025). PersonaMagic: Stage-Regulated High-Fidelity Face Customization with Tandem Equilibrium. Proceedings of the AAAI Conference on Artificial Intelligence.

[5] Zhong, Y., Yan, Z., Xie, Y., Wu, S., Zhang, H., Shu, L., & Zhou, P. (2025). MSSDA: multi-sub-source adaptation for diabetic foot neuropathy recognition. Proceedings of the AAAI Conference on Artificial Intelligence.

[6] Zhang, H., Xie, Y., Zhang, H., Xu, C., Luo, X., Chen, D., Xu, X., Zhang, H., Heng, P. A., & He, S. (2025). Unambiguous granularity distillation for asymmetric image retrieval. Neural Networks, 107303.

[7] Zhou, Y., Ye, D., Zhang, H., Xu, X., Sun, H., Xu, Y., Liu, X., & Zhou, Y. (2025). Recurrent Diffusion for 3D Point Cloud Generation from a Single Image. IEEE Transactions on Image Processing.

[8] Liu, B., Zheng, C., Xu, X., Xu, C., Zhang, H., & He, S. (2025). Rotation-Adaptive Point Cloud Domain Generalization Via Intricate Orientation Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Huang, Z., Xu, X., Xu, C., Zhang, H., Zheng, C., Qin, J., & He, S. (2024). Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation. European Conference on Computer Vision.

[10] Xiao, P., Xie, Y., Xu, X., Chen, W., & Zhang, H. (2024). Multi-person Pose Forecasting with Individual Interaction Perceptron and Prior Learning. European Conference on Computer Vision, 402-419.

[11] Yang, Z., Jiang, Z., Li, X., Zhou, H., Dong, J., Zhang, H., & Du, Y. (2024). $$\backslashtextrm {D}^ 4$$-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-On. European Conference on Computer Vision, 36-52.

[12] Jiang, X., Zheng, C., Xu, X., Liu, B., Zheng, W., Zhang, H., & He, S. (2024). VrdONE: One-stage Video Visual Relation Detection. Proceedings of the 32nd ACM International Conference on Multimedia, 1437-1446.

[13] Yu, Y., Liu, B., Zheng, C., Xu, X., Zhang, H., & He, S. (2024). Beyond textual constraints: Learning novel diffusion conditions with fewer examples. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7109-7118.

[14] Zhang, H., Huang, R., Xie, Y., & Zhang, H. (2024). Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13373-13383.

[15] Xie, Y., Lin, Y., Cai, W., Xu, X., Zhang, H., Du, Y., & He, S. (2024). D3still: Decoupled differential distillation for asymmetric image retrieval. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17181-17190.

[16] Zhang, W., Xu, C., Xu, X., Zhang, H., Zhao, R., & Qin, J. (2024). Exploiting Multi-View Clues for Context-Aware Unified Lumbar MRI Identification and Diagnosis. 2024 International Joint Conference on Neural Networks (IJCNN), 1-9.

[17] Cai, W., Xu, X., Xu, J., Zhang, H., Yang, H., Zhang, K., & He, S. (2024). Hierarchical damage correlations for old photo restoration. Information Fusion, 107, 102340.

[18] Cai, W., Zhang, H., Xu, X., Xu, C., Zhang, K., & He, S. (2024). Delving into Important Samples of Semi-Supervised Old Photo Restoration: A New Dataset and Method. IEEE Transactions on Multimedia, 26, 9866-9879.

[19] Xu, C., Xu, Y., Zhang, H., Xu, X., & He, S. (2024). DreamAnime: Learning Style-Identity Textual Disentanglement for Anime and Beyond. IEEE Transactions on Visualization and Computer Graphics, 1-12.

[20] Yang, H., Xu, X., Xu, C., Zhang, H., Qin, J., Wang, Y., Heng, P.-A., & He, S. (2024). G 2 Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors. IEEE Transactions on Information Forensics and Security, 19, 8773-8785.

[21] ZHENG, C., LIU, B., XU, X., ZHANG, H., & HE, S. (n.d.). Learning an interpretable stylized subspace for 3D-aware animatable artforms.(2024). IEEE Transactions on Visualization and Computer Graphics, 1-13.

[22] Zhou, Y., Qian, J., Zhang, H., Xu, X., Sun, H., Zeng, F., & Zhou, Y. (2024). Adaptive multi-text union for stable text-to-image synthesis learning. Pattern Recognition, 152, 110438.

[23] Zhou, Y., Sun, H., Zhang, H., Xu, X., Ye, D., Zhou, Y., Liu, X., & others. (2024). GaFL: Geometric-aware Feature Learning for universal 3D models recognition. Pattern Recognition, 149, 110214.

[24] Xu, C., Xu, X., Zhao, N., Cai, W., Zhang, H., Li, C., & Liu, X. (2023). Panel-page-aware comic genre understanding. IEEE Transactions on Image Processing, 32, 2636-2648.

[25] Cai, W., Zhang, H., Xu, X., He, S., Zhang, K., & Qin, J. (2023). Contextual-assisted scratched photo restoration. IEEE Transactions on Circuits and Systems for Video Technology, 33(10), 5458-5469.

[26] Zhou, Y., Dang, Z., Zhang, H., Xu, X., Qin, J., Li, W., Zeng, F., & Liu, X. (2023). EFSCNN: Encoded feature sphere convolution neural network for fast non-rigid 3D models classification and retrieval. Computer Vision and Image Understanding, 233, 103724.

[27] Huang, X., Zhou, N., Huang, J., Zhang, H., Pedrycz, W., & Choi, K.-S. (2023). Center transfer for supervised domain adaptation. Applied Intelligence, 53(15), 18277-18293.

[28] Xie, Y., Zhang, H., Xu, X., Zhu, J., & He, S. (2023). Towards a smaller student: Capacity dynamic distillation for efficient image retrieval. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16006-16015.

[29] Zheng, C., Liu, B., Zhang, H., Xu, X., & He, S. (2023). Where is my spot? Few-shot image generation via latent subspace optimization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3272-3281.

[30] Xiao, W., Xu, C., Zhang, H., & Xu, X. (2022). Spatial-Aware GAN for Instance-Guided Cross-Spectral Face Hallucination. CAAI International Conference on Artificial Intelligence, 93-105.

[31] Xu, S., Zhang, H., Xu, X., Hu, X., Xu, Y., Dai, L., Choi, K.-S., & Heng, P.-A. (2022). Representative feature alignment for adaptive object detection. IEEE Transactions on Circuits and Systems for Video Technology, 33(2), 689-700.

[32] Xu, X., Chen, J., Zhang, H., & Han, G. (2022). SA-DPNet: Structure-aware dual pyramid network for salient object detection. Pattern Recognition, 127, 108624.

[33] Xu, X., Chen, J., Zhang, H., & Ng, W. W. (2021). D4Net: De-deformation defect detection network for non-rigid products with large patterns. Information Sciences, 547, 763-776.

[34] Xu, Y., Yan, M., Xu, C., Zhang, H., Liu, Y., & Xu, X. (2021). Adaptive selecting and learning network and a new benchmark for imbalanced fine-grained ship classification. IEEE Access, 9, 58116-58126.

[35] Yan, X., Zhang, H., Xu, X., Hu, X., & Heng, P.-A. (2021). Learning semantic context from normal samples for unsupervised anomaly detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 3110-3118.

[36] Ye, C., Zhang, H., Xu, X., Cai, W., Qin, J., & Choi, K.-S. (2021). Object Detection in Densely Packed Scenes via Semi-Supervised Learning with Dual Consistency. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 1245-1251.

[37] Zhang, H., Han, C., Zhang, X., Du, Y., Xu, X., Han, G., Qin, J., & He, S. (2021). Fast scene labeling via structural inference. Neurocomputing, 442, 317-326.

[38] Xu, X., Chen, J., Zhang, H., & Han, G. (2020). Dual pyramid network for salient object detection. Neurocomputing, 375, 113-123.

[39] Zhang, H., Xu, X., Han, G., & He, S. (2020). Context-aware and scale-insensitive temporal repetition counting. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 670-678.

[40] Mo, Y., Han, G., Zhang, H., Xu, X., & Qu, W. (2019). Highlight-assisted nighttime vehicle detection using a multi-level fusion network and label hierarchy. Neurocomputing, 355, 13-23.

[41] Xu, X., He, H., Zhang, H., Xu, Y., & He, S. (2019). Unsupervised domain adaptation via importance sampling. IEEE Transactions on Circuits and Systems for Video Technology, 30(12), 4688-4699.

[42] Xu, X., Xie, M., Miao, P., Qu, W., Xiao, W., Zhang, H., Liu, X., & Wong, T.-T. (2019). Perceptual-aware sketch simplification based on integrated VGG layers. IEEE Transactions on Visualization and Computer Graphics, 27(1), 178-189.

[43] Zhang, H., Xu, X., He, H., He, S., Han, G., Qin, J., & Wu, D. (2019). Fast user-guided single image reflection removal via edge-aware cascaded networks. IEEE Transactions on Multimedia, 22(8), 2012-2023.

[44] Xu, X., Zhang, H., Han, G., Kwan, K. C., Pang, W.-M., Fang, J., & Zhao, G. (2016). A Two-Phase Space Resection Model for Accurate Topographic Reconstruction from Lunar Imagery with PushbroomScanners. Sensors, 16(4), 507.