张怀东
职称:副教授

个人简介

张怀东,华南理工大学未来技术学院副教授,亚热带建筑与城市科学全国重点实验室成员,中国图学学会可视化与认知计算专委会委员,从事视觉感知决策及视觉模型持续优化理论研究,主持国家自然科学青年基金及广东省自然科学面上项目,入选广州市青年博士“启航”项目。所开发产品已实现高精度的人车检测、异常事件检测、车流信息统计等智能认知算法,成果在2022年获中国图象图形学学会颁发的科技进步奖二等奖。在视觉感知决策、视觉模型持续学习及轻量化等问题上取得国际前沿的研究成果。近五年已公开发表论文四十余篇。代表性成果发表在CVPR、ECCV、AAAI、IJCAI、TPAMI、TIP、TMM、TVCG、TCSVT等国际会议及期刊上。

课题组在23-25年的计算机视觉领域国际会议CVPR发表论文6篇,第一作者均为组内学生。课题组目前方向包括具身智能、持续学习、知识蒸馏,欢迎对加入课题组感兴趣的同学邮件联系:huaidongz@scut.edu.cn

教育背景

2011-2015,本科,华南理工大学,计算机科学与技术

2015-2020,博士,华南理工大学,计算机科学与技术

工作经历

2022-至今,准聘副教授,华南理工大学未来技术学院

2020-2022,博士后,香港理工大学

承担项目

(1) 国家青年科学基金项目(C类)[原青年科学基金项目], 基于有限标签数据的视觉目标检测模型优化研究, 2024-01-01 至 2026-12-31, 30万元, 在研, 主持

(2) 广东省自然科学基金项目, 非遗代表性人物的三维智能建模及驱动技术研究 , 2024-01 至 2026-12, 15万元, 在研, 主持

(3) 广州市基础与应用研究专题, 岭南粤剧非遗经典人物的3D数字人智能重建, 2024-01 至 2025-12, 5万元, 在研, 主持

(4) 国家自然科学基金面上项目, 视觉注意力驱动的图像编辑与补全, 2020-01-01 至 2023-12-31, 58万元, 结题, 参与

(5) 国家重点研发计划项目, ********, 2023-07 至 2025-06, 200万元, 在研, 参与

(6) 广东省国际科技合作计划项目, 基于跨域迁移学习的图像理解及其应用研究, 2022-11 至 2024-10, 50万元, 结题, 参与

(7) 广东省重点领域研发计划项目, 人机交互精准手术规划及实时引导软件关键技术, 2020-01 至 2022-12, 2000万元, 结题, 参与

(8) 广东省重点领域研发计划项目, 基于云网节端模式的智慧公路体系研究与示范, 2020-01 至 2022-12, 800万元, 结题, 参与

所授课程

《3D视觉智能技术》、《高级语言程序设计》、《高级语言程序设计实训》、暑期课程《Artificial Intelligence and High-end Manufacture》

标志性成果

[1] Xie, X., Huang, Z., Xu, W., Xiao, P., Xu, X., & Zhang, H. (2025). Let's Chorus: Partner-aware Hybrid Song-Driven 3D Head Animation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]Liu, S., Lv, J., Kang, J., Zhang, H., Liang, Z., & He, S. (2025). MODfinity Unsupervised Domain Adaptation with Multimodal Information Flow Intertwining. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Zheng, Y., Jiang, Z., He, S., Sun, Y., Dong, J., Zhang, H., & Du, Y. (2025). NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Li, X., Zhan, J., He, S., Xu, Y., Dong, J., Zhang, H., & Du, Y. (2025). PersonaMagic: Stage-Regulated High-Fidelity Face Customization with Tandem Equilibrium.  Proceedings of the AAAI Conference on Artificial Intelligence.

[5] Zhong, Y., Yan, Z., Xie, Y., Wu, S., Zhang, H., Shu, L., & Zhou, P. (2025). MSSDA: multi-sub-source adaptation for diabetic foot neuropathy recognition. Proceedings of the AAAI Conference on Artificial Intelligence.

[6] Zhang, H., Xie, Y., Zhang, H., Xu, C., Luo, X., Chen, D., Xu, X., Zhang, H., Heng, P. A., & He, S. (2025). Unambiguous granularity distillation for asymmetric image retrieval. Neural Networks, 107303.

[7] Zhou, Y., Ye, D., Zhang, H., Xu, X., Sun, H., Xu, Y., Liu, X., & Zhou, Y. (2025). Recurrent Diffusion for 3D Point Cloud Generation from a Single Image. IEEE Transactions on Image Processing.

[8] Liu, B., Zheng, C., Xu, X., Xu, C., Zhang, H., & He, S. (2025). Rotation-Adaptive Point Cloud Domain Generalization Via Intricate Orientation Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Huang, Z., Xu, X., Xu, C., Zhang, H., Zheng, C., Qin, J., & He, S. (2024). Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation. European Conference on Computer Vision.

[10] Xiao, P., Xie, Y., Xu, X., Chen, W., & Zhang, H. (2024). Multi-person Pose Forecasting with Individual Interaction Perceptron and Prior Learning. European Conference on Computer Vision, 402–419.

[11] Yang, Z., Jiang, Z., Li, X., Zhou, H., Dong, J., Zhang, H., & Du, Y. (2024). $$\backslashtextrm {D}^ 4$$-VTON: Dynamic Semantics Disentangling for Differential Diffusion Based Virtual Try-On. European Conference on Computer Vision, 36–52.

[12] Jiang, X., Zheng, C., Xu, X., Liu, B., Zheng, W., Zhang, H., & He, S. (2024). VrdONE: One-stage Video Visual Relation Detection. Proceedings of the 32nd ACM International Conference on Multimedia, 1437–1446.

[13] Yu, Y., Liu, B., Zheng, C., Xu, X., Zhang, H., & He, S. (2024). Beyond textual constraints: Learning novel diffusion conditions with fewer examples. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7109–7118.

[14] Zhang, H., Huang, R., Xie, Y., & Zhang, H. (2024). Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13373–13383.

[15] Xie, Y., Lin, Y., Cai, W., Xu, X., Zhang, H., Du, Y., & He, S. (2024). D3still: Decoupled differential distillation for asymmetric image retrieval. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17181–17190.

[16] Zhang, W., Xu, C., Xu, X., Zhang, H., Zhao, R., & Qin, J. (2024). Exploiting Multi-View Clues for Context-Aware Unified Lumbar MRI Identification and Diagnosis. 2024 International Joint Conference on Neural Networks (IJCNN), 1–9.

[17] Cai, W., Xu, X., Xu, J., Zhang, H., Yang, H., Zhang, K., & He, S. (2024). Hierarchical damage correlations for old photo restoration. Information Fusion, 107, 102340.

[18] Cai, W., Zhang, H., Xu, X., Xu, C., Zhang, K., & He, S. (2024). Delving into Important Samples of Semi-Supervised Old Photo Restoration: A New Dataset and Method. IEEE Transactions on Multimedia, 26, 9866–9879.

[19] Xu, C., Xu, Y., Zhang, H., Xu, X., & He, S. (2024). DreamAnime: Learning Style-Identity Textual Disentanglement for Anime and Beyond. IEEE Transactions on Visualization and Computer Graphics, 1–12.

[20] Yang, H., Xu, X., Xu, C., Zhang, H., Qin, J., Wang, Y., Heng, P.-A., & He, S. (2024). G 2 Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors. IEEE Transactions on Information Forensics and Security, 19, 8773–8785.

[21] ZHENG, C., LIU, B., XU, X., ZHANG, H., & HE, S. (n.d.). Learning an interpretable stylized subspace for 3D-aware animatable artforms.(2024). IEEE Transactions on Visualization and Computer Graphics, 1–13.

[22] Zhou, Y., Qian, J., Zhang, H., Xu, X., Sun, H., Zeng, F., & Zhou, Y. (2024). Adaptive multi-text union for stable text-to-image synthesis learning. Pattern Recognition, 152, 110438.

[23] Zhou, Y., Sun, H., Zhang, H., Xu, X., Ye, D., Zhou, Y., Liu, X., & others. (2024). GaFL: Geometric-aware Feature Learning for universal 3D models recognition. Pattern Recognition, 149, 110214.

[24] Xu, C., Xu, X., Zhao, N., Cai, W., Zhang, H., Li, C., & Liu, X. (2023). Panel-page-aware comic genre understanding. IEEE Transactions on Image Processing, 32, 2636–2648.

[25] Cai, W., Zhang, H., Xu, X., He, S., Zhang, K., & Qin, J. (2023). Contextual-assisted scratched photo restoration. IEEE Transactions on Circuits and Systems for Video Technology, 33(10), 5458–5469.

[26] Zhou, Y., Dang, Z., Zhang, H., Xu, X., Qin, J., Li, W., Zeng, F., & Liu, X. (2023). EFSCNN: Encoded feature sphere convolution neural network for fast non-rigid 3D models classification and retrieval. Computer Vision and Image Understanding, 233, 103724.

[27] Huang, X., Zhou, N., Huang, J., Zhang, H., Pedrycz, W., & Choi, K.-S. (2023). Center transfer for supervised domain adaptation. Applied Intelligence, 53(15), 18277–18293.

[28] Xie, Y., Zhang, H., Xu, X., Zhu, J., & He, S. (2023). Towards a smaller student: Capacity dynamic distillation for efficient image retrieval. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16006–16015.

[29] Zheng, C., Liu, B., Zhang, H., Xu, X., & He, S. (2023). Where is my spot? Few-shot image generation via latent subspace optimization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3272–3281.

[30] Xiao, W., Xu, C., Zhang, H., & Xu, X. (2022). Spatial-Aware GAN for Instance-Guided Cross-Spectral Face Hallucination. CAAI International Conference on Artificial Intelligence, 93–105.

[31] Xu, S., Zhang, H., Xu, X., Hu, X., Xu, Y., Dai, L., Choi, K.-S., & Heng, P.-A. (2022). Representative feature alignment for adaptive object detection. IEEE Transactions on Circuits and Systems for Video Technology, 33(2), 689–700.

[32] Xu, X., Chen, J., Zhang, H., & Han, G. (2022). SA-DPNet: Structure-aware dual pyramid network for salient object detection. Pattern Recognition, 127, 108624.

[33] Xu, X., Chen, J., Zhang, H., & Ng, W. W. (2021). D4Net: De-deformation defect detection network for non-rigid products with large patterns. Information Sciences, 547, 763–776.

[34] Xu, Y., Yan, M., Xu, C., Zhang, H., Liu, Y., & Xu, X. (2021). Adaptive selecting and learning network and a new benchmark for imbalanced fine-grained ship classification. IEEE Access, 9, 58116–58126.

[35] Yan, X., Zhang, H., Xu, X., Hu, X., & Heng, P.-A. (2021). Learning semantic context from normal samples for unsupervised anomaly detection. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 3110–3118.

[36] Ye, C., Zhang, H., Xu, X., Cai, W., Qin, J., & Choi, K.-S. (2021). Object Detection in Densely Packed Scenes via Semi-Supervised Learning with Dual Consistency. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 1245–1251.

[37] Zhang, H., Han, C., Zhang, X., Du, Y., Xu, X., Han, G., Qin, J., & He, S. (2021). Fast scene labeling via structural inference. Neurocomputing, 442, 317–326.

[38] Xu, X., Chen, J., Zhang, H., & Han, G. (2020). Dual pyramid network for salient object detection. Neurocomputing, 375, 113–123.

[39] Zhang, H., Xu, X., Han, G., & He, S. (2020). Context-aware and scale-insensitive temporal repetition counting. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 670–678.

[40] Mo, Y., Han, G., Zhang, H., Xu, X., & Qu, W. (2019). Highlight-assisted nighttime vehicle detection using a multi-level fusion network and label hierarchy. Neurocomputing, 355, 13–23.

[41] Xu, X., He, H., Zhang, H., Xu, Y., & He, S. (2019). Unsupervised domain adaptation via importance sampling. IEEE Transactions on Circuits and Systems for Video Technology, 30(12), 4688–4699.

[42] Xu, X., Xie, M., Miao, P., Qu, W., Xiao, W., Zhang, H., Liu, X., & Wong, T.-T. (2019). Perceptual-aware sketch simplification based on integrated VGG layers. IEEE Transactions on Visualization and Computer Graphics, 27(1), 178–189.

[43] Zhang, H., Xu, X., He, H., He, S., Han, G., Qin, J., & Wu, D. (2019). Fast user-guided single image reflection removal via edge-aware cascaded networks. IEEE Transactions on Multimedia, 22(8), 2012–2023.

[44] Xu, X., Zhang, H., Han, G., Kwan, K. C., Pang, W.-M., Fang, J., & Zhao, G. (2016). A Two-Phase Space Resection Model for Accurate Topographic Reconstruction from Lunar Imagery with PushbroomScanners. Sensors, 16(4), 507.