Professional Title / Position:Professor  


Research Direction:Heterogeneous Computing Parallel Acceleration & Performance Optimization, Software Testing & Reliability Assurance, Software Architecture Design & Optimization


Team:Cyberspace Security and Software Reliability Assurance


Tel:13802918651


Email:lul@scut.edu.cn


 Biography:

Lu Lu is is a member of the Software Engineering Technical Committee of the China Computer Federation. He serves as a Professor and Ph.D. Supervisor in the School of Computer Science and Engineering at South China University of Technology (SCUT), and also holds a dual appointment as a Professor at Peng Cheng National Laboratory in Shenzhen. His research focuses on heterogeneous computing parallel acceleration and performance optimization, software testing and reliability assurance, and software architecture design and optimization.He has published over 50 papers indexed by major citation databases (SCI, EI, ISTP), and has applied for/been granted more than 20 patents and software copyrights. He has led over 80 research projects, including the National Key R&D Program of China, grants from the National Natural Science Foundation of China (NSFC), major science and technology initiatives of Guangdong Province and Guangzhou City, as well as enterprise-sponsored development projects.As the primary contributor, he has received the Huawei Technology Best Collaboration Award, two Second Prizes of the Guangdong Provincial Science and Technology Progress Award, one Third Prize of the Guangdong Provincial Science and Technology Progress Award, and one Second Prize from the Guangdong Computer Academy, Huawei Ascend MVP.  


 Education:

 Ph.D., Xi'an Jiaotong University (1997 - 1999)

 Postdoctoral Fellow‌,City University of Hong Kong (2000-2001)


 Work Experience:

2001 - 2003: Senior Systems Analyst, Crédit Agricole Corporate and Investment Bank (Hong Kong)

2003 - Present: School of Computer Science and Engineering, South China University of Technology



 Course:

  • Software Testing and Quality Control

  • Data Structures

  • Software Architecture Design


 Projects:

  1. 2023 Server RAID Card Project, Key R&D Program, Ministry of Industry and Information Technology (MIIT), 2023-2026, Ongoing.

  2. Collaboration on Ascend Template Library & AscendNPU IR Performance Optimization, Commissioned Project by Huawei, 2024-2026, Ongoing.

  3. ACTLASS Collaboration Project, Commissioned Project by ByteDance, 2024-2026, Ongoing.

  4. Enhancing Expression Capability for Ascend MLIR, Commissioned Project by Huawei, 2024-2026, Ongoing.

  5. Research on GEMM Optimization Algorithms for Large Language Models, General Program, Guangdong Provincial Natural Science Foundation, 2023-2026, Ongoing.

  6. Software Development for Intelligent Recognition, Data Analysis, and Financial Business Models, Commissioned Project by YinzhiJie Joint Laboratory, 2023-2025; Completed.

  7. Technical Collaboration on Ascend Performance Enhancement Library, Commissioned Project by Huawei, 2022-2023, Completed.

  8. BLAS Library Performance Optimization, Commissioned Project by Huawei, 2022-2023, Completed.

  9. Optimization for Irregular Expression Computation Scenarios in BiSheng Compiler, Commissioned Project by Huawei, 2022-2023, Completed.


 Publications:

  • Representative Publications:

  1. Zhang, Yu, Lu Lu, Zhanyu Yang, Zhihong Liang, and Siliang Suo. "A load-balanced acceleration method for small and irregular batch matrix multiplication on GPU." Journal of Systems Architecture 160 (2025): 103341.

  2. Zhang, Yu, Lu Lu, Zhanyu Yang, Zhihong Liang, and Siliang Suo. "LE-GEMM: A lightweight emulation-based GEMM with precision refinement on GPU." Journal of Systems Architecture 160 (2025): 103336.

  3. Yang, Zhanyu, Lu Lu, and Quanyi Zou. "Ensemble Kernel-Mapping-Based Ranking Support Vector Machine for Software Defect Prediction." IEEE Transactions on Reliability (2024).

  4. Guo, Yijie, Lu Lu, and Songxiang Zhu. "Novel accelerated methods for convolution neural network with matrix core." The Journal of Supercomputing 79, no. 17 (2023): 19547-19573.

  5. Wang, Ruimin, Zhiwei Yang, Hao Xu, and Lu Lu. "A high-performance batched matrix multiplication framework for gpus under unbalanced input distribution." The Journal of Supercomputing 78, no. 2 (2022): 1741-1758.

  6. Yang, Zhiwei, Lu Lu, and Ruimin Wang. "A batched GEMM optimization framework for deep learning." The Journal of Supercomputing 78, no. 11 (2022): 13393-13408.

  7. Hu, Yichang, Lu Lu, and Cuixu Li. "Memory-accelerated parallel method for multidimensional fast fourier implementation on GPU." The Journal of Supercomputing 78, no. 16 (2022): 18189-18208.

  • Technology Transfer & Team Contributions:

  1. HPL-GPU: Optimized the HPL benchmark for AMD platforms, achieving a 20% overall performance increase. Contributed over 30,000 lines of source code, accounting for more than 95% of the total project (https://github.com/reger-men/HPL_GPU/graphs/contributors).

  2. Developed a single-node, continuous multi-core, multi-task acceleration operator for Peng Cheng National Laboratory, boosting computational efficiency from an initial 244 TFlops to 315 TFlops, a 29% overall performance improvement.

  3. Open-Sourcing of the Huawei Ascend Platform Template Library and Long Sequence Project (https://gitee.com/ascend/catlass && https://gitee.com/ascend/cann-var-sequence-gemm)

  4. The internet data collection and user behavior analysis platform developed by the team has been successfully deployed in over 20 enterprises, including Elite, Tasty Fresh, Claudio, and individual software developers (http://www.i-test.com.cn).



Awards:

  • Huawei Technology Best Collaboration Award (2024/2025)

  • Second Prize, Guangdong Provincial Science and Technology Progress Award (2019)

  • Third Prize, Guangdong Provincial Science and Technology Progress Award (2016)

  • Second Prize, Guangdong Computer Academy (2012)

  • Second Prize, Guangdong Provincial Science and Technology Progress Award (2010)