Research Groups of Professor Kui Jia, Professor Lianwen Jin and Professor Shuangping Huang of Our School Published Papers on CVPR2023 Respectively

Recently, the research groups of Professor Kui Jia, Professor Lianwen Jin and Professor Shuangping Huang from the School of Electronics and Information of South China University of Technology have made innovative achievements in the top conference in the field of computer vision, and the relevant papers are accepted by CVPR2023 conference.

 

CVPR is recognized by academia and industry as the world's top conference in the field of computer vision and pattern recognition. It was first held in Washington, DC, USA in 1983 and attracts more than 3,000 participants from academia and industry around the world every year.Most of the papers selected for CVPR each year come from leading technology companies, universities and research institutes in the field of artificial intelligence, representing the leading level in the field of computer vision. In 2023, there are 9155 papers submitted and 26 percent of them are accepted.

 

Three papers from Professor Kui Jia's research group are accepted, including A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization, On the Utility of Synthetic Data for Bare Supervised Learning and Downstream Domain Adaptation and Generative Scene Synthesis via Incremental View Inpainting using RGBD Diffusion Models.In the first paper, the key technologies of multi-view indoor scene reconstruction are studied and verified, and an alternate optimization strategy is proposed to remove noise from the reconstruction results obtained by using the traditional multi-view reconstruction method, so as to achieve the optimal implicit neural surface reconstruction results.In the second paper, the research group conducted the first comprehensive study on image classification based on large-scale composite data, and proposed two new data sets, including a large-scale composite data set SynSL (12.8M) and a large-scale composite to real domain adaptation data set S2RDA. The group also tested, evaluated and validated the data sets through supervised learning and downstream migration, providing many new and valuable learning insights for one of the most important basic research problems in computer vision -- out-of-distribution/real data generalization, and it is considered as a very important step in the research of synthetic data.In the third paper, this group creatively explored the key technologies of large-scale 3D scene reconstruction and completion based on deep learning, proposed a new technology based on incremental new view completion, and used the completion network based on probability diffusion model to propose the incremental completion scheme of new 3D scene view based on deep learning probability diffusion model. It lays a technical foundation for the development of virtual augmented reality, 3D model creation and editing in the future.

 

Two papers from Professor Lianwen Jin's research group are accepted, including 6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis. and Towards Robust Tampered Text Detection in Document Image: New dataset and New Solution .The first paper proposed a large-scale and diverse document layout analysis data set M6Doc for the first time, and proposed a new Transformer based document layout analysis method TransDLANet. It not only achieves the most advanced performance with 64.5% mAP on M6Doc, but also reaches a comparable level with cutting-edge methods on other datasets.The second paper proposes Selective Tamper Generation (STG), which automatically synthesizes tampering samples based on tamper-free image materials, effectively simulates the process of tampering, and efficiently generates a large number of diversified tampering samples.At the same time, they also developed a new method of Document Tamper Detector (DTD), which can effectively detect the tamper of document images without visual traces. The proposed method has relatively good robustness and generalization against image compression, and reaches the optimal level on multiple document image tamper detection datasets.

 

Two papers from Professor Shuangping Huang 's research group are accepted, including Perception and Semantic Aware Regulation for Sequential Confidence Calibration and Disentangling Writer and Character Styles for Handwriting Generation.In the first paper, a confidence calibration method of sequence recognition model based on perception and semantics is proposed. By mining the sequences that are similar to the target sequence in perception and semantics and combining tag smoothing to regularization the training process, the problem of overconfidence commonly existing in the deep sequence recognition model is solved, and trusted AI is realized, which improves the reliability of the model application in security critical areas.The second paper proposes a stylized handwritten text generation method for small samples, which introduces two complementary comparative learning tasks to guide the dual branch style encoder to simultaneously model the overall and detailed styles of handwritten text, in order to solve the problems of diverse handwritten text styles and high integration costs. In practical applications, this method only requires the writer to provide a small number of reference samples to copy the user's writing style, automatically generate a large amount of stylized text. Obviously, this method can accelerate the work efficiency of font designers. It can also be applied to writing robots, and has great commercial prospects. (Text by Xiangyu Li)