Professor Jin Lianwen's Team Won Multiple Championships in the ICPR-Chart-Infographics 2020 International Academic Competition for Chart Information Extraction
time: 2021-01-21

Recently, the International Conference on Pattern Recognition (ICPR) kicked off, and the results of various academic competitions were also announced at the same time.Among them, in the international competition for chart information extraction, the deep learning and visual computing team led by Professor Jin Lianwen from our academy, together with Lenovo Research Institute and Shanghai Hehe Information Technology Co., Ltd., a well-known company in the OCR field, participated in the chart information extraction competition.In the end, this team won 11 championships out of 14 sub-tasks on all tracks.

The organizers of this contest are the University at Buffalo (University at Buffalo, the State University of New York) and the famous software company Adobe.The competition attracted many participating teams from all over the world, including Deep Blue Technology, Xinhua Zhiyun, Ropar Institute of Indian Institute of Technology, University of Buffalo, etc.

Figure 1 Examples of various types of chart data

As a widely used communication and display tool, charts can represent data intuitively and visually, thereby helping to understand data.IEEE TPAMI, a top journal in the field of artificial intelligence (a district of the Chinese Academy of Sciences, SCI impact factor 17.73). The sponsor's academic papers,“Davila K., Setlur S., Doermann D., Kota 

B.U., and Govindaraju V. Chart Mining: A Survey of Methods for Automated Chart Analysis”, included in the top journal IEEE TPAMI in the field of artificial intelligence in 2020, provides a comprehensive overview of the various techniques of graph data mining, including automatic chart extraction, processing of multi-panel charts, automatic chart classification, automatic data extraction, and application of chart data mining, etc. The main trends in this field are also summarized.It can be seen that the automatic extraction of chart content is a very challenging and important academic problem in recent years, which has high academic research and practical application value.

This ICPR competition was officially launched for the task of automatically extracting chart information.In order to deal with the complex problem of automatic extraction of chart information, the competition divided it into 6 sub-tasks and 1 end-to-end task.Among them, the end-to-end task needs to integrate all technologies, which is the most difficult and most practical task. It is also a manifestation of comprehensive strength.The organizer sets up two different test sets (two subtasks) for each task, namely synthetic data test set and real data test set, and conducts a separate evaluation for each data set.

Specifically, the sub-tasks of this competition include:

Task 1:Classification of chart types, divide the charts into horizontal/vertical histograms, line charts, scatter charts, etc.;

Task 2: Detection and recognition of image text blocks;

Task 3: Classification of text attributes, including chart title, axis scale value, legend label, etc.;

Task 4: Analyze the coordinate axis, locate the position of the scale point and match it with the scale value text;

Task 5: Analyze the legend, locate the position of the legend and match it with the text of the legend label;

Task 6: Extract chart information and restore the original table data of the chart;

Task 7: End-to-end chart information extraction, input chart pictures, and output chart original table data.

The main student members of our academy participating in the ICPR-CHART-Infographics 2020 international competition include (all students in the School of Electronics and Information):

Task 1 Participants: Huang Yuhao (Master student), Chen Bangdong (Master student)

Task 2 Participants: Wu Sihang (Master student), Xie Canyu (Master student), Liao Qianying (Master student)

Task three participants: Tang Guozhi (Master student), Wang Jiapeng (Master student), Li Hongliang (Undergraduate student)

Participating members of task four: Ma Weihong (master student), Zhang Hesuo (master student), Liao Qianying (master student)

Task 5 Participating members: Zhang Hesuo (master student), Ma Weihong (master student)

Task 6 Participants: Zhang Hesuo (Master student), Ma Weihong (Master student)

Task 7 Participants: Ma Weihong (Master student), Zhang Hesuo (Master student), Tang Guozhi (Master student), Wang Jiapeng (Master student), Wu Sihang (Master student), Huang Yuhao (Master student), Xie Canyu (Master student)