报告题目: Deep Learning Human Actions
时间:2016年12月30日(星期五)上午10:30
地点:30号楼632会议室
报告人:邵岭
主持人:徐向民
报告摘要:In this talk, I will cover two of my main research areas – human action recognition and deep learning. Action recognition has been an active research topic in computer vision due to its various applications in human-machine interaction, robotics, video surveillance and visual big data search. I will first review some related work on handcrafted features, feature/deep learning and attributes learning. Then I will introduce our recent multi-task system that can jointly solve three main problems: 1) Where in the video do the actions occur? (2) What categories do the actions belong to? and (3) How are these actions performed? This multi-task learning framework is designed based on a state-of-the-art 3D deep convolutional neural network (3D-CNN). Specifically, in the training phase, action localization, classification and attributes learning can be jointly optimized via the proposed deep architecture. Once model training is completed, given an upcoming test video, we can describe each individual action in the video simultaneously as: where the action occurs, what the action is and how the action is performed. To train the deep network, we also introduce a new large-scale aligned action dataset, NASA, with 200K well labeled video clips. Finally, I will present the results of detailed action parsing on challenging, realistic datasets that are collected by us or publicly available. Some initial results on zero-shot learning via the obtained action attributes will be discussed too.
报告人简介:
Ling Shao is a Chair Professor with the School of Computing Sciences at the University of East Anglia, Norwich, UK. He received the B.Eng. degree in Electronic and Information Engineering from the University of Science and Technology of China (USTC), the M.Sc. degree in Medical Image Analysis and the Ph.D. (D.Phil.) degree in Computer Vision at the Robotics Research Group from the University of Oxford. Previously, he was a Professor (2014-2016) with Northumbria University, a Senior Lecturer (2009-2014) with the Department of Electronic and Electrical Engineering at the University of Sheffield and a Senior Scientist (2005-2009) with Philips Research, The Netherlands. His research interests include Computer Vision, Image/Video Processing, Pattern Recognition and Machine Learning. He has authored/co-authored over 200 papers in refereed journals/conferences such as IEEE TPAMI, TIP, TNNLS, IJCV, ICCV, CVPR, ECCV, IJCAI and ACM MM, and holds over 10 EU/US patents.
Ling Shao is an Associate Editor of IEEE Transactions on Image Processing,IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Circuits and Systems for Video Technology,and several other journals. He has edited three books and several special issues for journals such as TNNLS and PR. He has organized a number of international workshops with top conferences including ICCV, ECCV and ACM Multimedia. He will be the General Chair for BMVC 2018. He is/was an Area Chair for ICPR'16, BMVC'14/15/16, WACV'14 and ICME'15 and has been serving as a Program Committee member for many international conferences, including ICCV, CVPR, ECCV, BMVC, and ACM MM, and as a reviewer for many leading journals.He is a Fellow of the British Computer Society, a Fellow of the IET, a Senior Member of the IEEE, and a Life Member of the ACM. (Homepage: http://lshao.staff.shef.ac.uk/)