2024-05-31
浏览次数:357报告题目:Human-centric Visual Perception, Understanding, and Generation
报告人: 曾爱玲 博士
报告时间:2024年6月2日周日 下午 3:30 – 5:30
报告地点:逸夫科学馆216报告厅
主持人:曾德炉教授(电子与信息学院)
欢迎广大师生参加!
摘要:
Capturing and understanding expressive human motions from arbitrary videos are fundamental and important tasks in Computer Vision, Human-Computer Interaction, and Controllable Generation. Different from high-cost wearable motion capture devices for professional users, we develop a series of markerless motion capture techniques for every user with input of an image or a video, which also makes the motion-paired data scalable, low-cost, and diverse. In this talk, I will focus on how to build large-scale human-centric data and benchmarks, including
(1).Automatically annotating multimodal data such as motion, image, video, text, audio, etc., from Internet data:
(2).Understanding human motions from videos via LLMs;(3).Controllable 2D to 4D human-centric generation.
个人简历:
Dr.Ailing Zeng (曾爱玲)is a senior research scientist at Tencent. Previously, she worked at the International Digital Economy Academy and led the human-centric perception, understanding, and generation research team. She obtained a Ph.D. degree at the Chinese University of Hong Kong. Her research targets to build multi-modal human-like intelligent agents on scalable big data, especially for Large Motion Models to capture, understand, interact, and generate the motion of humans, animals, and the world. She has published over thirty top-tier conference papers at CVPR, ICCV, Neurips. etc., and one of her first-author papers for long-term time-series forecasting was selected by the Top-3 Influential AAAI 2023 Paper. Her research works have been transferred or used to applicable products, such as DW-Pose in ControlNet ComfyUI for controllable generation and SmoothNet in AnyVision for surveillance areas.