I'm an Assistant Professor at Tsinghua University, College of Artificial Intelligence.
I'm currently leading the MEOW (Modeling Egocentric Omni World) Lab, which is committed to the following research agenda: Designing human-centered AI that sees through your eyes, learns your skills, and understands your intentions -- 构建能“看你所见、学你所会、懂你所想”的下一代人本智能系统
Previously, I was a Research Scientist at META GenAI, primarily focusing on egocentric vision and generative AI models. I completed my Ph.D. in Robotics at Georgia Tech, advised by Prof. James Rehg. I also work closely with Prof. Yin Li from the University of Wisconsin–Madison. I was fortunate to collaborate with Prof. Siyu Tang and Prof. Michael Black during my visit to ETH Zurich and the Max Planck Institute. I enjoyed a wonderful internship at Facebook Reality Labs, where I worked with Dr. Chao Li, Dr. Lingni Ma, Dr. Kiran Somasundaram, and Prof. Kristen Grauman on egocentric action recognition and localization. I am honored to have received several awards, including Best Paper Candidate at CVPR 2022 and ECCV 2024, and the BMVC Best Student Paper Award. As a primary contributor, I have helped construct several widely recognized egocentric video datasets, including
Ego4D,
Ego-Exo4D,
EGTEA Gaze+, and the
Behavior Vision Suite. I have also desgined multiple models that will be deployed in the next-generation smart glasses developed by
Meta Reality Labs. During my time at
Meta GenAI, I was deeply involved in the training and evaluation of large-scale generative multimodal models, including
EMU,
Llama3, and
Llama4 (multimodal components only).
*This background image of Jaime Lannister charging alone at Daenerys and her dragon reveals what it often takes to do science—you must be willing to stand as the lonely warrior.