Miao Liu - Academic Website

I'm an incoming Assistant Professor at Tsinghua University, College of Artificial Intelligence. Previously, I was a Research Scientist at META Reality Labs and GenAI, primarily focusing on first-person vision and generative AI models (such as Llama3, Llama4, and EMU). I completed my Ph.D. in Robotics at Georgia Tech, advised by Prof. James Rehg. I also work closely with Prof. Yin Li from the University of Wisconsin–Madison. I was fortunate to collaborate with Prof. Siyu Tang and Prof. Michael Black during my visit to ETH Zurich and the Max Planck Institute. I enjoyed a wonderful internship at Facebook Reality Labs, where I worked with Dr. Chao Li, Dr. Lingni Ma, Dr. Kiran Somasundaram, and Prof. Kristen Grauman on egocentric action recognition and localization. I am honored to have received several awards, including Best Paper Candidate at CVPR 2022 and ECCV 2024, and the BMVC Best Student Paper Award. Before joining Georgia Tech, I earned my Master’s degree from Carnegie Mellon University and Bachelor’s degree from Beihang University.

As a primary contributor, I have helped construct several widely recognized egocentric video datasets, including Ego4D, Ego-Exo4D, EGTEA Gaze+, and the Behavior Vision Suite, which have been broadly adopted in both academia and industry. I have also proposed multiple algorithms for egocentric action recognition and anticipation, some of which will be deployed in the next-generation smart glasses developed by Meta Reality Labs. During my time at Meta GenAI, I was deeply involved in the training and evaluation of large-scale generative multimodal models, including EMU, Llama3, and Llama4 (multimodal components only).

*This image of Jaime Lannister charging alone at Daenerys and her dragon reveals what it often takes to do science—you must be willing to stand as the lonely warrior.

Our lab is committed to the following research agenda:

Designing AI that sees through your eyes, learns your skills, and understands your intentions.

--构建能“看你所见、学你所会、懂你所想”的下一代人本智能系统。

Our research is dedicated to Bridging Minds and Machines by leveraging egocentric vision and generative AI to enable AI systems that understand and anticipate human behavior and intentions, and thereby assist human daily life. Our key research directions include:

Human Skill Transfer: Facilitating skill transfer between humans and from humans to robots through augmented reality, enabling efficient and natural human-AI collaboration.
Personalized AI Systems: Building generative AI models that continuously evolve based on user interaction history and preferences, capable of understanding context and adapting to individual users.
AI Agents with Theory of Mind: Developing proactive AI agents that model users’ intentions and cognitive load, leading to more intuitive and seamless human-AI interaction.

My group is always looking for talented students to join us on this journey. For students from Mainland China, please see the note here. For international students, please contact me directly via email.

News

Jun. 2025: Received the Egocentric Vision (EgoVis) 2023/2024 Distinguished Paper Awards
Feb. 2025: Three papers accepted to CVPR 2025.
Oct. 2024: Our LEGO paper has been nominated as one of the 15 award candidates at ECCV 2024.
Jul. 2024: Two corresponding-author papers accepted to ECCV 2024 (1 Poster, 1 Oral).
Jun. 2024: Received the Egocentric Vision (EgoVis) 2022/2023 Distinguished Paper Awards
Feb. 2024: Three papers accepted to CVPR 2024 (1 Poster, 1 Highlight, 1 Oral).
Nov. 2023: One paper accepted to IEEE TPAMI.
Nov. 2023: One paper accepted to IJCV.
Jun. 2023: One paper accepted to ACL 2023 as Findings.
Nov. 2022: Our paper on Egocentric Gaze Estimation won the Best Student Paper Prize at BMVC 2022!
Sep. 2022: One paper accepted to BMVC 2022 for spotlight presentation!
Aug. 2022: I started my new journey at META Reality Labs.
Jul. 2022: Two papers accepted at ECCV 2022.
Jun. 2022: I successfully defended my thesis!
Apr. 2022: Technical talk at META AI Research.
Mar. 2022: Technical talk at Amazon.
Mar. 2022: Our Ego4D paper was accepted to CVPR 2022 for oral presentation, Best Paper Finalist.
Feb. 2022: Technical talk at Apple.
Oct. 2021: Our Ego4D project has launched! Check out the arXiv paper.
Oct. 2021: One paper accepted to 3DV 2021.
Jul. 2021: I passed my thesis proposal.

Publications

Zeyi Huang*, Yuyang Ji*, Xiaofang Wang, Nikhil Mehta, Tong Xiao, Donghyun Lee, Sigmund Vanvalkenburgh, Shengxin Zha, Bolin Lai, Licheng Yu, Ning Zhang, Yong Jae Lee†, Miao Liu†. Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs, accepted by Computer Vision and Pattern Recognition Conference (CVPR) 2025 [arXiv]
Bolin Lai, Felix Juefei-Xu, Miao Liu, Xiaoliang Dai, Nikhil Mehta, Chenguang Zhu, Zeyi Huang, James M. Rehg, Sangmin Lee, Ning Zhang, Tong Xiao. Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation, accepted by Computer Vision and Pattern Recognition Conference (CVPR) 2025 [arXiv]
Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu, Xide Xia, Miao Liu, Xiaofang Wang, Mingfu Liang, Ning Zhang, Dimitris N. Metaxas, Licheng Yu. Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction, accepted by Computer Vision and Pattern Recognition Conference (CVPR) 2025 [arXiv]
Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu. LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning, accepted by European Conference on Computer Vision (ECCV) 2024 (Oral, Best Paper Award Candidate 15/8585). [arXiv]
Bolin Lai, Fiona Ryan, Wenqi Jia, Miao Liu†, James M. Rehg†. Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation, accepted by European Conference on Computer Vision (ECCV) 2024. [arXiv] †: Co-corresponding Author
Yunhao Ge*, Yihe Tang*, Jiashu Xu*, Cem Gokmen*, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu. BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation, accepted by Computer Vision and Pattern Recognition Conference (CVPR) 2024 (Spotlight). [arXiv] *: Equal Contribution
With Kristen Grauman, et al. Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives, accepted by Computer Vision and Pattern Recognition Conference (CVPR) 2024 (Oral). [arXiv]
Wenqi Jia, Miao Liu, Hao Jiang, Ishwarya Ananthabhotla, James Rehg, Vamsi Krishna Ithapu, Ruohan Gao. The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective, accepted by Computer Vision and Pattern Recognition Conference (CVPR) 2024. [arXiv]
Bolin Lai, Miao Liu†, Fiona Ryan, James M. Rehg. In the Eye of Transformer: Global–Local Correlation for Egocentric Gaze Estimation and Beyond, accepted by International Journal of Computer Vision (IJCV). †: Student Mentor [arXiv]
Bolin Lai*, Hongxin Zhang*, Miao Liu*, Aryan Pariani*, Fiona Ryan, Wenqi Jia, Shirley Anugrah Hayati, James M. Rehg, Diyi Yang. In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation, accepted by the Association for Computational Linguistics (ACL) 2023 (Findings). [arXiv] *: Equal Contribution
Bolin Lai, Miao Liu†, Fiona Ryan, James M. Rehg. In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation, accepted by British Machine Vision Conference (BMVC) 2022. †: Student Mentor, Co-corresponding Author (Spotlight, Best Student Paper Prize). [arXiv]
Wenqi Jia*, Miao Liu*, James M. Rehg. Generative Adversarial Network for Future Hand Segmentation from Egocentric Video, accepted by European Conference on Computer Vision (ECCV) 2022. [arXiv] *: Equal Contribution
Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li. Egocentric Activity Recognition and Localization on a 3D Map, accepted by European Conference on Computer Vision (ECCV) 2022. [arXiv]
With Kristen Grauman, et al. Ego4D: Around the World in 3,000 Hours of Egocentric Video, accepted by Computer Vision and Pattern Recognition Conference (CVPR) 2022 (Oral, best paper finalist, 33/8161). [arXiv] Key driver for Social Benchmark and Forecasting Benchmark
Miao Liu, Dexin Yang, Yan Zhang, Zhaopeng Cui, James M. Rehg, and Siyu Tang. 4D Human Body Capture from Egocentric Video via 3D Scene Grounding, accepted by International Conference on 3D Vision. [arXiv] [project page]
Yin Li, Miao Liu, and James M. Rehg. In the Eye of the Beholder: Gaze and Actions in First Person Video, accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2021. [arXiv]
Miao Liu, Xin Chen, Yun Zhang, Yin Li, and James M. Rehg. Attention Distillation for Learning Video Representations, accepted by British Machine Vision Conference (BMVC) 2020 (Oral, acceptance rate 5.0%). [pdf] [project page]
Yun Zhang*, Shibo Zhang*, Miao Liu, Elyse Daly, Samuel Battalio, Santosh Kumar, Bonnie Spring, James M. Rehg, Dr. Nabil Alshurafa. SyncWISE: Window Induced Shift Estimation for Synchronization of Video and Accelerometry from Wearable Sensors, accepted by Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. (IMUWT/UbiComp) 2020 (* denotes equal contribution). [pdf]
Miao Liu, Siyu Tang, Yin Li, and James M. Rehg. Forecasting Human Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Vision, accepted by European Conference on Computer Vision (ECCV) 2020 (Oral, acceptance rate 2.0%). [pdf] [project page]
Yin Li, Miao Liu, and James M. Rehg. In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video, accepted by European Conference on Computer Vision (ECCV) 2018. [pdf]

Teaching

Students

Contact

lmaptx4869@gmail.com