Jiageng Mao  

Ph.D. Student

Geometry, Vision, and Learning Lab
Department of Computer Science
University of Southern California
Los Angeles, USA

Email: jiagengm [at] usc [dot] edu


I am currently a PhD student in Computer Science at University of Southern California, advised by Professor Yue Wang.

My research encompasses the broad areas of Computer Vision, Robotics, and Artificial Intelligence. My objective is to equip robots with open-world capabilities for perceiving, reasoning, and planning. Specifically, my research is focusing on the following aspects:

1) Foundation models for robot perception and planning;

2) Data-driven end-to-end solutions for autonomous vehicles;

3) Open-world 3D object and scene understanding.

Recent Updates

Selected Publications [Google Scholar]

A Langauge Agent for Autonomous Driving

Jiageng Mao*, Junjie Ye*, Yuxi Qian, Marco Pavone, Yue Wang.
arXiv preprint arXiv:2311.10813. [Project Page]

We transform the traditional percepion-prediction-planning framework by introducing Large Language Models (LLMs) as an agent for autonomous driving.

GPT-Driver: Learning to Drive with GPT

Jiageng Mao, Yuxi Qian, Junjie Ye, Hang Zhao, Yue Wang.
Neural Information Processing Systems Workshop (NeurIPSW), 2023. [Project Page] [Code]

The first attempt to leverage the Large Language Models (LLMs) like GPT to resolve the motion planning problem in autonomous driving.

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li.
International Journal of Computer Vision (IJCV), 2023. [Code]

A 55-page survey that comprehensively discusses all aspects of 3D object detection in the context of autonomous driving.

CLIP2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data

Yihan Zeng*, Chenhan Jiang*, Jiageng Mao, Jianhua Han, Chaoqiang Ye, Qingqiu Huang, Dit-Yan Yeung, Zhen Yang, Xiaodan Liang, Hang Xu.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

The first attempt to lift up the vision-language foundation model CLIP to the 3D space, leveraging real-world image and point cloud data.

Point2Seq: Detecting 3D Objects as Sequences

Yujing Xue*, Jiageng Mao*, Minzhe Niu, Hang Xu, Michael Bi Mi, Wei Zhang, Xiaogang Wang, Xinchao Wang.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [Code]

The first attempt to represent 3D objects as words and leverage the paradigm of language models to resolve the 3D object detection problem.

Voxel Transformer for 3D Object Detection

Jiageng Mao*, Yujing Xue*, Minzhe Niu, Haoyue Bai, Jiashi Feng, Xiaodan Liang, Hang Xu, Chunjing Xu.
International Conference on Computer Vision (ICCV), 2021. [Code]

The first Transformer-based framework for voxel data processing.
Selected into Stanford CS348n. [Link]

One Million Scenes for Autonomous Driving: ONCE Dataset

Jiageng Mao*, Minzhe Niu*, Chenhan Jiang, Hanxue Liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei Zhang, Zhenguo Li, Jie Yu, Chunjing Xu, Hang Xu.
Neural Information Processing Systems (NeurIPS), 2021. [Code]

The first large-scale real-world autonomous driving dataset focusing on data-efficient learning for driving applications.

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Jiageng Mao, Minzhe Niu, Haoyue Bai, Xiaodan Liang, Hang Xu, Chunjing Xu.
International Conference on Computer Vision (ICCV), 2021. [Code]

A high-performance 3D object detector for autonomous vehicles.
Ranking 1st on the Waymo Open dataset LiDAR detection leaderboard (2021.3).

Grnet: Gridding Residual Network for Dense Point Cloud Completion

Haozhe Xie, Hongxun Yao, Shangchen Zhou, Jiageng Mao, Shengping Zhang, Wenxiu Sun.
European Conference on Computer Vision (ECCV), 2020. [Code]

The first attempt to leverage convolutional frameworks to resolve the dense point cloud completion problem.

Interpolated Convolutional Networks for 3D Point Cloud Understanding

Jiageng Mao, Xiaogang Wang, Hongsheng Li.
International Conference on Computer Vision (ICCV), 2019.

The first attempt to tackle irregular point cloud data with discrete convolutional kernels. Selected as oral presentations (Top 4%).

Honors and Awards

Professional Activities

Other Experiences


ENGG 2030 Signal and SystemsFall2019, 2020, 2021
ENGG 1130 Multivariable CalculusSpring2019, 2020, 2021