
Yutian Chen
Ph.D. in Computer Science
MSc. Robotics (Dropped)
BSc. Computer Science, Minor in Mathematical Science
Also find me at ...
About Me
I am an M.S. Robotics student at Carnegie Mellon University, advised by Prof. Sebastian Scherer in the AirLab. My research focuses on enabling machines to understand and interact with the physical reality through robust geometric and semantic perception. I am broadly interested in visual geometry, spatial reasoning, and the development of scalable algorithms that bridge perception and action for autonomous systems.
Recent Updates
Below is a highlight list of my recent works. For a full list of works, please see Here.
- MAC-VO: Metric-Aware Covariance for Learning-based Stereo Visual Odometry· ICRA 2025
ICRA 2025 Best Conference Paper Award
Best Paper Award on Robot Perception
Projects
PyPose is a Library for Robot Learning with Physics-based Optimization. It supports efficient automatic-differentiation on Lie Group and Algebra.
I created the C0 program visualizer, a virtual machine that executes C language in browser and provide realtime memory visualization.
Open Source Notes
I believe knowledge is most impactful when shared freely. By open-sourcing my notes from high school AP courses to advanced university topics, I aim to improve the accessibility of knowledge for everyone.
Experience
Sep 2022 – NowSpatial AI & Visual-Inertial SLAM
The AirLab, Robotics Institue, Carnegie Mellon University- MAC-VO: Metric-Aware Covariance for Learning-based Stereo Visual Odometry· ICRA 2025
ICRA 2025 Best Conference Paper Award
Best Paper Award on Robot Perception
- AirIMU: Learning Uncertainty Propagation for Inertial Odometry· arXiv Preprint
- PyPose v0.6: The Imperative Programming Interface for Robotics· IROS Workshop 2023
Working with Professor Sebastian Scherer, I aimed to construct robust and accurate visual-inertial SLAM system using data-driven approach. I Developed the MAC-VO, an award-winning SoTA visual odometry that is generalizable everywhere (even the lunar surface 🌕!).
Working with Jay Patrikar, we propose the Confidence-Guided Token Merging (Co-Me), a training-free acceleration method for visual geometric transformers that identifies and merges low-confidence tokens to reduce computation while preserving spatial fidelity. By leveraging a distilled confidence predictor, Co-Me delivers substantial speedups across models like VGGT (up to 11.3x) and MapAnything (up to 7.8x), enabling real-time 3D perception.
Working with Professor Chuang Gan, I developed a data pipeline for City-scale 3D scene reconstruction based on real world satellite/street-view image for multi-agent simulator.
Working with Professor Rita Singh and Bhiksha Raj, built a LLM-generated content detector called "LLM-Sentinel". Reaches 98% accuracy on test dataset and outperform existing content detector by OpenAI and ZeroGPT. Collected the OpenLLMText dataset, a dataset contains 30k human written text from OpenWebText and its corresponding rephrased version by various LLMs such as GPT3.5, LLaMA, PaLM, etc.
- Myocardial Segmentation of Cardiac MRI Sequences With Temporal Consistency for Coronary Artery Disease Diagnosis· Frontier Cardiovascular Medicine 2022
Mentored by Professor Yiyu Shi and Xiaowei Xu, I proposed an encoder-decoder architecture to perform semantic segmentation on cardiac MRI sequence. By introducing Temporal constraint on segmentation result, the model improved the accuracy by 2% on ACDC Dataset comparing to the baseline model.



