Yutian Chen portrait

Yutian Chen

Carnegie Mellon University2021 Aug - 2027 May

MSc. Robotics

BSc. Computer Science, Minor in Mathematical Science

About Me

I am an M.S. Robotics student at Carnegie Mellon University, advised by Prof. Sebastian Scherer in the AirLab. My research focuses on enabling machines to understand and interact with the physical reality through robust geometric and semantic perception. I am broadly interested in visual geometry, spatial reasoning, and the development of scalable algorithms that bridge perception and action for autonomous systems.

Recent Research

Below is a highlight list of my recent works. For a full list of works, please see Here.

Projects

PyPose Icon

PyPose

PyPose is a Library for Robot Learning with Physics-based Optimization. It supports efficient automatic-differentiation on Lie Group and Lie Algebra. I'm an active contributor of the PyPose project.

CMU SCS Icon

C0 Visualizer

I designed and implemented the C0 program visualizer in 2022 Summer. It is a virtual machine that executes C0 language (a safe subset of C) and provide visualization and debugging tools for education purpose.

Skills

Robotics

Visual-Inertial SLAM, Computer Vision, Geometric Vision, ROS2, C++

Deep learning & Artificial Intelligence

Natural Language Processing, Semantic Segmentation, PyTorch, CUDA, TensorRT, Python, Triton

Miscellaneous

React, HTML, CSS, TypeScript, LaTeX, Blender

Open Source Notes

I believe knowledge is most impactful when shared freely. By open-sourcing my notes from high school AP courses to advanced university topics, I aim to improve the accessibility of knowledge for everyone.

Experience

  1. Spatial AI & Visual-Inertial SLAM
    Sep 2022-Now

      Working with Professor Sebastian Scherer, I aimed to construct robust and accurate visual-inertial SLAM system using data-driven approach. I Developed the MAC-VO, an award-winning visual odometry that significantly outperforms the state-of-the-art visual odomtries like DPVO by 30% on relative translation error (RTE) and relative rotation error (ROE) in multiple public datasets. I also Deployed the MAC-VO as ROS2 node on Orin-AGX on real drone and speedup the system by 4 times with TensorRT.

    1. MAC-VO: Metric-Aware Covariance for Learning-based Stereo Visual Odometry

      ICRA 2025 Best Conference Paper Award

      Best Paper Award on Robot Perception

    ViT Inference Acceleration
    Jun 2025-Aug 2025

      Working with Jay Patrikar, we propose the Confidence-Guided Token Merging (Co-Me), a training-free acceleration method for visual geometric transformers that identifies and merges low-confidence tokens to reduce computation while preserving spatial fidelity. By leveraging a distilled confidence predictor, Co-Me delivers substantial speedups across models like VGGT (up to 11.3x) and MapAnything (up to 7.8x), enabling real-time 3D perception.

    Embodied AI Simulator

    Embodied AI Simulator

    MIT-IBM Watson AI Lab
    Apr 2024-Jan 2025

      Working with Professor Chuang Gan, I developed a data pipeline for City-scale 3D scene reconstruction based on real world satellite/street-view image for multi-agent simulator.

    Generated Text Detection
    Mar 2023-Sep 2023

      Working with Professor Rita Singh and Bhiksha Raj, built a LLM-generated content detector called "LLM-Sentinel". Reaches 98% accuracy on test dataset and outperform existing content detector by OpenAI and ZeroGPT. Collected the OpenLLMText dataset, a dataset contains 30k human written text from OpenWebText and its corresponding rephrased version by various LLMs such as GPT3.5, LLaMA, PaLM, etc.

    Medical Image Segmentation

    Medical Image Segmentation

    Guangdong Cardiovascular Institute
    Dec 2018-Jan 2020

      Mentored by Professor Yiyu Shi and Xiaowei Xu, I proposed an encoder-decoder architecture to perform semantic segmentation on cardiac MRI sequence. By introducing Temporal constraint on segmentation result, the model improved the accuracy by 2% on ACDC Dataset comparing to the baseline model.

Courses

  1. 16-833 Localization and Mapping
  2. 16-385 Computer Vision
  3. 15-451 Algorithm Design & Analysis
  4. 15-418 Parallel Computer Architecture and Programming
  5. 11-777 Multi-Modal Machine Learning
  6. 10-708 Probablistic Graphical Model
  7. 11-785 Intro to Deep Learning