Wenli Xiao

Research

Core Contributor

CaP-X: Benchmarking and Improving Coding Agents for Robot Manipulation

ICML 2026

CaP-X introduces CaP-Agent0, a training-free agentic framework enabling off-the-shelf LLMs to perform robotic manipulation via code generation, and CaP-Bench, a comprehensive evaluation suite of 100+ tasks. Frontier models achieve 30%+ zero-shot success on unseen tasks with 18% on perturbed tasks (vs. 0% for VLA models), and a 7B model improves from 20% to 72% in sim with 84% real-world transfer.

Letian Fu*, Justin Yu*, Karim El-Refai*, Ethan Kou, Haoru Xue, Huang Huang, Wenli Xiao, Guanzhi Wang, Fei-Fei Li, Guanya Shi, Jiajun Wu, Shankar Sastry, Yuke Zhu, Ken Goldberg, Linxi "Jim" Fan

arXiv Website Code

Co-Lead

Self-Improving Vision-Language-Action Models with Data Generation via Residual RL

ICLR 2026

PLD (Probe, Learn, Distill) is a plug-and-play recipe for Vision-Language-Action (VLA) post-training. It is model agnostic, supporting both autoregressive and diffusion architectures, and can push success rates to 99%.

Wenli Xiao*, Haotian Lin*, Andy Peng, Haoru Xue, Tairan He, Yuqi Xie, Fengyuan Hu, Jimmy Wu, Zhengyi Luo, Linxi "Jim" Fan†, Guanya Shi, Yuke Zhu†

Website Twitter

Co-Lead1 / 5

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

In submission

Physical Autoresearch on real-world Robot Fleet. ENPIRE lets coding agents autonomously improve robot manipulation policies through a closed-loop physical feedback system—automatic environment reset and verification, parallel robot rollouts, and evolutionary refinement—reaching a 99% success rate on challenging dexterous manipulation tasks.

Wenli Xiao*, Jia Xie*, Tonghe Zhang*, Haotian Lin*, Letian "Max" Fu, Haoru Xue, Jalen Lu, Yi Yang, Cunxi Dai, Zi Wang, Jimmy Wu, Guanzhi Wang, S. Shankar Sastry, Ken Goldberg, Linxi "Jim" Fan‡, Yuke Zhu‡, Guanya Shi‡

Website Twitter Media

swipe to browse

Wenli Xiao

I am currently a Research Intern at Physical Intelligence.

I'm a final-year PhD student at CMU Robotics, advised by Prof. Guanya Shi. My vision is to build scalable general intelligent robots in the real world. I am generally interested in dexterous manipulation and humanoid robots.

I spent two wonderful years (2024-2026) interning at NVIDIA GEAR Lab, doing Foundation Model Post-training and Physical Auto-Research with Dr. Jim Fan and Prof. Yuke Zhu.

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

In submission

TL;DR: Physical Autoresearch on real-world Robot Fleet. ENPIRE lets coding agents autonomously improve robot manipulation policies through a closed-loop physical feedback system—automatic environment reset and verification, parallel robot rollouts, and evolutionary refinement—reaching a 99% success rate on challenging dexterous manipulation tasks.

CaP-X: Benchmarking and Improving Coding Agents for Robot Manipulation

ICML 2026

TL;DR: CaP-X introduces CaP-Agent0, a training-free agentic framework enabling off-the-shelf LLMs to perform robotic manipulation via code generation, and CaP-Bench, a comprehensive evaluation suite of 100+ tasks. Frontier models achieve 30%+ zero-shot success on unseen tasks with 18% on perturbed tasks (vs. 0% for VLA models), and a 7B model improves from 20% to 72% in sim with 84% real-world transfer.

HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots

ICRA 2025

TL;DR: HOVER is a 1.5M-parameter neural network to control the body of a humanoid robot. It takes a lot of subconscious processing for us humans to walk, maintain balance, and maneuver our arms and legs into desired positions. We capture this 'subconsciousness' in HOVER, a single model that learns how to coordinate the motors of a humanoid robot to support locomotion and manipulation.

Research

CaP-X: Benchmarking and Improving Coding Agents for Robot Manipulation

Self-Improving Vision-Language-Action Models with Data Generation via Residual RL

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

News

Research Projects

Authors

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

Authors

Self-Improving Vision-Language-Action Models with Data Generation via Residual RL

Authors

CaP-X: Benchmarking and Improving Coding Agents for Robot Manipulation

Authors

HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots

Authors

AnyCar to Anywhere: Learning Universal Dynamics Model

Background