Wenli Xiao

Research

Core Contributor

CaP-X: Benchmarking and Improving Coding Agents for Robot Manipulation

ICML 2026

CaP-X introduces CaP-Agent0, a training-free agentic framework enabling off-the-shelf LLMs to perform robotic manipulation via code generation, and CaP-Bench, a comprehensive evaluation suite of 100+ tasks. Frontier models achieve 30%+ zero-shot success on unseen tasks with 18% on perturbed tasks (vs. 0% for VLA models), and a 7B model improves from 20% to 72% in sim with 84% real-world transfer.

Letian Fu*, Justin Yu*, Karim El-Refai*, Ethan Kou, Haoru Xue, Huang Huang, Wenli Xiao, Guanzhi Wang, Fei-Fei Li, Guanya Shi, Jiajun Wu, Shankar Sastry, Yuke Zhu, Ken Goldberg, Linxi "Jim" Fan

Co-Lead

Self-Improving Vision-Language-Action Models with Data Generation via Residual RL

ICLR 2026

PLD (Probe, Learn, Distill) is a plug-and-play recipe for Vision-Language-Action (VLA) post-training. It is model agnostic, supporting both autoregressive and diffusion architectures, and can push success rates to 99%.

Wenli Xiao*, Haotian Lin*, Andy Peng, Haoru Xue, Tairan He, Yuqi Xie, Fengyuan Hu, Jimmy Wu, Zhengyi Luo, Linxi "Jim" Fan†, Guanya Shi, Yuke Zhu†

Co-Lead1 / 5

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

In submission

Physical Autoresearch on real-world Robot Fleet. ENPIRE lets coding agents autonomously improve robot manipulation policies through a closed-loop physical feedback system—automatic environment reset and verification, parallel robot rollouts, and evolutionary refinement—reaching a 99% success rate on challenging dexterous manipulation tasks.

Wenli Xiao*, Jia Xie*, Tonghe Zhang*, Haotian Lin*, Letian "Max" Fu, Haoru Xue, Jalen Lu, Yi Yang, Cunxi Dai, Zi Wang, Jimmy Wu, Guanzhi Wang, S. Shankar Sastry, Ken Goldberg, Linxi "Jim" Fan‡, Yuke Zhu‡, Guanya Shi‡

swipe to browse