Hi, I’m Yuxiao Yang (杨宇骁).
I am a first-year Ph.D. student in Computer Science at the University of North Carolina at Chapel Hill, where I am fortunate to be advised by Prof. Weitong Zhang.
I am broadly interested in reinforcement learning and large language models.
Before joining UNC, I earned my bachelor’s degree in Computer Science from the John Hopcroft Class at Shanghai Jiao Tong University.
📝 Publications
- Return-to-Go Is More Than a Number: Q-Guided Alignment for Return-Conditioned Supervised Learning (ICML 2026)
Yuxiao Yang, Weitong Zhang
Paper · Code - Provable and Practical In-Context Policy Optimization for Self-Improvement (ICLR 2026)
Tianrun Yu*, Yuxiao Yang*, Zhaoyang Wang, Kaixiang Zhao, Porter Jenkins, Xuchao Zhang, Chetan Bansal, Huaxiu Yao, Weitong Zhang
Paper · Code
* indicates equal contribution.