Hi, I’m Yuxiao Yang (杨宇骁).

I am a first-year Ph.D. student in Computer Science at the University of North Carolina at Chapel Hill, where I am fortunate to be advised by Prof. Weitong Zhang.

I am broadly interested in reinforcement learning and large language models.

Before joining UNC, I earned my bachelor’s degree in Computer Science from the John Hopcroft Class at Shanghai Jiao Tong University.

📝 Publications

Return-to-Go Is More Than a Number: Q-Guided Alignment for Return-Conditioned Supervised Learning (ICML 2026)
Yuxiao Yang, Weitong Zhang
Paper · Code
Provable and Practical In-Context Policy Optimization for Self-Improvement (ICLR 2026)
Tianrun Yu*, Yuxiao Yang*, Zhaoyang Wang, Kaixiang Zhao, Porter Jenkins, Xuchao Zhang, Chetan Bansal, Huaxiu Yao, Weitong Zhang
Paper · Code

* indicates equal contribution.