Hi, I’m Yuxiao Yang (杨宇骁).

I am a first-year Ph.D. student in Computer Science at the University of North Carolina at Chapel Hill, where I am fortunate to be advised by Prof. Weitong Zhang.

I am broadly interested in reinforcement learning and large language models.

Before joining UNC, I earned my bachelor’s degree in Computer Science from the John Hopcroft Class at Shanghai Jiao Tong University.

📝 Publications

  • Return-to-Go Is More Than a Number: Q-Guided Alignment for Return-Conditioned Supervised Learning (ICML 2026)
    Yuxiao Yang, Weitong Zhang
    Paper · Code
  • Provable and Practical In-Context Policy Optimization for Self-Improvement (ICLR 2026)
    Tianrun Yu*, Yuxiao Yang*, Zhaoyang Wang, Kaixiang Zhao, Porter Jenkins, Xuchao Zhang, Chetan Bansal, Huaxiu Yao, Weitong Zhang
    Paper · Code

* indicates equal contribution.