🧑🏻‍💻 About Me

I am Haoyang Zou (邹皓阳), a senior undergraduate student majoring in Artificial Intelligence at Fudan University. I am currently conducting research under the supervision of Prof. Pengfei Liu and will continue working with him during my upcoming Ph.D. program at Shanghai Jiao Tong University (SJTU).

Research Interests

My research focuses on:

  • Large Language Model Reasoning: Enhancing the reasoning capabilities of large language models
  • Reinforcement Learning for LLMs: Using RL to improve language model capabilities in complex environments beyond traditional benchmarks

I am passionate about building capable and reliable AI systems that can advance complex reasoning and accelerate innovation in artificial intelligence.

🔥News

  • [2024.09] I will start my PhD at Shanghai Jiao Tong University in September 2025.

📖 Selected Works | Full

ToRL: Scaling Tool-Integrated RL
Xuefeng Li*, Haoyang Zou*, Pengfei Liu
2025, Tech Report.
Paper / Code / Models /
An RL framework enabling LLMs to autonomously learn tool usage from base models.

Generative AI Act II: Test Time Scaling Drives Cognition Engineering
Shijie Xia, Yiwei Qin, Xuefeng Li, Yan Ma, Run-Ze Fan, Steffi Chern, Haoyang Zou, Fan Zhou, Xiangkun Hu, Jiahe Jin, Yanheng He, Yixin Ye, Yixiu Liu, Pengfei Liu
2025, Preprint
PDF / Code /
A survey on Test Time Scaling.

LIMR: Less is More for RL Scaling
Xuefeng Li*, Haoyang Zou*, Pengfei Liu
2025, Tech Report.
Paper / Code / Models & Dataset /
A data selection method achieving comparable RL performance with fewer strategically chosen samples.

O1 Replication Journey--Part 3: Inference-time Scaling for Medical Reasoning
Zhongzhen Huang*, Gui Geng*, Shengyi Hua*, Zhen Huang*, Haoyang Zou*, Shaoting Zhang, Pengfei Liu, Xiaofan Zhang
2025, Tech Report.
Paper / Code / Dataset /
An inference-time scaling approach for medical reasoning tasks.

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?
Zhen Huang*, Haoyang Zou*, Xuefeng Li*, Yixiu Liu*, Yuxiang Zheng*, Ethan Chern*, Shijie Xia*, Yiwei Qin, Weizhe Yuan, Pengfei Liu
2024, Tech Report.
Paper / Code /
A distillation approach surpassing o1-preview performance.

O1 Replication Journey: A Strategic Progress Report--Part 1
Yiwei Qin*, Xuefeng Li*, Haoyang Zou*, Yixiu Liu*, Shijie Xia*, Zhen Huang, Yixin Ye, Weizhe Yuan, Hector Liu, Yuanzhi Li, Pengfei Liu
2024, Tech Report.
Paper / Code /
An early O1 replication exploration documenting trial-and-error attempts.

PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World
Yanheng He*, Jiahe Jin*, Shijie Xia, Jiadi Su, Runze Fan, Haoyang Zou, Xiangkun Hu, Pengfei Liu
2024, Tech Report.
Paper / Code / Models & Dataset /
A GUI agent learning from human cognitive processes to perform computer work.

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang, Dahua Lin, Yu Qiao, Pengfei Liu
2024, Neurips 2024 (DB track).
PDF / Code / Dataset / Project Page /
A challenging multi-modal olympic competition benchmark for LLMs and LVMs.

Reformatted Alignment
Run-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, Pengfei Liu
2024, Tech Report.
Paper / Code / Dataset / Project Page /
A data quality improvement method reformatting responses to improve LLM alignment.

Generative ai for math: Abel
Ethan Chern*, Haoyang Zou*, Xuefeng Li*, Jiewen Hu*, Kehua Feng, Junlong Li, Pengfei Liu
2024, Project.
Project Page / Code / Models /
A mathematical reasoning model achieving then SOTA performance on GSM8K and MATH benchmarks.

Experiences