Haoyang Zou | Shanghai Jiao Tong University

About Me

I work on models that can think, act, and hold up once they leave controlled environments and enter real work.

I am a first-year Ph.D. student at Shanghai Jiao Tong University, advised by Prof. Pengfei Liu, and a research intern at ByteDance Seed, where I work closely with Yujia Qin. My research centers on reasoning, reinforcement learning, and agents, with a focus on training foundation models for capable and reliable agent behavior. Before that, I received my B.S. in Artificial Intelligence from Fudan University.

Research Interests

Reasoning and Reinforcement Learning for Agents Studying how agents can plan, adapt, and remain reliable in complex environments.
Coding Agents in Practice Designing coding agents that optimize for real-world usefulness rather than benchmark performance.
Self-Improving Agent Systems Exploring systems whose capability grows through use and interaction.

Selected Work

Seed 2.0

ByteDance Seed Team (Core Contributor)

Product Launch, Feb 2026

Homepage Blog GitHub

As a core contributor, I am responsible for the coding agent ability of Seed 2.0, a multimodal foundation model series built for strong agentic performance.

Seed 1.8

ByteDance Seed Team (Core Contributor)

Product Launch, Dec 2025

Blog GitHub

As a core contributor, I am responsible for the coding agent ability of Seed 1.8, a generalized agentic model unifying search, code, and GUI abilities in one system.

UI-TARS-2: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Haoming Wang, Haoyang Zou, Huatong Song, Jiazhan Feng, Junjie Fang, et al.(alphabetical order)

Tech Report, Sep 2025

Paper Code

A GUI-centered agent trained with multi-turn RL and a data flywheel, reaching SOTA on OSWorld, Mind2Web, and AndroidWorld, and supporting the Doubao Mobile Assistant.

ToRL: Scaling Tool-Integrated RL

Xuefeng Li*, Haoyang Zou*, Pengfei Liu

Preprint, Mar 2025

Paper Code

Models

A tool-integrated RL framework that lets base models acquire tool use through training rather than prompting alone.

LIMR: Less is More for RL Scaling

Xuefeng Li*, Haoyang Zou*, Pengfei Liu

Preprint, Feb 2025

Paper Code

Models

Dataset

A data selection method showing that better-chosen trajectories can match larger RL runs at a fraction of the scale.

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Zhen Huang*, Haoyang Zou*, Xuefeng Li*, Yixiu Liu*, Yuxiang Zheng*, Ethan Chern*, Shijie Xia*, Yiwei Qin, Weizhe Yuan, Pengfei Liu

Preprint, Nov 2024

Paper Code

A distillation study showing that simple pipelines can push open models past o1-preview on targeted reasoning evaluations.

O1 Replication Journey: A Strategic Progress Report -- Part 1

Yiwei Qin*, Xuefeng Li*, Haoyang Zou*, Yixiu Liu*, Shijie Xia*, Zhen Huang, Yixin Ye, Weizhe Yuan, Hector Liu, Yuanzhi Li, Pengfei Liu

Preprint, Oct 2024

Paper Code

An early progress report documenting what did and did not work while replicating o1-style reasoning behavior.

Generative AI for Math: Abel

Ethan Chern*, Haoyang Zou*, Xuefeng Li*, Jiewen Hu*, Kehua Feng, Junlong Li, Pengfei Liu

Project 2024

Project Code

Models

A mathematical reasoning model that reached state-of-the-art results on GSM8K and MATH at the time.

Generative AI Act II: Test Time Scaling Drives Cognition Engineering

Shijie Xia, Yiwei Qin, Xuefeng Li, Yan Ma, Run-Ze Fan, Steffi Chern, Haoyang Zou, Fan Zhou, Xiangkun Hu, Jiahe Jin, Yanheng He, Yixin Ye, Yixiu Liu, Pengfei Liu

Preprint, Apr 2025

PDF Code

A survey framing test-time scaling as an emerging engineering discipline for reasoning systems.

O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

Zhongzhen Huang*, Gui Geng*, Shengyi Hua*, Zhen Huang*, Haoyang Zou*, Shaoting Zhang, Pengfei Liu, Xiaofan Zhang

Preprint, Jan 2025

Paper Code

Dataset

An inference-time scaling study for medical reasoning, focused on where extra computation actually improves clinical problem solving.

PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital World

Yanheng He*, Jiahe Jin*, Shijie Xia, Jiadi Su, Runze Fan, Haoyang Zou, Xiangkun Hu, Pengfei Liu

Preprint, Dec 2024

Paper Code

Models

A GUI agent project inspired by human cognitive workflows for completing computer tasks end to end.

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, ... , Pengfei Liu

NeurIPS 2024 Datasets & Benchmarks

PDF Code

Dataset Project

A multimodal benchmark designed to test cross-discipline reasoning at olympiad difficulty.

Reformatted Alignment

Run-Ze Fan, Xuefeng Li, Haoyang Zou, Junlong Li, Shwai He, Ethan Chern, Jiewen Hu, Pengfei Liu

Preprint, Feb 2024

Paper Code

Dataset Project

A data-centric alignment method that improves model behavior by reformatting responses rather than only scaling data or compute.

Haoyang Zou 邹皓阳

About Me

Research Interests

News

Selected Work

Experiences