Feng Xiang - Homepage

About Me关于我

I am Feng Xiang, a master's student in Computer Science at Wuhan University and a Research Intern at Alibaba Group.

My research focuses on multimodal large language models and agentic reinforcement learning (including search and memory). Beyond building stronger model reasoning, I am particularly interested in trustworthy reasoning — enabling models to "know what they know and know what they don't", and to proactively gather clearly-sourced evidence to form verifiable reasoning chains. I believe this is essential for deploying LLM/MLLM agents in real-world, high-stakes scenarios. I am also exploring unified multimodal models that incorporate action modalities.

🤝 I am always looking for research collaborations. If you are interested in sharing GPU resources or discussing ideas, feel free to reach out!

我是冯祥，武汉大学计算机学院硕士研究生，同时是阿里巴巴集团的研究型实习生。

我目前的研究主要关注多模态大语言模型和智能体强化学习（包括搜索和记忆等）。在实现更强的模型推理能力之外，我同时关心可信推理问题——即如何让模型能够"知之为知之，不知为不知"，并且可以主动收集来源清晰的证据组成可验证推理链。我认为，这对于将 LLM/MLLM 智能体部署到真实且严肃的场景中具有重要价值。此外，我也在关注融合动作模态的统一多模态模型。

🤝 我一直在寻找更多的科研合作。如果您愿意与我共享 GPU 算力或者对我的研究感兴趣想与我讨论，欢迎随时联系我！

Experience经历

Alibaba Group阿里巴巴集团

Research Intern研究型实习生

Focusing on trustworthy reasoning for document intelligence.主要关注文档智能的可信推理。

2026.01 - Present

Wuhan University武汉大学

M.S. in Computer Science and Technology, School of Computer Science计算机科学与技术硕士，计算机学院

Weighted average: 92.71/100; rank: 16/207, Top 8%加权平均分：92.71/100；排名：16/207，前 8%

2024.09 - Present

Lanzhou University兰州大学

B.S. in Computer Science and Technology, School of Information Science and Engineering计算机科学与技术学士，信息科学与工程学院

Weighted average: 88.44/100; rank: 7/113, Top 6%加权平均分：88.44/100；排名：7/113，前 6%

2020.09 - 2024.06

News新闻

2026.06 New preprint: DocScope — benchmarking verifiable reasoning for trustworthy long-document understanding.新预印本：DocScope——面向可信长文档理解的可验证推理基准。 [arXiv]

2026.05 Omni-I2C accepted at ACL 2026.Omni-I2C 被 ACL 2026 接收。 [arXiv]

2026.01 AnesSuite accepted at ICLR 2026.AnesSuite 被 ICLR 2026 接收。 [arXiv]

2026.01 Joined Alibaba Group as a Research Intern.加入阿里巴巴集团担任研究型实习生。

2025.10 New preprint: Adaptive Decoding via hierarchical neural information gradients.新预印本：通过分层神经信息梯度进行自适应解码。 [arXiv]

2025.09 Received First-class Academic Scholarship, Wuhan University.获得武汉大学一等学业奖学金。

2025.08 New preprint: REX-RAG — reasoning exploration with policy correction in RAG.新预印本：REX-RAG——检索增强生成中的推理探索与策略纠正。 [arXiv]

2024.10 Received Second-class Freshman Scholarship, Wuhan University.获得武汉大学二等新生奖学金。

2024.09 National Third Prize, China Graduate Mathematical Modeling Competition.全国研究生数学建模竞赛国家三等奖。

2024.09 Started M.S. at Wuhan University.进入武汉大学攻读硕士学位。

2024.06 Outstanding Undergraduate Thesis, Lanzhou University.兰州大学优秀本科毕业论文。

2023.06 National Third Prize, Lanqiao Cup Python Programming, Group A.蓝桥杯 Python 程序设计 A 组国家三等奖。

2022.09 National Second Prize, Contemporary Undergraduate Mathematical Contest in Modeling.全国大学生数学建模竞赛国家二等奖。

Publications出版物

DocScope: Benchmarking Verifiable Reasoning for Trustworthy Long-Document Understanding

X. Feng, J. Zhou, Z. Huang, K. Wang, S. Ye, J. Hu, Z. Chen, Y. Luo, J. Zhang

arXiv preprint arXiv:2605.08888, 2026

A benchmark that evaluates whether MLLMs can produce trustworthy, verifiable reasoning traces over long, visually rich documents via a four-stage evaluation protocol.一个评估多模态大模型能否在长文档上产生可信、可验证推理轨迹的基准，采用四阶段评估协议。

Paper | Code
Omni-I2C: A Holistic Benchmark for High-Fidelity Image-to-Code Generation

J. Zhou, C. Zhang, X. Feng, Q. Zhang, H. Qiu, L. He, D. Ye, X. Gao, J. Zhang

ACL 2026

A comprehensive benchmark of 1,080 samples across 5 code types and 45 figure types for evaluating LMMs on converting complex digital graphics into executable code.一个包含 1,080 个样本、涵盖 5 种代码类型和 45 种图表类型的基准，用于评估大模型将复杂数字图形转化为可执行代码的能力。

Paper | Code
AnesSuite: A Comprehensive Benchmark and Dataset Suite for Anesthesiology Reasoning in LLMs

X. Feng, W. Jiang, Z. Wang, Y. Luo, P. Xu, B. Yu, H. Jin, J. Zhang

International Conference on Learning Representations (ICLR), 2026

The first comprehensive dataset suite for anesthesiology reasoning, covering benchmark, training data (CPT/SFT/RLVR), and Morpheus baseline reasoning models.首个面向麻醉学推理的综合数据集套件，涵盖评测基准、训练数据（CPT/SFT/RLVR）和 Morpheus 基线推理模型。

Paper | Code
REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented Generation

W. Jiang, X. Feng, Z. Wang, Y. Luo, P. Xu, Z. Chen, B. Du, J. Zhang

arXiv preprint arXiv:2508.08149, 2025

Addresses dead-end exploration in RL-trained RAG agents through mixed sampling with exploratory prompts and a policy correction mechanism to reduce distribution shift.通过混合采样和策略纠正机制解决 RL 训练的 RAG 智能体中的死胡同探索问题，减少分布偏移。

Paper | Code
Adaptive Decoding via Hierarchical Neural Information Gradients in Mouse Visual Tasks

J. Feng, X. Feng

arXiv preprint arXiv:2510.09451, 2025

Proposes a hierarchical neural-information gradient framework to decode visual task representations from mouse brain activity across cortical regions.提出分层神经信息梯度框架，从小鼠脑活动中解码跨皮层区域的视觉任务表征。

Paper
Decoding Mouse Visual Tasks via Hierarchical Neural-Information Gradients

J. Feng, X. Feng, Y. Luo, J. Li

Mathematics 14(1), 31, 2025

Studies hierarchical information gradients across mouse visual cortex to understand how neural data flows support visual task decoding.研究小鼠视觉皮层的分层信息梯度，理解神经数据流如何支持视觉任务解码。
Orthogonal-moment-based Attraction Measurement with Ocular Hints in Video-watching Task

M. Yang, X. Feng, R. Ma, X. Li, C. Mao

IEEE Transactions on Computational Social Systems 10(3), 900-909, 2023

Combines orthogonal moments with eye-tracking signals to measure viewer attraction levels during video-watching tasks.结合正交矩与眼动信号，测量视频观看任务中的观众吸引力水平。

Projects开源项目

Academic Service学术服务

Reviewer:审稿人： Annual Conference on Neural Information Processing Systems (NeurIPS), IEEE Transactions on Multimedia (TMM), IEEE International Conference on Multimedia and Expo (ICME).NeurIPS、IEEE TMM 期刊、ICME。

Memberships:学术组织： IEEE Student Member, IEEE Geoscience and Remote Sensing Society (GRSS) Member.IEEE 学生会员、IEEE GRSS 会员。