About Me
I am Feng Xiang, a master's student in Computer Science at Wuhan University and a Research Intern at Alibaba Group.
My research focuses on multimodal large language models and agentic reinforcement learning (including search and memory). Beyond building stronger model reasoning, I am particularly interested in trustworthy reasoning — enabling models to "know what they know and know what they don't", and to proactively gather clearly-sourced evidence to form verifiable reasoning chains. I believe this is essential for deploying LLM/MLLM agents in real-world, high-stakes scenarios. I am also exploring unified multimodal models that incorporate action modalities.
🤝 I am always looking for research collaborations. If you are interested in sharing GPU resources or discussing ideas, feel free to reach out!
Experience
Alibaba Group
Research Intern
Focusing on trustworthy reasoning for document intelligence.
2026.01 - Present
Wuhan University
M.S. in Computer Science and Technology, School of Computer Science
Weighted average: 92.71/100; rank: 16/207, Top 8%
2024.09 - Present
Lanzhou University
B.S. in Computer Science and Technology, School of Information Science and Engineering
Weighted average: 88.44/100; rank: 7/113, Top 6%
2020.09 - 2024.06
News
Publications
-
DocScope: Benchmarking Verifiable Reasoning for Trustworthy Long-Document UnderstandingarXiv preprint arXiv:2605.08888, 2026A benchmark that evaluates whether MLLMs can produce trustworthy, verifiable reasoning traces over long, visually rich documents via a four-stage evaluation protocol.
-
AnesSuite: A Comprehensive Benchmark and Dataset Suite for Anesthesiology Reasoning in LLMsInternational Conference on Learning Representations (ICLR), 2026The first comprehensive dataset suite for anesthesiology reasoning, covering benchmark, training data (CPT/SFT/RLVR), and Morpheus baseline reasoning models.
-
REX-RAG: Reasoning Exploration with Policy Correction in Retrieval-Augmented GenerationarXiv preprint arXiv:2508.08149, 2025Addresses dead-end exploration in RL-trained RAG agents through mixed sampling with exploratory prompts and a policy correction mechanism to reduce distribution shift.
-
Adaptive Decoding via Hierarchical Neural Information Gradients in Mouse Visual TasksarXiv preprint arXiv:2510.09451, 2025Proposes a hierarchical neural-information gradient framework to decode visual task representations from mouse brain activity across cortical regions.
-
Decoding Mouse Visual Tasks via Hierarchical Neural-Information GradientsMathematics 14(1), 31, 2025Studies hierarchical information gradients across mouse visual cortex to understand how neural data flows support visual task decoding.
-
Orthogonal-moment-based Attraction Measurement with Ocular Hints in Video-watching TaskIEEE Transactions on Computational Social Systems 10(3), 900-909, 2023Combines orthogonal moments with eye-tracking signals to measure viewer attraction levels during video-watching tasks.