DocScope: Benchmarking Verifiable Reasoning for
Trustworthy Long-Document Understanding

Xiang Feng¹, Jiawei Zhou¹, Zhangfeng Huang², Kewei Wang³
Shanshan Ye⁴, Jinxin Hu², Zulong Chen^2,†, Yong Luo^1,†, Jing Zhang^1,†,‡

¹ School of Computer Science, National Engineering Research Center for Multimedia Software
and Hubei Key Laboratory of Multimedia and Network Communication Engineering,
Wuhan University, China
² Alibaba Group, Hangzhou, China
³ Department of Electronic Engineering and Information Science,
University of Science and Technology of China, China
⁴ Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates

^† Corresponding author, ^‡ Project leader

Overview

DocScope is a benchmark for evaluating trustworthy long-document understanding. It tests whether multimodal large language models can produce verifiable reasoning trajectories over complete PDF documents, including evidence pages, grounded evidence regions, factual statements, and final answers.

Figure 1. Overview of DocScope.

Citation

@article{feng2026docscopebenchmarkingverifiablereasoning,
  title={DocScope: Benchmarking Verifiable Reasoning for Trustworthy Long-Document Understanding},
  author={Xiang Feng and Jiawei Zhou and Zhangfeng Huang and Kewei Wang and Shanshan Ye and Jinxin Hu and Zulong Chen and Yong Luo and Jing Zhang},
  journal={arXiv preprint arXiv:2605.08888},
  year={2026},
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocScope: Benchmarking Verifiable Reasoning for
Trustworthy Long-Document Understanding

Overview

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

DocScope: Benchmarking Verifiable Reasoning for Trustworthy Long-Document Understanding

Overview

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

DocScope: Benchmarking Verifiable Reasoning for
Trustworthy Long-Document Understanding

Packages