Skip to content

Conversation

@yongtenglei
Copy link
Member

@yongtenglei yongtenglei commented Oct 16, 2025

What problem does this PR solve?

Add MinerU parser. #3945, #8092.

Set MINERU_EXECUTABLE to the MinerU executable path, defaults to mineru.

Set MINERU_DELETE_OUTPUT=0 to preserve MinerU's output, default is 1, which deletes temporary output.

Set MINERU_OUTPUT_DIR to choose the MinerU output directory (uses the temporary directory if unset).

Type of change

  • New Feature (non-breaking change which adds functionality)

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 🌈 python Pull requests that update Python code 💞 feature Feature request, pull request that fullfill a new feature. labels Oct 16, 2025
@KevinHuSh KevinHuSh added the ci Continue Integration label Oct 16, 2025
@KevinHuSh KevinHuSh merged commit 387baf8 into infiniflow:main Oct 17, 2025
2 checks passed
@JinHai-CN JinHai-CN mentioned this pull request Oct 21, 2025
41 tasks
cike8899 added a commit to cike8899/ragflow that referenced this pull request Dec 11, 2025
KevinHuSh pushed a commit that referenced this pull request Dec 11, 2025
### What problem does this PR solve?

Feat: Add mineru as a model manufacturer to the system. #10621

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: balibabu <assassin_cike@163.com>
yngvarhuang pushed a commit to yngvarhuang/ragflow that referenced this pull request Dec 13, 2025
…support_encrypted_files_20251209

* commit '44dec89f1fd6eb98f2d8d5c0137074990a09a99c': (28 commits)
  Fix: aspose-slide issue. (infiniflow#11935)
  Fix: raptor don't have attribute chat (infiniflow#11936)
  Feat: Add GPT-5.2 & pro (infiniflow#11929)
  Refa: refactor metadata filter (infiniflow#11907)
  Feat: Displaying the file option in the webhook's request body infiniflow#10427 (infiniflow#11928)
  Fix: forget-reset password (infiniflow#11927)
  Feature/docs generator (infiniflow#11858)
  Fix: correct metadata update behavior (infiniflow#11919)
  Docs: How to use restful API to update or delete metadata (infiniflow#11912)
  Feat: Add box connector (infiniflow#11845)
  Feat: Flatten the request schema of the webhook infiniflow#10427 (infiniflow#11917)
  feat: Add Single Bucket Mode for MinIO/S3 (infiniflow#11416)
  Fix: tokenizer issue. (infiniflow#11902)
  Feat: Add mineru as a model manufacturer to the system. infiniflow#10621 (infiniflow#11903)
  Refa: treat MinerU as an OCR model 2 (infiniflow#11905)
  Add DeepseekV3.2 of Tongyi-Qianwen and remove unused code (infiniflow#11898)
  Fix:async issue and sensitive logging (infiniflow#11895)
  Added semi-automatic mode to the metadata filter (infiniflow#11886)
  Fix data_sync startup crash by properly invoking async main (infiniflow#11879)
  MinerU supports for the new backend vlm-mlx-engine (infiniflow#11864)
  ...

# Conflicts:
#	docker/.env
#	pyproject.toml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Continue Integration 💞 feature Feature request, pull request that fullfill a new feature. 🌈 python Pull requests that update Python code size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants