Skip to content

Pull requests: datajuicer/data-juicer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix(checkpoint): resolve relative path issue in Ray checkpoint writer
#979 opened May 14, 2026 by cmgzn Collaborator Loading…
Replace bs4 stub with beautifulsoup4 in dependencies
#977 opened May 12, 2026 by justinwolfington Loading…
2 tasks
Add video_human_3d_pose_mapper. dj:op issues/PRs about some specific OPs enhancement New feature or request
#976 opened May 10, 2026 by Qirui-jiao Collaborator Loading…
Update num_proc handling for vllm and Ray mode
#973 opened May 1, 2026 by ArdalanM Loading…
Add normal map op, optimal flow op, and universal segmentation op for videos. dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request
#970 opened Apr 27, 2026 by Qirui-jiao Collaborator Loading…
[WIP] feat(agent): training-ready data recipes, learnable-value mappers, cross-model similarity agent related to agent dj:op issues/PRs about some specific OPs dj:post-tuning issues/PRs about post-tuning scenarios
#969 opened Apr 20, 2026 by yxdyc Collaborator Loading…
[WIP] feat: add persistent custom operator registry
#968 opened Apr 15, 2026 by cmgzn Collaborator Loading…
Add face keypoints/animal pose ops & Extend ops for frame-sequence input dj:op issues/PRs about some specific OPs enhancement New feature or request
#966 opened Apr 14, 2026 by Qirui-jiao Collaborator Loading…
refactor: declarative schema for configuration
#963 opened Apr 8, 2026 by cmgzn Collaborator Loading…
better parallelism in partitioned ray executor
#945 opened Mar 17, 2026 by cyruszhang Collaborator Draft
[WIP] feat: Integrate ElasticJuicer Core Modules
#934 opened Mar 11, 2026 by fengrui-z Collaborator Loading…
1 of 4 tasks
Feat: update vla ops and add val pipeline demo
#931 opened Mar 6, 2026 by Cathy0908 Collaborator Loading…
[WIP] arXiv/PDF to Markdown mappers + dj-op one-shot runner dj:op issues/PRs about some specific OPs
#917 opened Feb 14, 2026 by yxdyc Collaborator Loading…
[WIP] Multi-branch executor dj:core issues/PRs about the core functions of Data-Juicer enhancement New feature or request
#916 opened Feb 13, 2026 by yxdyc Collaborator Loading…
[WIP] feat: Add combined_logical_filter operator with AND/OR support dj:op issues/PRs about some specific OPs
#914 opened Feb 13, 2026 by yxdyc Collaborator Loading…
Feat: Support paimon, iceberg, hudi, delta lake, hdfs data source.
#911 opened Feb 11, 2026 by Dludora Collaborator Loading…
[WIP] Feat: Add RayImageBTSMinhashDeduplicator
#897 opened Jan 29, 2026 by Dludora Collaborator Loading…
Depth seg new op dj:op issues/PRs about some specific OPs
#862 opened Dec 22, 2025 by archernsy Loading…
[NewOp] Add group_diversity_filter op
#745 opened Jul 22, 2025 by lingzhq Collaborator Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.