Carkham

Junyuan Zhang Carkham

PhD student at HKU. Email: junyuan.zhang@connect.hku.hk

8 followers · 7 following

https://scholar.google.com/citations?user=uwwqEg8AAAAJ&hl=en

Achievements

Stars

ZZZZZQT / DOCR-Inspector

Jupyter Notebook 4 Updated Dec 16, 2025

opendatalab / LEGION

[ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"

Python 72 6 Updated Oct 22, 2025

ZichenWen1 / EPIC

(NeurIPS 2025 🔥) Official implementation for "Efficient Multi-modal Large Language Models via Progressive Consistency Distillation"

Python 38 3 Updated Nov 23, 2025

opendatalab / TRivia

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

Python 15 Updated Dec 3, 2025

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 21,561 1,928 Updated Oct 25, 2025

ibm-aur-nlp / PubTabNet

Jupyter Notebook 472 84 Updated Jul 8, 2025

InternLM / CapRL

An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"

Python 157 7 Updated Dec 24, 2025

ZichenWen1 / DIJA

Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"

Python 73 3 Updated Sep 30, 2025

CR400AF-A / SparseMM

[ICCV 2025] SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs

Python 79 2 Updated Oct 19, 2025

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,201 31,530 Updated Dec 24, 2025

YujiaHu1109 / IEAP

[NeurIPS 2025] IEAP: Image Editing As Programs with Diffusion Models

Python 109 5 Updated Sep 27, 2025

MM-MVR / UniViTAR

Python 11 1 Updated May 28, 2025

infly-ai / INF-MLLM

Python 109 8 Updated Nov 19, 2025

xuyang-liu16 / Awesome-Token-level-Model-Compression

📚 Collection of token-level model compression resources.

187 7 Updated Sep 3, 2025

HKU-TASR / AnywhereDoor

AnywhereDoor is a multi-target backdoor attack tailored for object detection. Once implanted, it enables adversaries to specify different attack types (object vanishing, fabrication, or misclassifi…

Jupyter Notebook 6 4 Updated Jul 8, 2025

KevinZhoutianyi / FoNE

Python 32 3 Updated Oct 1, 2025

MaxKinny / TabRecSet

A large scale camera-taken table detection and recognition dataset.

Python 143 9 Updated Jul 21, 2025

ZichenWen1 / DART

[EMNLP 2025 main 🔥] Code for "Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More"

Python 97 2 Updated Oct 12, 2025

XIAO4579 / Vlm-interpretability

Official implementation for the paper"Towards Understanding How Knowledge Evolves in Large Vision-Language Models"

Python 24 Updated Apr 10, 2025

shuaijiumei / logging-benchmark

AL-Bench: A benchmark for automatic logging

Python 8 Updated Aug 19, 2025

aiming-lab / MDocAgent

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Python 267 29 Updated Aug 8, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 5,773 376 Updated Oct 21, 2025

OpenBMB / VisRAG

Parsing-free RAG supported by VLMs

Python 888 74 Updated Dec 7, 2025

llm-lab-org / Multimodal-RAG-Survey

A Survey on Multimodal Retrieval-Augmented Generation

447 20 Updated Nov 8, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,826 1,086 Updated Dec 24, 2025

FanqingM / MM-Eureka-V0

MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka

Python 322 10 Updated Jun 21, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 12,515 1,536 Updated Apr 24, 2025

jiahaoli57 / Call-for-Reviewers

This project aims to collect the latest "call for reviewers" links from various top CS/ML/AI conferences/journals

1,050 43 Updated Dec 11, 2025

StarsfieldAI / R1-V

Witness the aha moment of VLM with less than $3.

Python 4,012 289 Updated May 19, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,751 2,407 Updated Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Junyuan Zhang Carkham

Achievements

Achievements

Block or report Carkham

Stars

ZZZZZQT / DOCR-Inspector

opendatalab / LEGION

ZichenWen1 / EPIC

opendatalab / TRivia

deepseek-ai / DeepSeek-OCR

ibm-aur-nlp / PubTabNet

InternLM / CapRL

ZichenWen1 / DIJA

CR400AF-A / SparseMM

huggingface / transformers

YujiaHu1109 / IEAP

MM-MVR / UniViTAR

infly-ai / INF-MLLM

xuyang-liu16 / Awesome-Token-level-Model-Compression

HKU-TASR / AnywhereDoor

KevinZhoutianyi / FoNE

MaxKinny / TabRecSet

ZichenWen1 / DART

XIAO4579 / Vlm-interpretability

shuaijiumei / logging-benchmark

aiming-lab / MDocAgent

om-ai-lab / VLM-R1

OpenBMB / VisRAG

llm-lab-org / Multimodal-RAG-Survey

modelscope / ms-swift

FanqingM / MM-Eureka-V0

Jiayi-Pan / TinyZero

jiahaoli57 / Call-for-Reviewers

StarsfieldAI / R1-V

huggingface / open-r1