wooksu

Wooksu Shin wooksu

2 followers · 0 following

Achievements

Organizations

Lists (3)

Sort

🔮 Future ideas

✨ Inspiration

🚀 My stack

Stars

EvolvingLMMs-Lab / LLaVA-OneVision-2

Fully Open Framework for Democratized Multimodal Training

Python 1,103 75 Updated Jun 23, 2026

EvolvingLMMs-Lab / OneVision-Encoder

Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Python 375 20 Updated Jun 20, 2026

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 72,410 8,860 Updated Jun 23, 2026

ocy1 / TRIO

Official implementation for "TRIO: Token Reduction via Inference-Objective Guidance for Efficient Vision-Language Models" https://arxiv.org/pdf/2602.04657

Python 77 7 Updated Jun 3, 2026

code-yeongyu / oh-my-openagent

omo/lazycodex: The coding agent for tokenmaxxers;the one and only agent harness for complex codebases. For your Codex, for your OpenCode

TypeScript 63,350 5,155 Updated Jun 23, 2026

TencentCloudADP / youtu-agent

A simple yet powerful agent framework that delivers with open-source models

Python 4,576 467 Updated Mar 21, 2026

nota-github / ERGO

ERGO (Efficient Reasoning & Guided Observation) is a large vision-language model trained with reinforcement learning on efficiency objectives. [ICLR'26]

Python 19 1 Updated Feb 25, 2026

Haochen-Wang409 / TreeVGR

[ICLR'26] Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Python 91 2 Updated Jan 26, 2026

zzzhhzzz / Ground-R1

Python 43 1 Updated Jul 14, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 14,153 2,246 Updated Apr 26, 2026

Theia-4869 / CDPruner

[NeurIPS 2025] Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.

Python 104 5 Updated Sep 20, 2025

UCSB-AI / GRIT

Official code for NeurIPS 2025 paper "GRIT: Teaching MLLMs to Think with Images"

Python 190 10 Updated Jan 16, 2026

Visual-Agent / DeepEyes

Python 1,239 78 Updated Nov 20, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 6,037 536 Updated May 4, 2026

Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,426 62 Updated May 11, 2026

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 5,025 372 Updated Apr 6, 2026

facebookresearch / perception_models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,304 156 Updated Apr 13, 2026

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 5,988 381 Updated Mar 12, 2026

Liuziyu77 / Visual-RFT

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,250 108 Updated Oct 29, 2025

StarsfieldAI / R1-V

Witness the aha moment of VLM with less than $3.

Python 4,060 283 Updated May 19, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,632 18,363 Updated Jun 23, 2026

daixiangzi / Awesome-Token-Compress

A paper list of some recent works about Token Compress for Vit and VLM

926 42 Updated Jun 16, 2026

Nota-NetsPresso / BK-SDM

A Compressed Stable Diffusion for Efficient Text-to-Image Generation [ECCV'24]

Python 318 20 Updated Jul 6, 2024

Nota-NetsPresso / nota-wav2lip

A 28× Compressed Wav2Lip for Efficient Talking Face Generation [ICCV'23 Demo] [MLSys'23 Workshop] [NVIDIA GTC'23]

Python 60 6 Updated Mar 8, 2024

Nota-NetsPresso / shortened-llm

Compressed LLMs for Efficient Text Generation [ICLR'24 Workshop]

Python 90 13 Updated Sep 13, 2024

Nota-NetsPresso / PyNetsPresso

The official NetsPresso Python package.

Jupyter Notebook 48 1 Updated Nov 20, 2025

Nota-NetsPresso / netspresso-trainer

A library for training, compressing and deploying computer vision models (including ViT) with edge devices

Python 75 11 Updated Sep 29, 2025

nota-github / AIC2023_Track1_Nota

Repository for 2023 AI City Challenge (Track1: Multi-Camera People Tracking)

Python 38 6 Updated Oct 7, 2024

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 161,832 33,576 Updated Jun 23, 2026

cmpark0126 / pytorch-polynomial-lr-decay

Polynomial Learning Rate Decay Scheduler for PyTorch

Python 65 13 Updated Dec 25, 2021