Skip to content
View wangqinsi1's full-sized avatar

Block or report wangqinsi1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,985 3,925 Updated Nov 6, 2025

Fully open reproduction of DeepSeek-R1

Python 25,614 2,401 Updated Sep 8, 2025

An LLM agent that conducts deep research (local and web) on any given topic and generates a long report with citations.

Python 24,053 3,178 Updated Oct 25, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,908 2,659 Updated Aug 12, 2024

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,917 947 Updated Nov 6, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,674 366 Updated Oct 21, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,679 440 Updated Nov 4, 2025

Witness the aha moment of VLM with less than $3.

Python 3,976 290 Updated May 19, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,260 419 Updated Nov 6, 2025

[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges

2,002 57 Updated Oct 10, 2025

[NeurIPS 2025] MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 1,471 71 Updated Oct 13, 2025

[ICLR 2025] Automated Design of Agentic Systems

Python 1,444 221 Updated Jan 28, 2025

[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

1,286 78 Updated Oct 11, 2025

🐝 When Agent Meets RL and Prompt Optimization the First Time

Python 964 83 Updated Jan 3, 2025

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Python 634 214 Updated Aug 30, 2021

Repository containing SoPs and other reference material for Graduate admission process.

280 31 Updated Mar 6, 2023

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Python 153 9 Updated Sep 27, 2025
Python 147 11 Updated Feb 15, 2025

Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning

Python 103 2 Updated Oct 16, 2025

This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.

Python 98 2 Updated Oct 21, 2025

ICML 2025 - Impossible Videos

Python 78 8 Updated Jul 23, 2025

This is the official Python version of Angles Don’t Lie: Unlocking Training-Efficient RL Through the Model’s Own Signals.

Python 77 9 Updated Sep 26, 2025

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Python 72 5 Updated Jan 22, 2025

Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"

Python 47 6 Updated Oct 19, 2025
Python 41 1 Updated Nov 4, 2025

This is Official PyTorch implementation for 2023-NeurIPS-MathNAS: If Blocks Have a Role in Mathematical Architecture Design.

Python 36 2 Updated Apr 10, 2024

[ICML 2025] Official Repo for Stability-guided Adaptive Diffusion Acceleration. 🚀🌙Accelerating off-the-shelf diffusion model with a unified stability criterion.

Python 32 4 Updated Jul 24, 2025

A GPU-based Incremental PCA implementation.

Python 30 6 Updated Feb 18, 2025

[NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems

Python 29 4 Updated Nov 3, 2025
CSS 18 Updated Nov 5, 2025
Next