Skip to content
View wangqinsi1's full-sized avatar

Block or report wangqinsi1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems

Python 21 3 Updated Nov 3, 2025

[NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems

Python 8 1 Updated Nov 1, 2025
Python 27 1 Updated Nov 4, 2025

Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning

Python 103 2 Updated Oct 16, 2025

🐝 When Agent Meets RL and Prompt Optimization the First Time

Python 963 82 Updated Jan 3, 2025

[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

1,279 78 Updated Oct 11, 2025

This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.

Python 97 2 Updated Oct 21, 2025

[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges

1,998 56 Updated Oct 10, 2025

[ICML 2025] Official Repo for Stability-guided Adaptive Diffusion Acceleration. 🚀🌙Accelerating off-the-shelf diffusion model with a unified stability criterion.

Python 32 4 Updated Jul 24, 2025

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Python 634 214 Updated Aug 30, 2021

HippoMM: Hippocampal-inspired Multimodal Memory

Python 13 Updated May 22, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,256 419 Updated Nov 3, 2025

ICML 2025 - Impossible Videos

Python 78 8 Updated Jul 23, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,897 946 Updated Nov 5, 2025

Solve Visual Understanding with Reinforced VLMs

Python 5,673 366 Updated Oct 21, 2025

This is the official Python version of Angles Don’t Lie: Unlocking Training-Efficient RL Through the Model’s Own Signals.

Python 77 9 Updated Sep 26, 2025

This is Official PyTorch implementation for 2025-ICML-CoreMatching: Co-adaptive Sparse Inference Framework for Comprehensive Acceleration of Vision Language Model

Python 12 2 Updated May 27, 2025

[NeurIPS 2025] MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 1,470 71 Updated Oct 13, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,669 440 Updated Nov 4, 2025

An LLM agent that conducts deep research (local and web) on any given topic and generates a long report with citations.

Python 24,036 3,174 Updated Oct 25, 2025

Witness the aha moment of VLM with less than $3.

Python 3,975 290 Updated May 19, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,929 3,918 Updated Nov 5, 2025

Fully open reproduction of DeepSeek-R1

Python 25,613 2,401 Updated Sep 8, 2025

Advanced implementation of DeepSeek-R1 featuring Group Relative Policy Optimization (GRPO) for mathematical reasoning AI. Integrates safe distillation, modular reward systems, and efficient LoRA fi…

Python 13 3 Updated Jan 29, 2025

Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"

Python 47 6 Updated Oct 19, 2025

"Knock, knock!" "Who's there?" "Dobi."

HTML 17 1 Updated Aug 11, 2025

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Python 153 9 Updated Sep 27, 2025

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Python 72 5 Updated Jan 22, 2025

This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation.

Jupyter Notebook 17 2 Updated Oct 25, 2024

Repository for latent Bayesian Kernel Inference

Python 7 1 Updated Apr 1, 2025
Next