-
Tsinghua University
- Beijing, China
- https://shenzhi-wang.netlify.app/
- @ShenzhiWang_THU
- https://huggingface.co/shenzhi-wang
Stars
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025
Code for "Variational Reasoning for Language Models"
Expert Kit is an efficient foundation of Expert Parallelism (EP) for MoE model Inference on heterogenous hardware
Official Repository of Absolute Zero Reasoner
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).
QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Official Repo for Open-Reasoner-Zero
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
A light-weight tool for evaluating LLMs in rule-based ways.
verl: Volcano Engine Reinforcement Learning for LLMs
Code for paper: Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Official repository of Uni-AdaFocus (TPAMI 2024).
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
A flexible and efficient training framework for large-scale alignment tasks
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
fanshiqing / grouped_gemm
Forked from tgale96/grouped_gemmPyTorch bindings for CUTLASS grouped GEMM.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.