Highlights
- Pro
Lists (9)
Sort Name ascending (A-Z)
Stars
[NAACL 2024] Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.
SkyRL: A Modular Full-stack RL Library for LLMs
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
Train your Agent model via our easy and efficient framework
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Code & Dataset for Paper: "Better Process Supervision with Bi-directional Rewarding Signals"
Recommend new arxiv papers of your interest daily according to your Zotero libarary.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Recipes to train reward model for RLHF.
Get your documents ready for gen AI
Awesome-Paper-list: Visualization meets LLM
[EMNLP 2025] Distill Visual Chart Reasoning Ability from LLMs to MLLMs
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
A Next-Generation Training Engine Built for Ultra-Large MoE Models
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
A generative speech model for daily dialogue.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
✨✨Latest Advances on Multimodal Large Language Models
[EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor