Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
verl: Volcano Engine Reinforcement Learning for LLMs
Build and run agents you can see, understand and trust.
ReMe: Memory Management Kit for Agents - Remember Me, Refine Me.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
The de facto GitHub star history graph.
AgentEvolver: Towards Efficient Self-Evolving Agent System
Retrieval and Retrieval-augmented LLMs
An easy-to-use Python framework to generate adversarial jailbreak prompts.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
FlowLLM: Simplifying LLM-based HTTP/MCP Service Development
LLM-powered MCP server for building financial deep-research agents, integrating web search, Crawl4AI scraping, and entity extraction into composable analysis flows.
EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in challenging tasks.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
DeepSeek Coder: Let the Code Write Itself
[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
Official Repo for Open-Reasoner-Zero
程序员延寿指南 | A programmer's guide to live longer
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓