Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
A high-throughput and memory-efficient inference and serving engine for LLMs
Repository for the Tetrad Project, www.phil.cmu.edu/tetrad.
A language agent gym with challenging scientific tasks
Framework enabling modular interchange of language agents, environments, and optimizers
A Python package for causal inference in quasi-experimental settings
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
verl: Volcano Engine Reinforcement Learning for LLMs
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
Democratizing Reinforcement Learning for LLMs
A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Training Sparse Autoencoders on Language Models
Now, Stronger AI Pushes Frontiers, Stronger Our Shared Future.
Python package for Causal Discovery by learning the graphical structure of Bayesian networks. Structure Learning, Parameter Learning, Inferences, Sampling methods.
[TMLR 2025] Efficient Reasoning Models: A Survey
The official implementation of "ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning"
欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓
Verify Precision of all Kimi K2 API Vendor
优质稳定的OpenAI、Gemini、Claude等的API接口-For企业和开发者。OpenAI的api proxy,支持ChatGPT的API调用,支持Anthropic claude的官方接口形式,支持Google gemini的官方接口形式,支持:gpt-5,sora。不需要openai Key, 不需要买openai的账号,不需要美元的银行卡,通通不用的,直接调用就行,稳定好用!!智增增
A Comprehensive Survey on World Models for Embodied AI
[ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Obsidian tars plugin that supports text generation based on tag suggestions, using services like DeepSeek, Claude, OpenAI, OpenRouter, SiliconFlow, Gemini, Ollama, Kimi, Doubao, Qwen, Zhipu, QianFa…
Official implementation of X-Master, a general-purpose tool-augmented reasoning agent.
Pretraining and inference code for a large-scale depth-recurrent language model
Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
Stanford NLP Python library for understanding and improving PyTorch models via interventions