llm
Pocket Flow: 100-line LLM framework. Let Agents build Agents!
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
Unified interface for interacting with various LLMs hundreds of models, caching, fallback mechanisms, and enhanced reliability.
Flexible and powerful framework for managing multiple AI agents and handling complex conversations
The simplest, fastest repository for training/finetuning small-sized VLMs.
Official inference framework for 1-bit LLMs
Prompts for our Grok chat assistant and the `@grok` bot on X.
A lightweight, powerful framework for multi-agent workflows
Demo of a customer service use case implemented with the OpenAI Agents SDK
Tencent Hunyuan A13B (short as Hunyuan-A13B), an innovative and open-source LLM built on a fine-grained MoE architecture.
Kimi K2 is the large language model series developed by Moonshot AI team
🤗 smolagents: a barebones library for agents that think in code.
Build Real-Time Knowledge Graphs for AI Agents
A Python framework that emulates Grok Heavy functionality using intelligent multi-agent orchestration. Deploy 4 (or more) specialized AI agents in parallel to deliver comprehensive, multi-perspecti…
Simple & Scalable Pretraining for Neural Architecture Research
An open-source AI agent that lives in your terminal.
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Hierarchical Reasoning Model Official Release
AI memory OS for LLM and Agent systems(moltbot,clawdbot,openclaw), enabling persistent Skill memory for cross-task skill reuse and evolution.
[ICLR2026] Test-Time Scaling with Reflective Generative Model
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.
Implement a reasoning LLM in PyTorch from scratch, step by step
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.