❇️ LLM
Distribute and run LLMs with a single file.
🤯 LobeHub is your Chief Agent Operator, organizing your agents into 7×24 operations by hiring, scheduling, and reporting on your entire AI team.
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
An Application Framework for AI Engineering
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
Stop renting your intelligence. Own it with AnythingLLM. Everything you need for a powerful local-first agent experience
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
This is a Phi Family of SLMs book for getting started with Phi Models. Phi a family of open sourced AI models developed by Microsoft. Phi models are the most capable and cost-effective small langua…
[ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.
Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.
An open-source RAG-based tool for chatting with your documents.
A guidance language for controlling large language models.
A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.
Code accompanying "How I learned to start worrying about prompt formatting".
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
SGLang is a high-performance serving framework for large language models and multimodal models.
A high-throughput and memory-efficient inference and serving engine for LLMs
⚡ Pure-Rust WebGPU inference engine — OpenAI-API compatible, GGUF native, runs on any GPU. No Python. No llama.cpp. Single binary.
A fast inference library for running LLMs locally on modern consumer-class GPUs
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
Lightweight Image Video Action Generation Inference Framework
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients. Published in Nature.