Stars
Use any LLMs (Large Language Models) for Deep Research. Support SSE API and MCP server.
🧠 Roo Code Memory Bank: Seamless project context in Roo Code. No more repetition, just continuous development!
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
A fast inference library for running LLMs locally on modern consumer-class GPUs
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
Run inference on replit-3B code instruct model using CPU
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Official Code for Paper: RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text
QLoRA: Efficient Finetuning of Quantized LLMs
⛓️ Serving LangChain LLM apps and agents automagically with FastApi. LLMops
📋 A list of open LLMs available for commercial use.
Inference code and configs for the ReplitLM model family
A Bulletproof Way to Generate Structured JSON from Language Models
Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/T…
Faster Whisper transcription with CTranslate2
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Open Academic Research on Improving LLaMA to SOTA LLM