Stars
A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI framework, it provides an efficient, scalable, and user-fri…
Metal GPU implementation of the Qwen3 transformer model on macOS with complete Apple Silicon compute shader acceleration.
A workshop that teaches you how to build your own coding agent. Similar to Roo code, Cline, Amp, Cursor, Windsurf or OpenCode.
A high-throughput and memory-efficient inference and serving engine for LLMs
ControlNet++: All-in-one ControlNet for image generations and editing!
Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps
Run Orpheus 3B Locally With LM Studio
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale…
Faster Whisper transcription with CTranslate2
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Whisper realtime streaming for long speech-to-text transcription and translation
No fortress, purely open ground. OpenManus is Coming.
A cert-manager webhook to perform DNS01 challenge through websupport DNS API
A simple screen parsing tool towards pure vision based GUI agent
An open source personal productivity platform built on Markdown, turbo charged with the scripting power of Lua
🤗 smolagents: a barebones library for agents that think in code.
Simple, unified interface to multiple Generative AI providers
Everything about the SmolLM and SmolVLM family of models
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Code for the paper "Language Models are Unsupervised Multitask Learners"
A collection of projects designed to help developers quickly get started with building deployable applications using the Claude API
Private & local AI personal knowledge management app for high entropy people.
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
llama3 implementation one matrix multiplication at a time