Starred repositories
๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A feature-rich command-line audio/video downloader
Python tool for converting files and office documents to Markdown.
๐ Make websites accessible for AI agents. Automate tasks online with ease.
๐ซ Toolkit to help you get started with Spec-Driven Development
Interact with your documents using the power of GPT, 100% privately, no data leaks
Get your documents ready for gen AI
๐ธ๐ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Build and share delightful machine learning apps, all in Python. ๐ Star to support our work!
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthrโฆ
We write your reusable computer vision tools. ๐
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Code and documentation to train Stanford's Alpaca models, and generate the data.
๐ PageIndex: Document Index for Vectorless, Reasoning-based RAG
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
Train transformer language models with reinforcement learning.
Wan: Open and Advanced Large-Scale Video Generative Models
Specification and documentation for Agent Skills
Knowledge Engine for AI Agent Memory in 6 lines of code
AG-UI: the Agent-User Interaction Protocol. Bring Agents into Frontend Applications.
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
A command-line productivity tool powered by AI large language models like GPT-5, will help you accomplish your tasks faster and more efficiently.
A framework for building realtime voice AI agents ๐ค๐๏ธ๐น
A research prototype of a human-centered web agent
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/