Stars
Stable Diffusion web UI
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
aider is AI pair programming in your terminal
A generative speech model for daily dialogue.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Instant voice cloning by MIT and MyShell. Audio foundation model.
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
You like pytorch? You like micrograd? You love tinygrad! ❤️
DSPy: The framework for programming—not prompting—language models
A generative world for general-purpose robotics & embodied AI learning.
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Fully open reproduction of DeepSeek-R1
Official inference repo for FLUX.1 models
Code for the paper "Language Models are Unsupervised Multitask Learners"
🤗 smolagents: a barebones library for agents that think in code.
Official inference framework for 1-bit LLMs