Stars
Stable Diffusion web UI
Robust Speech Recognition via Large-Scale Weak Supervision
real time face swap and one-click video deepfake with only a single image
A high-throughput and memory-efficient inference and serving engine for LLMs
Models and examples built with TensorFlow
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Clone a voice in 5 seconds to generate arbitrary speech in real-time
The simplest, fastest repository for training/finetuning medium-sized GPTs.
CLI platform to experiment with codegen. Precursor to: https://lovable.dev
"🐈 nanobot: The Ultra-Lightweight Personal AI Agent"
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
You like pytorch? You like micrograd? You love tinygrad! ❤️
A visual, example-driven guide to Claude Code — from basic concepts to advanced agents, with copy-paste templates that bring immediate value.
A generative world for general-purpose robotics & embodied AI learning.
Deezer source separation library including pretrained models.
Industry leading face manipulation platform
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
Faster Whisper transcription with CTranslate2
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
State-of-the-Art Text Embeddings
SQL databases in Python, designed for simplicity, compatibility, and robustness.
Stable Diffusion with Core ML on Apple Silicon
Wan: Open and Advanced Large-Scale Video Generative Models
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
End-to-End Object Detection with Transformers