Starred repositories
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
real time face swap and one-click video deepfake with only a single image
The definitive Web UI for local AI, with powerful features and easy setup.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Instant voice cloning by MIT and MyShell. Audio foundation model.
You like pytorch? You like micrograd? You love tinygrad! ❤️
Convert PDF to markdown + JSON quickly with high accuracy
Generative Models by Stability AI
Official inference repo for FLUX.1 models
Code for the paper "Language Models are Unsupervised Multitask Learners"
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
OCR, layout analysis, reading order, table recognition in 90+ languages
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
so-vits-svc fork with realtime support, improved interface and more features.
Model parallel transformers in JAX and Haiku
The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Website: https://swarms.ai
Google Drive Public File Downloader when Curl/Wget Fails
This patch removes restriction on maximum number of simultaneous NVENC video encoding sessions imposed by Nvidia to consumer-grade GPUs.
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
A python package to analyze and compare voices with deep learning
HAAS = Hierarchical Autonomous Agent Swarm - "Resistance is futile!"
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured …
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a ca…