Stars
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
unofficial vits2-TTS implementation in pytorch
A python package to build AI-powered real-time audio applications
The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"
Sophisticated, battery-conscious background-geolocation with motion-detection
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
Magic Copy is a Chrome extension that uses Meta's Segment Anything Model to extract a foreground object from an image and copy it to the clipboard.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
🦘 Explore multimedia datasets at scale
ML pipeline orchestration and model deployments on Kubernetes.
Graph Neural Network Library for PyTorch
3D ResNets for Action Recognition (CVPR 2018)
Modified version of react-search-input (https://github.com/enkidevs/react-search-input) to work with react-native.
A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch
A React component for playing a variety of URLs, including file paths, YouTube, Facebook, Twitch, SoundCloud, Streamable, Vimeo, Wistia and DailyMotion
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Make huge neural nets fit in memory
A Python module to decode video frames directly, using the FFmpeg C API.
Extract TVL1 optical flows in python (multi-process && multi-server)
A web video player built for the HTML5 world using React library.
A python implementation of Counterfactual Regret Minimization for poker
Decoupled Neural Interfaces using Synthetic Gradients for PyTorch
Tensorlang, a differentiable programming language based on TensorFlow
A tensorflow implementation of "Deep Convolutional Generative Adversarial Networks"
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation