Lists (16)
Sort Name ascending (A-Z)
Stars
Robust Speech Recognition via Large-Scale Weak Supervision
real time face swap and one-click video deepfake with only a single image
A high-throughput and memory-efficient inference and serving engine for LLMs
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
A generative speech model for daily dialogue.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Instant voice cloning by MIT and MyShell. Audio foundation model.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Full reference of LinkedIn answers 2024 for skill assessments (aws-lambda, rest-api, javascript, react, git, html, jquery, mongodb, java, Go, python, machine-learning, power-point) linkedin excel t…
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Graph Neural Network Library for PyTorch
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integra…
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
The official Python SDK for Model Context Protocol servers and clients
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Generate audiobooks from e-books, voice cloning & 1107+ languages!
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"