Stars
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
🦜🔗 The platform for reliable agents.
Robust Speech Recognition via Large-Scale Weak Supervision
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Generative Models by Stability AI
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Official inference framework for 1-bit LLMs
GUI for a Vocal Remover that uses Deep Neural Networks.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
🚀 The fast, Pythonic way to build MCP servers and clients
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
The official Python SDK for Model Context Protocol servers and clients
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
A TTS model capable of generating ultra-realistic dialogue in one pass.
State-of-the-Art Text Embeddings
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Hackable and optimized Transformers building blocks, supporting a composable construction.
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…