Stars
A latent text-to-image diffusion model
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
A High-Quality Real Time Upscaler for Anime Video
Instruct-tune LLaMA on consumer hardware
《动手学大模型Dive into LLMs》系列编程实践教程
📡 Simple and ready-to-use tutorials for TensorFlow
High-Resolution Image Synthesis with Latent Diffusion Models
PyTorch tutorials and fun projects including neural talk, neural style, poem writing, anime generation (《深度学习框架PyTorch:入门与实战》)
QLoRA: Efficient Finetuning of Quantized LLMs
Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Inpaint anything using Segment Anything and inpainting models.
COCO API - Dataset @ http://cocodataset.org/
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
tensorflow implementation
An implementation of WaveNet with fast generation
Recurrent Neural Network Tutorial, Part 2 - Implementing a RNN in Python and Theano
Worlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.
🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition