-
timedomAIn
- Beijing
- seanweichat
Lists (28)
Sort Name ascending (A-Z)
3d-rendering
unity or other 3D rendering relatedAI_tricks
audio_framework
audio-generation
models for audio generationbigData
blockchain
chatGPTxxx
dataset
DeepLearning—learning
dsp
game_framework
game_graphic
game_physics
image_generation
xxGAN, diffusion..infra
interesting
large_model
MIR_ASR
ML_model deploy/optimization
music-generation
nlp
other_tools
server_dev
TTS_or_singing-sythesis
deep-learning paper for MIR, TTS for SInging-synthesisui_framework
vocoder
voice-conversion
webui
Stars
🔊 Text-Prompted Generative Audio Model
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
StableLM: Stability AI Language Models
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
High-Resolution Image Synthesis with Latent Diffusion Models
wtfpython的中文翻译/持续🚧.../ 能力有限,欢迎帮我改进翻译
QLoRA: Efficient Finetuning of Quantized LLMs
Code release for NeRF (Neural Radiance Fields)
Official inference library for Mistral models
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Image restoration with neural networks but without learning.
Using Low-rank adaptation to quickly fine-tune diffusion models.
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
This repo contains the code for 1D tokenizer and generator
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
An implementation of WaveNet with fast generation
Using GPT to organize and access information, and generate questions. Long term goal is to make an agent-like research assistant.
PyTorch implementation of VQ-VAE by Aäron van den Oord et al.
How to use our public wav2vec2 dimensional emotion model