-
University of Chinese Academy of Science
- Beijing in China
Starred repositories
Examples and guides for using the OpenAI API
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
🔊 Text-Prompted Generative Audio Model
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
A hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding). Translations: 🇺🇸 🇨🇳 🇯🇵 🇮🇹 🇰🇷 🇷🇺 🇧🇷 🇪🇸
A multi-voice TTS system trained with an emphasis on quality
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
High-Resolution Image Synthesis with Latent Diffusion Models
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
深度学习入门开源书,基于TensorFlow 2.0案例实战。Open source Deep Learning book, based on TensorFlow 2.0 framework.
wtfpython的中文翻译/持续🚧.../ 能力有限,欢迎帮我改进翻译
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
QLoRA: Efficient Finetuning of Quantized LLMs
Code release for NeRF (Neural Radiance Fields)
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
My blogs and code for machine learning. http://cnblogs.com/pinard
Zero-Shot Speech Editing and Text-to-Speech in the Wild
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners