-
UESTC PhD, TJU Master's
Lists (6)
Sort Name ascending (A-Z)
Starred repositories
Curated list of project-based tutorials
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
ใHello ็ฎๆณใ๏ผๅจ็ปๅพ่งฃใไธ้ฎ่ฟ่ก็ๆฐๆฎ็ปๆไธ็ฎๆณๆ็จใๆฏๆ Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart ไปฃ็ ใ็ฎไฝ็ๅ็นไฝ็ๅๆญฅๆดๆฐ๏ผEnglish version in translation
Virtual whiteboard for sketching hand-drawn like diagrams
Robust Speech Recognition via Large-Scale Weak Supervision
ไธญ่ฑๆๆๆ่ฏใ่ฏญ่จๆฃๆตใไธญๅคๆๆบ/็ต่ฏๅฝๅฑๅฐ/่ฟ่ฅๅๆฅ่ฏขใๅๅญๆจๆญๆงๅซใๆๆบๅทๆฝๅใ่บซไปฝ่ฏๆฝๅใ้ฎ็ฎฑๆฝๅใไธญๆฅๆไบบๅๅบใไธญๆ็ผฉๅๅบใๆๅญ่ฏๅ ธใ่ฏๆฑๆ ๆๅผใๅ็จ่ฏใๅๅจ่ฏ่กจใๆดๆ่ฏ่กจใ็น็ฎไฝ่ฝฌๆขใ่ฑๆๆจกๆไธญๆๅ้ณใๆฑชๅณฐๆญ่ฏ็ๆๅจใ่ไธๅ็งฐ่ฏๅบใๅไน่ฏๅบใๅไน่ฏๅบใๅฆๅฎ่ฏๅบใๆฑฝ่ฝฆๅ็่ฏๅบใๆฑฝ่ฝฆ้ถไปถ่ฏๅบใ่ฟ็ปญ่ฑๆๅๅฒใๅ็งไธญๆ่ฏๅ้ใๅ ฌๅธๅๅญๅคงๅ จใๅค่ฏ่ฏๅบใIT่ฏๅบใ่ดข็ป่ฏๅบใๆ่ฏญ่ฏๅบใๅฐๅ่ฏๅบใโฆ
A latent text-to-image diffusion model
๐งโ๐ซ 60+ Implementations/tutorials of deep learning papers with side-by-side notes ๐; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gaโฆ
Clone a voice in 5 seconds to generate arbitrary speech in real-time
The world's simplest facial recognition api for Python and the command line
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
๐ธ๐ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Build and share delightful machine learning apps, all in Python. ๐ Star to support our work!
A generative speech model for daily dialogue.
Deep Learning Book Chinese Translation
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Cross-platform, customizable ML solutions for live and streaming media.
๐ค Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
State-of-the-art 2D and 3D Face Analysis Project
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Industry leading face manipulation platform
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Code for the paper "Language Models are Unsupervised Multitask Learners"
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities