-
UESTC PhD, TJU Master's
Lists (6)
Sort Name ascending (A-Z)
Starred repositories
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Robust Speech Recognition via Large-Scale Weak Supervision
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Clone a voice in 5 seconds to generate arbitrary speech in real-time
The world's simplest facial recognition api for Python and the command line
The simplest, fastest repository for training/finetuning medium-sized GPTs.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
A generative speech model for daily dialogue.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
State-of-the-art 2D and 3D Face Analysis Project
Industry leading face manipulation platform
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Code for the paper "Language Models are Unsupervised Multitask Learners"
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Faster Whisper transcription with CTranslate2
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
A very simple framework for state-of-the-art Natural Language Processing (NLP)