Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Robust Speech Recognition via Large-Scale Weak Supervision
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
No fortress, purely open ground. OpenManus is Coming.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
🔊 Text-Prompted Generative Audio Model
A generative speech model for daily dialogue.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Open-Sora: Democratizing Efficient Video Production for All
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Fully open reproduction of DeepSeek-R1
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
verl: Volcano Engine Reinforcement Learning for LLMs
✨✨Latest Advances on Multimodal Large Language Models
Train transformer language models with reinforcement learning.
<⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
A multi-voice TTS system trained with an emphasis on quality
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
FlashMLA: Efficient Multi-head Latent Attention Kernels
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
The official GitHub page for the survey paper "A Survey of Large Language Models".
Foundational Models for State-of-the-Art Speech and Text Translation