Starred repositories
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Python library for audio and music analysis
A lightweight library for converting complex objects to and from simple Python datatypes.
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
Python library for processing Chinese text
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
WebRTC and ORTC implementation for Python using asyncio
text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
This library provides common speech features for ASR including MFCCs and filterbank energies.
A framework for detecting, highlighting and correcting grammatical errors on natural language text. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.
基于微信PC端的Python接口,开发者可通过Python轻松调用。实现微信机器人、群管理等强大的功能!3.9.10.19、x64、微信hook、微信接口
SEED-Story: Multimodal Long Story Generation with Large Language Model
An API wrapper for Discord written in Python.
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
a free python grammar checker 📝✅
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation
🐍 Minos is a framework which helps you create reactive microservices in Python
[ICML 2025] Official PyTorch implementation of LongVU
今日头条中文新闻(文本)分类数据集
Syntax-highlighting, declarative and composable pretty printer for Python 3.5+