Stars
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio quality • Chuyển văn bản thành giọng nói tiếng Việt • Text to speech tiếng Việt • TTS tiếng Việt
Native macOS app for Qwen3-TTS with custom voices, voice design, and voice cloning, 100% offline on Apple Silicon
A simple, pretty terminal tool that lets you search and download files from GitHub without leaving your CLI.
Go HTTP client with browser-identical TLS/HTTP2 fingerprinting. Bypass bot detection by perfectly mimicking Chrome, Firefox, and Safari at the cryptographic level (JA3/JA4, Akamai fingerprint, head…
《国际中文教育中文水平等级标准》 查询系统 Query System of Chinese Proficiency Grading Standards for International Chinese Language Education, New HSK Levels 2021
Fast inference engine for Transformer models
NiuTrans.SMT is an open-source statistical machine translation system developed by a joint team from NLP Lab. at Northeastern University and the NiuTrans Team. The NiuTrans system is fully develope…
Code and data for the COLING 2020 paper "Try to Substitute: An Unsupervised Chinese Word Sense Disambiguation Method Based on HowNet"
Core Data of HowNet and OpenHowNet Python API
Code for Chinese LIWC Lexicon Expansion via Hierarchical Classification of Word Embeddings with Sememe Attention (AAAI18)
A curated list of resources for Chinese NLP 中文自然语言处理相关资料
Code for NeurIPS 2019 - Glyce: Glyph-vectors for Chinese Character Representations
命名实体识别(NER),分词(CWS),实体分类(Entity Typing),关系抽取(Relation Extraction)等任务数据集整理
中文词典 / 中文詞典。Chinese / Chinese-English dictionaries.
Chinese word segmentation, Part-of-speech tagging and Medical named entity recognition From scratch.
Some useful Chinese corpus datasets 中文语料小数据
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
🎭 Intelligent browser header & fingerprint generator
中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。
Curating a collection of Mandarin Chinese vocabulary, idioms (成语), and characters (汉字). HSK 3.0, RSH, and other frequency lists.