Skip to content
View MXuer's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report MXuer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Open-source, self-hosted note-taking tool built for quick capture. Markdown-native, lightweight, and fully yours.

Go 60,714 4,465 Updated Jun 12, 2026
Python 501 35 Updated Jun 12, 2026

Interactive 3D cell architecture gallery built with React and Three.js

HTML 1,307 270 Updated Jun 6, 2026

A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…

TypeScript 31,889 3,330 Updated Jun 12, 2026

using MMS to do the audio-transcript alignment

Python 10 Updated May 29, 2023

Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.

Python 756 65 Updated Jun 11, 2026

Python tool for converting files and office documents to Markdown.

Python 152,456 10,544 Updated May 26, 2026

High-Quality Voice Cloning TTS for 600+ Languages

Python 7,384 1,156 Updated Jun 11, 2026

Official code for "Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis"

Python 326 34 Updated Mar 30, 2026

Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm

Python 773 60 Updated Jan 4, 2026

A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.

Python 4,177 537 Updated Jun 5, 2026

End-to-end speech recognition large model: 31 languages, dialects, accents, lyrics, hotwords, timestamps, speaker diarization. Trained on tens of millions of hours.

Python 1,267 125 Updated Jun 12, 2026

Patterns and resources of low latency programming.

1,228 67 Updated Jul 30, 2025

Opencpop: A High-Quality Open Source Chinese Popular Song Database for Singing Voice Synthesis

235 11 Updated Dec 10, 2025

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,832 254 Updated Dec 30, 2025

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python 207 17 Updated Jun 11, 2026

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 3,435 440 Updated Dec 11, 2025

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,460 107 Updated Mar 16, 2026

The official implementation of CATT Arabic diacritization models.

Python 75 11 Updated Jul 18, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 21,108 2,608 Updated Jun 12, 2026
Python 97 21 Updated Jul 21, 2025

Convert PDF to markdown + JSON quickly with high accuracy

Python 36,055 2,491 Updated Jun 6, 2026

A Survey of Spoken Dialogue Models (60 pages)

317 18 Updated Nov 28, 2024

GLM-4-Voice | 端到端中英语音对话模型

Python 3,187 281 Updated Dec 5, 2024

A python package to analyze and compare voices with deep learning

Python 3,267 484 Updated Oct 12, 2023

前端实践项目

Vue 1 Updated Jan 18, 2024

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 5,562 503 Updated Feb 23, 2026

A generative speech model for daily dialogue.

Python 39,441 4,241 Updated Apr 10, 2026
Next