MaxMax2016

MaxMax2016 MaxMax2016

Computer Vision, Speech Separation, Speech Synthesis, LLMs

392 followers · 946 following

UESTC
ChengDu，China

Lists (14)

Sort

Stars

deedy5 / ddgs

DDGS | Dux Distributed Global Search. A metasearch library that aggregates results from diverse web search services

Python 2,148 214 Updated Dec 19, 2025

ace-step / ACE-Step-1.5

The most powerful local music generation model that outperforms most commercial alternatives

Python 4,510 462 Updated Feb 7, 2026

Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 1,347 105 Updated Jan 30, 2026

iflytek / DeepResearch

A deep research framework based on progressive search and cross-evaluation.

Python 52 16 Updated Oct 27, 2025

anomalyco / opencode

The open source coding agent.

TypeScript 99,692 9,497 Updated Feb 7, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 173,501 28,217 Updated Feb 7, 2026

QwenLM / Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 7,103 874 Updated Feb 6, 2026

NVIDIA / personaplex

PersonaPlex code.

Python 4,798 696 Updated Jan 24, 2026

CarlWangChina / SaMoye-SVC

dog-can-sing-song

Python 51 5 Updated Jan 9, 2026

kyutai-labs / pocket-tts

A TTS that fits in your CPU (and pocket)

Python 3,066 343 Updated Feb 3, 2026

ysharma3501 / NovaSR

A lightning fast audio upsampler.

Python 703 58 Updated Feb 2, 2026

HKUDS / DeepTutor

"DeepTutor: AI-Powered Personalized Learning Assistant"

Python 10,103 1,345 Updated Feb 7, 2026

Simpleyyt / ai-manus

AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.

Python 1,427 349 Updated Dec 9, 2025

sarwarbeing-ai / Agentic_Design_Patterns

Agentic Design Patterns: A Hands-On Guide to Building Intelligent Systems by Antonio Gulli

Jupyter Notebook 9,336 1,684 Updated Sep 7, 2025

bfs18 / armel

poorman's ar-dit tts

Python 45 6 Updated Dec 31, 2025

ysharma3501 / MiraTTS

A high quality and fast TTS repository

Python 499 42 Updated Dec 22, 2025

cjpais / Handy

A free, open source, and extensible speech-to-text application that works completely offline.

TypeScript 14,354 971 Updated Feb 6, 2026

k2-fsa / Flow2GAN

Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation

Python 134 8 Updated Jan 21, 2026

amazon-far / TWIST2

[arXiv 2025] TWIST2: Scalable, Portable, and Holistic Humanoid Data Collection System

Python 643 60 Updated Dec 3, 2025

sh-lee-prml / PeriodWave

The official Implementation of PeriodWave and PeriodWave-Turbo

Python 217 17 Updated Apr 14, 2025

hugohe3 / ppt-master

AI 驱动的 SVG 演示文稿生成系统，支持 PPT、小红书、朋友圈等多格式 | 15 个示例 | 229 页 | 生成可编辑的 ppt 格式

Python 1,686 235 Updated Feb 7, 2026

sizigi / AliasingFreeNeuralAudioSynthesis

Python 45 5 Updated Dec 24, 2025

arcosoph / nanowakeword

A lightweight, open-source, and intelligent wake word detection engine. Train custom, high-accuracy models with minimal effort.

Python 41 7 Updated Feb 4, 2026

NexaAI / nexa-sdk

Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mobile (Android & iOS), and Linux/IoT (Arm64 & x86 Docker). Su…

Kotlin 7,672 949 Updated Feb 6, 2026

ModelTC / LightX2V

Light Image Video Generation Inference Framework

Python 1,927 157 Updated Feb 6, 2026

Aratako / T5Gemma-TTS

Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM

Python 271 29 Updated Dec 23, 2025

TEN-framework / ten-framework

Open-source framework for conversational voice AI agents

Python 9,847 1,172 Updated Feb 7, 2026

facebookresearch / dacvae

DACVAE

Python 191 15 Updated Dec 22, 2025

Marvis-Labs / marvis-tts

Python 346 21 Updated Aug 28, 2025

Choddeok / DiEmo-TTS

[INTERSPEECH 2025] The official implementation of DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech

Python 16 3 Updated Sep 7, 2025

MaxMax2016 MaxMax2016

Lists (14)

Android

Audio-C++

Audio-Common

Audio-Diffusion

Audio-LLM

Automatic Drive

books

Image-Vision

LLM-GPT

tools

TTS-Acoustic

TTS-Front

TTS-Vocoder

Voice-Change

Stars