Lists (1)
Sort Name ascending (A-Z)
Stars
Official Implemenation for RAEv2: Improved Baselines with Representation Autoencoders
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
High-Quality Voice Cloning TTS for 600+ Languages
Open Multi-Agent Interactive Classroom — Get an immersive, multi-agent learning experience in just one click
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
Object-oriented handling of audio data, with GPU-powered augmentations, and more.
NVIDIA Isaac Sim™ is an open-source application on NVIDIA Omniverse for developing, simulating, and testing AI-driven robots in realistic virtual environments.
逃离鸭科夫联机mod(by Mr.sans & InitLoader)
A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone
SkiaSharp is a cross-platform 2D graphics API for .NET platforms based on Google's Skia Graphics Library. It provides a comprehensive 2D API that can be used across mobile, server and desktop model…
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
An extremely fast Python package and project manager, written in Rust.
FSA/FST algorithms, differentiable, with PyTorch compatibility.
A self-control web app based on CTDP theory
A demo project for starters to learn website coding
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)
Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
A Easy way to create your own Knowledge-base! Notemd enhances your Obsidian workflow by integrating with various Large Language Models (LLMs) to process your notes, automatically generate wiki-link…
This is the COST2100 channel model, a MATLAB implementation of a spatially consistent radio channel model for MIMO and Massive MIMO communication. Originally developed within COST 2100 (http://www.…
Discrete-time Signal Processing 3rd edition (Oppenheim)