Skip to content
View lifeiteng's full-sized avatar

Block or report lifeiteng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 2,004 141 Updated Dec 19, 2025

Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM

Python 196 23 Updated Dec 17, 2025

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Python 3,089 252 Updated Dec 19, 2025

PyTorch implementation of ReaLchords, ReaLJam and GAPT: real-time music accompaniment systems with generative models trained via reinforcement learning

Python 6 3 Updated Nov 25, 2025

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 2,988 319 Updated Dec 15, 2025

GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters

Python 577 51 Updated Dec 12, 2025

Kaldi-compatible online fbank extractor without external dependencies

C++ 135 34 Updated Oct 9, 2025

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Python 335 15 Updated Dec 15, 2025

Open-Source Frontier Voice AI

Python 18,669 2,058 Updated Dec 17, 2025

My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.

Python 24 2 Updated Dec 5, 2025

WebCodecs is a flexible web API for encoding and decoding audio and video.

HTML 1,199 167 Updated Dec 4, 2025

Foundational Model for Speech Recognition Tasks

Python 399 53 Updated Dec 5, 2025

Official inference repo for FLUX.2 models

Python 1,240 62 Updated Dec 1, 2025

The fastest way to create an HTML app

Jupyter Notebook 6,743 288 Updated Dec 16, 2025

Text-to-text alignment algorithm for speech recognition error analysis.

Python 22 1 Updated Dec 15, 2025

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 4,403 473 Updated Dec 11, 2025

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,487 213 Updated Dec 16, 2025

QuantMind is an intelligent knowledge extraction and retrieval framework for quantitative finance.

Python 98 14 Updated Sep 25, 2025

Open-source reproducible benchmarks from Argmax

Jupyter Notebook 72 3 Updated Dec 19, 2025

A free, open source, and extensible speech-to-text application that works completely offline.

TypeScript 8,706 597 Updated Dec 19, 2025

Voice-to-text app for macOS to transcribe what you say to text almost instantly

Swift 2,887 348 Updated Dec 19, 2025

Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Python 557 104 Updated Nov 26, 2025

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Python 501 15 Updated Sep 22, 2025

Data Pipeline, Models, and Benchmark for Omni-Captioner.

Python 105 Updated Oct 17, 2025

Official code for"DiaMoE-TTS: A Unified IPA-based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptation"

Python 205 18 Updated Nov 28, 2025

LongLive: Real-time Interactive Long Video Generation

Python 916 63 Updated Dec 4, 2025

《新概念英语》全四册在线课文朗读、单句点读、中英对照

JavaScript 2,165 411 Updated Nov 11, 2025

【Accepted by TPAMI】Human Motion Video Generation: A Survey (https://ieeexplore.ieee.org/document/11106267)

283 11 Updated Dec 19, 2025

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Python 1,225 40 Updated Oct 26, 2025
Next