lifeiteng

Follow

Feiteng lifeiteng

Follow

Full stack Algorithm Engineer

536 followers · 131 following

Achievements

Achievements

Stars

facebookresearch / sam-audio

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 2,004 141 Updated Dec 19, 2025

Aratako / T5Gemma-TTS

Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM

Python 196 23 Updated Dec 17, 2025

Blaizzy / mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Python 3,089 252 Updated Dec 19, 2025

lukewys / realchords-pytorch

PyTorch implementation of ReaLchords, ReaLJam and GAPT: real-time music accompaniment systems with generative models trained via reinforcement learning

Python 6 3 Updated Nov 25, 2025

OpenBMB / VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 2,988 319 Updated Dec 15, 2025

zai-org / GLM-ASR

GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters

Python 577 51 Updated Dec 12, 2025

csukuangfj / kaldi-native-fbank

Kaldi-compatible online fbank extractor without external dependencies

C++ 135 34 Updated Oct 9, 2025

Visionary-Laboratory / visionary

Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Python 335 15 Updated Dec 15, 2025

microsoft / VibeVoice

Open-Source Frontier Voice AI

Python 18,669 2,058 Updated Dec 17, 2025

thib-s / flash-newton-schulz

My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.

Python 24 2 Updated Dec 5, 2025

w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.

HTML 1,199 167 Updated Dec 4, 2025

salute-developers / GigaAM

Foundational Model for Speech Recognition Tasks

Python 399 53 Updated Dec 5, 2025

deepseek-ai / DeepSeek-Math-V2

Python 1,488 117 Updated Dec 1, 2025

black-forest-labs / flux2

Official inference repo for FLUX.2 models

Python 1,240 62 Updated Dec 1, 2025

AnswerDotAI / fasthtml

The fastest way to create an HTML app

Jupyter Notebook 6,743 288 Updated Dec 16, 2025

corticph / error-align

Text-to-text alignment algorithm for speech recognition error analysis.

Python 22 1 Updated Dec 15, 2025

Breakthrough / PySceneDetect

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 4,403 473 Updated Dec 11, 2025

facebookresearch / omnilingual-asr

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,487 213 Updated Dec 16, 2025

LLMQuant / quant-mind

QuantMind is an intelligent knowledge extraction and retrieval framework for quantitative finance.

Python 98 14 Updated Sep 25, 2025

argmaxinc / OpenBench

Open-source reproducible benchmarks from Argmax

Jupyter Notebook 72 3 Updated Dec 19, 2025

cjpais / Handy

A free, open source, and extensible speech-to-text application that works completely offline.

TypeScript 8,706 597 Updated Dec 19, 2025

Beingpax / VoiceInk

Voice-to-text app for macOS to transcribe what you say to text almost instantly

Swift 2,887 348 Updated Dec 19, 2025

yihao-meng / HoloCine

Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives

Python 557 104 Updated Nov 26, 2025

NVlabs / DiffusionNFT

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Python 501 15 Updated Sep 22, 2025

ddlBoJack / Omni-Captioner

Data Pipeline, Models, and Benchmark for Omni-Captioner.

Python 105 Updated Oct 17, 2025

GiantAILab / DiaMoE-TTS

Official code for"DiaMoE-TTS: A Unified IPA-based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptation"

Python 205 18 Updated Nov 28, 2025

NVlabs / LongLive

LongLive: Real-time Interactive Long Video Generation

Python 916 63 Updated Dec 4, 2025

iChochy / NCE

《新概念英语》全四册在线课文朗读、单句点读、中英对照

JavaScript 2,165 411 Updated Nov 11, 2025

Winn1y / Awesome-Human-Motion-Video-Generation

【Accepted by TPAMI】Human Motion Video Generation: A Survey (https://ieeexplore.ieee.org/document/11106267)

283 11 Updated Dec 19, 2025

Tencent-Hunyuan / SRPO

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Python 1,225 40 Updated Oct 26, 2025