voidful

🎯

Focusing

Eric Lam voidful

🎯

Focusing

👩‍🎓PhD@NTU Speech Lab. Formerly, Microsoft Research Intern.

385 followers · 322 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Developer Program Member
Pro

Lists (1)

Sort

instruction dataset

8 repositories

Stars

2412 results for source starred repositories

Clear filter

AmphionTeam / TaDiCodec

This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Lan…

Python 58 2 Updated Sep 21, 2025

amphionspace / FlexiCodec

FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates

Python 29 2 Updated Nov 4, 2025

inclusionAI / Ming-UniAudio

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

Python 312 25 Updated Oct 28, 2025

SesameAILabs / csm

A Conversational Speech Generation Model

Python 14,271 1,436 Updated May 27, 2025

Jiawei-Yang / DeTok

Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"

Jupyter Notebook 154 4 Updated Oct 21, 2025

shaochenze / calm

Official implementation of "Continuous Autoregressive Language Models"

Python 469 57 Updated Nov 10, 2025

Stability-AI / stable-audio-metrics

Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.

Python 256 22 Updated Oct 31, 2025

voidful / tw_new_stocker

Python 1 Updated Nov 11, 2025

meituan-longcat / LongCat-Flash-Omni

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 379 18 Updated Nov 10, 2025

john852517791 / awesome-fake-audio-detection

A list of tools, papers and code related to Fake Audio Detection.

191 10 Updated Oct 20, 2025

HKUDS / AI-Trader

"AI-Trader: Can AI Beat the Market?" Live Trading Bench: https://ai4trade.ai

Python 9,228 1,371 Updated Nov 11, 2025

wquguru / nof0

NOF0 - 开源的 AI 交易竞技场

Go 2,678 423 Updated Nov 6, 2025

ictnlp / LLaMA-Omni2

Python 244 26 Updated May 19, 2025

liushiliushi / ConfTuner

Official code of ConfTuner: Training Large Language Models to Express Their Confidence Verbally

Python 11 Updated Sep 26, 2025

Soul-AILab / SAC

Trainging, inference, and testing of the SAC speech codec model.

Python 84 6 Updated Nov 1, 2025

SWivid / AUV

An All-in-One Speech, Sound, Music Codec with Single Nested Codebook

Python 20 Updated Oct 11, 2025

PaddlePaddle / PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 63,375 9,302 Updated Nov 10, 2025

zhenye234 / xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Python 261 18 Updated Oct 12, 2025

danjuan-77 / UltraVoice100K

This is the official repository for the UltraVoice100K dataset, providing code and dataset samples.

JavaScript 12 1 Updated Oct 26, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 36,357 4,322 Updated Nov 5, 2025

simonw / claude-skills

The contents of /mnt/skills in Claude's code interpreter environment

866 124 Updated Oct 16, 2025

SamsungSAILMontreal / TinyRecursiveModels

Python 5,499 789 Updated Oct 8, 2025

thinking-machines-lab / tinker-cookbook

Post-training with Tinker

Python 1,784 142 Updated Nov 11, 2025

ckyang1124 / SAKURA

Official GitHub repository for paper "SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information" (Interspeech 2025)

Python 19 3 Updated Aug 14, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,856 160 Updated Oct 9, 2025