voidful

🎯

Focusing

Eric Lam voidful

🎯

Focusing

👩‍🎓PhD@NTU Speech Lab. Formerly, Microsoft Research Intern.

385 followers · 322 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Developer Program Member
Pro

Lists (1)

Sort

instruction dataset

8 repositories

Stars

inclusionAI / Ming-UniAudio

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

Python 302 25 Updated Oct 28, 2025

SesameAILabs / csm

A Conversational Speech Generation Model

Python 14,257 1,426 Updated May 27, 2025

Jiawei-Yang / DeTok

Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"

Jupyter Notebook 152 4 Updated Oct 21, 2025

shaochenze / calm

Official implementation of "Continuous Autoregressive Language Models"

Python 294 42 Updated Nov 6, 2025

Stability-AI / stable-audio-metrics

Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.

Python 253 22 Updated Oct 31, 2025

voidful / tw_new_stocker

Python 1 Updated Nov 7, 2025

meituan-longcat / LongCat-Flash-Omni

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 366 16 Updated Nov 4, 2025

john852517791 / awesome-fake-audio-detection

A list of tools, papers and code related to Fake Audio Detection.

190 10 Updated Oct 20, 2025

HKUDS / AI-Trader

"AI-Trader: Can AI Beat the Market?" Live Trading Bench: https://ai4trade.ai

Python 9,024 1,302 Updated Nov 6, 2025

wquguru / nof0

NOF0 - 开源的 AI 交易竞技场

Go 2,630 410 Updated Nov 6, 2025

ictnlp / LLaMA-Omni2

Python 243 26 Updated May 19, 2025

liushiliushi / ConfTuner

Official code of ConfTuner: Training Large Language Models to Express Their Confidence Verbally

Python 11 Updated Sep 26, 2025

Soul-AILab / SAC

Trainging, inference, and testing of the SAC speech codec model.

Python 84 6 Updated Nov 1, 2025

SWivid / AUV

An All-in-One Speech, Sound, Music Codec with Single Nested Codebook

Python 20 Updated Oct 11, 2025

PaddlePaddle / PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 62,867 9,262 Updated Nov 6, 2025

zhenye234 / xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Python 259 18 Updated Oct 12, 2025

danjuan-77 / UltraVoice100K

This is the official repository for the UltraVoice100K dataset, providing code and dataset samples.

JavaScript 12 1 Updated Oct 26, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 35,956 4,161 Updated Nov 5, 2025

simonw / claude-skills

The contents of /mnt/skills in Claude's code interpreter environment

860 123 Updated Oct 16, 2025

SamsungSAILMontreal / TinyRecursiveModels

Python 5,457 775 Updated Oct 8, 2025

thinking-machines-lab / tinker-cookbook

Post-training with Tinker

Python 1,446 114 Updated Nov 5, 2025

ckyang1124 / SAKURA

Official GitHub repository for paper "SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information" (Interspeech 2025)

Python 19 3 Updated Aug 14, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,829 160 Updated Oct 9, 2025