Heng-Jui Chang vectominist

🎯

Focusing

PhD Candidate @ MIT CSAIL. Speech Processing and Balloon Arts.

90 followers · 17 following

Massachusetts Institute of Technology
Cambridge, MA
04:54 (UTC -04:00)
people.csail.mit.edu/hengjui
@hjchang87

Achievements

Highlights

Organizations

Stars

rentruewang / inversql

Create SQL that match your selection (with explainable AI), not the other way around

Python 15 1 Updated Jun 17, 2026

ajd12342 / paraspeechcaps

Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'

Python 161 11 Updated Mar 26, 2026

Audio-WestlakeU / RealMAN

A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]

Python 171 16 Updated Apr 29, 2025

nari-labs / dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 19,323 1,687 Updated Nov 19, 2025

sainnhe / gruvbox-material

Gruvbox with Material Palette

Vim Script 2,596 192 Updated Apr 15, 2026

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 1,146 88 Updated Dec 23, 2024

k2-fsa / OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

Python 7,586 1,187 Updated Jun 11, 2026

OpenMOSS / MOSS-TTS

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenario…

Python 3,411 300 Updated Jun 18, 2026

microsoft / VibeVoice

Open-Source Frontier Voice AI

Python 49,476 5,516 Updated May 6, 2026

kan-bayashi / LibriTTSLabel

Alignment files of LibriTTS.

70 7 Updated Mar 16, 2020

CorentinJ / librispeech-alignments

Word alignments generated by the Montreal Forced Aligner for the Librispeech dataset

Python 182 24 Updated Mar 25, 2019

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 10,140 1,072 Updated Jun 17, 2026

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 9,365 786 Updated Mar 26, 2026

duoan / TorchCode

🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.

Jupyter Notebook 4,191 360 Updated May 25, 2026