Skip to content

kdrkdrkdr/kdrkdrkdr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 

Repository files navigation

Hi 👋, I'm Gyeongmin Kim (kdr)

Software engineer working on speech processing, on-device model optimization, and real-time audio systems. Co-founder of El Nino, building Knoc — a real-time translation subtitle service.


📝 Publications

  • Extracting Voice Styles from Frozen TTS Models via Gradient-Based Inverse Optimization Gyeongmin Kim. Preprint, Apr 2026. To be submitted to ICASSP 2027 / Interspeech 2027. [DOI] [supertonic.embed] [kokoro.embed]

💼 Experience

  • NC AI (Seongnam, South Korea) — Data Engineer, Contract (May 2026 – ) Data Engineering and Model Optimization for Multimodal AI Training/Inference
  • Yonsei University Health System (YUHS) (Seoul, South Korea) — Research Engineer (Mar 2025 – Oct 2025) Led dev for NGS clinical report pipeline & SICU false-alarm monitoring desktop app.
  • NCSOFT (Seongnam, South Korea) — Audio Data Engineer, Contract (Jul 2024 – Jan 2025) Built end-to-end audio post-processing automation for TTS data pipelines.
  • Axcellworks (Japan, Remote) — Speech Engineer, Freelance (Nov 2023 – Feb 2024) Improved real-time Voice Changer intelligibility via chunk merging algorithm.
  • Taiyaki Studios (USA, Remote) — AI Engineer, Contract (Jan 2023 – Jul 2023) Built a complete TTS training toolkit and production inference pipeline.

🚀 Featured Projects

On-Device Speech Model Optimization (C / WebAssembly)

  • MossTTS-Nano.c — 100M TTS model rewritten in pure C. NEON/SSE SIMD, KV cache, pthread parallelism — 30× speedup (68s → 2.3s), 1.8× faster than PyTorch CPU, RTF 0.33.
  • DeepFilterNet3.c.wasm — Noise-reduction model in pure C/WASM. ~1 ms on MacBook M2, ~4 ms on Galaxy S23.
  • fastenhancer.c.wasm — Audio enhancement in pure C. 546× size reduction (183 KB), mobile RTF 0.28.
  • LILAC — Zero-shot real-time voice conversion from 3s audio. OpenVoice v2 ported from PyTorch to pure C with streaming HiFi-GAN decoder, RNNoise SIMD, 2-thread SPSC audio pipeline. RTF 0.7–0.8 on CPU.

Multilingual TTS & Voice Conversion

  • JA2ML-VITS — Multilingual TTS inducing 19-language speech from Japanese-only datasets.
  • JK-VITS — Korean/Japanese bilingual TTS.
  • RVC-VITS — Voice-conversion-based dataset augmentation and TTS training pipeline using RVC.
  • ProsekaTTS — Character TTS service. 2.1M+ visitors (Feb 2026).
  • ShirokoTTS — First-ever Blue Archive Shiroko TTS.

G2P Packages (PyPI)

  • g2pk3 — Korean/Japanese/English → Korean pronunciation.
  • ko2kana — Korean/English pronunciation → Katakana.

Japanese Translation Tools

  • novel-reader — Android app translating novels from 7 Japanese sites with a custom proper-noun dictionary system.
  • EhndWebTranslate — Async Japanese web page translator with real-time/document/novel modes.
  • UserDict4Papago — Proper-noun dictionary overlay for Papago KR-JP translation.

🌐 Open Source Contributions

🎓 Education

  • Hanyang University, Seoul, South Korea — B.S. in Computer Science (Mar 2023 – Present, Leave of Absence since Jul 2024)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors