- Seoul, Republic of Korea
-
22:49
(UTC +09:00) - https://www.linkedin.com/in/kdrkdrkdr
- https://elnino.kr
Highlights
Lists (3)
Sort Name ascending (A-Z)
Stars
Ultrafast serverless GPU inference, sandboxes, and background jobs
C inference for Qwen3-ASR 0.6b and 1.7b transcriptions models
Two-stage Dereverberation Algorithm using DNN-supported multi-channel linear filtering and single-channel non-linear post-filtering
Real-time speech enhancement in the browser using pure C + WASM SIMD 128-bit.
Speed-optimized streaming neural speech enhancement network
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
AirLLM 70B inference with single 4GB GPU
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voiceโฆ
Pytorch implementation of Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation
Simultaneous speech-to-text model
Fast and local neural text-to-speech engine
Fast audio super resolution from 16khz to 48khz.
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READโฆ
SQL Native Memory Layer for LLMs, AI Agents & Multi-Agent Systems
๐ Sycamore is an LLM-powered search and analytics platform for unstructured data.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Optimized Whisper models for streaming and on-device use
๊ธฐํ ํฌ์ง์ ๋ณ ์ฝ๋์ ์ค์ผ์ผ์ ์ฝ๊ฒ ์ฐพ์ ์ ์๋ ์ฑ
team ORI of the 1st Hackathon Contest of Hyupseongdae
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very friendly for beginners.
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
finetune llm part for spark-tts model
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
Easy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)