kirak-kim

Kirak Kim kirak-kim

PhD student at AIRIS Lab, KAIST

20 followers · 31 following

AIRIS Lab, KAIST
Daejeon, South Korea
https://www.kirak.kim
@_kirak_kim

Achievements

Highlights

Stars

232 results for source starred repositories

Clear filter

facebookresearch / sam-audio

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 2,399 177 Updated Dec 23, 2025

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,558 772 Updated May 27, 2025

LOGUNIVPM / 1st-DAFx-Challenge

Official repository for the 1st DAFx Parameter Estimation Challenge

Python 29 1 Updated Nov 10, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,157 193 Updated Oct 9, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 17,346 1,450 Updated Nov 28, 2025

ali-vosoughi / PromptReverb

HTML 2 Updated Oct 5, 2025

NKU-HLT / AuditEval

Python 7 Updated Oct 14, 2025

AaronZ345 / ISDrama

Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting

Python 236 Updated Aug 20, 2025

jmrplens / PyOctaveBand

[Python3] Octave-Band and Fractional Octave-Band filter. For signal in time domain.

Python 83 19 Updated Jul 6, 2023

SonyResearch / LLM2Fx

Implementation of the paper "Can Large Language Models Predict Audio Effects Parameters from Natural Language?"

Python 23 2 Updated May 27, 2025

CompVis / zigma

A PyTorch implementation of the paper "ZigMa: A DiT-Style Mamba-based Diffusion Model" (ECCV 2024)

Python 341 23 Updated Mar 17, 2025

CuriseJia / NeurIPS2025-Spotlight-SSHS

[NeurIPS 2025 Spotlight] Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI models in Sound Localization

C# 4 1 Updated Nov 7, 2025

kennymckormick / pyskl

A toolbox for skeleton-based action recognition.

Python 1,185 213 Updated Mar 17, 2025

nikhilsinghmus / image2reverb

[ICCV 2021] Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis.

Python 87 9 Updated Oct 12, 2021

SAGNIKMJR / few-shot-rir

Code and datasets for 'Few-Shot Audio-Visual Learning of Environment Acoustics' (NeurIPS 2022)

Python 23 5 Updated Nov 20, 2023

yifan123 / flow_grpo

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,773 105 Updated Nov 4, 2025

liugangcode / Graph-DiT

The code for "Graph Diffusion Transformer for Multi-Conditional Molecular Generation"

Python 100 8 Updated Jun 3, 2025

audiolabs / webMUSHRA

a MUSHRA compliant web audio API based experiment software

JavaScript 406 162 Updated Nov 21, 2025

csteinmetz1 / dasp-pytorch

Differentiable audio signal processors in PyTorch

Python 274 8 Updated Dec 4, 2023

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & TIS & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,647 838 Updated Dec 18, 2025