Skip to content
View kirak-kim's full-sized avatar
🪼
🪼

Highlights

  • Pro

Block or report kirak-kim

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Twitch VOD/Clip Downloader - Chat Download/Render/Replay

C# 3,762 334 Updated Jun 10, 2026

Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms

Python 11,232 1,295 Updated May 15, 2026

VLA-GSE: Boosting Parameter Efficient Finetuning in VLA with Generalized and Specialized Experts

Python 18 1 Updated Apr 29, 2026

Official repository of LIBERO-plus, a generalized benchmark for in-depth robustness analysis of vision-language-action models.

Python 347 26 Updated Jan 21, 2026

LIBERO-PRO is the official repository of the LIBERO-PRO — an evaluation extension of the original LIBERO benchmark

Python 267 21 Updated Mar 23, 2026

A first-of-its-kind acoustic simulation platform for audio-visual embodied AI research. It supports training and evaluating multiple tasks and applications.

Python 462 74 Updated Sep 29, 2023

[ICLR 2025] GRAM Official PyTorch repository

Python 133 8 Updated May 7, 2025

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

174,511 17,790 Updated Apr 20, 2026

Isaac-GR00T for RoboCasa Benchmark

Jupyter Notebook 6 5 Updated Feb 17, 2026

Public release of the Sound Effect Foundation model by Sony AI.

Python 316 22 Updated May 21, 2026

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Python 2,200 199 Updated Mar 19, 2026

OmniCodec: Low Frame Rate Universal Audio Codec with Semantic–Acoustic Disentanglement

Python 40 1 Updated Apr 17, 2026

[CVPR 2026] FLAC: Few-Shot Acoustic Synthesis with Flow Matching. FLAC enables RIR generation in novel scenes using only one-shot acoustic observation. The repository also provides AGREE, a joint e…

Python 18 4 Updated Mar 20, 2026

NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards

Python 106 9 Updated Jan 11, 2026

Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success

Python 1,250 181 Updated Sep 9, 2025

DreamGen: Nvidia GEAR Lab's initiative to solve the robotics data problem using world models

Jupyter Notebook 569 56 Updated Oct 24, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,650 361 Updated Jun 21, 2025

[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions

Python 1,090 66 Updated Nov 19, 2025

A structured reading list on Vision-Language-Action (VLA) models — from diffusion/flow matching foundations through state-of-the-art robot foundation model architectures to data scaling, RL fine-tu…

281 17 Updated Mar 21, 2026

🦾 A Dual-System VLA with System2 Thinking

Python 145 3 Updated Aug 21, 2025

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Python 153 15 Updated Dec 5, 2024

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 2,078 165 Updated Apr 21, 2025

Embodied Chain of Thought: A robotic policy that reason to solve the task.

Python 399 23 Updated Apr 5, 2025

ICASSP 2024 - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.

Python 56 3 Updated Nov 16, 2025

A comprehensive list of papers about dual-system VLA models, including papers, codes, and related websites.

117 4 Updated Nov 21, 2025

OpenHelix: An Open-source Dual-System VLA Model for Robotic Manipulation

Python 379 21 Updated Aug 27, 2025

PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation

417 9 Updated Mar 11, 2026

NVIDIA Isaac GR00T N1.7 - A Foundation Model for Generalist Robots.

Python 7,333 1,254 Updated Jun 12, 2026
Next