Skip to content
View smksyj's full-sized avatar

Block or report smksyj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 672 42 Updated Jun 22, 2026

Modern Scala 3 client for Redis and Valkey, native and multi-backend

Scala 29 1 Updated Jun 21, 2026

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 251 23 Updated Apr 20, 2024

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.

Python 1,305 208 Updated Dec 7, 2025

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Python 9,924 842 Updated Jun 12, 2026

Safe direct-style streaming, concurrency and resiliency for Scala on the JVM

Scala 516 35 Updated Jun 22, 2026

Deterministic, AI-driven development flows.

Scala 116 9 Updated Jun 16, 2026

Open-source speech AI models from KRAFTON, including Raon-Speech and Raon-SpeechChat for speech understanding, generation, and real-time full-duplex conversation.

Python 63 12 Updated Apr 7, 2026

Official implementation of AsymFlow, pi-Flow, GMFlow

Python 440 24 Updated Jun 13, 2026

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,464 107 Updated Mar 16, 2026

First foundation ASR built for the real world - 7 atomic acoustic conditions, 54 compound scenarios, 2.6M samples, and up to ~30% gains over SOTA where every other model falls apart. **You'll come …

Python 1,033 67 Updated Jun 2, 2026

Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoregressive.

C 8,634 785 Updated Jun 22, 2026
Python 678 50 Updated Apr 29, 2026

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

Python 25,672 2,008 Updated Jun 4, 2026

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 58,922 6,440 Updated Jun 20, 2026

Create stunning demos for free. Open-source, no subscriptions, no watermarks, and free for commercial use. An alternative to Screen Studio.

TypeScript 38,734 2,763 Updated Jun 17, 2026

Warp is an agentic development environment, born out of the terminal.

Rust 62,152 5,072 Updated Jun 22, 2026

High-Quality Voice Cloning TTS for 600+ Languages

Python 7,657 1,199 Updated Jun 11, 2026

A Java port of ratatui — build rich terminal UIs from Java

Java 256 13 Updated May 17, 2026

Simple, beautiful CLI output

Scala 345 12 Updated Jun 19, 2026

The first continuous diffusion language model that rivals discrete counterparts on standard language modeling benchmarks like LM1B and OpenWebText.

Python 78 2 Updated Jun 14, 2026

The open source coding agent.

TypeScript 177,081 21,619 Updated Jun 22, 2026

Hearth fire starter - incubator/dogfooding for Hearth-based macro libraries

Scala 59 6 Updated Jun 21, 2026

🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman

JavaScript 75,531 4,270 Updated Jun 12, 2026

DMax: Aggressive Parallel Decoding for dLLMs

Python 126 7 Updated May 25, 2026

NumPy inspired Linear Algebra Library

Scala 12 1 Updated Jun 20, 2026

Code-first Protobuf and gRPC library for Scala

Scala 84 3 Updated Jun 22, 2026

Direct-style pure domain logic for Scala

Scala 50 7 Updated Jun 20, 2026
TypeScript 276 270 Updated Mar 31, 2026
Next