Skip to content
View smksyj's full-sized avatar

Block or report smksyj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Official Code for "Rethinking Diffusion Model in High Dimension"

HTML 26 Updated May 20, 2025

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 698 43 Updated Jun 23, 2026

Modern Scala 3 client for Redis and Valkey, native and multi-backend

Scala 32 1 Updated Jun 23, 2026

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 251 23 Updated Apr 20, 2024

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.

Python 1,308 208 Updated Dec 7, 2025

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Python 9,929 843 Updated Jun 12, 2026

Safe direct-style streaming, concurrency and resiliency for Scala on the JVM

Scala 516 35 Updated Jun 23, 2026

Deterministic, AI-driven development flows.

Scala 116 9 Updated Jun 16, 2026

Open-source speech AI models from KRAFTON, including Raon-Speech and Raon-SpeechChat for speech understanding, generation, and real-time full-duplex conversation.

Python 65 12 Updated Apr 7, 2026

Official implementation of AsymFlow, pi-Flow, GMFlow

Python 441 24 Updated Jun 22, 2026

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,465 107 Updated Mar 16, 2026

First foundation ASR built for the real world - 7 atomic acoustic conditions, 54 compound scenarios, 2.6M samples, and up to ~30% gains over SOTA where every other model falls apart. **You'll come …

Python 1,035 67 Updated Jun 2, 2026

Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoregressive.

C 8,647 786 Updated Jun 22, 2026
Python 678 50 Updated Apr 29, 2026

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

Python 25,703 2,009 Updated Jun 4, 2026

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 58,978 6,442 Updated Jun 20, 2026

Create stunning demos for free. Open-source, no subscriptions, no watermarks, and free for commercial use. An alternative to Screen Studio.

TypeScript 38,765 2,772 Updated Jun 17, 2026

Warp is an agentic development environment, born out of the terminal.

Rust 62,237 5,082 Updated Jun 23, 2026

High-Quality Voice Cloning TTS for 600+ Languages

Python 7,681 1,203 Updated Jun 11, 2026

A Java port of ratatui — build rich terminal UIs from Java

Java 256 13 Updated May 17, 2026

Simple, beautiful CLI output

Scala 345 12 Updated Jun 19, 2026

The first continuous diffusion language model that rivals discrete counterparts on standard language modeling benchmarks like LM1B and OpenWebText.

Python 78 2 Updated Jun 14, 2026

The open source coding agent.

TypeScript 177,642 21,726 Updated Jun 23, 2026

Hearth fire starter - incubator/dogfooding for Hearth-based macro libraries

Scala 59 6 Updated Jun 22, 2026

🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman

JavaScript 76,084 4,308 Updated Jun 12, 2026

DMax: Aggressive Parallel Decoding for dLLMs

Python 126 7 Updated May 25, 2026

NumPy inspired Linear Algebra Library

Scala 12 1 Updated Jun 20, 2026

Code-first Protobuf and gRPC library for Scala

Scala 84 3 Updated Jun 22, 2026

Direct-style pure domain logic for Scala

Scala 50 7 Updated Jun 22, 2026
Next