Starred repositories
AI agent framework for plan-first development workflows with approval-based execution. Multi-language support (TypeScript, Python, Go, Rust) with automatic testing, code review, and validation buil…
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Kanidm: A simple, secure, and fast identity management platform
💫 Toolkit to help you get started with Spec-Driven Development
VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)
C++17 port of Demucs v3 (hybrid) and v4 (hybrid transformer) models with ggml and Eigen3
adefossez / demucs
Forked from facebookresearch/demucsCode for the paper Hybrid Spectrogram and Waveform Source Separation
🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade archite…
🗻 Log-structured, embeddable key-value storage engine written in Rust
codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
Reached #13 on Stanford's Terminal Bench leaderboard. Orchestrator, explorer & coder agents working together with intelligent context sharing.
Open dubbing is an AI dubbing system which uses machine learning models to automatically translate and synchronize audio dialogue into different languages.
Hierarchical Reasoning Model Official Release
A new schemafull, Postgres compatible, high-performance database written from scratch in Rust. https://crates.io/crates/rasterizeddb_core
Fast and trainable tokenizer for natural languages relying on maximum entropy methods.
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
A python package to build AI-powered real-time audio applications
Simultaneous speech-to-text model
Whisper realtime streaming for long speech-to-text transcription and translation