-
ElevenLabs
- Trondheim, Norway
- @iver56
- All languages
- Assembly
- Astro
- Bikeshed
- C
- C#
- C++
- CSS
- Clojure
- CoffeeScript
- Csound Document
- Cuda
- Dockerfile
- GLSL
- Go
- HTML
- Haskell
- JSON
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- MATLAB
- MLIR
- Max
- Mojo
- PHP
- Perl
- Processing
- Python
- R
- Roff
- Ruby
- Rust
- Scala
- Shell
- Swift
- TeX
- TypeScript
- Vue
- Zig
- reStructuredText
Starred repositories
DFlash: Block Diffusion for Flash Speculative Decoding
Official Pytorch implementation of the fundamental frequency estimator described in "Robust and Lightweight F0 Estimation Through Mid-Level Fusion of DSP-Informed Features", ICASSP 2026.
Official implementation of Tuna-2: Pixel Embeddings Beat Vision Encoders for Unified Understanding and Generation
Adapters for external AAC and Opus decoders to be used with Symphonia
Jax Codebase for Evolutionary Strategies at the Hyperscale
Python toolkit for high-quality time and pitch processing
A library for panning and zooming elements using CSS transforms 🔍
AI agents running research on single-GPU nanochat training automatically
A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD
🌋LavaSR: Fast Speech restoration and enhancement
A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems
The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.
Variations of L1 SNR Loss function for training audio source separation machine learning models
On-device AI across mobile, embedded and edge for PyTorch
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
HeartMuLa Official Repo: The Most Powerful Open-Source Music Generation Model of 2026
Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.
Pure Mojo tokenizer for LLM inference - BPE, tiktoken, HuggingFace compatible
Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative modeling.
A highly compressive and high-quality neural audio codec for speech models.
0xSojalSec / airllm
Forked from lyogavin/airllmRuns 405B LLMs on 8GB VRAM
[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass