- All languages
- C
- C#
- C++
- CSS
- Clojure
- CoffeeScript
- Cuda
- Dart
- Dockerfile
- Fortran
- GDScript
- Go
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- Lean
- Lex
- Lua
- MATLAB
- MDX
- Makefile
- Mathematica
- Meson
- Mojo
- OCaml
- Objective-C
- Objective-C++
- PHP
- Pascal
- PureBasic
- Python
- R
- Ren'Py
- Roff
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Smali
- Swift
- TeX
- TypeScript
- Vue
- XSLT
Starred repositories
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
Recipes to train reward model for RLHF.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Robust recipes to align language models with human and AI preferences
Experiments on the impact of depth in transformers and SSMs.
An Open Source Toolkit For LLM Distillation
Various AI scripts. Mostly Stable Diffusion stuff.
Inference rwkv5 or rwkv6 with Qualcomm AI Engine Direct SDK
Repository for formalization of the Polynomial Freiman Ruzsa conjecture (and related results)
A MAD laboratory to improve AI architecture designs 🧪
Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"
Formatron empowers everyone to control the format of language models' output with minimal overhead.
On-device wake word detection powered by deep learning
This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the idea of SLAM_ASR and used the RWKV language model as the LLM…
Multilingual Voice Understanding Model
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.