Skip to content
View xiaol's full-sized avatar

Block or report xiaol

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 6,218 768 Updated Oct 23, 2024

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python 709 116 Updated Oct 22, 2024

Target Speaker Extraction Toolkit

Python 95 12 Updated Oct 12, 2024
Cuda 3 Updated Sep 28, 2024

The best OSS video generation models

Python 1,427 125 Updated Oct 28, 2024

Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.

Python 1,230 68 Updated Oct 30, 2024

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 11 Updated Oct 27, 2024
Python 6,557 498 Updated Oct 14, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,493 164 Updated Sep 24, 2024

Code for Quiet-STaR

Python 624 86 Updated Aug 21, 2024

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 50,067 7,158 Updated Oct 30, 2024

Recipes to train reward model for RLHF.

Python 773 65 Updated Sep 23, 2024
Python 548 66 Updated Oct 28, 2024

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 4,264 414 Updated Oct 21, 2024

Robust recipes to align language models with human and AI preferences

Python 4,629 403 Updated Oct 7, 2024
Python 3 1 Updated May 30, 2024

Experiments on the impact of depth in transformers and SSMs.

Python 14 3 Updated Oct 27, 2024

An Open Source Toolkit For LLM Distillation

Python 340 36 Updated Sep 17, 2024

Various AI scripts. Mostly Stable Diffusion stuff.

Python 3,187 324 Updated Oct 29, 2024

Inference rwkv5 or rwkv6 with Qualcomm AI Engine Direct SDK

C++ 35 3 Updated Oct 29, 2024

Repository for formalization of the Polynomial Freiman Ruzsa conjecture (and related results)

Lean 135 33 Updated Oct 29, 2024
Python 13 Updated Oct 29, 2024

A MAD laboratory to improve AI architecture designs 🧪

Python 92 6 Updated May 2, 2024

Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"

Python 1,914 237 Updated Feb 5, 2024

Formatron empowers everyone to control the format of language models' output with minimal overhead.

Python 149 6 Updated Oct 29, 2024

On-device wake word detection powered by deep learning

Python 3,752 498 Updated Oct 28, 2024

This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the idea of SLAM_ASR and used the RWKV language model as the LLM…

Python 31 3 Updated Oct 27, 2024

Multilingual Voice Understanding Model

Python 3,225 298 Updated Oct 18, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 5,934 635 Updated Oct 22, 2024

RAG SYSTEM FOR RWKV

Python 34 2 Updated Oct 29, 2024
Next