LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 3,138 223 Updated May 19, 2025

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 9,999 929 Updated Mar 4, 2026

xidongwu / AutoTrainOnce

Python 21 2 Updated Oct 1, 2024

kechunl / AdaCode

AdaCode is licensed under CC BY-NC-SA 4.0 https://creativecommons.org/licenses/by-nc-sa/4.0/

Jupyter Notebook 46 8 Updated Mar 22, 2024

zhshi0816 / Video-Frame-Interpolation-Transformer

Python 103 15 Updated Mar 29, 2022

sstzal / DiffTalk

[CVPR2023] The implementation for "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation"

Python 472 42 Updated Jul 15, 2024

m-bain / webvid

Large-scale text-video dataset. 10 million captioned short videos.

Python 677 39 Updated Aug 14, 2024

AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI

Python 162,270 30,249 Updated Mar 2, 2026

ykk648 / AnimateDiff-I2V

Forked from guoyww/AnimateDiff

AnimateDiff I2V version.

Python 185 4 Updated Mar 1, 2024

CiaraStrawberry / Temporal-Image-AnimateDiff

Forked from guoyww/AnimateDiff

A retrain of AnimateDiff to be conditional on an init image

Python 35 1 Updated Oct 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LEI LU GGGGxxxxxxxxr

Block or report GGGGxxxxxxxxr

Stars

PrimeIntellect-ai / prime-rl

RUCKBReasoning / OmniSQL

dhcode-cpp / X-R1

pokerllm / pokerbench

uoftcprg / phh-dataset

EleutherAI / lm-evaluation-harness

GGGGxxxxxxxxr / One-click-Tuning-and-Pruning-for-Customized-LLMs

ictnlp / LLaMA-Omni