This is an official PyTorch implementation of "Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation" (IROS 2022).

Python 27 4 Updated Feb 9, 2024

TGuichoux / Gelina

Official implementation of Gelina

Python 30 3 Updated Apr 28, 2026

snap-research / SnapMoGen

SnapMoGen: Human Motion Generation from Expressive Texts [NeurIPS 2025]

Python 103 8 Updated Sep 26, 2025

SimWorld-AI / SimWorld

SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds

Python 718 74 Updated Jun 17, 2026

mattpocock / skills

Skills for Real Engineers. Straight from my .claude directory.

Shell 140,914 12,206 Updated Jun 18, 2026

TeleHuman / TextOp

TextOp: Real-time Interactive Text-Driven Humanoid Robot Motion Generation and Control

Python 457 41 Updated Feb 7, 2026

TeoNikolov / genea_visualizer

Forked from jonepatr/genea_visualizer

This repository contains data pre-processing and visualization scripts used in GENEA Challenge 2022 and 2023. Check the repository's README.md file for instructions on how to use scripts yourself.

Python 28 6 Updated May 29, 2025

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 3,140 223 Updated May 19, 2025

jingyaogong / minimind-o

🎙️ 「大模型」从0训练0.1B能听能说能看的全模态Omni模型！A 0.1B Omni model trained from scratch, capable of listening, speaking, and seeing!

Python 1,948 223 Updated Jun 8, 2026

xcc-zach / xtalk

X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech interaction with a lightweight, pure-Python, production-rea…

Python 214 27 Updated Jun 8, 2026

Ruiqi-Yan / Awesome-Full-Duplex-SDM

A curated list of full-duplex spoken dialogue models & benchmarks

101 7 Updated Jun 17, 2026

xzf-thu / Pask

Towards Self-Evolving Proactive AI with Perpetual Memory

Python 199 21 Updated Apr 17, 2026

bytedance / SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

1,452 115 Updated May 26, 2026

jundot / omlx

LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar

Python 16,959 1,436 Updated Jun 22, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,545 18,312 Updated Jun 22, 2026

A production-grade, multi-modal voice gateway providing real-time audio-to-audio interaction, read-aloud TTS, transcription, and model introspection. Built on vLLM-Omni architecture with Qwen3 models.

Python 2 Updated Jan 31, 2026

xiaotianfotos / run-qwen3-omni

Run Qwen3 Omni - A multimodal AI assistant demo

TypeScript 72 16 Updated Oct 16, 2025

Project-N-E-K-O / N.E.K.O

A catgirl who watches, reads, listens, and plays alongside you, powered by human-like memory and an embodied emotional engine. 🐱❤️一只会主动找你玩的 AI 猫娘。

Python 1,697 195 Updated Jun 22, 2026

openai / openai-realtime-agents

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

TypeScript 6,908 1,094 Updated Jan 7, 2026

ultraworkers / claw-code

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 194,142 109,916 Updated Jun 8, 2026

livekit / livekit

End-to-end realtime stack for connecting humans and AI

Go 19,346 2,093 Updated Jun 22, 2026

hanxie-crypto / conversational-ai-livekit

基于阿里云的tts, llm,stt模型构建的实时对话应用

TypeScript 22 12 Updated Jun 4, 2024

georgetime1970 / Hysteria2

🟢🌍2026最新超详细+极速+隐私 Hysteria2一键安装脚本,默认解锁GPT和奈飞;🛡️附带VPN 安全性检测指南

Shell 63 8 Updated Apr 28, 2026

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 5,236 1,155 Updated Jun 22, 2026

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,843 265 Updated Apr 23, 2026

Go2Heart / OmniStream

OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams

Python 107 2 Updated Mar 15, 2026

fafancier

Lists (29)

3DGS

Agent

AIGC

Animation

Calibration

Concept

DIBR

DigitalHuman

Fusion

GPT

ImageTask2D

Library

LLM

LocoManip

MeshProcess

MM-Interaction

Motion

NERF

ObjectGeneration

Reconstruction

Render

Robot

SceneGen

Survey

Tools

VideoGen

VideoInterpolation

VLA

WorldModel

Starred repositories

3d-generation

bundle-adjustment

stereo-matching