Skip to content
View fafancier's full-sized avatar

Block or report fafancier

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

2026 好用的付费机场推荐

2,616 128 Updated May 16, 2026

Tools for the Embody 3D Dataset

Python 256 11 Updated Oct 30, 2025

Foundation Models and Data for Human-Human and Human-AI interactions.

Python 392 30 Updated Dec 13, 2025

[ICCV 2025] MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Python 282 17 Updated Oct 28, 2025

This is an official PyTorch implementation of "Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation" (IROS 2022).

Python 27 4 Updated Feb 9, 2024

Official implementation of Gelina

Python 30 3 Updated Apr 28, 2026

SnapMoGen: Human Motion Generation from Expressive Texts [NeurIPS 2025]

Python 103 8 Updated Sep 26, 2025

SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds

Python 718 74 Updated Jun 17, 2026

Skills for Real Engineers. Straight from my .claude directory.

Shell 140,914 12,206 Updated Jun 18, 2026

TextOp: Real-time Interactive Text-Driven Humanoid Robot Motion Generation and Control

Python 457 41 Updated Feb 7, 2026

This repository contains data pre-processing and visualization scripts used in GENEA Challenge 2022 and 2023. Check the repository's README.md file for instructions on how to use scripts yourself.

Python 28 6 Updated May 29, 2025

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 3,140 223 Updated May 19, 2025

🎙️ 「大模型」从0训练0.1B能听能说能看的全模态Omni模型!A 0.1B Omni model trained from scratch, capable of listening, speaking, and seeing!

Python 1,948 223 Updated Jun 8, 2026

X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech interaction with a lightweight, pure-Python, production-rea…

Python 214 27 Updated Jun 8, 2026

A curated list of full-duplex spoken dialogue models & benchmarks

101 7 Updated Jun 17, 2026

Towards Self-Evolving Proactive AI with Perpetual Memory

Python 199 21 Updated Apr 17, 2026

SALMONN family: A suite of advanced multi-modal LLMs

1,452 115 Updated May 26, 2026

LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar

Python 16,959 1,436 Updated Jun 22, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,545 18,312 Updated Jun 22, 2026

A production-grade, multi-modal voice gateway providing real-time audio-to-audio interaction, read-aloud TTS, transcription, and model introspection. Built on vLLM-Omni architecture with Qwen3 models.

Python 2 Updated Jan 31, 2026

Run Qwen3 Omni - A multimodal AI assistant demo

TypeScript 72 16 Updated Oct 16, 2025

A catgirl who watches, reads, listens, and plays alongside you, powered by human-like memory and an embodied emotional engine. 🐱❤️一只会主动找你玩的 AI 猫娘。

Python 1,697 195 Updated Jun 22, 2026

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

TypeScript 6,908 1,094 Updated Jan 7, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 194,142 109,916 Updated Jun 8, 2026

End-to-end realtime stack for connecting humans and AI

Go 19,346 2,093 Updated Jun 22, 2026

基于阿里云的tts, llm,stt模型构建的实时对话应用

TypeScript 22 12 Updated Jun 4, 2024

🟢🌍2026最新超详细+极速+隐私 Hysteria2一键安装脚本,默认解锁GPT和奈飞;🛡️附带VPN 安全性检测指南

Shell 63 8 Updated Apr 28, 2026

A framework for efficient model inference with omni-modality models

Python 5,236 1,155 Updated Jun 22, 2026

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,843 265 Updated Apr 23, 2026

OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams

Python 107 2 Updated Mar 15, 2026
Next