Yuki Avatar — AI-Driven Physical Avatar System for iPhone

English | 日本語

⚠️ This is a reference snapshot, not a maintained project. The purpose of this repository is to serve as a codebase that you can show to an AI assistant (Claude, ChatGPT, Codex, etc.) to help you build a similar system. It is not intended to be cloned and run as-is.

Screenshots

Idle	Voice Input	Thinking	Speaking

Character art: WhiteCUL立ち絵素材 by moiky — not included in this repository. You must supply your own images.

What Is This?

An iOS app that turns an iPhone (mounted on an Insta360 Flow Pro or any DockKit-compatible motorized stand) into a physical AI avatar with:

🎭 29 expressions — pre-composed character images swapped in real-time
🗣️ Text-to-speech with lip sync — audio streamed from a server, mouth animation synced to playback
💬 Speech bubbles — assistant and user speech displayed on screen
🎤 Voice input — on-device wake word detection + speech-to-text (iOS 26+ SpeechAnalyzer)
🤖 DockKit motor control — nod, shake, look around, custom velocity/rotation sequences
🌙 Idle scenes — sleeping, snacking, singing, thinking, bored (with burn-in protection)
📡 WebSocket control — all features controllable via JSON commands over LAN

Architecture Overview

┌─────────────┐     WebSocket (8765)     ┌──────────────┐
│  AI Backend  │ ◄──────────────────────► │  iPhone App  │
│  (any host)  │   JSON commands + audio  │  (this code) │
└─────────────┘                           └──────┬───────┘
                                                 │ DockKit API
                                          ┌──────▼───────┐
                                          │ Motorized     │
                                          │ Phone Stand   │
                                          └──────────────┘

The app is the display + motor controller. Your AI backend synthesizes speech (e.g., VOICEVOX, ElevenLabs), encodes it as base64 WAV, and sends it along with expression/motion commands via WebSocket.

File Overview

File	Role
`ContentView.swift`	Main UI — composites expression images, speech bubbles, status bar; routes WebSocket commands
`WebSocketServer.swift`	LAN WebSocket server (port 8765) using Network.framework
`ExpressionManager.swift`	Expression enum (29 types), auto-blink, JSON command parser
`DockKitManager.swift`	DockKit accessory control — gestures, orientation, velocity, motion presets
`AudioPlayerManager.swift`	Base64 WAV playback + lip-sync timer
`SpeechQueueManager.swift`	FIFO queue for sequential speech items with expression/motion per item
`VoiceInputManager.swift`	iOS 26 SpeechAnalyzer — wake word → capture → send to backend
`IdleSceneManager.swift`	Time/activity-based idle animations with screen dimming
`SpeechBubbleView.swift`	Styled speech bubble (assistant/user/previous/sending)
`StatusBarView.swift`	Connection status pills (WebSocket, DockKit, mic, audio)
`CameraManager.swift`	AVCaptureSession for camera preview (DockKit requires active camera)

See spec/ for detailed documentation:

architecture.md — System diagram and data flow
websocket-protocol.md — Full WebSocket API reference
host-scripts.md — Backend control scripts
voice-input.md — On-device voice input (iOS 26+)
dockkit-integration.md — Motorized stand control
idle-scenes.md — Idle animation system

Host Scripts (Backend Control)

The host-scripts/ directory contains the backend tools that control the app:

Script	Purpose
`yuki_cmd.sh`	Main command — TTS synthesis (VOICEVOX) + expression + motion, sent as one WebSocket message
`ws_send.py`	Raw WebSocket client for sending arbitrary JSON commands

# Quick start
export YUKI_IPHONE_IP=<your-iphone-ip>
./host-scripts/yuki_cmd.sh "こんにちは" --expression happy --motion smallNod

See spec/host-scripts.md for full documentation.

What's NOT Included

Character images — The original uses WhiteCUL (copyrighted). You must supply your own PNG images named to match the Expression enum cases in ExpressionManager.swift.
TTS backend — The app plays audio sent to it; you need your own speech synthesis pipeline.
AI orchestration — The WebSocket protocol is documented in spec/websocket-protocol.md. Build your own controller.

How to Use This Repository

Show this code to your AI assistant and describe what you want to build
The AI can understand the architecture from the code + specs and help you adapt it
Replace the expression images with your own character art
Build your own backend that speaks the WebSocket protocol
Optionally integrate DockKit for physical movement

Requirements

iOS 17.0+ (iOS 26+ for voice input features)
Xcode 26+
DockKit-compatible stand (optional, for motor control)
A TTS engine on your network (VOICEVOX, ElevenLabs, etc.)

Credits

Character art in screenshots: WhiteCUL立ち絵素材 by moiky — used under the original distribution terms. The character image files are not included in this repository.
WhiteCUL: A character from the VOICEVOX project

License

MIT — The code is free to use. Character art is NOT included and must be sourced separately. Screenshots contain WhiteCUL character art by moiky, used for demonstration purposes only.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
HelloWorld.xcodeproj		HelloWorld.xcodeproj
HelloWorld		HelloWorld
docs/screenshots		docs/screenshots
host-scripts		host-scripts
spec		spec
.gitignore		.gitignore
Info.plist		Info.plist
LICENSE		LICENSE
README.ja.md		README.ja.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yuki Avatar — AI-Driven Physical Avatar System for iPhone

Screenshots

What Is This?

Architecture Overview

File Overview

Host Scripts (Backend Control)

What's NOT Included

How to Use This Repository

Requirements

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Yuki Avatar — AI-Driven Physical Avatar System for iPhone

Screenshots

What Is This?

Architecture Overview

File Overview

Host Scripts (Backend Control)

What's NOT Included

How to Use This Repository

Requirements

Credits

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages