Releases · uezo/ChatdollKit

🌏WebGL Updates

Support for Silero VAD, switching between front and rear cameras and proper aspect ratio handling, image file upload support, improved microphone input performance, and fixing the bug where lip sync didn't work while muted. Huge improvements 💪

Add WebGL support for Silero VAD #455
Add meta files for SileroVAD WebGL JS assets #458
Add HandlePlayingSamples callback for audio playback #456
Add WebGL camera device support and refactor SimpleCamera #462
Add WebGL file upload support to ImageButton #464
Optimize WebGL microphone data transfer with malloc #467

✨ UI Control Improvements

In addition to becoming more stylish, they can now be used with zero configuration — just place them on the scene’s Canvas.

Add max volume control and improve muting for AIAvatar #459
Add UI prefabs and scripts for chat controls #460
Add MessageWindow and container prefabs with improved configurability #463

🥁 Further Noise Resistance

Supports combining multiple types of VADs. For example, by combining Silero VAD, which can recognize only human voices even in noisy environments, with the built-in energy-based VAD, which only captures loud voices, the system can accurately pick up the user’s speech at event venues while partially filtering out surrounding voices and venue announcements.

Support multiple voice detection functions in speech listener #466
Remove SileroVADMicrophoneButton component and prefab #468

🍩 Other Updates

Mask tools when function calling is disabled #457
Make OnDestroy method virtual in SpeechListenerBase #461
Add payload support to AIAvatar dialog processing #465

Full Changelog: v0.8.14...v0.8.15

🎙️ Echo Cancelling Support

Add native microphone support for Android, iOS, and macOSX #440
Add experimental native microphone plugins🔌 #449
Add delay before muting character volume on recording #453
Add echo cancelling instructions to README #454

🗣️ Enhanced Stability for Hesitations and Pauses

Add request merging to prevent fragmented speech recognition #442
Add support for character volume control #444
Improve context sequence validation for ChatGPT #450
Fix recursive streaming to disable function calls #452

🧩 Platform Enhancement

Add notes for OpenAI-compatible API usage #437
Add support for Aivis Cloud API as a TTS service💠 #439
Add support for AIAvatarKit STT and TTS #446

🙏 Bug fix and small/internal changes

Add Android-specific model loading for SileroVAD #441
Refactor tool call handling and context updates for LLMs #443
Expose recording state and sample count for debugging #445
Fix bugs where conversation and build fails in WebGL #447
Update demo for v0.8.14 #448
Update for v0.8.14 #451

Full Changelog: v0.8.13...v0.8.14

@uezo

🥳 Silero VAD Support

ML-based voice-activity detection vastly improves turn-end accuracy in noisy settings, enabling smooth conversations outdoors or at events.

Add SileroVADProcessor for ONNX-based voice activity detection by @uezo in #431
Add SileroVAD microphone button UI and logic by @uezo in #434
Add preroll buffer to RecordingSession by @uezo in #430

🪄 TTS Pre-processing

Optional text pre-processing lets you fine-tune pronunciation (e.g., convert “OpenAI” to katakana) before synthesis.

Add PreprocessText hook to SpeechSynthesizerBase by @uezo in #432

🤝 Grok & Gemini Compatibility

Removes OpenAI-specific params from the OpenAI-style endpoint, so Grok, Gemini, and other API-compatible models work out of the box.

Add OpenAI compatible API option to ChatGPTService by @uezo in #433

🍩 Other changes

Support ad-hoc internal request with idle state recovery by @uezo in #428
Add support for language response from AIAvatarKit by @uezo in #429
Fix bug where WebGL build fails by @uezo in #435

🎂 Birthday Release

This update drops on my birthday - thanks for celebrating with me! 🥳🎉🎈🍰

Full Changelog: v0.8.12...v0.8.13

What's Changed

AIAvatarKit Streaming Improvements #427

Full Changelog: v0.8.11...v0.8.12

🤖 Support Server-side Agent Framework Collaboration

Offloads AI agent logic to the server—boosting front-end maintainability—while letting you plug in frameworks like AutoGen (and any other agent SDK) for unlimited capability expansion.

🌐 WebGL Improvements

Upgraded mic capture to modern AudioWorkletNode for lower latency and reliability; stabilized mute/unmute handling; improved error handling to immediately surface HTTP errors and prevent hangs; fixed API-key authorization in WebGL builds.

🍩 Other updates

By the way, this release was prepared while enjoying a picnic at Jonanjima Seaside Park 🏕️🌳✈️ — a wonderful spot in Tokyo for camping, BBQ, and watching airplanes up close. Highly recommended!

Full Changelog: 0.8.10...v0.8.11

@buchizo

🌎 Dynamic Multi-Language Switching

Support dynamic multi language speech synthesizing #414
Enable Multi-Language Support for Speech Recognition #415
Add support for SpeechGateway in SpeechSynthesizer #416

🔖 Long-Term Memory

Add ContextId for conversation-level identification #418
Add request info to the params of OnStreamingEnd #419
Add support for Long-Term Memory #420

🍩 Other Updates

Support IsAzure option of ChatGpt LLM by AITuber controller by @buchizo in #413
Enable echo cancellation and noise suppression for WebGL #417
Prevent WebGL build error #421
Improve error handling in HTTP access #422
Fix bug where setting Nijivoice duration fails #423
Small changes and update README for v0.8.10 #424

Thank you so much for your contribution, @buchizo san!🥰🥰🥰

Full Changelog: 0.8.9...0.8.10

✨ Support NijiVoice as a Speech Synthesizer

Add Voice Prefetch Mode functionality #408
Add support for NijiVoice as a speech synthesizer #409

🍩 Other changes

Improve dialog processing #410
Fix bug where DialogProcessor fails on before processing LLM stream #411
Update for v0.8.9 #412

Full Changelog: 0.8.8.1...0.8.9

💪Support Dify as a backend for AITuber

Seamlessly integrate with any LLM while empowering AITubers with agentic capabilities, blending advanced knowledge and functionality for highly efficient and scalable operations!

Enable clearing context in DifyService #405
Update AITuber demo v0.8.8.1 #406

Full Changelog: 0.8.8...0.8.8.1

🥰🥳Support Multiple AITuber Dialogue

Add client for socket communications #402
Added support for interactions between multiple AITubers #403

🍩Other updates

Fix WebGL build error in v0.8.7 #401
Update for v0.8.8 #404

Full Changelog: 0.8.7...0.8.8

✨More Features For AITuber and Update Demo✨

Support start/stop SocketServer from external components #395
Add dummy components for the use case that doesn't use microphone #396
Update demo for v0.8.7 #397
Update README and small changes for v0.8.7 #399

🐛 Bug fix

Fix bug where WebGL build fails #398

Full Changelog: 0.8.6...0.8.7

Releases: uezo/ChatdollKit

v0.8.15

🌏WebGL Updates

✨ UI Control Improvements

🥁 Further Noise Resistance

🍩 Other Updates

Uh oh!

v0.8.14

🎙️ Echo Cancelling Support

🗣️ Enhanced Stability for Hesitations and Pauses

🧩 Platform Enhancement

🙏 Bug fix and small/internal changes

Uh oh!

v0.8.13

🥳 Silero VAD Support

🪄 TTS Pre-processing

🤝 Grok & Gemini Compatibility

🍩 Other changes

🎂 Birthday Release

Contributors

Uh oh!

v0.8.12

What's Changed

Uh oh!

v0.8.11

🤖 Support Server-side Agent Framework Collaboration

🌐 WebGL Improvements

🍩 Other updates

Uh oh!

v0.8.10

🌎 Dynamic Multi-Language Switching

🔖 Long-Term Memory

🍩 Other Updates

Contributors

Uh oh!

v0.8.9

✨ Support NijiVoice as a Speech Synthesizer

🍩 Other changes

Uh oh!

v0.8.8.1

💪Support Dify as a backend for AITuber

Uh oh!

v0.8.8

🥰🥳Support Multiple AITuber Dialogue

🍩Other updates

Uh oh!

v0.8.7

✨More Features For AITuber and Update Demo✨

🐛 Bug fix

Uh oh!