Releases: uezo/ChatdollKit
v0.8.15
🌏WebGL Updates
Support for Silero VAD, switching between front and rear cameras and proper aspect ratio handling, image file upload support, improved microphone input performance, and fixing the bug where lip sync didn't work while muted. Huge improvements 💪
- Add WebGL support for Silero VAD #455
- Add meta files for SileroVAD WebGL JS assets #458
- Add HandlePlayingSamples callback for audio playback #456
- Add WebGL camera device support and refactor SimpleCamera #462
- Add WebGL file upload support to ImageButton #464
- Optimize WebGL microphone data transfer with malloc #467
✨ UI Control Improvements
In addition to becoming more stylish, they can now be used with zero configuration — just place them on the scene’s Canvas.
- Add max volume control and improve muting for AIAvatar #459
- Add UI prefabs and scripts for chat controls #460
- Add MessageWindow and container prefabs with improved configurability #463
🥁 Further Noise Resistance
Supports combining multiple types of VADs. For example, by combining Silero VAD, which can recognize only human voices even in noisy environments, with the built-in energy-based VAD, which only captures loud voices, the system can accurately pick up the user’s speech at event venues while partially filtering out surrounding voices and venue announcements.
- Support multiple voice detection functions in speech listener #466
- Remove SileroVADMicrophoneButton component and prefab #468
🍩 Other Updates
- Mask tools when function calling is disabled #457
- Make OnDestroy method virtual in SpeechListenerBase #461
- Add payload support to AIAvatar dialog processing #465
Full Changelog: v0.8.14...v0.8.15
v0.8.14
🎙️ Echo Cancelling Support
- Add native microphone support for Android, iOS, and macOSX #440
- Add experimental native microphone plugins🔌 #449
- Add delay before muting character volume on recording #453
- Add echo cancelling instructions to README #454
🗣️ Enhanced Stability for Hesitations and Pauses
- Add request merging to prevent fragmented speech recognition #442
- Add support for character volume control #444
- Improve context sequence validation for ChatGPT #450
- Fix recursive streaming to disable function calls #452
🧩 Platform Enhancement
- Add notes for OpenAI-compatible API usage #437
- Add support for Aivis Cloud API as a TTS service💠 #439
- Add support for AIAvatarKit STT and TTS #446
🙏 Bug fix and small/internal changes
- Add Android-specific model loading for SileroVAD #441
- Refactor tool call handling and context updates for LLMs #443
- Expose recording state and sample count for debugging #445
- Fix bugs where conversation and build fails in WebGL #447
- Update demo for v0.8.14 #448
- Update for v0.8.14 #451
Full Changelog: v0.8.13...v0.8.14
v0.8.13
🥳 Silero VAD Support
ML-based voice-activity detection vastly improves turn-end accuracy in noisy settings, enabling smooth conversations outdoors or at events.
- Add SileroVADProcessor for ONNX-based voice activity detection by @uezo in #431
- Add SileroVAD microphone button UI and logic by @uezo in #434
- Add preroll buffer to RecordingSession by @uezo in #430
🪄 TTS Pre-processing
Optional text pre-processing lets you fine-tune pronunciation (e.g., convert “OpenAI” to katakana) before synthesis.
🤝 Grok & Gemini Compatibility
Removes OpenAI-specific params from the OpenAI-style endpoint, so Grok, Gemini, and other API-compatible models work out of the box.
🍩 Other changes
- Support ad-hoc internal request with idle state recovery by @uezo in #428
- Add support for language response from AIAvatarKit by @uezo in #429
- Fix bug where WebGL build fails by @uezo in #435
🎂 Birthday Release
This update drops on my birthday - thanks for celebrating with me! 🥳🎉🎈🍰
Full Changelog: v0.8.12...v0.8.13
v0.8.12
v0.8.11
🤖 Support Server-side Agent Framework Collaboration
Offloads AI agent logic to the server—boosting front-end maintainability—while letting you plug in frameworks like AutoGen (and any other agent SDK) for unlimited capability expansion.
- Add support for AIAvatarKit as an AI agent backend service
- Support
inputsparameter for Dify - Fix bug where API Key authorization doesn't work in WebGL
- Allow null for SystemPromptParams for AIAvatarKit
🌐 WebGL Improvements
Upgraded mic capture to modern AudioWorkletNode for lower latency and reliability; stabilized mute/unmute handling; improved error handling to immediately surface HTTP errors and prevent hangs; fixed API-key authorization in WebGL builds.
- Switch WebGLMicrophone implementation from ScriptProcessor to AudioWorklet
- Prevent processing of muted user's speech after unmuting on WebGL
- Return HTTP errors immediately to avoid AI character hanging
🍩 Other updates
- Remove CommandRService
- Add option to include Wave header in SpeechGatewaySpeechSynthesizer
- Add channel info to ChatMemory integration extension
- Enable downsampling of microphone input for speech recognition
By the way, this release was prepared while enjoying a picnic at Jonanjima Seaside Park 🏕️🌳
Full Changelog: 0.8.10...v0.8.11
v0.8.10
🌎 Dynamic Multi-Language Switching
- Support dynamic multi language speech synthesizing #414
- Enable Multi-Language Support for Speech Recognition #415
- Add support for SpeechGateway in SpeechSynthesizer #416
🔖 Long-Term Memory
- Add ContextId for conversation-level identification #418
- Add request info to the params of
OnStreamingEnd#419 - Add support for Long-Term Memory #420
🍩 Other Updates
- Support IsAzure option of ChatGpt LLM by AITuber controller by @buchizo in #413
- Enable echo cancellation and noise suppression for WebGL #417
- Prevent WebGL build error #421
- Improve error handling in HTTP access #422
- Fix bug where setting Nijivoice duration fails #423
- Small changes and update README for v0.8.10 #424
Thank you so much for your contribution, @buchizo san!🥰🥰🥰
Full Changelog: 0.8.9...0.8.10
v0.8.9
✨ Support NijiVoice as a Speech Synthesizer
🍩 Other changes
- Improve dialog processing #410
- Fix bug where DialogProcessor fails on before processing LLM stream #411
- Update for v0.8.9 #412
Full Changelog: 0.8.8.1...0.8.9
v0.8.8.1
💪Support Dify as a backend for AITuber
Seamlessly integrate with any LLM while empowering AITubers with agentic capabilities, blending advanced knowledge and functionality for highly efficient and scalable operations!
Full Changelog: 0.8.8...0.8.8.1
v0.8.8
v0.8.7
✨More Features For AITuber and Update Demo✨
- Support start/stop SocketServer from external components #395
- Add dummy components for the use case that doesn't use microphone #396
- Update demo for v0.8.7 #397
- Update README and small changes for v0.8.7 #399
🐛 Bug fix
- Fix bug where WebGL build fails #398
Full Changelog: 0.8.6...0.8.7