Skip to content

Releases: uezo/ChatdollKit

v0.8.15

21 Aug 15:13
7767c7d

Choose a tag to compare

🌏WebGL Updates

Support for Silero VAD, switching between front and rear cameras and proper aspect ratio handling, image file upload support, improved microphone input performance, and fixing the bug where lip sync didn't work while muted. Huge improvements 💪

  • Add WebGL support for Silero VAD #455
  • Add meta files for SileroVAD WebGL JS assets #458
  • Add HandlePlayingSamples callback for audio playback #456
  • Add WebGL camera device support and refactor SimpleCamera #462
  • Add WebGL file upload support to ImageButton #464
  • Optimize WebGL microphone data transfer with malloc #467

✨ UI Control Improvements

In addition to becoming more stylish, they can now be used with zero configuration — just place them on the scene’s Canvas.

  • Add max volume control and improve muting for AIAvatar #459
  • Add UI prefabs and scripts for chat controls #460
  • Add MessageWindow and container prefabs with improved configurability #463

🥁 Further Noise Resistance

Supports combining multiple types of VADs. For example, by combining Silero VAD, which can recognize only human voices even in noisy environments, with the built-in energy-based VAD, which only captures loud voices, the system can accurately pick up the user’s speech at event venues while partially filtering out surrounding voices and venue announcements.

  • Support multiple voice detection functions in speech listener #466
  • Remove SileroVADMicrophoneButton component and prefab #468

🍩 Other Updates

  • Mask tools when function calling is disabled #457
  • Make OnDestroy method virtual in SpeechListenerBase #461
  • Add payload support to AIAvatar dialog processing #465

Full Changelog: v0.8.14...v0.8.15

v0.8.14

12 Aug 12:42
9df78b6

Choose a tag to compare

🎙️ Echo Cancelling Support

  • Add native microphone support for Android, iOS, and macOSX #440
  • Add experimental native microphone plugins🔌 #449
  • Add delay before muting character volume on recording #453
  • Add echo cancelling instructions to README #454

🗣️ Enhanced Stability for Hesitations and Pauses

  • Add request merging to prevent fragmented speech recognition #442
  • Add support for character volume control #444
  • Improve context sequence validation for ChatGPT #450
  • Fix recursive streaming to disable function calls #452

🧩 Platform Enhancement

  • Add notes for OpenAI-compatible API usage #437
  • Add support for Aivis Cloud API as a TTS service💠 #439
  • Add support for AIAvatarKit STT and TTS #446

🙏 Bug fix and small/internal changes

  • Add Android-specific model loading for SileroVAD #441
  • Refactor tool call handling and context updates for LLMs #443
  • Expose recording state and sample count for debugging #445
  • Fix bugs where conversation and build fails in WebGL #447
  • Update demo for v0.8.14 #448
  • Update for v0.8.14 #451

Full Changelog: v0.8.13...v0.8.14

v0.8.13

18 Jul 14:59
b7f9e53

Choose a tag to compare

🥳 Silero VAD Support

ML-based voice-activity detection vastly improves turn-end accuracy in noisy settings, enabling smooth conversations outdoors or at events.

  • Add SileroVADProcessor for ONNX-based voice activity detection by @uezo in #431
  • Add SileroVAD microphone button UI and logic by @uezo in #434
  • Add preroll buffer to RecordingSession by @uezo in #430

🪄 TTS Pre-processing

Optional text pre-processing lets you fine-tune pronunciation (e.g., convert “OpenAI” to katakana) before synthesis.

  • Add PreprocessText hook to SpeechSynthesizerBase by @uezo in #432

🤝 Grok & Gemini Compatibility

Removes OpenAI-specific params from the OpenAI-style endpoint, so Grok, Gemini, and other API-compatible models work out of the box.

  • Add OpenAI compatible API option to ChatGPTService by @uezo in #433

🍩 Other changes

  • Support ad-hoc internal request with idle state recovery by @uezo in #428
  • Add support for language response from AIAvatarKit by @uezo in #429
  • Fix bug where WebGL build fails by @uezo in #435

🎂 Birthday Release

This update drops on my birthday - thanks for celebrating with me! 🥳🎉🎈🍰

Full Changelog: v0.8.12...v0.8.13

v0.8.12

15 May 23:52
bf23d41

Choose a tag to compare

What's Changed

  • AIAvatarKit Streaming Improvements #427

Full Changelog: v0.8.11...v0.8.12

v0.8.11

29 Apr 04:46
0f6440b

Choose a tag to compare

🤖 Support Server-side Agent Framework Collaboration

Offloads AI agent logic to the server—boosting front-end maintainability—while letting you plug in frameworks like AutoGen (and any other agent SDK) for unlimited capability expansion.

🌐 WebGL Improvements

Upgraded mic capture to modern AudioWorkletNode for lower latency and reliability; stabilized mute/unmute handling; improved error handling to immediately surface HTTP errors and prevent hangs; fixed API-key authorization in WebGL builds.

🍩 Other updates

By the way, this release was prepared while enjoying a picnic at Jonanjima Seaside Park 🏕️🌳✈️ — a wonderful spot in Tokyo for camping, BBQ, and watching airplanes up close. Highly recommended!

Full Changelog: 0.8.10...v0.8.11

v0.8.10

29 Mar 13:40
f48bb69

Choose a tag to compare

🌎 Dynamic Multi-Language Switching

  • Support dynamic multi language speech synthesizing #414
  • Enable Multi-Language Support for Speech Recognition #415
  • Add support for SpeechGateway in SpeechSynthesizer #416

🔖 Long-Term Memory

  • Add ContextId for conversation-level identification #418
  • Add request info to the params of OnStreamingEnd #419
  • Add support for Long-Term Memory #420

🍩 Other Updates

  • Support IsAzure option of ChatGpt LLM by AITuber controller by @buchizo in #413
  • Enable echo cancellation and noise suppression for WebGL #417
  • Prevent WebGL build error #421
  • Improve error handling in HTTP access #422
  • Fix bug where setting Nijivoice duration fails #423
  • Small changes and update README for v0.8.10 #424

Thank you so much for your contribution, @buchizo san!🥰🥰🥰

Full Changelog: 0.8.9...0.8.10

v0.8.9

13 Dec 16:40
31b34ab

Choose a tag to compare

✨ Support NijiVoice as a Speech Synthesizer

  • Add Voice Prefetch Mode functionality #408
  • Add support for NijiVoice as a speech synthesizer #409

🍩 Other changes

  • Improve dialog processing #410
  • Fix bug where DialogProcessor fails on before processing LLM stream #411
  • Update for v0.8.9 #412

Full Changelog: 0.8.8.1...0.8.9

v0.8.8.1

05 Dec 13:07

Choose a tag to compare

💪Support Dify as a backend for AITuber

Seamlessly integrate with any LLM while empowering AITubers with agentic capabilities, blending advanced knowledge and functionality for highly efficient and scalable operations!

  • Enable clearing context in DifyService #405
  • Update AITuber demo v0.8.8.1 #406

Full Changelog: 0.8.8...0.8.8.1

v0.8.8

03 Dec 13:20
3f01491

Choose a tag to compare

🥰🥳Support Multiple AITuber Dialogue

  • Add client for socket communications #402
  • Added support for interactions between multiple AITubers #403

🍩Other updates

  • Fix WebGL build error in v0.8.7 #401
  • Update for v0.8.8 #404

Full Changelog: 0.8.7...0.8.8

v0.8.7

29 Nov 14:36
4b02d68

Choose a tag to compare

✨More Features For AITuber and Update Demo✨

  • Support start/stop SocketServer from external components #395
  • Add dummy components for the use case that doesn't use microphone #396
  • Update demo for v0.8.7 #397
  • Update README and small changes for v0.8.7 #399

🐛 Bug fix

  • Fix bug where WebGL build fails #398

Full Changelog: 0.8.6...0.8.7