Skip to content
View Atotti's full-sized avatar

Highlights

  • Pro

Organizations

@citruzdev @gdsc-tmu @triC-tmu

Block or report Atotti

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 6 Updated Jan 7, 2026

FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.

Python 250 25 Updated Feb 25, 2026

The agent that grows with you

Python 193,353 33,794 Updated Jun 14, 2026
Python 93 11 Updated Oct 23, 2024

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Python 2,864 139 Updated Mar 13, 2024

Reference implementation of an end-to-end voice agent built using the NVIDIA Nemotron models

TypeScript 56 24 Updated Apr 22, 2026

[EMNLP 2025 Findings] Code for "Distilling Many-Shot In-Context Learning into a Cheat Sheet"

Python 5 Updated Nov 21, 2025

List of open-source TTS, voice cloning, and music generation models

345 50 Updated Apr 17, 2026
Python 30 8 Updated Apr 27, 2026
Python 92 14 Updated May 14, 2026

Erasing concepts from neural representations with provable guarantees

Python 255 15 Updated Jan 27, 2025

Must-read Papers on LLM Agents.

3,047 183 Updated Jun 5, 2026

A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.

Python 160 26 Updated Jan 27, 2026

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenario…

Python 3,330 286 Updated Jun 11, 2026

Whisper-Flow is a framework designed to enable real-time transcription of audio content using OpenAI’s Whisper model. Rather than processing entire files after upload (“batch mode”), Whisper-Flow a…

Python 768 111 Updated Apr 20, 2026

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ 12,981 1,485 Updated Jun 12, 2026

An open-source wake word library for creating voice-enabled applications.

Python 176 33 Updated Jun 10, 2026

The AsyncAPI specification allows you to create machine-readable definitions of your asynchronous APIs.

JavaScript 5,211 384 Updated Jun 8, 2026

unclutter your .profile

Go 15,176 795 Updated Mar 31, 2026

Open Source framework for voice and multimodal conversational AI

Python 12,821 2,191 Updated Jun 14, 2026

Public release of the Sound Effect Foundation model by Sony AI.

Python 319 22 Updated May 21, 2026

Code for the blog "Neural audio codecs: how to get audio into LLMs"

Python 171 4 Updated Oct 20, 2025

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 384 27 Updated May 27, 2025

MiMo-Audio: Audio Language Models are Few-Shot Learners

Python 1,050 103 Updated Mar 3, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 28,984 6,527 Updated Jun 14, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 378,682 79,192 Updated Jun 14, 2026
Python 68 6 Updated Aug 16, 2023

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …

Python 14,500 1,475 Updated Jun 13, 2026

Official inference code for SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis

Python 742 98 Updated May 29, 2026

AI Generated Music Player with 3D Carousel - Powered by ACE-Step 1.5, HonoX, Cloudflare Workers/R2/D1

TypeScript 5 Updated Feb 5, 2026
Next