Skip to content
View Atotti's full-sized avatar

Highlights

  • Pro

Organizations

@citruzdev @gdsc-tmu @triC-tmu

Block or report Atotti

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 7 Updated Jun 18, 2026
Python 6 Updated Jan 7, 2026

FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.

Python 250 25 Updated Feb 25, 2026

The agent that grows with you

Python 197,074 34,822 Updated Jun 19, 2026
Python 94 11 Updated Oct 23, 2024

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Python 2,865 139 Updated Mar 13, 2024

Reference implementation of an end-to-end voice agent built using the NVIDIA Nemotron models

TypeScript 56 25 Updated Apr 22, 2026

[EMNLP 2025 Findings] Code for "Distilling Many-Shot In-Context Learning into a Cheat Sheet"

Python 5 Updated Nov 21, 2025

List of open-source TTS, voice cloning, and music generation models

349 50 Updated Apr 17, 2026
Python 30 8 Updated Apr 27, 2026
Python 92 14 Updated May 14, 2026

Erasing concepts from neural representations with provable guarantees

Python 255 15 Updated Jan 27, 2025

Must-read Papers on LLM Agents.

3,051 182 Updated Jun 18, 2026

A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.

Python 160 26 Updated Jan 27, 2026

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenario…

Python 3,403 297 Updated Jun 18, 2026

Whisper-Flow is a framework designed to enable real-time transcription of audio content using OpenAI’s Whisper model. Rather than processing entire files after upload (“batch mode”), Whisper-Flow a…

Python 773 112 Updated Apr 20, 2026

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ 13,054 1,497 Updated Jun 18, 2026

An open-source wake word library for creating voice-enabled applications.

Python 179 35 Updated Jun 10, 2026

The AsyncAPI specification allows you to create machine-readable definitions of your asynchronous APIs.

JavaScript 5,215 384 Updated Jun 8, 2026

unclutter your .profile

Go 15,191 794 Updated Mar 31, 2026

Open Source framework for voice and multimodal conversational AI

Python 12,892 2,216 Updated Jun 19, 2026

Public release of the Sound Effect Foundation model by Sony AI.

Python 327 22 Updated May 21, 2026

Code for the blog "Neural audio codecs: how to get audio into LLMs"

Python 174 4 Updated Oct 20, 2025

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 385 27 Updated May 27, 2025

MiMo-Audio: Audio Language Models are Few-Shot Learners

Python 1,052 104 Updated Jun 17, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,169 6,606 Updated Jun 19, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,422 79,419 Updated Jun 19, 2026
Python 68 6 Updated Aug 16, 2023

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …

Python 14,558 1,483 Updated Jun 18, 2026

Official inference code for SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis

Python 751 101 Updated May 29, 2026
Next