-
UFMT
- Cuiabá, Mato Grosso - Brazil
- https://www.fredso.com.br
- @fred_s0
Highlights
Lists (12)
Sort Name ascending (A-Z)
Stars
A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
MR-RATE: A Vision-Language Foundation Model and Dataset for Magnetic Resonance Imaging
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM
GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
AI Edge Quantizer: flexible post training quantization for LiteRT models.
Audio-to-Audio Schrodinger Bridges is a diffusion-based audio restoration model for bandwidth extension and inpainting.
Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)
Open Source Text-To-Speech Portuguese Dataset
Spotify Scraper to extract all the information from spotify, download mp3 with cover of the song
Running any GGUF SLMs/LLMs locally, on-device in Android
Awesome speech/audio LLMs, representation learning, and codec models
SoTA open-source TTS
finetune llm part for spark-tts model
stlohrey / dia-finetuning
Forked from nari-labs/diaA TTS model capable of generating ultra-realistic dialogue in one pass.
An AI-powered interactive avatar engine using Live2D, LLM, ASR, TTS, and RVC. Ideal for VTubing, streaming, and virtual assistant applications.
A TTS model capable of generating ultra-realistic dialogue in one pass.
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis