- Hong Kong
-
20:55
(UTC +08:00) - naozumi.me
- https://huggingface.co/Naozumi0512
Highlights
Stars
🔊 Text-Prompted Generative Audio Model
A guidance language for controlling large language models.
Instruct-tune LLaMA on consumer hardware
A multi-voice TTS system trained with an emphasis on quality
AirLLM 70B inference with single 4GB GPU
An Open Source text-to-speech system built by inverting Whisper.
“让爷康康”是一款手机 AI 应用程序,可以监测不良坐姿并进行语音提示
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Multilingual G2P in 100 languages
Speech Toolkit for Malaysian language, https://malaya-speech.readthedocs.io/
Measurement and processing of binaural impulse responses for personalized surround virtualization on headphones.
Software kit that uses deep learning to generate vocaloid music (melodies and lyrics)
Large-language Model Evaluation framework with Elo Leaderboard and A-B testing
Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed
German Alpaca Dataset (Cleaned + Translated)
Audio Super Resolution model to decompress mp3 into wav file with recovered frequencies built on a "ResUNet" style architecture
SovitsTokenizer: A low-bitrate audio tokenizer that converts speech into discrete tokens
Fine-Tune Wav2Vec2 Bert 2.0 for Jyutping Recognition
Fasiany / emotional-vits-FA
Forked from innnky/emotional-vits一些新东西和优化