ishine

ishine

speech asr/speech-recognition tts/text-to-speech vc/voice-conversion

125 followers · 162 following

gerzz.inc
shanghai
dubbing-ai.com

Achievements

x-vits Public
Forked from reppy4620/x-vits

Python MIT License Updated Oct 29, 2024
viet-tts Public
Forked from dangvansam/viet-tts

VietTTS: An Open-Source Vietnamese Text to Speech

Python Apache License 2.0 Updated Oct 29, 2024
vec2wav2.0 Public
Forked from cantabile-kwok/vec2wav2.0

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

Python GNU General Public License v3.0 Updated Oct 29, 2024
F5-TTS Public
Forked from SWivid/F5-TTS

Official code for "A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python MIT License Updated Oct 29, 2024
CMOT Public
Forked from ictnlp/CMOT

Code for ACL 2023 main conference paper "CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation"

Python Updated Oct 29, 2024
SESD Public
Forked from justinlovelace/SESD

Python MIT License Updated Oct 28, 2024
GLM-4-Voice Public
Forked from THUDM/GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python Apache License 2.0 Updated Oct 28, 2024
seed-vc Public
Forked from Plachtaa/seed-vc

zero-shot voice conversion with in context learning

Python GNU General Public License v3.0 Updated Oct 28, 2024
AP-BWE Public
Forked from yxlu-0102/AP-BWE

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction

Python MIT License Updated Oct 28, 2024
ggml Public
Forked from ggerganov/ggml

Tensor library for machine learning

C++ MIT License Updated Oct 27, 2024
GLM-4 Public
Forked from THUDM/GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python Apache License 2.0 Updated Oct 27, 2024
audiotools Public
Forked from descriptinc/audiotools

Object-oriented handling of audio data, with GPU-powered augmentations, and more.

Python MIT License Updated Oct 27, 2024
podcastfy Public
Forked from souzatharsis/podcastfy

An Open Source alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

Python Apache License 2.0 Updated Oct 27, 2024
libllm Public
Forked from ling0322/libllm

Efficient inference of large language models.

C++ MIT License Updated Oct 27, 2024
llama.cpp Public
Forked from ggerganov/llama.cpp

Port of Facebook's LLaMA model in C/C++

C++ MIT License Updated Oct 26, 2024
LlamaVoice Public
Forked from OpenT2S/LlamaVoice

LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

Python 1 Updated Oct 26, 2024
streaming-sensevoice Public
Forked from pengzhendong/streaming-sensevoice

Pseudo Streaming SenseVoice with Hotwords

Python Apache License 2.0 Updated Oct 26, 2024
SenseVoice.cpp Public
Forked from lovemefan/SenseVoice.cpp

Port of Funasr's Sense-voice model in C/C++

C 1 MIT License Updated Oct 26, 2024
CosyVoice Public
Forked from FunAudioLLM/CosyVoice

LLM based TTS model, providing inference/training/deployment full-stack ability.

Python Apache License 2.0 Updated Oct 25, 2024
SubFix Public
Forked from cronrpc/SubFix

Web-based tool for efficient batch editing, precise subtitle correction, and flexible audio control.

Python Apache License 2.0 Updated Oct 25, 2024
noScribe Public
Forked from kaixxx/noScribe

Cutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)

Python GNU General Public License v3.0 Updated Oct 25, 2024
WhisperBiasing Public
Forked from BriansIDP/WhisperBiasing

Jupyter Notebook MIT License Updated Oct 25, 2024
SpeechT5 Public
Forked from microsoft/SpeechT5

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing (ACL'2022)

Python MIT License Updated Oct 25, 2024
MeloTTS.cpp Public
Forked from apinge/MeloTTS.cpp

A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting mixed English and Chinese languages.

C++ Apache License 2.0 Updated Oct 24, 2024
moonshine Public
Forked from usefulsensors/moonshine

Fast and accurate automatic speech recognition (ASR) for edge devices

Python MIT License Updated Oct 22, 2024
WaveFM Public
Forked from PKBHY/WaveFM

WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching

Python 2 Updated Oct 22, 2024
ReDimNet Public
Forked from IDRnD/ReDimNet

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

Python MIT License Updated Oct 22, 2024
asr-ctc-decoder-hotword Public
Forked from pengzhendong/asr-decoder

CTC decoder with hotwords for ASR.

Python Apache License 2.0 Updated Oct 21, 2024
AuxiliaryASR-NB Public
Forked from lemoi18/AuxiliaryASR-Conformer

Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment) For Norwegian

Python MIT License Updated Oct 21, 2024
Amphion Public
Forked from open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 1 MIT License Updated Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ishine

Achievements

Achievements

Block or report ishine

x-vits Public

viet-tts Public

vec2wav2.0 Public

F5-TTS Public

CMOT Public

SESD Public

GLM-4-Voice Public

seed-vc Public

AP-BWE Public

ggml Public

GLM-4 Public

audiotools Public

podcastfy Public

libllm Public

llama.cpp Public

LlamaVoice Public

streaming-sensevoice Public

SenseVoice.cpp Public

CosyVoice Public

SubFix Public

noScribe Public

WhisperBiasing Public

SpeechT5 Public

MeloTTS.cpp Public

moonshine Public

WaveFM Public

ReDimNet Public

asr-ctc-decoder-hotword Public

AuxiliaryASR-NB Public

Amphion Public