Skip to content
View ishine's full-sized avatar

Block or report ishine

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • x-vits Public

    Forked from reppy4620/x-vits
    Python MIT License Updated Oct 29, 2024
  • viet-tts Public

    Forked from dangvansam/viet-tts

    VietTTS: An Open-Source Vietnamese Text to Speech

    Python Apache License 2.0 Updated Oct 29, 2024
  • Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

    Python GNU General Public License v3.0 Updated Oct 29, 2024
  • F5-TTS Public

    Forked from SWivid/F5-TTS

    Official code for "A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

    Python MIT License Updated Oct 29, 2024
  • CMOT Public

    Forked from ictnlp/CMOT

    Code for ACL 2023 main conference paper "CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation"

    Python Updated Oct 29, 2024
  • SESD Public

    Forked from justinlovelace/SESD
    Python MIT License Updated Oct 28, 2024
  • GLM-4-Voice Public

    Forked from THUDM/GLM-4-Voice

    GLM-4-Voice | 端到端中英语音对话模型

    Python Apache License 2.0 Updated Oct 28, 2024
  • seed-vc Public

    Forked from Plachtaa/seed-vc

    zero-shot voice conversion with in context learning

    Python GNU General Public License v3.0 Updated Oct 28, 2024
  • AP-BWE Public

    Forked from yxlu-0102/AP-BWE

    Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction

    Python MIT License Updated Oct 28, 2024
  • ggml Public

    Forked from ggerganov/ggml

    Tensor library for machine learning

    C++ MIT License Updated Oct 27, 2024
  • GLM-4 Public

    Forked from THUDM/GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

    Python Apache License 2.0 Updated Oct 27, 2024
  • Object-oriented handling of audio data, with GPU-powered augmentations, and more.

    Python MIT License Updated Oct 27, 2024
  • An Open Source alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI

    Python Apache License 2.0 Updated Oct 27, 2024
  • libllm Public

    Forked from ling0322/libllm

    Efficient inference of large language models.

    C++ MIT License Updated Oct 27, 2024
  • llama.cpp Public

    Forked from ggerganov/llama.cpp

    Port of Facebook's LLaMA model in C/C++

    C++ MIT License Updated Oct 26, 2024
  • LlamaVoice Public

    Forked from OpenT2S/LlamaVoice

    LlamaVoice is a llama-based large voice generation model, providing inference and training ability.

    Python 1 Updated Oct 26, 2024
  • Pseudo Streaming SenseVoice with Hotwords

    Python Apache License 2.0 Updated Oct 26, 2024
  • Port of Funasr's Sense-voice model in C/C++

    C 1 MIT License Updated Oct 26, 2024
  • CosyVoice Public

    Forked from FunAudioLLM/CosyVoice

    LLM based TTS model, providing inference/training/deployment full-stack ability.

    Python Apache License 2.0 Updated Oct 25, 2024
  • SubFix Public

    Forked from cronrpc/SubFix

    Web-based tool for efficient batch editing, precise subtitle correction, and flexible audio control.

    Python Apache License 2.0 Updated Oct 25, 2024
  • noScribe Public

    Forked from kaixxx/noScribe

    Cutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)

    Python GNU General Public License v3.0 Updated Oct 25, 2024
  • Jupyter Notebook MIT License Updated Oct 25, 2024
  • SpeechT5 Public

    Forked from microsoft/SpeechT5

    SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing (ACL'2022)

    Python MIT License Updated Oct 25, 2024
  • MeloTTS.cpp Public

    Forked from apinge/MeloTTS.cpp

    A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting mixed English and Chinese languages.

    C++ Apache License 2.0 Updated Oct 24, 2024
  • Fast and accurate automatic speech recognition (ASR) for edge devices

    Python MIT License Updated Oct 22, 2024
  • WaveFM Public

    Forked from PKBHY/WaveFM

    WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching

    Python 2 Updated Oct 22, 2024
  • ReDimNet Public

    Forked from IDRnD/ReDimNet

    The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

    Python MIT License Updated Oct 22, 2024
  • CTC decoder with hotwords for ASR.

    Python Apache License 2.0 Updated Oct 21, 2024
  • Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment) For Norwegian

    Python MIT License Updated Oct 21, 2024
  • Amphion Public

    Forked from open-mmlab/Amphion

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

    Python 1 MIT License Updated Oct 20, 2024