speech-translation

Here are 68 public repositories matching this topic...

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Nov 10, 2025
Python

espnet / espnet

Star

End-to-End Speech Processing Toolkit

text-to-speech deep-learning chainer end-to-end machine-translation pytorch speech-synthesis speech-recognition kaldi voice-conversion speaker-diarization speech-separation speech-enhancement spoken-language-understanding speech-translation singing-voice-synthesis

Updated Nov 9, 2025
Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Oct 20, 2025
Python

huggingface / speech-to-speech

Star

Speech To Speech: an effort for an open-sourced and modular GPT4-o

python machine-learning ai speech speech-synthesis assistant speech-to-text language-model speech-translation

Updated Apr 15, 2025
Python

microsoft / SpeechT5

Star

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

speech-synthesis speech-recognition speech-translation speech-pretraining speecht5 speech2c speechlm speechut speech-text-pretraining vatlm vallex

Updated Apr 24, 2024
Python

ictnlp / StreamSpeech

Star

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Updated Jun 29, 2025
Python

double22a / speech_dataset

Star

The dataset of Speech Recognition

audio text-to-speech deep-neural-networks deep-learning speech tts speech-synthesis dataset wav speech-recognition automatic-speech-recognition speech-to-text voice-conversion asr speech-separation speech-enhancement speech-segmentation speech-translation speech-diarization

Updated Dec 26, 2024

Dadangdut33 / Speech-Translate

Star

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.

python translate whisper tkinter-python speech-translation speech-transcription

Updated Jan 18, 2024
Python

echogarden-project / echogarden

Star

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.

text-to-speech command-line speech language-detection speech-synthesis speech-recognition node-js speech-to-text source-separation language-identification forced-alignment speech-translation speech-alignment voice-isolation

Updated Sep 1, 2025
TypeScript

kahne / SpeechTransProgress

Star

Tracking the progress in end-to-end speech translation

natural-language-processing machine-translation artificial-intelligence natural-language-generation speech-processing spoken-language-processing speech-translation spoken-language-translation

Updated Oct 25, 2023

bzhangGo / zero

Star

Zero -- A neural machine translation system

transformer neural-machine-translation average-attention-network aan speech-translation depth-scaled-initialization deep-transformer l0drop adaptive-feature-selection massively-multilingual-translation opus-100 fast-bidirectional-decoder

Updated May 8, 2023
Python

MooreThreads / MooER

Star

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.

speech-recognition speech-to-text speech-translation speech-to-speech large-language-models chatgpt gpt-4o speech-interaction

Updated Jan 8, 2025
Python

zhangshaolei1998 / Awesome-Simultaneous-Translation

Star

Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.

nlp natural-language-processing streaming awesome paper machine-translation text-translation paperlist speech-translation simultaneous-translation simultaneous-machine-translation

Updated Jun 7, 2024

ictnlp / STEMM

Star

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

machine-translation speech-to-text speech-translation

Updated Oct 25, 2023
Python

csikasote / bigc

Star

This repository contains the data resources for the LacunaFund supported project, Multimodal datasets for the Bemba Language of Zambia.

machine-translation speech-recognition zambia multimodal-learning speech-translation bemba-language image-grounded-conversations africa-language

Updated Jul 9, 2025

ictnlp / ComSpeech

Star

Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".

text-to-speech machine-translation speech-translation non-autoregressive-translation speech-to-speech-translation zero-shot-speech-translation

Updated Jul 2, 2024
Python

JeffWang0325 / Microsoft-Azure-Cognitive-Services

Star

🖍️ This project combines multiple operations in Microsoft Azure Cognitive Services into one GUI, including QnA Maker, LUIS, Computer Vision, Custom Vision, Face, Form Recognizer, Text To Speech, Speech To Text and Speech Translation. It's very user-friendly for users to implement any operation mentioned above.

microsoft text-to-speech translation computer-vision azure speech-synthesis speech-recognition face face-recognition face-detection luis speech-to-text cognitive-services qna-maker qnamaker customvision luis-ai speech-translation formrecognizer

Updated Nov 2, 2021
C#

liamdugan / speech-to-speech

Star

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"

speech speech-processing speech-translation speech-to-speech simultaneous-translation

Updated Jan 14, 2025
Python

ictnlp / DASpeech

Star

Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".

machine-translation speech-translation speech-to-speech speech-to-speech-translation

Updated Jul 22, 2024
Python

ReneeYe / ConST

Star

code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)

translation machine-translation pytorch transformer neural-machine-translation spoken-language-processing speec speech-translation contrastive-learning naacl2022

Updated May 25, 2022
Python

Improve this page

Add a description, image, and links to the speech-translation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-translation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-translation

Here are 68 public repositories matching this topic...

NVIDIA-NeMo / NeMo

espnet / espnet

PaddlePaddle / PaddleSpeech

huggingface / speech-to-speech

microsoft / SpeechT5

ictnlp / StreamSpeech

double22a / speech_dataset

Dadangdut33 / Speech-Translate

echogarden-project / echogarden

kahne / SpeechTransProgress

bzhangGo / zero

MooreThreads / MooER

zhangshaolei1998 / Awesome-Simultaneous-Translation

ictnlp / STEMM

csikasote / bigc

ictnlp / ComSpeech

JeffWang0325 / Microsoft-Azure-Cognitive-Services

liamdugan / speech-to-speech

ictnlp / DASpeech

ReneeYe / ConST

Improve this page

Add this topic to your repo