speech-translation

Here are 68 public repositories matching this topic...

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Nov 11, 2025
Python

PalabraAI / palabra-ai-python

Star

Python SDK for Palabra AI's real-time speech-to-speech translation API. Break down language barriers and enable seamless communication across 25+ languages

translation languages seamless speech-translation s2st

Updated Nov 10, 2025
Python

espnet / espnet

Star

End-to-End Speech Processing Toolkit

text-to-speech deep-learning chainer end-to-end machine-translation pytorch speech-synthesis speech-recognition kaldi voice-conversion speaker-diarization speech-separation speech-enhancement spoken-language-understanding speech-translation singing-voice-synthesis

Updated Nov 10, 2025
Python

Sharan-Kumar-R / Talk2Translate

Star

The application uses SpeechRecognition, GoogleTranslator, and gTTS to convert spoken English or Tamil into the opposite language, display the translated text, and play the audio output.

tts speech-recognition stt gtts bilingual googletrans speech-translation voice-translator deep-translator real-time-translation

Updated Nov 4, 2025
Python

hlt-mt / FBK-fairseq

Star

Repository containing the open source code of works published at the FBK MT unit.

deep-learning pytorch speech-to-text subtitling gender-bias speech-translation simultaneous-translation

Updated Nov 1, 2025
Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Oct 20, 2025
Python

ymoslem / Model-Compression

Star

Code for the papers: "Efficient Speech Translation through Model Compression and Knowledge Distillation" and "Iterative Layer Pruning for Efficient Translation Inference"

machine-translation model-compression speech-translation layer-pruning

Updated Sep 24, 2025
Jupyter Notebook

echogarden-project / echogarden

Star

Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.

text-to-speech command-line speech language-detection speech-synthesis speech-recognition node-js speech-to-text source-separation language-identification forced-alignment speech-translation speech-alignment voice-isolation

Updated Sep 1, 2025
TypeScript

chentuochao / Spatial-Speech-Translation

Star

The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"

spatial-audio speech-separation speech-translation

Updated Aug 15, 2025
Python

ictnlp / StreamUni

Star

StreamUni is a framework that efficiently enables unified Large Speech-Language Models to accomplish streaming speech translation in a cohesive manner.

speech-recognition speech-to-text speech-processing multimodal speech-translation simultaneous-translation large-language-models llms simultaneous-machine-translation multimodal-large-language-models streaming-generation phi4-multimodal speech-language-models speeech-llms

Updated Jul 14, 2025
Python

csikasote / bigc

Star

This repository contains the data resources for the LacunaFund supported project, Multimodal datasets for the Bemba Language of Zambia.

machine-translation speech-recognition zambia multimodal-learning speech-translation bemba-language image-grounded-conversations africa-language

Updated Jul 9, 2025

ictnlp / StreamSpeech

Star

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Updated Jun 29, 2025
Python

VinAIResearch / PhoST

Star

A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)

vietnamese machine-translation english speech-translation phost benchmark-dataset english-to-vietnamese

Updated Jun 5, 2025

othneildrew / open-whisperer

Star

AI Video Translator and Subtitler

self-hosted speech-recognition video-captioning speech-translation audio-transcription whisper-api ai-tools video-translator multilingual-subtitles ai-transcriber ffmpeg-subtitles auto-subtitles

Updated May 31, 2025
TypeScript

mllpresearch / ESO-dataset

Star

ESO speech dataset: an English-language speech corpus of the oncology domain for ASR training and benchmarking and MT benchmarking.

machine-translation automatic-speech-recognition oncology domain-adaptation speech-corpus speech-translation large-language-models llm

Updated May 30, 2025

mct10 / IWSLT2025_LowRes_ST

Star

Code for GMU's submission to IWSLT 2025 Low-Resource Speech Translation Shared Task

machine-translation speech-translation

Updated May 29, 2025
Python

steventan0110 / STAR

Star

Official Repository for our IWSLT 2025 paper "Streaming Sequence Transduction through Dynamic Compression"

speech-translation simultaneous-translation

Updated May 22, 2025
Python

The-Data-Dilemma / ParquetToHuggingFace

Star

ParquetToHuggingFace processes raw audio data, converts it into Parquet files, and uploads them to Hugging Face. The README explains how to set up the environment, configure paths, and run the scripts to generate and upload the data.

data-science pandas python3 dataset speech-recognition data-analysis parquet automatic-speech-recognition speech-to-text parquet-generator healthcare-application audio-processing speech-data speech-translation huggingface audio-dataset huggingface-datasets

Updated May 16, 2025
Python

huggingface / speech-to-speech

Star

Speech To Speech: an effort for an open-sourced and modular GPT4-o

python machine-learning ai speech speech-synthesis assistant speech-to-text language-model speech-translation

Updated Apr 15, 2025
Python

mahshid1378 / NeMo

Star

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation speaker-diariazation generative-ai large-langage-models

Updated Mar 28, 2025
Python

Improve this page

Add a description, image, and links to the speech-translation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-translation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-translation

Here are 68 public repositories matching this topic...

NVIDIA-NeMo / NeMo

PalabraAI / palabra-ai-python

espnet / espnet

Sharan-Kumar-R / Talk2Translate

hlt-mt / FBK-fairseq

PaddlePaddle / PaddleSpeech

ymoslem / Model-Compression

echogarden-project / echogarden

chentuochao / Spatial-Speech-Translation

ictnlp / StreamUni

csikasote / bigc

ictnlp / StreamSpeech

VinAIResearch / PhoST

othneildrew / open-whisperer

mllpresearch / ESO-dataset

mct10 / IWSLT2025_LowRes_ST

steventan0110 / STAR

The-Data-Dilemma / ParquetToHuggingFace

huggingface / speech-to-speech

mahshid1378 / NeMo

Improve this page

Add this topic to your repo