Code from the paper "Towards Speech-to-Pictograms Translation" (Interspeech 2024)
-
Updated
Jan 29, 2025 - Python
Code from the paper "Towards Speech-to-Pictograms Translation" (Interspeech 2024)
Speech to text and translation client-server using Google cloud
Limit the use of end-to-end data for Speech Translation (by leveraging Automatic Speech Recognition and Machine Translation data instead) using zero-shot multilingual text translation techniques.
Speech recognition, language detection, translation, and speech synthesis
SPEAR-ASR and SPEAR-WakeUp Software Development Kit in Java for Linux
AI Video Translator and Subtitler
The application uses SpeechRecognition, GoogleTranslator, and gTTS to convert spoken English or Tamil into the opposite language, display the translated text, and play the audio output.
Speech-To-Text is a C# desktop app that uses Azure Cognitive Services to convert and translate speech. You can copy or show the text on the screen, and choose the language of the speech or the translation.
Whisper Transcription Service
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
SPEAR-ASR and SPEAR-WakeUp Software Development Kit in Java for Windows
A database of challenging voice utterances collected by the Biometrics Vision and Computing (BVC) group.
SEGAUGMENT: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations
A database of Afro Voice utterances, BVC-Afro-Voice data, collected by the Biometrics Vision and Computing (BVC) group.
Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach
Code for the paper "Does Joint Training Really Help Cascaded Speech Translation?" (EMNLP 2022)
Simultaneous Speech-to-Text and Speech Translation using Azure AI.
An NLP-powered tool for transcribing, summarizing, and indexing podcast content, with video-to-audio conversion and multilingual support.
Systems submitted to IWSLT 2022 by the MT-UPC group.
Code for the papers: "Efficient Speech Translation through Model Compression and Knowledge Distillation" and "Iterative Layer Pruning for Efficient Translation Inference"
Add a description, image, and links to the speech-translation topic page so that developers can more easily learn about it.
To associate your repository with the speech-translation topic, visit your repo's landing page and select "manage topics."