Utility for string normalization
-
Updated
Oct 20, 2022 - Python
Utility for string normalization
Our source code for the paper "Transformer-based Joint Learning Approach for Text Normalization in Vietnamese ASR"
A modern, actively maintained contractions library. Expands English contractions (you're → you are) with improved performance, type safety, and features like bulk dictionary imports and JSON loading. Includes 100% test coverage, full type hints, and works with both pip and uv.
Small Python wrapper class for the CAB webservice.
Clipboard Translator is a lightweight desktop application built with PyQt5 that automatically translates text copied to the clipboard into Persian using the Google Translate API. The application features a modern and minimalistic UI, custom styling, and real-time text normalization and tokenization.
Cryptocurrency Market Analysis and Question Answering System
Text Normalization on tweets (Tweet Normalization)
Code, models, and data for "Exploiting Dialect Identification in Automatic Dialectal Text Normalization". ArabicNLP 2024, ACL.
Implementation of the paper on Text normalization by Choudhury et al.
Ferramentas úteis para aplicações de Text To Speech: normalização de textos, construção automática de dataset e métricas de avaliação.
Pipeline Python pour enrichir un dataset Arabe (MSA) → Darija (MA) depuis livres PDF & transcriptions YouTube ; normalisation, segmentation par tokens, génération (OpenAI ou règles) et export JSON. Projet de stage d’application chez YaneCode Digital.
Text preprocessing and PII anonymisation for NLP/ML. ONNX NER ensemble, language detection, stopword removal. Built for statistical ML and language models.
Pipeline para finetuning automático de modelos de Text to Speech.
Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment
Implementing text normalization for Farsi(Persian) language.
Training Tacotron 2 Text-to-Speech (TTS)
📢 Tha (ថា) - A Khmer Text Normalization and Verbalization Toolkit
Command-line interface (CLI) and library to normalize English texts.
A simple tool to check if Unicode text files are Unicode-normalized
Add a description, image, and links to the text-normalization topic page so that developers can more easily learn about it.
To associate your repository with the text-normalization topic, visit your repo's landing page and select "manage topics."