Highlights
- Pro
Stars
Robust Speech Recognition via Large-Scale Weak Supervision
Open source annotation tool for machine learning practitioners.
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
A system for quickly generating training data with weak supervision
Learned string similarity for entity names using optimal transport.
Metaphone is a phonetic algorithm, an algorithm published in 1990 for indexing words by their English pronunciation. It fundamentally improves on the Soundex algorithm by using information about va…
An End-to-End Evaluation Framework for Entity Resolution Systems
Efficient String Comparison Functions and Fuzzy String Matching
Materials for workshop on "Using bibliometric data in demographic research". A report here: https://iussp.org/en/using-bibliometric-data-demographic-research-0
Datos de feminicidios en Latinoamerica
Deduplicate data using fuzzy and deterministic matching rules.
code that manages processing MSE scripts through Amazon SQS