📊 Collect company data from Dealroom effortlessly with Dealroom Scraper, providing insights on startups, funding, investors, and more for informed business decisions.
-
Updated
Dec 18, 2025 - Python
📊 Collect company data from Dealroom effortlessly with Dealroom Scraper, providing insights on startups, funding, investors, and more for informed business decisions.
Ricgraph - Research in context graph
A Python client for the People Data Labs API
Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs
AI-powered data sanitizer with schema detection, dedupe, outlier detection, and LLM enrichment.
Dealroom data extraction tool
Import, maintain and export tag metadata to/from audio files and a dynamically created SQLite table. Automates incremental tag cleanup, enrichment and standardisation for your digital audio library at scale using pre-scripted SQL queries and Polars, achieving quality and consistency in your metadata not possible with a tagger
Scientific data enrichment tool for Open WebUI - Chemistry and materials science integration with PubChem, ChEMBL, Materials Project, and RDKit
Intro to using DSPy with Kuzu to enrich the data within the Nobel Laureate mentorship network
Backend for an automated job data ingestion and enrichment platform. Integrates multiple data sources, processes job feeds, enriches listings with AI-powered services, and supports analytics, notifications, and geolocation features. Built with Python, BigQuery, and Upstash.
SmartFlow-Prep is the data preprocessing pipeline for the SmartFlow system. It collects, cleans, and enriches historical CitiBike trip data with contextual features such as weather, time, holidays, and station metadata. The processed output serves as the structured input for SmartFlow’s bike rebalancing model.
CrawlerBox is an automated analysis framework designed for parsing emails and crawling embedded web resources.
Python SDK for Trestle API - A comprehensive OSINT and data enrichment toolkit with support for phone validation, reverse lookups, and more.
Multilingual dataset of world cities with English and Arabic names, population, and country info. Provided in JSON, CSV, SQL, Excel formats. This will provide enriched information of countries, states and their capitals translate these in Arabic and show population of the city
A high-performance Python tool for batch processing Brazilian postal codes (CEP) into complete addresses. Features parallel processing, multiple API sources, and flexible I/O formats. Perfect for data enrichment and address validation.
⚡ Fast ASN lookup for IP prefixes with multi-threading & caching
Data enrichment with AI for pandas DataFrame
Case Study for a data engineering job application at a company
🧩➜👤 TheDig enrich personal data from a full name and an email
Add a description, image, and links to the data-enrichment topic page so that developers can more easily learn about it.
To associate your repository with the data-enrichment topic, visit your repo's landing page and select "manage topics."