Work in progress...
-
Updated
Aug 7, 2023 - Python
Work in progress...
Validator for values or inputs with kotlin
An online education system aimed to help students with better education system and equal opportunity.
This repository contains a C program for calculating genetic distances between taxa. The program reads genetic distance data from an input csv file, parses and validates it, and initializes internal data structures to store the distance values.
Python-based web scraping and data analysis toolkit with automated data collection, processing, and visualization capabilities. Demonstrates data engineering and automation skills.
Cleaned and validated 10,000 Indian PAN numbers using PostgreSQL. Real-world data cleaning project with PL/pgSQL functions, regex validation, and production-ready views. #SQL #DataAnalyst #PostgreSQL
A Python tool for exploring SQLite database files, enabling users to verify file validity and retrieve essential database information with a simple command-line interface.
AI Resource‑Allocation Configurator
Health Atlas— a multi-agent AI platform that autonomously validates, enriches, and prioritizes healthcare provider data. Built with FastAPI, React, and LangGraph, it delivers real-time, VLM-ready automation for accuracy, compliance, and scale.
In this notebook US accidents database is analyzed. This is a countrywide car accident dataset, which covers 49 states of the USA. Information. The accident data are collected from February 2016 to March 2023(7.7 million records).
Validate and Impute your data with math expressions
Launch your bulk import feature in minutes 🚀
A Python library for automatic data profiling and validation
Streamline OpenAI requests with an intuitive API wrapper... Created at https://coslynx.com
Unified multi-format file comparison tool for text, JSON, CSV, XML, binary, and HDF5. Automation-friendly, extensible, battle-tested.
Developed an ETL data pipeline that ingests live data from London’s Traffic Information Management System, validates and sanitizes it using Pydantic, stores it in a PostgreSQL database, and exposes it through a FastAPI-based RESTful API.
Dataset management library for ML experiments—loaders for SciFact, FEVER, GSM8K, HumanEval, MMLU, TruthfulQA, HellaSwag; git-like versioning with lineage tracking; transformation pipelines; quality validation with schema checks and duplicate detection; GenStage streaming for large datasets. Built for reproducible AI research.
Automated document processing pipeline that extracts unstructured data from PDFs/Docs and performs intelligent schema validation using OpenAI GPT-4.
Add a description, image, and links to the data-validation topic page so that developers can more easily learn about it.
To associate your repository with the data-validation topic, visit your repo's landing page and select "manage topics."