An open-source software for synthetic web-based user interface and content dataset generation.
-
Updated
Feb 6, 2023 - HTML
An open-source software for synthetic web-based user interface and content dataset generation.
It's a simulator based on Unity for RoboMaster. You can use it to get some labeled dataset for deep learning
Code to generate the Inv3D dataset from our paper "Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping" (ICDAR) 2023.
"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang
The Font Image Generator App creates diverse character images using various fonts, aiding in dataset creation for machine learning and analysis.
Modular R web-scraping framework that crawls sitemaps, aggregates links by date range, and extracts target HTML fields using the paperboy package (German newspapers)
Syntetic HR data creator
Creating an index measuring the "rurality" of counties in the contiguous United States
Augmented Synthetic Data-set for Deep Learning in C++
Snippets for data set generation and analyses with ParlGov · 🗳️🧑🏻💻📊
a utility for generating VOC image annotations
UCL Geographical Information Systems (GIS) project, building up an open-source way of extracting the location of diplomatic outputs across the world. We argue that this new level of detail helps in the study of diplomatic interactions.
Easily create training or fine-tuning data for OpenAI's ChatGPT models thru chatting with yourself, then export it to JSONL.
Benchmark datasets containing both normal background traffic and worm traffic
A collection of scripts and tools that tracks the availability of helium mobile wifi networks in the wild from the Wigle Dataset and Helium API. Updates every 24 hours.
An AI scraper using crawl4ai and firecrawl.
Lack of alumni tracking and poor alumni interaction among students who have graduated from educational institutions across Odisha.
A Python command-line tool to create a geocoded dataset of Russian small and medium-sized enterprises (SMEs) from open data published by Federal Tax Service
Dataset builder is an application that allows users to build datasets up to 5 dimensions visually through an intuitive MS-Paint-like interface and store the dataset in a MySQL data base for effective data-wrangling. Furthermore, the application allows users to export the datasets to a CSV file.
Add a description, image, and links to the dataset-generation topic page so that developers can more easily learn about it.
To associate your repository with the dataset-generation topic, visit your repo's landing page and select "manage topics."