preprocessing-data

Star

Here are 50 public repositories matching this topic...

YaminiGhumre / salary_prj2

Star

Linear_Regression_Practical_Salary

data linear-regression pandas r2-score preprocessing-data

Updated Dec 3, 2023
Python

thiwaK / preprocess-50k-tiles-sri-lanka

Star

Preprocessing scripts for 1:50K tiles issued by the survey department, Sri Lanka

automation geospatial arcpy gdal-python preprocessing-data

Updated Sep 17, 2023
Python

EslamElbassel / MNIST-Dataset-Classification-with-KNN-using-centroid-preprocessing

Star

MNIST is a Dataset for images of handwritten digits Classification with KNN by extracting features using centroid

machine-learning mnist-classification mnist-dataset knn-classification centroid preprocessing-data

Updated May 11, 2021
Python

Explore your favorite anime with this interactive search app! 🚀 This project leverages Weaviate for vector search and Gradio for a seamless user interface. Using embeddings from a custom anime dataset, you can perform quick and accurate similarity searches for anime titles

python docker anime gradio weaviate preprocessing-data vectordb

Updated Feb 26, 2025
Python

AlRzRz / RAG-YT_BBCNews

Star

rag workflow w/ pinecone-db -- enriches LLM context with a YT/BBC news dataset to provide more accurate insights as well as sources

pandas python3 openai-api preprocessing-data rag-pipeline pinecone-db agentic-ai

Updated Oct 10, 2025
Python

lawl2 / object-detection-and-spatial-relation

Star

Ananyagawade12 / ML

Star

numpy sklearn regression pandas pca matplotlib bayes-classifier knn preprocessing-data

Updated Nov 16, 2024
Python

sorrychoe / pyBigKinds

Star

BigKinds Data Analysis Toolkit for python

python text-mining journalism newsdata preprocessing-data

Updated Oct 7, 2024
Python

alvaro-concha / animal-behavior-preprocessing

Star

animal-behavior-preprocessing is a Python repository to preprocess animal behavior data. It works on the output spreadsheets from video-tracking of animal body parts with LEAP or DeepLabCut. It applies a Median Filter, an Ensemble Kalman Filter, transforms data to joint angles and computes their Morlet Wavelet Spectra.

pipeline data-engineering feature-extraction filtering cleaning-data preprocessing-data

Updated Dec 12, 2024
Python

ironymint / tl_preprocessor

Star

Just add a few sample images and run transfer Learning. It classifies tons of images like magic.

machine-learning preprocessing-data

Updated Jan 10, 2022
Python

boomalope / misc

Star

Growing collection of scripts that manipulate text data.

ocr twitter jupyter-notebook memory-management ngrams parallel-processing pdftotext manual-annotations textual-analysis pdftoimage ocr-python tagging-tool scanned-image-pdfs preprocessing-data

Updated Jan 3, 2022
Python

ThalesGroup / Iliad-custom-to-OIM-transformer

Star

Scripts to preprocess ocean data files from custom apps in order to export the data to Ocean Information Model.

ocean mobile-app citizen-science oim preprocessing-data

Updated Jul 28, 2025
Python

MaxBubblegum47 / Preprocessing

Star

Preprocessing method for Information Retrieval System

python algorithm algorithms python3 preprocessing unimore-informatica preprocessing-data

Updated Mar 13, 2021
Python

BadBoy0170 / training-data_BOT

Star

Enterprise-grade training data curation bot for LLM fine-tuning using Decodo and Python automation. It provides an async, modular pipeline for document loading, preprocessing, task-specific data generation (Q&A, summarization, classification), quality evaluation, and dataset export — all through a unified API.

async data-curation training-data dataset-generator classification-internal preprocessing-data llm

Updated Nov 2, 2025
Python

MohammedSaim-Quadri / networksecurity

Star

This project is an end-to-end MLOps pipeline for a network security system that detects phishing and malicious activities using machine learning. It automates data ingestion, preprocessing, model training, and deployment while leveraging AWS S3 for model storage and GitHub Actions for CI/CD. The system includes realtime monitoring & a web interface

machine-learning deployment aws-s3 ml cybersecurity learning-by-doing network-security data-ingestion realtime-monitoring mlops phishing-detection github-actions end-to-end-pipeline end-to-end-project preprocessing-data model-training-and-evaluation

Updated Apr 15, 2025
Python

huiyi999 / concordiacrawler

Star

web crawler

crawler information-retrieval spider preprocessing-data

Updated Mar 1, 2022
Python

NM001007 / Suicidal_Ideation_Detection_Using_GAT_and_GCN

Star

In this project, three different models based on GAT, GCN and SAGE have been implemented to examine their performance on two prominent social networking platforms, namely Twitter and Reddit.

python machine-learning twitter reddit graph machine-learning-algorithms dataset graphical-models datasets preprocessing gcn wordembeddings suicide-prevention suicide-data preprocessing-data suicidal suicidal-detection suicide-notes

Updated Jan 24, 2024
Python

amalrajan / chat-cleaner

Star

A script for pre-processing WhatsApp exported chats, enhancing their quality and structure to optimize training for GPT-3 models.

bot chatbot preprocessing gpt-3 preprocessing-data

Updated Aug 4, 2023
Python

abdolazizsalimi / datasets

Star

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

python data-science data machine-learning dataset preprocessing preprocessing-data

Updated Mar 6, 2023
Python

AlejandroLara11 / MachineLearningCourse

Star

Machine Learning Basics: From Setup to Clustering

python data-science machine-learning numpy scikit-learn plotly pandas seaborn data-analysis streamlit preprocessing-data

Updated Dec 9, 2024
Python

Improve this page

Add a description, image, and links to the preprocessing-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the preprocessing-data topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preprocessing-data

Here are 50 public repositories matching this topic...

YaminiGhumre / salary_prj2

thiwaK / preprocess-50k-tiles-sri-lanka

EslamElbassel / MNIST-Dataset-Classification-with-KNN-using-centroid-preprocessing

nlqthinh / WeaviateAnime

AlRzRz / RAG-YT_BBCNews

lawl2 / object-detection-and-spatial-relation

Ananyagawade12 / ML

sorrychoe / pyBigKinds

alvaro-concha / animal-behavior-preprocessing

ironymint / tl_preprocessor

boomalope / misc

ThalesGroup / Iliad-custom-to-OIM-transformer

MaxBubblegum47 / Preprocessing

BadBoy0170 / training-data_BOT

MohammedSaim-Quadri / networksecurity

huiyi999 / concordiacrawler

NM001007 / Suicidal_Ideation_Detection_Using_GAT_and_GCN

amalrajan / chat-cleaner

abdolazizsalimi / datasets

AlejandroLara11 / MachineLearningCourse

Improve this page

Add this topic to your repo