Skip to content
#

preprocessing-data

Here are 240 public repositories matching this topic...

Analyzing logistics data to optimize shipment efficiency, reduce delays, and enhance supply chain visibility using Power BI. Insights include top routes, delays, supplier trends, and peak shipments.

  • Updated Dec 14, 2025
  • Jupyter Notebook

Single-Cell RNA-seq Analysis of Bone Marrow Dataset Using Scanpy: This repository reproduces a complete scRNA-seq analysis pipeline using the Scanpy library on a modified bone marrow dataset (originally from CZI). The workflow includes preprocessing, normalization, clustering, marker-based annotation, and biological interpretation.

  • Updated Dec 9, 2025
  • Python

Enterprise-grade training data curation bot for LLM fine-tuning using Decodo and Python automation. It provides an async, modular pipeline for document loading, preprocessing, task-specific data generation (Q&A, summarization, classification), quality evaluation, and dataset export — all through a unified API.

  • Updated Nov 2, 2025
  • Python

Public Repository: Machine Learning & Data Mining project using the South African Heart Disease dataset. Applied PCA, Regularized Linear Regression, ANN, Logistic Regression, and Decision Trees with cross-validation for regression and classification. Includes feature scaling, EDA, and statistical tests.

  • Updated Oct 29, 2025

Improve this page

Add a description, image, and links to the preprocessing-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the preprocessing-data topic, visit your repo's landing page and select "manage topics."

Learn more