Skip to content
#

preprocessing-data

Here are 50 public repositories matching this topic...

Single-Cell RNA-seq Analysis of Bone Marrow Dataset Using Scanpy: This repository reproduces a complete scRNA-seq analysis pipeline using the Scanpy library on a modified bone marrow dataset (originally from CZI). The workflow includes preprocessing, normalization, clustering, marker-based annotation, and biological interpretation.

  • Updated Dec 9, 2025
  • Python

Enterprise-grade training data curation bot for LLM fine-tuning using Decodo and Python automation. It provides an async, modular pipeline for document loading, preprocessing, task-specific data generation (Q&A, summarization, classification), quality evaluation, and dataset export — all through a unified API.

  • Updated Nov 2, 2025
  • Python

This project is an end-to-end MLOps pipeline for a network security system that detects phishing and malicious activities using machine learning. It automates data ingestion, preprocessing, model training, and deployment while leveraging AWS S3 for model storage and GitHub Actions for CI/CD. The system includes realtime monitoring & a web interface

  • Updated Apr 15, 2025
  • Python

Improve this page

Add a description, image, and links to the preprocessing-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the preprocessing-data topic, visit your repo's landing page and select "manage topics."

Learn more