Skip to content

ImDB0oo1/Data_mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Mining Project Repository

Welcome to the Data Mining Project Repository! This repository is a collection of projects focused on various aspects of data mining, including data preprocessing, exploratory data analysis (EDA), feature engineering, and machine learning applications. Each subproject is dedicated to a specific task or dataset, offering a comprehensive exploration of data mining concepts.


Table of Contents

  1. Overview
  2. Subprojects

Overview

This repository contains multiple projects that delve into key data mining concepts and methodologies. From cleaning datasets and exploring correlations to building advanced predictive models, this repository serves as a practical resource for data enthusiasts.


Subprojects

  • Focus: Exploratory Data Analysis (EDA) and Feature Engineering.
  • Highlights:
    • Correlation analysis and missing data visualization.
    • Feature transformations and dummy encoding.
    • Model-ready dataset preparation.
  • Focus: EDA on US COVID-19 cases and deaths.
  • Highlights:
    • Time-series analysis with rolling averages.
    • Mortality rate and state-wise comparisons.
    • Visualizations of trends and distributions.
  • Focus: Unsupervised Learning and Clustering Techniques.
  • Highlights:
    • Dimensionality reduction using PCA.
    • Multiple clustering methods (K-Means, Agglomerative Clustering, DBSCAN).
    • Cluster analysis and visualizations.
  • Focus: Data Cleaning and Classification.
  • Highlights:
    • Handling missing values and outliers.
    • Multi-class classification with models like Decision Trees, SVM, and Neural Networks.
    • Feature importance and model performance metrics.
  • Focus: Fraud Detection in Transaction Data.
  • Highlights:
    • Feature engineering with balance differences and merchant identification.
    • Class imbalance handling with undersampling.
    • Models: Logistic Regression, Decision Trees, Random Forest, XGBoost.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors