You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature selection is widely used in nearly all data science pipelines. Hence I have created functions that do a form of backward stepwise selection based on the XGBoost classifier feature importance and a set of other input values with the goal to return the number of features to keep in regard to a prefered AUC-score.
Machine learning classification project using Random Forest, Logistic Regression, and Decision Trees on the Iris dataset. Includes model comparison, learning curves, feature importance analysis, and comprehensive evaluation metrics. Built with scikit-learn.
This repository contains a Decision Tree Regression model developed to predict house sale prices based on various predictor variables, aiming to provide accurate predictions and insights into regional differences in real estate values.
🤖 Predicts employee performance (High/Medium/Low) using Random Forest and Logistic Regression on 1,000 synthetic HR records. Includes EDA dashboard, feature importance, and live prediction demo. Built in Python and Jupyter Notebook.
Crime and Incarceration in the United States contain data on crimes that are committed, and the prisoner counts in every 50 states, for which the data is analyzed using various analytical methods.
Interactive and visual Principal Component Analysis (PCA) demos on MNIST and Fashion-MNIST datasets, featuring feature importance analysis, 2D/3D visualizations, and an interactive HTML PCA explainer.
This repository is a partial fulfilment of the requirements for the module of MSIN0114: Business Analytics Consulting Project/Dissertation for UCL School of Management.
Predict bank customer churn using interactive EDA and machine learning (Logistic Regression, Decision Tree, Random Forest, Gradient Boosting). Built with Python, Scikit-learn, and Plotly. Includes feature importance and actionable business insights.
Predict medical insurance claim amounts using regression modeling, interactive EDA, and feature interpretation. Built with Python, Scikit-learn, and Plotly. Final project for DevelopersHub Corporation.