Feature Engineering with Python
-
Updated
Jan 11, 2026 - Jupyter Notebook
Feature Engineering with Python
Content of the course "Regression and Statistical Models (52571)" at The Hebrew University of Jerusalem, in the Department of Statistics and Data Science.
The second task given to me while completing the BCG X Data Science microinternship. Conduct feature engineering by selecting, manipulating and transforming raw data into features that can be used in a supervised learning model
This file provides full practice of data preprocessing methods and techniques using different types of libraries.
There are lot of things that need to be done on the given dataset before we feed it to the machine, these things come under data preprocessing. In this repository I have tried to explain those things with some examples.
The Bike Sharing Company wants to understand the independent variables on their past data to analyze and create a machine learning model to understand the demand of the bike and accordingly plan a business strategy.
X Education Organization wants to identify if a customer registered on their website for enquiry is a potential customer or not. Using past data to build a machine learning algorithm
different types of regression
To predict which customer is most likely to convert
This python code shows howw regression is handled in case of categorical variables using duumies. It calculates the multiple regression code and shows the regression table. It also performs the residual analysis.
Introduction to Machine Learning course - Spring 2021 - Supervised and Unsupervised Learning, KNN Classification Models, Naive-Bayes Classifier, Regression Analysis, K-Means and DBSCAN Clustering Analysis, Association Rules and PCA, Confusion Matrix, Normalization, Dummy Variables.
Predictive model that tells important factors(or features) affecting the demand for shared bikes
Sample programs with basic machine learning concepts
Scientific programming through the SKLearn / Scikitlearn library
Working Examples of all algorithms with datasets
King County Real Estate Model
Build a model with machine learning to predict housing prices in Ames, Iowa. Top 11% in the Kaggle Housing Prices Competition.
Goal is to predict the miles per gallon of the cars using different attributes
Add a description, image, and links to the dummy-variables topic page so that developers can more easily learn about it.
To associate your repository with the dummy-variables topic, visit your repo's landing page and select "manage topics."