Content 1: Python application to solve some linear problems
Maximize:𝟏𝟎𝒙𝟏+𝟓𝒙𝟐 𝟐
Problem 1: Use the pulp library in Python to solve the following linear model:
𝒙𝟏+𝒙𝟐≤𝟖
𝒙𝟏+𝟐𝒙𝟐≤𝟔
𝒙𝟏,𝒙𝟐≥𝟎
!pip install pulp
#(used to install pulp library)
import pulp
#(import pulp library into program)
problem = pulp.LpProblem('Maximine', pulp.LpMaximize)
#( pulp.LpProblem: Defines a new linear programming problem.
'Maximine': The name of the problem, which is optional but helps identify it.
pulp.LpMaximize: Specifies that the objective of the problem is maximization.)
x1 = pulp.LpVariable('A', lowBound=0, cat='Continuous')
x2 = pulp.LpVariable('B', lowBound=0, cat='Continuous')
#( pulp.LpVariable: Creates decision variables for the problem.
'A' and 'B': Names of the variables.
lowBound=0: Sets the lower bound of the variables to 0 (non-negative constraint).
cat='Continuous': Specifies that the variables are continuous (real numbers))
problem += 10*x1 + 5*x2
#(Define the objective function)
problem += 2*x1 + x2 <= 8
problem += x1 + 2*x2 <= 6
#(Add constraints)
problem.solve()
#(Solves the optimization problem using a linear programming solver built into the pulp library.)
print(f'x1 = {pulp.value(x1)}')
print(f'x2 = {pulp.value(x2)}')
#(pulp.value(x1) and pulp.value(x2): Retrieve the optimal values of the decision variables x1
and x2 after solving the problem.)
Problem 2: Build the following linear model and proceed to solve it. A trading company
needs to import goods from two suppliers, A and B, with the following information:
Supplier A:
- Profit per unit of goods: $5
- Import Cost: $20
Supplier B:
- Profit per unit of goods: $4
- Import Cost: $15
Bind:
- The total cost of importing goods must not exceed $300.
- The quantity of goods imported from each supplier must not be negative.
Goal:
- Maximize total profit from imports.
!pip install pulp
import pulp
problem2 = pulp.LpProblem('Maximine_Profit', pulp.LpMaximize)
x = pulp.LpVariable('A', lowBound=0, cat='Continuous')
y = pulp.LpVariable('B', lowBound=0, cat='Continuous')
problem2 += 5*x + 4*y
problem2 += 20*x + 15*y <= 300
problem2.solve()
print(f'The quantity of goods imported from supplier A = {pulp.value(x)}')
print(f'The quantity of goods imported from supplier B = {pulp.value(y)}')
Content 2: Python Applications in Machine Learning and Statistics
Problem 1: Use the ydata-profiling library to generate a descriptive report on the
Ecommerce Customers dataset
!pip install ydata-profiling
#(installs the ydata-profiling library, which is a tool designed to generate automated descriptive
reports for datasets.)
import pandas as pd
from ydata_profiling import ProfileReport
#(ProfileReport: A class from the ydata-profiling library that creates a detailed profiling report
for a given dataset)
df = pd.read_csv('/content/gdrive/MyDrive/Python Lesson/Ecommerce Customers')
#( Reads the dataset from a CSV file and stores it in a Pandas DataFrame called df)
profile = ProfileReport(df, title="Ecommerce Customers Report", explorative=True)
#( Purpose: Creates a profiling report for the DataFrame data and stores it in the profile
variable.
Parameters:
df: The DataFrame to analyze.
title: The title of the report, which appears at the top of the HTML file.
explorative=True: Enables more detailed visualizations, such as correlation matrices
and advanced statistics.)
profile.to_file("ecommerce_customers_report.html")
#(Saves the profiling report as an HTML file so you can view it in a web browser.)
Problem 2: Using python, apply a linear regression model to perform the following tasks:
- Perform dataframe splitting
o Dataframe df_y only has the 'Yearly Amount Spent' column
o Dataframe df_x has columns 'Avg. Session Length', 'Time on App', 'Time on
Website', 'Length of Membership'
- Import train_test_split from sklearn.model_selection to separate the train and test sets.
Separate df_x, df_y into train and test sets with a ratio of 0.3 and 0.7. Name as X_train,
X_test, y_train, y_test
- Train the model and print out the coefficient of the independent variable and R square
between the X and Y variables
- Predict and compare with the test set to evaluate the model (e.g. MAE, MSE, etc.).
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
#(pandas: For data manipulation and analysis, used to structure and manipulate the dataset.
LinearRegression: A machine learning model for performing linear regression.
train_test_split: A function to split the dataset into training and testing sets.
mean_absolute_error, mean_squared_error, and r2_score: Metrics to evaluate the
performance of the regression model.)
df = pd.read_csv("/content/gdrive/MyDrive/Python Lesson/Ecommerce Customers")
#(Reads the dataset file into a Pandas DataFrame called df.)
df_y = df[['Yearly Amount Spent']]
df_x = df[['Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership']]
#(df_y: Extracts the dependent variable (Yearly Amount Spent) for the regression model.
df_x: Contains independent variables (Avg. Session Length, Time on App, Time on Website,
and Length of Membership) that are used to predict df_y.)
X_train, X_test, y_train, y_test = train_test_split(df_x, df_y, test_size=0.3)
#(Splits df_x and df_y into training and testing sets.
test_size=0.3: Reserves 30% of the data for testing and uses 70% for training.)
model = LinearRegression()
model.fit(X_train, y_train)
#(LinearRegression(): Creates a linear regression model.
.fit(): Trains the model using the training dataset (X_train, y_train))
print('Intercept:', model.intercept_).
print('Coefficients:', model.coef_)
print('R² Score (Train):', model.score(X_train, y_train))
#(model.intercept_: The y-intercept of the regression equation.
model.coef_: The coefficients for the independent variables, representing their impact on the
dependent variable.
model.score(): Calculates the R² score, a measure of how well the model explains the variance in
the data.)
y_pred = model.predict(X_test)
#(y_pred: Predictions made by the model on the test dataset (X_test))
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
#(Evaluation Metrics)
print('Mean Absolute Error (MAE):', mae)
print('Mean Squared Error (MSE):', mse)
print('R² Score (Test):', r2)
Please suggest another machine learning model and use it on Ecommerce Customers
dataset
select Support Vector Regression (SVR)
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVR
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
# (Import libraries)
df = pd.read_csv("/content/gdrive/MyDrive/Python Lesson/Ecommerce Customers")
# (Load the dataset)
df_y = df[['Yearly Amount Spent']]
df_X = df[['Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership']]
# (Split the DataFrame into independent (X) and dependent (y) variables)
X_train, X_test, y_train, y_test = train_test_split(df_X, df_y, test_size=0.3)
# (Split the dataset into train and test sets)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
#( Scale the features)
svr_model = SVR(kernel='rbf')
svr_model.fit(X_train_scaled, y_train.values.ravel())
# (Train the SVR model)
y_pred = svr_model.predict(X_test_scaled)
#( Make predictions)
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
# (Evaluate the model)
print('Mean Absolute Error (MAE):', mae)
print('Mean Squared Error (MSE):', mse)
print('R² Score:', r2)
#( Print the results)