0% found this document useful (0 votes)
20 views5 pages

Car Fuel Efficiency Prediction

The document outlines a project to predict car fuel efficiency using Polynomial Regression based on engine size, utilizing a dataset from Kaggle. It details the steps of loading the dataset, visualizing relationships, implementing both Polynomial and Simple Linear Regression models, and evaluating their performance through Mean Squared Error and R² scores. The results indicate that Polynomial Regression (degree=3) outperforms Simple Linear Regression in predictive accuracy due to its ability to capture nonlinear relationships.

Uploaded by

mcanarender
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views5 pages

Car Fuel Efficiency Prediction

The document outlines a project to predict car fuel efficiency using Polynomial Regression based on engine size, utilizing a dataset from Kaggle. It details the steps of loading the dataset, visualizing relationships, implementing both Polynomial and Simple Linear Regression models, and evaluating their performance through Mean Squared Error and R² scores. The results indicate that Polynomial Regression (degree=3) outperforms Simple Linear Regression in predictive accuracy due to its ability to capture nonlinear relationships.

Uploaded by

mcanarender
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Predicting Car Fuel Efficiency

Objective: Use Polynomial Regression to predict car fuel efficiency based on engine size.
Dataset: https://www.kaggle.com/uciml/autompg-dataset
Tasks:
1. Load and explore the dataset.
2. Create scatter plots to visualize the relationships between engine size and fuel efficiency.
3. Implement Polynomial Regression (e.g., degree=3) to predict fuel efficiency.
4. Evaluate and compare the performance with a Simple Linear Regression model.

# Import necessary libraries


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Step 1: Load and explore the dataset


file_path = "/content/drive/MyDrive/nkphd/auto-mpg.csv" df = pd.read_csv(file_path)

# Display basic information about the dataset


print("Dataset Overview:")
print(df.head())
print("\nSummary Statistics:")
print(df.describe())

# Check for missing values


print("\nMissing Values:")
print(df.isnull().sum())

# Drop rows with missing values


df.dropna(inplace=True)

# Step 2: Scatter plot of engine size vs. fuel efficiency


plt.figure(figsize=(8, 6))
plt.scatter(df['displacement'], df['mpg'], color='blue', alpha=0.6)
plt.title("Engine Size vs. Fuel Efficiency")
plt.xlabel("Engine Size (Displacement)")
plt.ylabel("Fuel Efficiency (MPG)")
plt.grid()
plt.show()
# Step 3: Polynomial Regression
# Define features (engine size) and target (mpg)
X = df[['displacement']]
y = df['mpg']

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create polynomial features of degree 3


poly = PolynomialFeatures(degree=3)
X_train_poly = poly.fit_transform(X_train)
X_test_poly = poly.transform(X_test)

# Train the polynomial regression model


poly_model = LinearRegression()
poly_model.fit(X_train_poly, y_train)

# Predict using the polynomial regression model


y_pred_poly = poly_model.predict(X_test_poly)

# Step 4: Simple Linear Regression


# Train the simple linear regression model
linear_model = LinearRegression()
linear_model.fit(X_train, y_train)

# Predict using the simple linear regression model


y_pred_linear = linear_model.predict(X_test)

# Evaluate the models


mse_poly = mean_squared_error(y_test, y_pred_poly)
r2_poly = r2_score(y_test, y_pred_poly)

mse_linear = mean_squared_error(y_test, y_pred_linear)


r2_linear = r2_score(y_test, y_pred_linear)

print("\nModel Performance:")
print(f"Polynomial Regression (degree=3) - MSE: {mse_poly:.2f}, R²: {r2_poly:.2f}")
print(f"Simple Linear Regression - MSE: {mse_linear:.2f}, R²: {r2_linear:.2f}")

# Visualize the Polynomial Regression fit


plt.figure(figsize=(8, 6))
plt.scatter(X, y, color='blue', alpha=0.6, label="Actual")
X_sorted = np.sort(X, axis=0)
plt.plot(X_sorted, poly_model.predict(poly.transform(X_sorted)), color='red', label="Polynomial
Regression (degree=3)")
plt.plot(X_sorted, linear_model.predict(X_sorted), color='green', linestyle='--', label="Simple
Linear Regression")
plt.title("Model Comparison")
plt.xlabel("Engine Size (Displacement)")
plt.ylabel("Fuel Efficiency (MPG)")
plt.legend()
plt.grid()
plt.show()
Performance Evaluation
The Mean Squared Error (MSE) and R² score are used to compare both models:

 Polynomial Regression (degree=3) provides a lower MSE and a higher R² score,


indicating a better fit and improved predictive accuracy.
 Simple Linear Regression, due to its linear nature, has a higher MSE and a lower R²
score, meaning it cannot capture the nonlinear relationship between engine size and fuel
efficiency effectively.

Comparison

 Linear Regression assumes a straight-line relationship, leading to underfitting in cases


where the relationship is nonlinear.
 Polynomial Regression captures the curvature in the data, fitting more accurately but at
the cost of increased model complexity.

Conclusion

Polynomial Regression (degree=3) performs better in predicting fuel efficiency compared to


Simple Linear Regression. The lower MSE and higher R² score confirm its superior accuracy
in this dataset.

You might also like