0% found this document useful (0 votes)
16 views2 pages

Task 8

The document outlines a procedure for building a predictive model using machine learning in R or Python, specifically employing the predict() function on new datasets. It includes steps for loading libraries, training data, and making predictions based on employee experience to estimate salaries. The example demonstrates a successful linear regression model that predicts salaries for new employees based on their years of experience.

Uploaded by

John Mesia Dhas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views2 pages

Task 8

The document outlines a procedure for building a predictive model using machine learning in R or Python, specifically employing the predict() function on new datasets. It includes steps for loading libraries, training data, and making predictions based on employee experience to estimate salaries. The example demonstrates a successful linear regression model that predicts salaries for new employees based on their years of experience.

Uploaded by

John Mesia Dhas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Task 8: Use the predict () function to make predictions from that model on new data.

The
new dataset must have all of the columns from the training data, but they can be in a
different order with different values
Tools: RStudio, Python

Aim
To build a predictive model using machine learning, train it on a given dataset, and use the
predict() function to make predictions on a new dataset with the same feature columns but
different values.

Procedure
1. Load the necessary libraries in R or Python.
2. Load the training dataset and preprocess it if required.
3. Train a regression or classification model using the training dataset.
4. Load the new dataset, ensuring it has the same columns as the training dataset.
5. Use the predict() function to generate predictions for the new dataset.
6. Evaluate the predictions (optional).

Sample Dataset
We will use a dataset that contains work experience (years) as an independent variable
(X) and salary (USD) as a dependent variable (Y).

Training Data (train.csv)

Employee Experience (Years) Salary (USD)


1 1 3000
2 2 3500
3 3 4000
4 4 4500
5 5 5000
6 6 5500

New Data (new_data.csv)

Employee Experience (Years)


7 2.5
8 4.5
9 5.5
Python Implementation using Scikit-Learn
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

# Load the training dataset


train_data = pd.DataFrame({
'Experience': [1, 2, 3, 4, 5, 6],
'Salary': [3000, 3500, 4000, 4500, 5000, 5500]
})

# Split into X (features) and y (target)


X_train = train_data[['Experience']]
y_train = train_data['Salary']

# Train a linear regression model


model = LinearRegression()
model.fit(X_train, y_train)

# Load the new dataset (without salary)


new_data = pd.DataFrame({'Experience': [2.5, 4.5, 5.5]})

# Make predictions
predictions = model.predict(new_data)

# Display predictions
new_data['Predicted Salary'] = predictions
print(new_data)

Output:
Experience Predicted Salary
0 2.5 3750.0
1 4.5 4750.0
2 5.5 5250.0

Results
The model was successfully trained using a linear regression approach.

• Using the predict() function, we estimated salaries for new employees based on
their experience.
• The predictions suggest that a linear increase in salary occurs with increasing years of
experience.

You might also like