0% found this document useful (0 votes)
84 views47 pages

Bachelor of Technology: Prediction of Used Car Prices Using Artificial Neural Networks and Machine Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views47 pages

Bachelor of Technology: Prediction of Used Car Prices Using Artificial Neural Networks and Machine Learning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

An Industry Oriented Mini Project Report

On
PREDICTION OF USED CAR PRICES USING ARTIFICIAL NEURAL
NETWORKS AND MACHINE LEARNING
Submitted in partial fulfillment of the Academic
Requirement for the Award of Degree of

BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
Submitted By

S.NISHANTH REDDY 20R01A0553

Under the esteemed guidance of

Mr. U. Veeresh
(Assistant Professor, Dept of CSE)

CMR INSTITUTE OF TECHNOLOGY


(UGC AUTONOMUS)
Approved by AICTE, Permanent Affiliation to JNTUH, Accredited by NBA and NAAC

Kandlakoya(V), Medchal Dist – 501401


www.cmrithyderabad.edu.in
2023 - 24
CMR INSTITUTE OF TECHNOLOGY
(UGC AUTONOMUS)

(Approved by AICTE, Affiliated to JNTU, Kukatpally, Hyderabad)


Kandlakoya, Medchal Road, Hyderabad

Department of Computer Science and Engineering

CERTIFICATE
This is to certify that a Mini Project entitled with” Cyber Security Awareness in Online
Education” is being
Submitted by:

S. NISHANTH REDDY 20R01A0553

In partial fulfilment of the requirement for award of the degree of B. Tech in CSE to the
JNTUH, Hyderabad is a record of a Bonafide work carried out under our guidance and
supervision. The results in this project have been verified and are found to be satisfactory. The
results embodied in this work have not been submitted to have any other University for award
of any other degree or diploma.

Signature of Guide Signature of HOD

Mr. U. Veeresh Mr. A. Prakash


(Assistant Professor) (Head of Department)

Signature of External Examiner


ACKNOWLEDGEMENT

We are extremely grateful to Dr. M. Janga Reddy, Director, Dr. B. Satyanarayana,


Principal and Mr. A. Prakash, Head of Department, Department of Computer Science and
Engineering, CMR Institute of Technology for their inspiration and valuable guidance
during entire duration.

We are extremely thankful to our guide Mr. U. Veeresh , Assistant Professor,Department of


Computer Science and Engineering, CMR Institute of Technology for his constant guidance,
encouragement and moral support throughout the project.

We express our thanks to all staff members and friends for all the help and coordination
extended in bringing out this Project successfully in time.

Finally, we are very much thankful to our parents and relatives who guided directly or
indirectly for successful completion of the project.

S. NISHANTH REDDY 20R01A0553

i
ABSTRACT

The number of cars on Mauritian roads has been rising consistently by 5% during the last decade. In
2014, (1,73,954) cars were registered at the National Transport Authority. Thus, one Mauritian in
every six owns a car, most of which are second hand reconditioned cars and used cars. The aim of this
study is to assess whether it is possible to predict the price of second-hand cars using artificial neural
networks. Thus, data for 200 cars from different sources was gathered and fed to four different machine
learning algorithms. We found that support vector machine regression produced slightly better results
than using a neural network or linear regression. However, some of the predicted values are quite far
away from the actual prices, especially for higher priced cars. Thus, more investigations with a larger
data set are required and more experimentation with different network type and structures is still
required in order to obtain better predictions.

The manufacturer sets the price of a new car in the industry, with the government incurring some
additional expenditures in the form of taxes. Customers purchasing a new car may thus be sure that
their investment will be worthwhile. However, due to rising new car prices and buyers' financial
inability to purchase them, used car sales are increasing globally. As a result, a used car price prediction
system that efficiently assesses the worthiness of the car utilizing a range of factors is required. The
current system comprises a system in which a dealer decides on a price at random and the buyer has
no knowledge of the car or its current worth. In reality, the seller has no clue what the car is worth or
what price he should charge for it. To address this issue, we have devised a highly effective model.
Regression algorithms are employed because they produce a continuous value rather than a classified
value as an output.

In this paper, we investigate the application of supervised machine learning techniques to predict the
price of used cars in Mauritius. The predictions are based on historical data collected from daily
newspapers. Different techniques like multiple linear regression analysis, k-nearest neighbours, naïve
bayes and decision trees have been used to make the predictions. The predictions are then evaluated
and compared in order to find those which provide the best performances. A seemingly easy problem
turned out to be indeed very difficult to resolve with high accuracy. All the four methods provided
comparable performance. In the future, we intend to use more sophisticated algorithms to make the
predictions.

ii
TABLE OF CONTENTS
1. Introduction 1
2. Literature Survey 2
3. System Analysis 4
3.1 Existing System 4
3.2 Disadvantages of Existing System 4
3.3 Proposed System 5
3.4 Advantages of Proposed System 5
4. System Study 6
4.1 Feasibility Study 6
4.1.1 Economical Feasibility 6
4.1.2 Technical Feasibility 7
4.1.3 Social feasibility 7
5. Hardware and Software requirements 8
5.1 Hardware Requirements 8
5.1.1 Processor 8
5.1.2 RAM 8
5.1.3 Hard disk 8
5.2 Software Requirements 9
5.2.1 Operating System 9
5.2.2 Coding Language 9
6. Architecture 10
7. Modules 13
8. Diagrams 15
8.1 Data flow Diagrams 15
8.2 Class Diagrams 17
8.3 Sequence Diagrams 19
8.4 Use-case Diagrams 21
9. Implementation 23
9.1 Data set 25
9.2 Preprocessing 25
9.3 Validation 26
9.4 Model with best parameters 26

iii
9.5 Prediction 26
10. Screen Shots 27
11. Testing 31
11.1 Types of Tests 31
11.1.1 Unit Testing 31
11.1.2 Integration Testing 32
11.1.3 Functional Testing 32
11.1.4 System Testing 33
11.2 User Training 36
11.3 Maintenance 36
11.4 Testing Strategy 37
12. Conclusion 38
13. References 39

iv
LIST OF FIGURES

Figure No. Particulars Page no.


6.1 Architecture diagram 10
8.1 Data flow diagram 15
8.2 Class diagram 17
8.3 Sequence diagram 19
8.4 Use-case Diagram 21
9.1 Dataset 25

v
1. INTRODUCTION

According to the data obtained from the National Transport Authority (2014), there has been an
increase of 254% in the number of cars from 2003 (68, 524) to 2014 (173, 954), as shown in Figure 1.
We can thus infer that the sale of second-hand imported (reconditioned) cars and second-hand used
cars has eventually increase given that new cars represent only a very small percentage of the total
number of cars sold each year. Most individuals in Mauritius who buy new cars also want to know
about the resale value of their cars after some years so that they can sell it in the used car market.

Price prediction of second-hand cars depends on numerous factors. The most important ones are
manufacturing year, make, model, mileage, horsepower and country of origin. Some other factors are
type and amount of fuel per usage, the type of braking system, its acceleration, the interior style, its
physical state, volume of cylinders (measured in cubic centimeters), size of the car, number of doors,
weight of the car, consumer reviews, paint color and type, transmission type, whether it is a sports car,
sound system, cosmic wheels, power steering, air conditioner, GPS navigator, safety index etc. In the
Mauritian context, there are some special factors that are also usually considered such as who were the
previous owners and whether the car has had any serious accidents.

Thus, predicting the price of second-hand cars is a very laudable enterprise. In this paper, we will
assess whether neural networks can be used to accurately predict the price of secondhand cars. The
results will also be compared with other methods like linear regression and support vector regression.

This paper proceeds as follows. In this system, various works on neural networks and price prediction
have been summarized. The methodology and data collection are described in this system. The system
presents the results for price prediction of second-hand cars. Finally, we end the paper with a
conclusion and some ideas towards future works.

1
2. LITERATURE SURVEY

Various studies have been conducted in order to predict the price of used cars. Researchers regularly
anticipate product prices using past data. Pudaruth predicted car prices in Mauritius, and these cars
were not new, but rather used to predict the prices, he employed multiple linear regression, k-nearest
neighbors, Naive Bayes, and decision tree techniques. When the prediction results from various
strategies were compared, it was discovered that the prices from these methods are quite similar.
However, the decision tree technique and the Nave Bayes approach were proven to be incapable of
classifying and predicting numeric values. According to Pudaruth's research, the small sample size
does not give good prediction accuracy.

Kuiper, S. (2008) demonstrated a multivariate regression model that helps in classifying and predicting
values in numeric format. It demonstrates how to apply this multivariate regression model to forecast
the price of 2005 General Motors (GM) vehicles. The price prediction of cars does not require any
special knowledge. So, the data available online is enough to predict prices. The author of the article
did the same car price prediction and introduced variable selection techniques that helped in finding
which variables were more relevant for inclusion .

In 2019, Pal et al discovered as a methodology for predicting used cars prices using Random Forest.
The paper evaluated used-car price prediction using Kaggle data set which gave an accuracy of 83.62%
for test data and 95% for train-data. The most relevant features used for this prediction were price,
kilometer, brand, and vehicle type and identified by filtering out outliers and irrelevant features of the
data set. Being a sophisticated model, Random Forest provided good accuracy in comparison to prior
work using these data sets.

2
The goal of the system that Dholiya , M., et al. developed is to give the user a realistic estimation of
how much the vehicle might cost them. Based on the specifics of the automobile the user is looking
for, the system, which is a web application, may also offer the user a list of options for various car
kinds. It assists in providing the buyer or seller with useful information on which to base their decision.
This system makes predictions using the multiple linear regression algorithm, and this model was
trained using historical data that was obtained over an extended period of time. The raw data was
initially gathered using the KDD (Knowledge Discovery in Databases) process. Afterward, it
underwent preprocessing and cleaning in order to identify patterns that are valuable and then derive
some meaning from those patterns.

Richardson conducted his analysis under the presumption that automakers are more inclined to produce
cars that don't lose value quickly. He demonstrated, in particular, that hybrid cars are better equipped
to maintain their value than conventional vehicles by utilizing multiple regression analysis. This is
perhaps because there are increasing concerns about the environment and the climate, as well as
because it uses less gasoline. In this study, the significance of additional variables including age,
mileage, make, and MPG (miles per gallon) was also taken into account. All of his information was
gathered from several websites.

Listiani published another study that is comparable and uses Support Vector Machines (SVM) to
forecast lease car pricing . This study demonstrated that when a very large data set is available, SVM
is significantly more accurate at price prediction than multiple linear regression. SVM is also superior
at handling high dimensional data and steers clear of both under- and over-fitting problems. Finding
crucial features for SVM is done using a genetic algorithm. However, the method does not demonstrate
why SVM is superior to basic multiple regression in terms of variance and mean standard deviation.

3
3. SYSTEM ANALYSIS

System analysis is a critical phase in the project management and development process. It
provides a solid foundation for project planning, design, development, and implementation,
helping to ensure that the final system meets the intended objectives and user needs. This
analysis aims to gather detailed information about the existing system and its disadvantages,
proposed system and its attributes like its advantages and advancements.

3.1 EXISTING SYSTEM

1. Ahangar et al. (2010) also compared the use of neural networks with linear regression in
order to predict the stock prices of companies in Iran. They also found that neural
networks had superior performance both in terms of accuracy and speed compared to
linear regression.
2. Pudaruth (2014) used four different supervised machine learning techniques namely
KNN (k-Nearest Neighbor), Naïve Bayes, linear regression and decision trees. The best
result was obtained using KNN .
3. Bharambe and Dharmadhikari (2015) used artificial neural networks (ANN) to analyze
the stock market and predict market behavior.

3.2 DISADVANTAGES OF EXISTING SYSTEM

1. An existing methodology doesn’t implement data preprocessing and labelling


method.

2. The system not implemented an effective ML Classifiers for predictions in the


datasets.

4
3.3 PROPOSED SYSTEM

In order to carry out this study, data have been obtained from different car websites and from the
small adverts sections found in daily newspapers. Two hundred records were collected. A large
number of experiments have been conducted in order to find the best network structure and the best
parameters for the neural network. We found that a neural network with 1 hidden layer and 2 nodes
produced the smallest mean absolute error among various neural network structures that were
experimented with.

However, we found that Support Vector Regression and a multilayer perception with back-
propagation produced slightly better predictions than linear regression while the k-Nearest
Neighbor algorithm had the worst accuracy among these four approaches. All experiments were
performed with a cross validation value of 10 folds.

3.4 ADVANTAGES OF PROPOSED SYSTEM

1. The purpose of linear regression, support vector regression which are more effective for
testing and training accuracy.

2. In this work, the system will assess whether neural networks can be used to accurately
predict the price of used cars.

5
4. SYSTEM STUDY

System Study is the process of collecting and interpreting facts, identifying the problems, and
decomposition of a system into its components. It is a problem solving technique that improves the
system and ensures that all the components of the system work efficiently to accomplish their purpose.
For this project, we have gathered images from Kaggle website about various brain tumor types, based
on the images collected we have divided them into training and testing sets where training set images
are used to train the machine learning model and based on the training the machine will predict the
result for testing data whether the brain tumor is malignant or not.

4.1 FEASIBILITY STUDY

The feasibility of the project is analyzed in this phase and a business proposal is put forth with a very
general plan for the project and some cost estimates. During system analysis the feasibility study of
the proposed system is to be carried out. This is to ensure that the proposed system is not a burden to
the company. For feasibility analysis, some understanding of the major requirements for the system is
essential.

Three key considerations involved in the feasibility analysis are:


 Economical Feasibility
 Technical Feasibility
 Social Feasibility

4.1.1 ECONOMICAL FEASIBILITY

This study is carried out to check the economic impact that the system will have onthe organization.
The amount of funds that the company can pour into the research and development of the system is
limited. The expenditures must be justified. Thus the developed system as well within the budget and
this was achieved because most of the technologies used are freely available .Only the customized
products had to be purchased.

6
4.1.2 TECHINCAL FEASIBILITY

This study is carried out to check the technical feasibility, that is, the technical requirements of the
system. Any system developed must not have a high demand onthe available technical resources .This
will lead to high demands on the available technical resources. This will lead to high demands being
placed on the client. The developed system must have a modest requirement, as only minimal or null
changes are required for implementing this system.

4.1.3 SOCIAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system by the user. This includes the process
of training the user to use the system efficiently. The user must not feel threatened by the system,
instead must accept it as a necessity. The level of acceptance by the users solely depends on the methods
that are employed to educatethe user about the system and to make him familiar with it. His level of
confidence must be raised so that he is also able to make some constructive criticism, which is
welcomed, as he is the final user of the system.

7
5. HARDWARE AND SOFTWARE REQUIREMENTS

The hardware and software requirements outlined for the proposed system are pivotalelements that
warrant careful consideration to guarantee the system's optimal functionality and high-performance
capabilities. The following mentioned hardware and software requirements are those which is
suggestable for the successful working and execution of the proposed system.

5.1 HARDWARE REQUIREMENTS

In parallel, the hardware requirements have been meticulously defined to strike a balance between
accessibility and computational process.

5.1.1 Processor

Provides the instructions and processing power the computer needs to do its work.The more powerful
and updated your processor, the faster your computer can complete its tasks.
The Processor we used for this project is: Intel core i5/ AMD Ryzen 5.

5.1.2 RAM

RAM provides the shorter-term memory the CPU needs to open files and move data around as it
responds to the tasks given to it by your apps. Both RAM and the CPU work synchronously and
complementarity to ensure that your computer's performance fits your needs and you have a good
experience when using your device. The RAM used for this project is 8 GB minimum.

5.1.3 Hard disk

A hard drive is the hardware component that stores all of your digital content. Your documents,
pictures, music, videos, programs, application preferences, andoperating system represent digital
content stored on a hard drive. A minimum of 500GB hard disk is suggestable.
The Hard disk used for this project is 10GB.

8
5.2 SOFTWARE REQUIREMENTS

On the software front, A software requirements specification (SRS) is a comprehensivedescription of


the intended purpose and environment for software under development.

5.2.1 Operating system

The operating system (OS) manages all of the software and hardware on thecomputer. It performs
basic tasks such as file, memory and process management, handling input and output, and controlling
peripheral devices such as disk drives andprinters. The operating system we have used is Windows 10.

5.2.2 Coding language

The programming language used for this project is python. It brings an exceptional amount of power
and versatility to machine learning environments. The language's simple syntax simplifies data
validation and streamlines the scraping, processing, refining, cleaning, arranging and analyzing
processes, thereby making collaborationwith other programmers less of an obstacle.
The coding language we have used is Python with the latest version of 3.7.0.

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is


designed to be highly readable. It uses English words frequently whereas other languages use
punctuation, and it has fewer syntactic constructions than other languages.

1. Python is Interpreted: Python is processed at runtime by the interpreter. You do not need to compile
your program before executing it. This is similar to PERL and PHP.
2. Python is Interactive: You can actually sit at a Python prompt and interact withthe interpreter directly
to write your programs.
3. Python is Object-Oriented: Python supports Object-Oriented style or technique of programming
that encapsulates code within objects.

9
6. ARCHITECTURE

Architecture typically refers to the structural and organizational framework or design of a system,
software, building, or any complex project. The architecture provides a high-level overview of how the
project is structured and how its various components or elements interactwith each other. It outlines
the fundamental design principles, components, and their relationships, and helps stakeholders
understand the project's overall framework.

Fig 6.1 Architecture Diagram

10
SERVICE PROVIDER:

In this module, the Service Provider has to login by using valid user name and password. After login
successful he can do some operations such as:
1. Login
2. Train & Test Used Car Data Sets
3. View Trained Accuracy in Bar Chart
4. View Trained Accuracy Results
5. View Used Car Prices Type
6. Find Used Car Prices Type Ratio
7. Download Predicted Datasets
8. View Used Car Prices Type Ratio Results
9. View All Remote Users.

WEB SERVER:

A web server is a dedicated computer responsible for running websites sitting out on those computers
somewhere on the Internet. They are specialized programs that circulate web pages as summoned by
the user. The primary objective of any web server is to collect, process and provide web pages to the
users.

VIEW AND AUTHORIZE USERS:

In this module, the admin can view the list of users who all registered. In this, the admin can view the
user’s details such as,:
1. User name
2. Email
3. Address
4. Admin authorizes the users.

11
REMOTE USER:

In this module, there are n numbers of users are present. User should register before doing any
operations. Once user registers, their details will be stored to the database. After registration
successful, he has to login by using authorized user name and password. Once Login is successful user
will do some operations like :

1. REGISTER AND LOGIN


2. PREDICT USED CAR PRICE TYPE
3. VIEW YOUR PROFILE.

WEB DATABASE:

A Web database is a database application designed to be managed and accessed through the Internet.
Website operators can manage this collection of data and present analytical results based on the data
in the Web database application.

12
7.MODULES

This project involves some modules that can be organized into three main steps:

7.1 SERVICE PROVIDER

In this module, the Service Provider has to login by using valid user name and password. After login
successful he can do some operations such as:
1. Login
2. Train & Test Used Car Data Sets
3. View Trained Accuracy in Bar Chart
4. View Trained Accuracy Results
5. View Used Car Prices Type
6. Find Used Car Prices Type Ratio
7. Download Predicted Datasets
8. View Used Car Prices Type Ratio Results
9. View All Remote Users.

7.2 VIEW AND AUTHORIZE USERS

In this module, the admin can view the list of users who all registered. In this, the admin can view the
user’s details such as,:
1. User name
2. Email
3. Address
4. Admin authorizes the users.

13
7.3 REMOTE USER

In this module, there are n numbers of users are present. User should register before doing any
operations. Once user registers, their details will be stored to the database. After registration
successful, he has to login by using authorized user name and password. Once Login is successful user
will do some operations like :

1. REGISTER AND LOGIN


2. PREDICT USED CAR PRICE TYPE
3. VIEW YOUR PROFILE.

14
1. DIAGRAMS
8.1 DATA FLOW DIAGRAM

Fig 8.1 Data Flow Diagram

A Data Flow Diagram (DFD) is a visual representation that illustrates how data moves within a system.
In the context of brain tumor detection using CNN (Convolutional Neural Networks), the primary
elements would include data sources, data processing components, and data destinations.

Data Flow Diagram Components:

The Data Flow Diagram has 4 components:


 Process: Input to output transformation in a system takes place because of process function.
The symbols of a process are rectangular with rounded corners, oval, rectangle or a circle. The
process is named a short sentence, in one word or a phrase to express its essence.

15
 Data Flow: Data flow describes the information transferring between different parts of the
systems. The arrow symbol is the symbol of data flow. A relatable name should be given to the
flow to determine the information which is being moved. Data flow also represents material
along with information that is being moved. Material shifts are modelled in systems that are not
merely informative. A given flow should only transfer a single type of information. The
direction of flow is represented by the arrow which can also be bi-directional.
 Warehouse: The data is stored in the warehouse for later use. Two horizontal lines represent
the symbol of the store. The warehouse is simply not restricted to being a data file rather it can
be anything like a folder with documents, an optical disc, a filing cabinet. The data warehouse
can be viewed independent of its implementation. When the data flow from the warehouse it is
considered as data reading and when data flows to the warehouse it is called data entry or data
updating.
 Terminator: The Terminator is an external entity that stands outside of the system and
communicates with the system. It can be, for example, organizations like banks, groups of
people like customers or different departments of the same organization, which is not a part of
the model system and is an external entity. Modelled systems also communicate with
terminator.

16
8.2 CLASS DIAGRAMS

Fig 8.2 Class Diagram

A class diagram in the Unified Modelling Language (UML) is a type of static structure diagramthat
describes the structure of a system by showing the system's classes, their attributes, operations (or
methods), and the relationships among the classes. It explains which class contains information.

17
Class Descriptions:

1. Shows static structure of classifiers in a system


2. Diagram provides a basic notation for other structure diagrams prescribed by UML
3. Helpful for developers and other team members too
4. Business Analysts can use class diagrams to model systems from a business perspective

18
8.3 SEQUENCE DIAGRAMS

Fig 8.3 Sequence Diagram

A sequence diagram in Unified Modelling Language (UML) is a kind of interaction diagram that shows
how processes operate with one another and in what order. It is a construct of a Message Sequence
Chart. Sequence diagrams are sometimes called event diagrams, event scenarios, and timing diagrams.

19
Sequence Description:

 Model high-level interaction between active objects in a system


 Model the interaction between object instances within a collaboration that realizes a use case
 Model the interaction between objects within a collaboration that realizes an operation
 Either model generic interactions (showing all possible paths through the interaction) or
specific instances of a interaction (showing just one path through the interaction)

20
8.4 USE-CASE DIAGRAM

Fig 8.4 Use-case Diagram

Use-case diagram is used to represent the dynamic behavior of a system. It encapsulates the system's
functionality by incorporating use cases, actors, and their relationships. It models the tasks, services,
and functions required by a system/subsystem of an application. It depicts the high-level functionality
of a system and also tells how the user handles a system.

21
Use-case Descriptions:

1. It gathers the system's needs.


2. It depicts the external view of the system.
3. It recognizes the internal as well as external factors that influence the system.
4. It represents the interaction between the actors.

22
9.IMPLEMENTATION

Implementation can be done by using Algorithms mentioned below:

Naïve Bayes:

1. Predicts by counting how often things happen.


2. Assumes things are independent unless told otherwise.
3. Similar to guessing by counting common occurrences.

Random Forest:

1. Teams of trees vote to make choices and many trees help decide the most popular answer.
2. Like a group making decisions by voting.

Logistic Regression:

1. Predicts yes/no situations using clues and looks at factors to decide between choices.
2. Like choosing something based on different reasons.

Support Vector Machine (SVM):

1. Draws a line to separate different groups and finds the best line to keep things apart.
2. Think of drawing a clear line between two teams.

Decision Tree Classifier:

1. Makes decisions by asking questions about data.


2. Creates a tree to make choices step by step.
3. Each choice leads to more questions or a final answer.

23
K-Nearest Neighbors (KNN):

1. Classifies by asking neighbors for help.


2. Looks at close neighbors to decide a category.
3. Imagine borrowing ideas from nearby friends.

Gradient boosting :

1. It gives a prediction model in the form of an ensemble of weak prediction models, which are
typically decision trees.
2. A gradient-boosted trees model is built in a stage-wise fashion 3. It generalizes the other methods
by allowing optimization of an arbitrary differentiable loss function.

24
9.1 DATA SET
The sample dataset is collected from the Kaggle website which includes Brain MRI images. This
dataset includes images of both patients with brain tumors(malignant) and those without
tumors(benign).

9.2 PREPROCESSING
Once that the dataset has been loaded next step is to preprocess the data in such a waythat it is
suitable for training and testing of the data.
Basically, Data preprocessing is a technique that is used to convert raw data into a cleandataset.
The data gathered from different sources is in raw format which is not feasible for the analysis.
The preprocessing techniques used here are resizing the images, normalizing and augmenting the
images.

9.2.1 Training and Testing Data

Now the next step is to split our dataset into two. Training set and a Test set. We willtrain our
machine learning models on our training set, i.eour machine learning models will try to
understand any correlations in our training set and then we will test the models on our test set
to examine how accurately it will predict. A general rule ofthe thumb is to assign 80% of the
dataset to the training set and therefore the remaining 20% to the test set.

25
9.3 VALIDATION

In this crucial phase, the model undergoes a rigorous validation process using the independent test
dataset to comprehensively evaluate its performance. Several essentialmetrics are considered to assess
the model's efficacy in making accurate predictions and its ability to balance different aspects of
performance.

9.4 PREDICTION

The final prediction is a result of the activation function applied to the output layer. A sigmoid
activation function is commonly used for binary classification tasks, providinga probability estimate.
Depending on the threshold chosen, a prediction can be made. For example, if the predicted probability
is greater than 0.5, the model may classify the image as malignant; if it's less than or equal to 0.5, the
model may classify it as benign.

26
10.SCREENSHOTS

Register page:

Login page:

27
Main Interface of Remote User:

View your Profile:

28
Predicted Used Car Type:

View Used Car Prices Type:

29
Used Car Price Ratio:

Service Provider:

30
11.TESTING

The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, sub assemblies, assemblies and/or a finished product. It is the process of exercising
software with the intent of ensuring that the Software system meets its requirements and user
expectations and does not fail in an unacceptable manner. There are various types of test. Each test
type addresses a specific testing requirement.

11.1 TYPES OF TESTS

11.1.1 Unit testing


Unit testing involves the design of test cases that validate that the internal program logic is functioning
properly, and that program inputs produce valid outputs. All decision branches and internal code flow
should be validated. It is the testing of individual software units of the application .It is done after the
completion of an individual unit before integration. This is a structural testing that relies on knowledge
of its constructionand is invasive. Unit tests perform basic tests at component level and test a specific
business process, application, and/or system configuration. Unit tests ensure that each unique path of
a business process performs accurately to the documented specifications and contains clearly defined
inputs and expected results.
Unit testing is usually conducted as part of a combined code and unit test phase of the software
lifecycle, although it is not uncommon for coding and unit testing to be conducted as two distinct
phases.

Test strategy and approach: Field testing will be performed manually and functional testswill be written
in detail.

Test objectives:
1. All field entries must work properly.
2. Pages must be activated from the identified link.
3. The entry screen, messages and responses must not be delayed.

31
Features to be tested:
1. Verify that the entries are of the correct format.
2. No duplicate entries should be allowed.
3. All links should take the user to the correct page.

11.1.2 Integration testing


Integration tests are designed to test integrated software components to determine if theyactually run as
one program. Testing is event driven and is more concerned with the basicoutcome of screens or fields.
Integration tests demonstrate that although the componentswere individually satisfactory, as shown by
successfully unit testing, the combination ofcomponents is correct and consistent. Integration testing is
specifically aimed at exposingthe problems that arise from the combination of components.
Software integration testing is the incremental integration testing of two or more integrated software
components on a single platform to produce failures caused by interface defects.The task of the
integration test is to check that components or software applications, e.g. components in a software
system or – one step up – software applications at the company level – interact without error.

Test Results: All the test cases mentioned above passed successfully. No defects encountered.

11.1.3 Functional testing


Functional tests provide systematic demonstrations that functions tested are available as specified by
the business and technical requirements, system documentation, and user manuals.

Functional testing is centered on the following items:


1. Valid Input : identified classes of valid input must be accepted. Invalid Input : identified classes of
invalid input must be rejected. Functions : identified functions must be exercised.

2. Output : identified classes of application outputs must be exercised.

3. Systems/Procedures: interfacing systems or procedures must be invoked.

32
Organization and preparation of functional tests is focused on requirements, key functions, or special
test cases. In addition, systematic coverage pertaining to identifyingBusiness process flows; data fields,
predefined processes, and successive processes mustbe considered for testing. Before functional testing
is complete, additional tests are identified and the effective value of current tests is determined.

11.1.4 System testing


System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. An example of system testing is the
configuration oriented system integration test. System testing is based on process descriptions and
flows, emphasizing pre-driven process links and integration points.
Software once validated must be combined with other system elements (e.g. Hardware, people,
database). System testing verifies that all the elements are proper and that overall system function
performance is achieved. It also tests to find discrepancies between the system and its original
objective, current specifications and system documentation

TESTING METHODOLOGIES
The following are the Testing Methodologies:
1. Unit Testing
2. Integration Testing
3. User Acceptance Testing
4. Output Testing
5. System Testing

Unit Testing
Unit testing focuses verification effort on the smallest unit of Software design that is the module. Unit
testing exercises specific paths in a module’s control structure to ensure complete coverage and
maximum error detection. This test focuses on each module individually, ensuring that it functions
properly as a unit. Hence, the naming is Unit Testing. During this testing, each module is tested
individually and the module interfacesare verified for the consistency with design specification. All
important processing pathsare tested for the expected results. All error handling paths are also tested.

33
Integration Testing
Integration testing addresses the issues associated with the dual problems of verification and program
construction. After the software has been integrated a set of high order testsare conducted. The main
objective in this testing process is to take unit tested modules and builds a program structure that has
been dictated by design.

The following are the types of Integration Testing:

1. Top Down Integration


This method is an incremental approach to the construction of program structure. Modules are
integrated by moving downward through the control hierarchy, beginning with the main program
module. The module subordinates to the main program module are incorporated into the structure in
either a depth first or breadth first manner. In this method, the software is tested from the main module
and individual stubs are replaced when the test proceeds downwards.

2. Bottom-up Integration
This method begins the construction and testing with the modules at the lowest level in the program
structure. Since the modules are integrated from the bottom up, processing required for modules
subordinate to a given level is always available and the need for stubs is eliminated. The bottom up
integration strategy may be implemented with the following steps:

 The low-level modules are combined into clusters into clusters that perform aspecific Software
sub-function.
 A driver (i.e.) the control program for testing is written to coordinate test case input and output.
 The cluster is tested.
 Drivers are removed and clusters are combined moving upward in the programstructure.
The bottom up approach tests each module individually and then each module is integrated
with a main module and tested for functionality.

34
User Acceptance Testing
User Acceptance of a system is the key factor for the success of any system. The system under
consideration is tested for user acceptance by constantly keeping in touch with theprospective system
users at the time of developing and making changes wherever required. The system developed provides
a friendly user interface that can easily be understood even by a person who is new to the system. This
is a critical phase of any project and requires significant participation by the end user. It also ensures
that the system meets the functional requirements.

Test Results: All the test cases mentioned above passed successfully. No defects encountered.

Output Testing
After performing the validation testing, the next step is output testing of the proposed system, since no
system could be useful if it does not produce the required output in the specified format. Asking the
users about the format required by them tests the outputs generated or displayed by the system under
consideration. Hence the output format is considered in 2 ways – one is on screen and another in printed
format.

System Test
System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. An example of system testing is the
configuration oriented system integration test. System testing is based on process descriptions and
flows, emphasizing pre-driven process links and integration points.

White Box Testing


White Box Testing is a testing in which in which the software tester has knowledge of the inner
workings, structure and language of the software, or at least its purpose. It is purpose. It is used to test
areas that cannot be reached from a black box level.

35
Black Box Testing
Black Box Testing is testing the software without any knowledge of the inner workings, structure or
language of the module being tested. Black box tests, as most other kinds oftests, must be written from
a definitive source document, such as specification or requirements document, such as specification or
requirements document. It is a testing inwhich the software under test is treated, as a black box .you
cannot “see” into it. The test provides inputs and responds to outputs without considering how the
software works.

11.2 USER TRAINING

Whenever a new system is developed, user training is required to educate them about theworking of the
system so that it can be put to efficient use by those for whom the systemhas been primarily designed.
For this purpose the normal working of the project was demonstrated to the prospective users. Its
working is easily understandable and since theexpected users are people who have good knowledge of
computers, the use of this systemis very easy.

11.3 MAINTENANCE

This covers a wide range of activities including correcting code and design errors. To reduce the need
for maintenance in the long run, we have more accurately defined the user’s requirements during the
process of system development. Depending on therequirements, this system has been developed to
satisfy the needs to the largest possible extent. With development in technology, it may be possible to
add many more features based on the requirements in future. The coding and designing is simple and
easy to understand which will make maintenance easier.

11.4 TESTING STRATEGY

A strategy for system testing integrates system test cases and design techniques into a well planned
series of steps that results in the successful construction of software. The testing strategy must
cooperate with test planning, test case design, test execution, and the resultant data collection and
evaluation. A strategy for software testing must accommodate low-level tests that are necessary to
verify that a small source code

36
segment has been correctly implemented as well as high level tests that validate major system functions
against user requirements.
Software testing is a critical element of software quality assurance and represents the ultimate review
of specification design and coding. Testing represents an interesting anomaly for the software. Thus, a
series of testing are performed for the proposed systembefore the system is ready for user acceptance
testing.

37
12. CONCLUSION

Prediction of used cars prices depends on numerous factors. The most important ones are
manufacturing year, make, model, mileage, horsepower and country of origin. Some other factors are
type and amount of fuel per usage, the type of braking system, its acceleration, the interior style, its
physical state, volume of cylinders (measured in cubic centimeters), size of the car, number of doors,
weight of the car, consumer reviews, paint color and type, transmission type, whether it is a sports car,
sound system, cosmic wheels, power steering, air conditioner, GPS navigator, safety index etc.
In the Mauritian context, there are some special factors that are also usually considered such as who
were the previous owners and whether the car has had any serious accidents.Predicting used car prices
is a difficult task due to the large number of features and parameters that must be examined in order to
get reliable findings. The first and most important phase is data collection and preprocessing. The
model was then defined and built in order to implement algorithms and generate results.

After executing various regression algorithms on the model, it was concluded that the Decision Tree
Algorithm was the top performer, with the greatest r2 score of 0.95, implying that it provided the most
accurate predictions, as shown by the Original v/s Prediction line graph. Aside from having the highest
r2 score, the Decision Tree also had the lowest Mean Square Error (MSE) and Root Mean Square Error
(RMSE) scores, indicating that the errors in predictions were the lowest of all and that the results
obtained were very accurate.

The aim of this paper was to predict the price of second-hand reconditioned and second- hand used
cars in Mauritius. The car market has been increasing steadily by around 5% for the last ten years,
showing the high demand for cars by the Mauritian population. There are hundreds of car websites in
Mauritius but none of them provide such a facility to predict the price of used cars based on their
attributes. Our dataset of 200 records was used with the cross-validation technique with ten folds.

The car make, year manufactured, paint type, transmission type, engine capacity and mileage have
been used to predict the price of second-hand cars using four different machine learning algorithms.
The average residual value was reasonably low for all four approaches. Thus, we conclude that
predicting the price of second-hand cars is a very risky enterprise but which is feasible. This system
will be very useful to car dealers and car owners who need to assess the value of their cars. In the
future, we intend to collect more data and more features and to use a larger variety of machine learning
algorithms to do the prediction.

38
13. REFERENCES

[1] NATIONAL TRANSPORT AUTHORITY. 2015. Available at:


http://nta.govmu.org/English/Statistics/Pages/Arch ives.aspx. [Accessed 24 April 2015].
[2] Bharambe, M. M. P., and Dharmadhikari, S. C. (2015) “Stock Market Analysis Based on
Artificial Neural Network with Big data”. Fourth Post Graduate Conference, 24-25th March 2015,
Pune, India.
[3] Pudaruth, S. (2014) “Predicting the Price of Used Cars using Machine Learning Techniques”.
International Journal of Information & Computation Technology, Vol. 4, No. 7, pp.753- 764.
[4] Jassibi, J., Alborzi, M. and Ghoreshi, F. (2011) “Car Paint Thickness Control using Artificial
Neural Network and Regression Method”. Journal of Industrial Engineering International, Vol. 7,
No. 14, pp. 1-6, November 2010
[5] Ahangar, R. G., Mahmood and Y., Hassen P.M. (2010) “The Comparison of Methods, Artificial
Neural Network with Linear Regression using Specific Variables for Prediction Stock Prices in
Tehran Stock Exchange”. International Journal of Computer Science and Information Security,
Vol.7, No. 2, pp. 38-46.
[6] Listiani, M. (2009) “Support Vector Regression Analysis for Price Prediction in a Car Leasing
Application”. Thesis (MSc). Hamburg University of Technology.
[7] Iseri, A. and Karlik, B. (2009) “An Artificial Neural Network Approach on Automobile Pricing”.
Expert Systems with Application: ScienceDirect Journal of Informatics, Vol. 36, pp. 155-2160,
March 2009.
[8] Yeo, C. A. (2009) “Neural Networks for Automobile Insurance Pricing”. Encyclopedia of
Information Science and Technology, 2nd Edition,
pp. 2794-2800, Australia.
[9] Doganis, P., Alexandridis, A., Patrinos, P. and
Sarimveis, H. (2006) “Time Series Sales Forecasting for Short Shelf-life Food Products
Based on Artificial Neural Networks and Evolutionary Computing”. Journal of Food Engineering,
Vol. 75, pp. 196–204.

39
[10] Rose, D. (2003) “Predicting Car Production using a Neural Network Technical Paper- Vetronics
(Inhouse)”.
Thesis, U.S. Army Tank Automotive Research, Development and Engineering Center (TARDEC).
[11] LEXPRESS.MU ONLINE. 2014. [Online] Available at: http://www.lexpress.mu/ [Accessed
23 September 2014].
[12] LE DEFI MEDIA GROUP. 2014. [Online] Available at: http://www.defimedia.info/ [Accessed
23 September 2014].
[13] He, Q. (1999) “Neural Network and its
Application in IR”. Thesis (BSc). University of Illinois.
[14] Cheng, B. and Titterington, D. M. (1994). “Neural Networks: A Review from a Statistical
Perspective”. Statistical Science, Vol. 9, pp. 2-54.
[15] Anyaeche, C. O. (2013). “Predicting Performance Measures using Linear Regression and Neural
Network: A Comparison”. African Journal of Engineering Research, Vol. 1, No. 3, pp. 84-89.

40

You might also like