Bachelor of Technology: Prediction of Used Car Prices Using Artificial Neural Networks and Machine Learning
Bachelor of Technology: Prediction of Used Car Prices Using Artificial Neural Networks and Machine Learning
On
PREDICTION OF USED CAR PRICES USING ARTIFICIAL NEURAL
NETWORKS AND MACHINE LEARNING
Submitted in partial fulfillment of the Academic
Requirement for the Award of Degree of
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
Submitted By
Mr. U. Veeresh
(Assistant Professor, Dept of CSE)
CERTIFICATE
This is to certify that a Mini Project entitled with” Cyber Security Awareness in Online
Education” is being
Submitted by:
In partial fulfilment of the requirement for award of the degree of B. Tech in CSE to the
JNTUH, Hyderabad is a record of a Bonafide work carried out under our guidance and
supervision. The results in this project have been verified and are found to be satisfactory. The
results embodied in this work have not been submitted to have any other University for award
of any other degree or diploma.
We express our thanks to all staff members and friends for all the help and coordination
extended in bringing out this Project successfully in time.
Finally, we are very much thankful to our parents and relatives who guided directly or
indirectly for successful completion of the project.
i
ABSTRACT
The number of cars on Mauritian roads has been rising consistently by 5% during the last decade. In
2014, (1,73,954) cars were registered at the National Transport Authority. Thus, one Mauritian in
every six owns a car, most of which are second hand reconditioned cars and used cars. The aim of this
study is to assess whether it is possible to predict the price of second-hand cars using artificial neural
networks. Thus, data for 200 cars from different sources was gathered and fed to four different machine
learning algorithms. We found that support vector machine regression produced slightly better results
than using a neural network or linear regression. However, some of the predicted values are quite far
away from the actual prices, especially for higher priced cars. Thus, more investigations with a larger
data set are required and more experimentation with different network type and structures is still
required in order to obtain better predictions.
The manufacturer sets the price of a new car in the industry, with the government incurring some
additional expenditures in the form of taxes. Customers purchasing a new car may thus be sure that
their investment will be worthwhile. However, due to rising new car prices and buyers' financial
inability to purchase them, used car sales are increasing globally. As a result, a used car price prediction
system that efficiently assesses the worthiness of the car utilizing a range of factors is required. The
current system comprises a system in which a dealer decides on a price at random and the buyer has
no knowledge of the car or its current worth. In reality, the seller has no clue what the car is worth or
what price he should charge for it. To address this issue, we have devised a highly effective model.
Regression algorithms are employed because they produce a continuous value rather than a classified
value as an output.
In this paper, we investigate the application of supervised machine learning techniques to predict the
price of used cars in Mauritius. The predictions are based on historical data collected from daily
newspapers. Different techniques like multiple linear regression analysis, k-nearest neighbours, naïve
bayes and decision trees have been used to make the predictions. The predictions are then evaluated
and compared in order to find those which provide the best performances. A seemingly easy problem
turned out to be indeed very difficult to resolve with high accuracy. All the four methods provided
comparable performance. In the future, we intend to use more sophisticated algorithms to make the
predictions.
ii
TABLE OF CONTENTS
1. Introduction 1
2. Literature Survey 2
3. System Analysis 4
3.1 Existing System 4
3.2 Disadvantages of Existing System 4
3.3 Proposed System 5
3.4 Advantages of Proposed System 5
4. System Study 6
4.1 Feasibility Study 6
4.1.1 Economical Feasibility 6
4.1.2 Technical Feasibility 7
4.1.3 Social feasibility 7
5. Hardware and Software requirements 8
5.1 Hardware Requirements 8
5.1.1 Processor 8
5.1.2 RAM 8
5.1.3 Hard disk 8
5.2 Software Requirements 9
5.2.1 Operating System 9
5.2.2 Coding Language 9
6. Architecture 10
7. Modules 13
8. Diagrams 15
8.1 Data flow Diagrams 15
8.2 Class Diagrams 17
8.3 Sequence Diagrams 19
8.4 Use-case Diagrams 21
9. Implementation 23
9.1 Data set 25
9.2 Preprocessing 25
9.3 Validation 26
9.4 Model with best parameters 26
iii
9.5 Prediction 26
10. Screen Shots 27
11. Testing 31
11.1 Types of Tests 31
11.1.1 Unit Testing 31
11.1.2 Integration Testing 32
11.1.3 Functional Testing 32
11.1.4 System Testing 33
11.2 User Training 36
11.3 Maintenance 36
11.4 Testing Strategy 37
12. Conclusion 38
13. References 39
iv
LIST OF FIGURES
v
1. INTRODUCTION
According to the data obtained from the National Transport Authority (2014), there has been an
increase of 254% in the number of cars from 2003 (68, 524) to 2014 (173, 954), as shown in Figure 1.
We can thus infer that the sale of second-hand imported (reconditioned) cars and second-hand used
cars has eventually increase given that new cars represent only a very small percentage of the total
number of cars sold each year. Most individuals in Mauritius who buy new cars also want to know
about the resale value of their cars after some years so that they can sell it in the used car market.
Price prediction of second-hand cars depends on numerous factors. The most important ones are
manufacturing year, make, model, mileage, horsepower and country of origin. Some other factors are
type and amount of fuel per usage, the type of braking system, its acceleration, the interior style, its
physical state, volume of cylinders (measured in cubic centimeters), size of the car, number of doors,
weight of the car, consumer reviews, paint color and type, transmission type, whether it is a sports car,
sound system, cosmic wheels, power steering, air conditioner, GPS navigator, safety index etc. In the
Mauritian context, there are some special factors that are also usually considered such as who were the
previous owners and whether the car has had any serious accidents.
Thus, predicting the price of second-hand cars is a very laudable enterprise. In this paper, we will
assess whether neural networks can be used to accurately predict the price of secondhand cars. The
results will also be compared with other methods like linear regression and support vector regression.
This paper proceeds as follows. In this system, various works on neural networks and price prediction
have been summarized. The methodology and data collection are described in this system. The system
presents the results for price prediction of second-hand cars. Finally, we end the paper with a
conclusion and some ideas towards future works.
1
2. LITERATURE SURVEY
Various studies have been conducted in order to predict the price of used cars. Researchers regularly
anticipate product prices using past data. Pudaruth predicted car prices in Mauritius, and these cars
were not new, but rather used to predict the prices, he employed multiple linear regression, k-nearest
neighbors, Naive Bayes, and decision tree techniques. When the prediction results from various
strategies were compared, it was discovered that the prices from these methods are quite similar.
However, the decision tree technique and the Nave Bayes approach were proven to be incapable of
classifying and predicting numeric values. According to Pudaruth's research, the small sample size
does not give good prediction accuracy.
Kuiper, S. (2008) demonstrated a multivariate regression model that helps in classifying and predicting
values in numeric format. It demonstrates how to apply this multivariate regression model to forecast
the price of 2005 General Motors (GM) vehicles. The price prediction of cars does not require any
special knowledge. So, the data available online is enough to predict prices. The author of the article
did the same car price prediction and introduced variable selection techniques that helped in finding
which variables were more relevant for inclusion .
In 2019, Pal et al discovered as a methodology for predicting used cars prices using Random Forest.
The paper evaluated used-car price prediction using Kaggle data set which gave an accuracy of 83.62%
for test data and 95% for train-data. The most relevant features used for this prediction were price,
kilometer, brand, and vehicle type and identified by filtering out outliers and irrelevant features of the
data set. Being a sophisticated model, Random Forest provided good accuracy in comparison to prior
work using these data sets.
2
The goal of the system that Dholiya , M., et al. developed is to give the user a realistic estimation of
how much the vehicle might cost them. Based on the specifics of the automobile the user is looking
for, the system, which is a web application, may also offer the user a list of options for various car
kinds. It assists in providing the buyer or seller with useful information on which to base their decision.
This system makes predictions using the multiple linear regression algorithm, and this model was
trained using historical data that was obtained over an extended period of time. The raw data was
initially gathered using the KDD (Knowledge Discovery in Databases) process. Afterward, it
underwent preprocessing and cleaning in order to identify patterns that are valuable and then derive
some meaning from those patterns.
Richardson conducted his analysis under the presumption that automakers are more inclined to produce
cars that don't lose value quickly. He demonstrated, in particular, that hybrid cars are better equipped
to maintain their value than conventional vehicles by utilizing multiple regression analysis. This is
perhaps because there are increasing concerns about the environment and the climate, as well as
because it uses less gasoline. In this study, the significance of additional variables including age,
mileage, make, and MPG (miles per gallon) was also taken into account. All of his information was
gathered from several websites.
Listiani published another study that is comparable and uses Support Vector Machines (SVM) to
forecast lease car pricing . This study demonstrated that when a very large data set is available, SVM
is significantly more accurate at price prediction than multiple linear regression. SVM is also superior
at handling high dimensional data and steers clear of both under- and over-fitting problems. Finding
crucial features for SVM is done using a genetic algorithm. However, the method does not demonstrate
why SVM is superior to basic multiple regression in terms of variance and mean standard deviation.
3
3. SYSTEM ANALYSIS
System analysis is a critical phase in the project management and development process. It
provides a solid foundation for project planning, design, development, and implementation,
helping to ensure that the final system meets the intended objectives and user needs. This
analysis aims to gather detailed information about the existing system and its disadvantages,
proposed system and its attributes like its advantages and advancements.
1. Ahangar et al. (2010) also compared the use of neural networks with linear regression in
order to predict the stock prices of companies in Iran. They also found that neural
networks had superior performance both in terms of accuracy and speed compared to
linear regression.
2. Pudaruth (2014) used four different supervised machine learning techniques namely
KNN (k-Nearest Neighbor), Naïve Bayes, linear regression and decision trees. The best
result was obtained using KNN .
3. Bharambe and Dharmadhikari (2015) used artificial neural networks (ANN) to analyze
the stock market and predict market behavior.
4
3.3 PROPOSED SYSTEM
In order to carry out this study, data have been obtained from different car websites and from the
small adverts sections found in daily newspapers. Two hundred records were collected. A large
number of experiments have been conducted in order to find the best network structure and the best
parameters for the neural network. We found that a neural network with 1 hidden layer and 2 nodes
produced the smallest mean absolute error among various neural network structures that were
experimented with.
However, we found that Support Vector Regression and a multilayer perception with back-
propagation produced slightly better predictions than linear regression while the k-Nearest
Neighbor algorithm had the worst accuracy among these four approaches. All experiments were
performed with a cross validation value of 10 folds.
1. The purpose of linear regression, support vector regression which are more effective for
testing and training accuracy.
2. In this work, the system will assess whether neural networks can be used to accurately
predict the price of used cars.
5
4. SYSTEM STUDY
System Study is the process of collecting and interpreting facts, identifying the problems, and
decomposition of a system into its components. It is a problem solving technique that improves the
system and ensures that all the components of the system work efficiently to accomplish their purpose.
For this project, we have gathered images from Kaggle website about various brain tumor types, based
on the images collected we have divided them into training and testing sets where training set images
are used to train the machine learning model and based on the training the machine will predict the
result for testing data whether the brain tumor is malignant or not.
The feasibility of the project is analyzed in this phase and a business proposal is put forth with a very
general plan for the project and some cost estimates. During system analysis the feasibility study of
the proposed system is to be carried out. This is to ensure that the proposed system is not a burden to
the company. For feasibility analysis, some understanding of the major requirements for the system is
essential.
This study is carried out to check the economic impact that the system will have onthe organization.
The amount of funds that the company can pour into the research and development of the system is
limited. The expenditures must be justified. Thus the developed system as well within the budget and
this was achieved because most of the technologies used are freely available .Only the customized
products had to be purchased.
6
4.1.2 TECHINCAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the technical requirements of the
system. Any system developed must not have a high demand onthe available technical resources .This
will lead to high demands on the available technical resources. This will lead to high demands being
placed on the client. The developed system must have a modest requirement, as only minimal or null
changes are required for implementing this system.
The aspect of study is to check the level of acceptance of the system by the user. This includes the process
of training the user to use the system efficiently. The user must not feel threatened by the system,
instead must accept it as a necessity. The level of acceptance by the users solely depends on the methods
that are employed to educatethe user about the system and to make him familiar with it. His level of
confidence must be raised so that he is also able to make some constructive criticism, which is
welcomed, as he is the final user of the system.
7
5. HARDWARE AND SOFTWARE REQUIREMENTS
The hardware and software requirements outlined for the proposed system are pivotalelements that
warrant careful consideration to guarantee the system's optimal functionality and high-performance
capabilities. The following mentioned hardware and software requirements are those which is
suggestable for the successful working and execution of the proposed system.
In parallel, the hardware requirements have been meticulously defined to strike a balance between
accessibility and computational process.
5.1.1 Processor
Provides the instructions and processing power the computer needs to do its work.The more powerful
and updated your processor, the faster your computer can complete its tasks.
The Processor we used for this project is: Intel core i5/ AMD Ryzen 5.
5.1.2 RAM
RAM provides the shorter-term memory the CPU needs to open files and move data around as it
responds to the tasks given to it by your apps. Both RAM and the CPU work synchronously and
complementarity to ensure that your computer's performance fits your needs and you have a good
experience when using your device. The RAM used for this project is 8 GB minimum.
A hard drive is the hardware component that stores all of your digital content. Your documents,
pictures, music, videos, programs, application preferences, andoperating system represent digital
content stored on a hard drive. A minimum of 500GB hard disk is suggestable.
The Hard disk used for this project is 10GB.
8
5.2 SOFTWARE REQUIREMENTS
The operating system (OS) manages all of the software and hardware on thecomputer. It performs
basic tasks such as file, memory and process management, handling input and output, and controlling
peripheral devices such as disk drives andprinters. The operating system we have used is Windows 10.
The programming language used for this project is python. It brings an exceptional amount of power
and versatility to machine learning environments. The language's simple syntax simplifies data
validation and streamlines the scraping, processing, refining, cleaning, arranging and analyzing
processes, thereby making collaborationwith other programmers less of an obstacle.
The coding language we have used is Python with the latest version of 3.7.0.
1. Python is Interpreted: Python is processed at runtime by the interpreter. You do not need to compile
your program before executing it. This is similar to PERL and PHP.
2. Python is Interactive: You can actually sit at a Python prompt and interact withthe interpreter directly
to write your programs.
3. Python is Object-Oriented: Python supports Object-Oriented style or technique of programming
that encapsulates code within objects.
9
6. ARCHITECTURE
Architecture typically refers to the structural and organizational framework or design of a system,
software, building, or any complex project. The architecture provides a high-level overview of how the
project is structured and how its various components or elements interactwith each other. It outlines
the fundamental design principles, components, and their relationships, and helps stakeholders
understand the project's overall framework.
10
SERVICE PROVIDER:
In this module, the Service Provider has to login by using valid user name and password. After login
successful he can do some operations such as:
1. Login
2. Train & Test Used Car Data Sets
3. View Trained Accuracy in Bar Chart
4. View Trained Accuracy Results
5. View Used Car Prices Type
6. Find Used Car Prices Type Ratio
7. Download Predicted Datasets
8. View Used Car Prices Type Ratio Results
9. View All Remote Users.
WEB SERVER:
A web server is a dedicated computer responsible for running websites sitting out on those computers
somewhere on the Internet. They are specialized programs that circulate web pages as summoned by
the user. The primary objective of any web server is to collect, process and provide web pages to the
users.
In this module, the admin can view the list of users who all registered. In this, the admin can view the
user’s details such as,:
1. User name
2. Email
3. Address
4. Admin authorizes the users.
11
REMOTE USER:
In this module, there are n numbers of users are present. User should register before doing any
operations. Once user registers, their details will be stored to the database. After registration
successful, he has to login by using authorized user name and password. Once Login is successful user
will do some operations like :
WEB DATABASE:
A Web database is a database application designed to be managed and accessed through the Internet.
Website operators can manage this collection of data and present analytical results based on the data
in the Web database application.
12
7.MODULES
This project involves some modules that can be organized into three main steps:
In this module, the Service Provider has to login by using valid user name and password. After login
successful he can do some operations such as:
1. Login
2. Train & Test Used Car Data Sets
3. View Trained Accuracy in Bar Chart
4. View Trained Accuracy Results
5. View Used Car Prices Type
6. Find Used Car Prices Type Ratio
7. Download Predicted Datasets
8. View Used Car Prices Type Ratio Results
9. View All Remote Users.
In this module, the admin can view the list of users who all registered. In this, the admin can view the
user’s details such as,:
1. User name
2. Email
3. Address
4. Admin authorizes the users.
13
7.3 REMOTE USER
In this module, there are n numbers of users are present. User should register before doing any
operations. Once user registers, their details will be stored to the database. After registration
successful, he has to login by using authorized user name and password. Once Login is successful user
will do some operations like :
14
1. DIAGRAMS
8.1 DATA FLOW DIAGRAM
A Data Flow Diagram (DFD) is a visual representation that illustrates how data moves within a system.
In the context of brain tumor detection using CNN (Convolutional Neural Networks), the primary
elements would include data sources, data processing components, and data destinations.
15
Data Flow: Data flow describes the information transferring between different parts of the
systems. The arrow symbol is the symbol of data flow. A relatable name should be given to the
flow to determine the information which is being moved. Data flow also represents material
along with information that is being moved. Material shifts are modelled in systems that are not
merely informative. A given flow should only transfer a single type of information. The
direction of flow is represented by the arrow which can also be bi-directional.
Warehouse: The data is stored in the warehouse for later use. Two horizontal lines represent
the symbol of the store. The warehouse is simply not restricted to being a data file rather it can
be anything like a folder with documents, an optical disc, a filing cabinet. The data warehouse
can be viewed independent of its implementation. When the data flow from the warehouse it is
considered as data reading and when data flows to the warehouse it is called data entry or data
updating.
Terminator: The Terminator is an external entity that stands outside of the system and
communicates with the system. It can be, for example, organizations like banks, groups of
people like customers or different departments of the same organization, which is not a part of
the model system and is an external entity. Modelled systems also communicate with
terminator.
16
8.2 CLASS DIAGRAMS
A class diagram in the Unified Modelling Language (UML) is a type of static structure diagramthat
describes the structure of a system by showing the system's classes, their attributes, operations (or
methods), and the relationships among the classes. It explains which class contains information.
17
Class Descriptions:
18
8.3 SEQUENCE DIAGRAMS
A sequence diagram in Unified Modelling Language (UML) is a kind of interaction diagram that shows
how processes operate with one another and in what order. It is a construct of a Message Sequence
Chart. Sequence diagrams are sometimes called event diagrams, event scenarios, and timing diagrams.
19
Sequence Description:
20
8.4 USE-CASE DIAGRAM
Use-case diagram is used to represent the dynamic behavior of a system. It encapsulates the system's
functionality by incorporating use cases, actors, and their relationships. It models the tasks, services,
and functions required by a system/subsystem of an application. It depicts the high-level functionality
of a system and also tells how the user handles a system.
21
Use-case Descriptions:
22
9.IMPLEMENTATION
Naïve Bayes:
Random Forest:
1. Teams of trees vote to make choices and many trees help decide the most popular answer.
2. Like a group making decisions by voting.
Logistic Regression:
1. Predicts yes/no situations using clues and looks at factors to decide between choices.
2. Like choosing something based on different reasons.
1. Draws a line to separate different groups and finds the best line to keep things apart.
2. Think of drawing a clear line between two teams.
23
K-Nearest Neighbors (KNN):
Gradient boosting :
1. It gives a prediction model in the form of an ensemble of weak prediction models, which are
typically decision trees.
2. A gradient-boosted trees model is built in a stage-wise fashion 3. It generalizes the other methods
by allowing optimization of an arbitrary differentiable loss function.
24
9.1 DATA SET
The sample dataset is collected from the Kaggle website which includes Brain MRI images. This
dataset includes images of both patients with brain tumors(malignant) and those without
tumors(benign).
9.2 PREPROCESSING
Once that the dataset has been loaded next step is to preprocess the data in such a waythat it is
suitable for training and testing of the data.
Basically, Data preprocessing is a technique that is used to convert raw data into a cleandataset.
The data gathered from different sources is in raw format which is not feasible for the analysis.
The preprocessing techniques used here are resizing the images, normalizing and augmenting the
images.
Now the next step is to split our dataset into two. Training set and a Test set. We willtrain our
machine learning models on our training set, i.eour machine learning models will try to
understand any correlations in our training set and then we will test the models on our test set
to examine how accurately it will predict. A general rule ofthe thumb is to assign 80% of the
dataset to the training set and therefore the remaining 20% to the test set.
25
9.3 VALIDATION
In this crucial phase, the model undergoes a rigorous validation process using the independent test
dataset to comprehensively evaluate its performance. Several essentialmetrics are considered to assess
the model's efficacy in making accurate predictions and its ability to balance different aspects of
performance.
9.4 PREDICTION
The final prediction is a result of the activation function applied to the output layer. A sigmoid
activation function is commonly used for binary classification tasks, providinga probability estimate.
Depending on the threshold chosen, a prediction can be made. For example, if the predicted probability
is greater than 0.5, the model may classify the image as malignant; if it's less than or equal to 0.5, the
model may classify it as benign.
26
10.SCREENSHOTS
Register page:
Login page:
27
Main Interface of Remote User:
28
Predicted Used Car Type:
29
Used Car Price Ratio:
Service Provider:
30
11.TESTING
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality of
components, sub assemblies, assemblies and/or a finished product. It is the process of exercising
software with the intent of ensuring that the Software system meets its requirements and user
expectations and does not fail in an unacceptable manner. There are various types of test. Each test
type addresses a specific testing requirement.
Test strategy and approach: Field testing will be performed manually and functional testswill be written
in detail.
Test objectives:
1. All field entries must work properly.
2. Pages must be activated from the identified link.
3. The entry screen, messages and responses must not be delayed.
31
Features to be tested:
1. Verify that the entries are of the correct format.
2. No duplicate entries should be allowed.
3. All links should take the user to the correct page.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
32
Organization and preparation of functional tests is focused on requirements, key functions, or special
test cases. In addition, systematic coverage pertaining to identifyingBusiness process flows; data fields,
predefined processes, and successive processes mustbe considered for testing. Before functional testing
is complete, additional tests are identified and the effective value of current tests is determined.
TESTING METHODOLOGIES
The following are the Testing Methodologies:
1. Unit Testing
2. Integration Testing
3. User Acceptance Testing
4. Output Testing
5. System Testing
Unit Testing
Unit testing focuses verification effort on the smallest unit of Software design that is the module. Unit
testing exercises specific paths in a module’s control structure to ensure complete coverage and
maximum error detection. This test focuses on each module individually, ensuring that it functions
properly as a unit. Hence, the naming is Unit Testing. During this testing, each module is tested
individually and the module interfacesare verified for the consistency with design specification. All
important processing pathsare tested for the expected results. All error handling paths are also tested.
33
Integration Testing
Integration testing addresses the issues associated with the dual problems of verification and program
construction. After the software has been integrated a set of high order testsare conducted. The main
objective in this testing process is to take unit tested modules and builds a program structure that has
been dictated by design.
2. Bottom-up Integration
This method begins the construction and testing with the modules at the lowest level in the program
structure. Since the modules are integrated from the bottom up, processing required for modules
subordinate to a given level is always available and the need for stubs is eliminated. The bottom up
integration strategy may be implemented with the following steps:
The low-level modules are combined into clusters into clusters that perform aspecific Software
sub-function.
A driver (i.e.) the control program for testing is written to coordinate test case input and output.
The cluster is tested.
Drivers are removed and clusters are combined moving upward in the programstructure.
The bottom up approach tests each module individually and then each module is integrated
with a main module and tested for functionality.
34
User Acceptance Testing
User Acceptance of a system is the key factor for the success of any system. The system under
consideration is tested for user acceptance by constantly keeping in touch with theprospective system
users at the time of developing and making changes wherever required. The system developed provides
a friendly user interface that can easily be understood even by a person who is new to the system. This
is a critical phase of any project and requires significant participation by the end user. It also ensures
that the system meets the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
Output Testing
After performing the validation testing, the next step is output testing of the proposed system, since no
system could be useful if it does not produce the required output in the specified format. Asking the
users about the format required by them tests the outputs generated or displayed by the system under
consideration. Hence the output format is considered in 2 ways – one is on screen and another in printed
format.
System Test
System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. An example of system testing is the
configuration oriented system integration test. System testing is based on process descriptions and
flows, emphasizing pre-driven process links and integration points.
35
Black Box Testing
Black Box Testing is testing the software without any knowledge of the inner workings, structure or
language of the module being tested. Black box tests, as most other kinds oftests, must be written from
a definitive source document, such as specification or requirements document, such as specification or
requirements document. It is a testing inwhich the software under test is treated, as a black box .you
cannot “see” into it. The test provides inputs and responds to outputs without considering how the
software works.
Whenever a new system is developed, user training is required to educate them about theworking of the
system so that it can be put to efficient use by those for whom the systemhas been primarily designed.
For this purpose the normal working of the project was demonstrated to the prospective users. Its
working is easily understandable and since theexpected users are people who have good knowledge of
computers, the use of this systemis very easy.
11.3 MAINTENANCE
This covers a wide range of activities including correcting code and design errors. To reduce the need
for maintenance in the long run, we have more accurately defined the user’s requirements during the
process of system development. Depending on therequirements, this system has been developed to
satisfy the needs to the largest possible extent. With development in technology, it may be possible to
add many more features based on the requirements in future. The coding and designing is simple and
easy to understand which will make maintenance easier.
A strategy for system testing integrates system test cases and design techniques into a well planned
series of steps that results in the successful construction of software. The testing strategy must
cooperate with test planning, test case design, test execution, and the resultant data collection and
evaluation. A strategy for software testing must accommodate low-level tests that are necessary to
verify that a small source code
36
segment has been correctly implemented as well as high level tests that validate major system functions
against user requirements.
Software testing is a critical element of software quality assurance and represents the ultimate review
of specification design and coding. Testing represents an interesting anomaly for the software. Thus, a
series of testing are performed for the proposed systembefore the system is ready for user acceptance
testing.
37
12. CONCLUSION
Prediction of used cars prices depends on numerous factors. The most important ones are
manufacturing year, make, model, mileage, horsepower and country of origin. Some other factors are
type and amount of fuel per usage, the type of braking system, its acceleration, the interior style, its
physical state, volume of cylinders (measured in cubic centimeters), size of the car, number of doors,
weight of the car, consumer reviews, paint color and type, transmission type, whether it is a sports car,
sound system, cosmic wheels, power steering, air conditioner, GPS navigator, safety index etc.
In the Mauritian context, there are some special factors that are also usually considered such as who
were the previous owners and whether the car has had any serious accidents.Predicting used car prices
is a difficult task due to the large number of features and parameters that must be examined in order to
get reliable findings. The first and most important phase is data collection and preprocessing. The
model was then defined and built in order to implement algorithms and generate results.
After executing various regression algorithms on the model, it was concluded that the Decision Tree
Algorithm was the top performer, with the greatest r2 score of 0.95, implying that it provided the most
accurate predictions, as shown by the Original v/s Prediction line graph. Aside from having the highest
r2 score, the Decision Tree also had the lowest Mean Square Error (MSE) and Root Mean Square Error
(RMSE) scores, indicating that the errors in predictions were the lowest of all and that the results
obtained were very accurate.
The aim of this paper was to predict the price of second-hand reconditioned and second- hand used
cars in Mauritius. The car market has been increasing steadily by around 5% for the last ten years,
showing the high demand for cars by the Mauritian population. There are hundreds of car websites in
Mauritius but none of them provide such a facility to predict the price of used cars based on their
attributes. Our dataset of 200 records was used with the cross-validation technique with ten folds.
The car make, year manufactured, paint type, transmission type, engine capacity and mileage have
been used to predict the price of second-hand cars using four different machine learning algorithms.
The average residual value was reasonably low for all four approaches. Thus, we conclude that
predicting the price of second-hand cars is a very risky enterprise but which is feasible. This system
will be very useful to car dealers and car owners who need to assess the value of their cars. In the
future, we intend to collect more data and more features and to use a larger variety of machine learning
algorithms to do the prediction.
38
13. REFERENCES
39
[10] Rose, D. (2003) “Predicting Car Production using a Neural Network Technical Paper- Vetronics
(Inhouse)”.
Thesis, U.S. Army Tank Automotive Research, Development and Engineering Center (TARDEC).
[11] LEXPRESS.MU ONLINE. 2014. [Online] Available at: http://www.lexpress.mu/ [Accessed
23 September 2014].
[12] LE DEFI MEDIA GROUP. 2014. [Online] Available at: http://www.defimedia.info/ [Accessed
23 September 2014].
[13] He, Q. (1999) “Neural Network and its
Application in IR”. Thesis (BSc). University of Illinois.
[14] Cheng, B. and Titterington, D. M. (1994). “Neural Networks: A Review from a Statistical
Perspective”. Statistical Science, Vol. 9, pp. 2-54.
[15] Anyaeche, C. O. (2013). “Predicting Performance Measures using Linear Regression and Neural
Network: A Comparison”. African Journal of Engineering Research, Vol. 1, No. 3, pp. 84-89.
40