Road accident Prediction Using Data Mining Techniques
Anurag S Tippa [MY.SC.U3BCA21133]
                                                  Thilakesh A [MY.SC.U3BCA21106]                                                          Supervisor:Mrs.Keerthika.K Assistant Professor, Computer
                                                Mahadev Prasad M [MY.SC.U3BCA21124]
                                                                                                                                               Science Engineering, School of Arts and Science.
                                                  Gowtham M [MY.SC.U3BCA21104]
                                                                                                                                          RESULTS
     ABSTRACT                                             INTRODUCTION
                                                  Road accidents are a global concern, causing loss of life,   Decision tree:                                                                         DISCUSSION
 This study uses a variety of data sources,       injuries, and economic costs. Data mining techniques,         1. The study aimed to predict if Karnataka districts would             Random Forest outperformed Decision Tree with 91%
 such as past accident reports,                   incorporating data analysis and machine learning, offer          have over 750 total accidents using a decision tree                accuracy compared to Decision Tree's 61%.
 meteorological data, and traffic patterns,       a powerful means to predict and prevent accidents.               classifier.                                                        - Random Forest's superior performance demonstrates its
 to anticipate road accidents using data          This approach involves analyzing historical accident          2. Features included total accidents, fatalities, severe              robust predictive capabilities.
 mining techniques.                               data to identify patterns and risk factors,                      injuries, minor injuries, and total injuries, with 'Vehicle to     - The findings highlight Random Forest as a preferred
  The study uses machine learning methods         In India, where road safety challenges are pronounced,           Vehicle Total Accidents' as the target variable.                   choice for tasks needing high precision and accuracy.
 for feature selection and thorough               the adoption of data mining for accident prediction is on     3. The decision tree classified districts as 'False' for              - The results underscore Decision Tree's limitations in
 analysis, including decision trees and           the rise. Government agencies, including the Ministry of         accidents ≤750 and 'True' for accidents >750, visualized           capturing complex patterns and nuances within the dataset.
 neural networks.                                 Road Transport and Highways, utilize predictive                  in tree.
 To maintain data quality, the dataset is         models to inform safety programs. Collaborative efforts       4. It listed districts with over 750 accidents, using this
 meticulously preprocessed to handle              involving universities, tech companies, and NGOs are             threshold as a baseline for identifying patterns leading to                              CONCLUSIONS
 outliers and missing values.                     actively contributing to road safety improvement.                high accident rates.
                                                  The motivations in India stem from high accident rates,       5. The tree's visualization highlighted key features and              Conclusion:
 Complex interactions among contributing
                                                  substantial economic impact, limited resources, urban            thresholds used for classification, aiding in understanding        - The study uses Decision Tree and Random Forest algorithms to
 components can be understood through
                                                  congestion, and the need for infrastructure                      factors contributing to higher accident rates.                     analyze road accident data.
 the use of advanced analytics and
                                                  development. Predictive models aid in optimizing              6. This information helps in targeted interventions and               - Random Forest achieves 91% accuracy, outperforming Decision
 visualisation technologies.
                                                  resource allocation, supporting efficient traffic                preventive measures. The Decision Tree method had an               Tree.
 Real-time and historical data are used to
                                                  management, informing infrastructure projects, and               accuracy of 51%, the lowest in the experiment.                     - A comprehensive dataset from historical reports, meteorological
 thoroughly evaluate and validate the
                                                  influencing policy changes. Overall, data-driven accident    Random Forest:-                                                        data, and traffic patterns is utilized.
 model's performance.
                                                  prediction contributes to public safety, reduces              1. Used attributes: "Vehicle to vehicle total injured" and            - Advanced data mining techniques ensure thorough analysis and
 By enabling proactive accident risk
                                                  economic burdens, and fosters responsible road use               "Vehicle to animal persons minor injury."                          precise predictions.
 minimization, the suggested predictive
                                                  through awareness campaigns.                                  2. Achieved a notable accuracy of 91%.                                - The model serves as an early warning system, aiding accident
 model functions as an efficient early
                                                                                                                3. Demonstrated high efficacy in analyzing and predicting             prevention, policy-making, traffic management, and urban
 warning system for law enforcement and
                                                                                                                   factors associated with road accidents.                            planning.
 motorists.
                                                                                                                4. Highlights Random Forest's robust performance and                  - Real-time and historical data integration enhances the model's
 The study improves overall road safety by            METHODS AND MATERIALS                                                                                                           value for law enforcement and motorists.
 assisting with policy formation, traffic                                                                          suitability for complex predictive tasks in road safety
                                              Decision tree: When proposing to use the ID3 algorithm               analysis.                                                          - Data mining techniques significantly improve road safety and
 management, and urban planning.
                                              for traffic accident prediction, it is important to describe                                                                            reduce societal and economic impacts.
 The purpose of this study is to offer a
                                              the specific objectives, methods and expected results of                                                                                Future Scope:
 useful tool for predicting and preventing
                                              the study or project. Below is an example of a proposed                                                                                 - Further research can explore factors like vehicle interactions,
 accidents, ultimately reducing their
                                              action plan for using the ID3 algorithm in traffic                                                                                      injuries, and animal-related incidents.
 frequency and severity, saving lives, and
                                              accident prediction.                                                                                                                    - Predictive analytics can forecast accident likelihood and severity.
 minimizing societal and economic impact.
                                              1)Vehicle Crash Dataset Development                                                                                                     - The dataset can guide infrastructure development and urban
                                              2)Data processing                                                                                                                       planning by identifying high-risk areas.
                                              3)Feature Selection                                                                                                                     - Insights can help policymakers create targeted safety regulations.
                                              4)Use of the ID3 algorithm                                                                                                              - Public awareness campaigns can promote safer driving practices
                                              5)Tree construction                                                                                                                     based on common accident scenarios.
                                              6)Model Evaluation
ACKNOWLEDGEMENT
                                              Random forest:-When considering the use of the                                                          Figure 2.Random forest
                                                                                                                Figure 1. Decision Tree
We are grateful to MRS.KEERTHIKA. K                                                                                                                            tree
                                              Random Forest algorithm for traffic accident prediction,
and Dr.Narendran S M for encouraging
discussions about the topic. We acknowledge   it is important to describe the specific study objectives,                                                                                                     REFERENCE
Amrita Vishwa Vidyapeetham, Mysuru            methods, and expected results. Below is a visual example
                                                                                                                                                                                         1) Indraja Smitha Tabitha Paul Chandini;(2018);A Road Accident Prediction
Campus                                        of the proposed work using the Random Forest                                                                                              Model Using Data Mining Techniques;The papers suggest the use of data
                                              algorithm for traffic accident prediction.                                                                                                mining technologies and algorithms for developing predictive models for road
                                              1)Development of a complete data set                                                                                                      accidents in India and other regions.
       CONTACT                                2)Feature Selection and Engineering
                                              3)Implement Random Forest Algorithm
                                                                                                                                                                                        [2] Mrs.Kavitha Bai A.SAishwaryaThankchanE Si Krupa;(2020 may);Road
                                                                                                                                                                                        accident analysis using data miningtechniques ; The paper discuss the results of
                                                                                                                                                                                        using data mining techniques to analyse road accidents in India and also the
                                              4)Hyper parameter Tuning                                                                                                                  accidents which might occur in the future.
 Name:ANURAG S TIPPA                                                                                                                                                                    [3] Liling Li, Sharad Shrestha, Gongzhu Hu;(2021);Analysis of road traffic fatal
 Organization Name:AMRITA VISHWA              5)Evaluation                                                                                                                              accidents using data mining techniques;Association rules among variables were
 VIDYAPEETHAM, MYSURU CAMPUS                  6)Spatial and Temporal Analysis                                                                                                           discovered using the Apriori algorithm
 E-mail: anuragtippa757@gmail.com             7)Interpretability and Feature importance
 phone no:7892817515                                                                                             Figure 3. Comparison of Two
                                              8)Compare with current model                                                Algorithm
                                                                                                                                                          Figure 4.Confusion Matrix
                                              9)Implementation and design