Electronic Letters on Computer Vision and Image Analysis 0(0):1-7, 2000
BadODD: Bangladeshi Autonomous Driving Object Detection
Dataset
Mirza Nihal Baig∗ , Rony Hajong∗ , Mahdi Murshed Patwary∗ , Mohammad Shahidur Rahman∗ ,
Husne Ara Chowdhury∗
∗
CSE, Shahjalal University of Science and Technology, Sylhet, Bangladesh
arXiv:2401.10659v1 [cs.CV] 19 Jan 2024
Abstract
We propose a comprehensive dataset for object detection in diverse driving environments across 9 dis-
tricts in Bangladesh. The dataset, collected exclusively from smartphone cameras, provided a realistic rep-
resentation of real-world scenarios, including day and night conditions. Most existing datasets lack suitable
classes for autonomous navigation on Bangladeshi roads, making it challenging for researchers to develop
models that can handle the intricacies of road scenarios. To address this issue, the authors proposed a new
set of classes based on characteristics rather than local vehicle names. The dataset aims to encourage the
development of models that can handle the unique challenges of Bangladeshi road scenarios for the effective
deployment of autonomous vehicles. The dataset did not consist of any online images to simulate real-world
conditions faced by autonomous vehicles. The classification of vehicles is challenging because of the diverse
range of vehicles on Bangladeshi roads, including those not found elsewhere in the world. The proposed
classification system is scalable and can accommodate future vehicles, making it a valuable resource for
researchers in the autonomous vehicle sector.
Key Words: Computer Vision, Object Detection, Autonomous Vehicle, Dataset, Vehicle Classification.
1 Introduction
Autonomous navigation is advancing towards incredible technology in our daily lives. Many companies in-
tegrate autonomous technologies into their products in various regions and try to achieve level 5 autonomy.
Large-scale datasets have contributed significantly to the progress of autonomous navigation. However, in
many parts of the world, this technology still has a long way to go. A key challenge is to obtain a diverse set
of data to account for extreme corner cases. Most of the algorithms developed for autonomous vehicles are
not benchmarked on unstructured and congested data from different parts of the world. In addition, existing
datasets do not account for classes that are available on the Indian subcontinent, which is a major disadvantage
in advancing autonomous technology.
In this paper, we propose BDOR, a novel dataset for autonomous navigation of Bangladeshi roads that
addresses the problems of existing autonomous vehicle datasets. Our dataset consists of 9825 images with
78,943 objects of a bangladesh road under various lightning conditions with 13 classes. As the number of
objects per image is higher in Bangladesh roads, this dataset consists of the most common objects that are seen
on BD roads.
Furthermore, we propose a new set of classes for classifying vehicles in Bangladesh that are scalable and can
be used as a heuristic for autonomous cars. There are a diverse set of vehicles, many of which are manufactured
2
locally in unique shapes and customized to the needs of people. These vehicles are not globally acknowledged
by any universal name; rather, local names are provided. This creates a problem for setting class names that are
aligned with the existing datasets and account for all of the vehicles. Our proposed classes solved this problem
by classifying vehicles based on their characteristics rather than their local names. This set of classes can be
scalable to new types of vehicles that may be manufactured locally or imported from other countries.
Another potential use case for this new set of classes is that the characteristics of the vehicle can be used
for faster decision-making for autonomous cars. For the existing dataset, each class has the same weight as
autonomous vehicles. Our proposed classes have different characteristics; therefore, autonomous vehicles can
precompute decisions that can help make accurate decisions in real time. Each characteristic will cater to a
different decision-making process for navigation. This will be helpful for creating more scalable dataset in the
future.
We provide a detailed analysis of the class distribution in BDOR dataset. While some classes are seen heavily
on road such as Person, Autorickshaw, Three Wheeler other classes are rarely seen such as wheel chair, train,
construction vehicle. For this reason, the dataset is imbalance for the disproponate ratio of the vehicles seen on
road. Furthermore, the unstructure nature of the road scenario and congested objects poses certain problems
like occlusion in detecting the object.
Our main contributions are the following:
• We released a 2D Object Detection autonomous for autonomous driving on Bangladesh Roads
• A new set of scalable classes suitable for classifying bangladeshi vehicles
• We trained and benchmarked existing baseline models and made a comparative analysis of them.
Feature SODA10M BDD100K Waymo Open Dataset
Size 10M (unlabeled) + 20K (labeled) images 100K video frames 1.28 TB
Types of Data Images (front and rear cameras) Images (various cameras) LiDAR, Camera, Radar
Object Categories 6 10 19
Scenarios Diverse weather, time, location Urban, rural, various weather, day/night Highway, urban
Annotations Bounding boxes Bounding boxes Bounding boxes, 3D object points
Challenges Occlusions, scale variations, low resolution Occlusions, night-time, diverse weather Proprietary format, limited access
Focus Self/semi-supervised learning General 2D object detection Autonomous driving research
Publicly Available? Yes Yes Partially
Table 1: Comparison of 2D Object Detection Datasets for Autonomous Vehicles
2 Dataset
2.1 Data Collection
The data were collected from 9 districts in Bangladesh: Sylhet, Dhaka, Rajshahi, Mymensingh, Maowa, Chit-
tagong, Sirajganj, Sherpur, and Khulna. These cities have different road scenarios and different types of vehicles
in various proportions. There are urban and rural areas, highways, and expressway roads with different types of
traffic. We collected a dataset from the front of the car. Sometimes, there was car glass in front of the camera,
and others time we collected without glass in front. Day and nighttime datasets were collected for different
types of roads. To ensure a real-world road scenario, videos were collected while the driver was driving. No
online images were collected to ensure the driving road scenery and quality of the dataset.
2.2 Frame Selection
The method by which we sampled images from video sources was carefully planned to capture the dynamic
quality of various urban environments. Because traffic densities vary in different settings, we used a flexible
frame-rate sampling strategy to provide a representative and varied dataset.
3
The frame rate on highways and expressways, which is frequently associated with reduced traffic volumes,
was set at one frame per second. This decision was made after a thorough manual review of the film with the
goal of eliminating repetition in less dynamic situations while retaining crucial details.
However, we chose a lower frame rate of one frame every two seconds in highly crowded urban regions with
increased traffic and pedestrian activity. The necessity of guaranteeing dataset diversity by recording a wider
range of circumstances in high-density traffic zones motivated this decision.
The changing frame rates are largely determined by observing the video content. This made it possible for
us to customize the sample plan to the unique features of every site, producing a dataset that faithfully captured
the dynamic character of both densely populated areas and those with little traffic.
In addition to improving the representativeness of the dataset, this adaptive sampling technique advances
our understanding of the traffic patterns in various metropolitan environments. Frame rates were carefully
considered in accordance with our dedication to capturing the subtleties of real-world surroundings, providing
a strong basis for reliable analysis and model training.
2.3 Redefining Classes & Annotation
Most of the existing object detection datasets have similar classes that do not represent the vehicles on Bangladesh
roads. As there are diverse types of vehicles, labelling the vehicles by their local or globally acknowledged
names will increase the number of classes, and if new vehicles are manufactured, new classes must be added.
Our approach to redefining labels based on the characteristics of vehicles solves this problem.
Class Wheel Number Driving Force Size
Car 4 wheeler Diesel, Gas, Fossil Fuel Medium
Three Wheeler 3 Wheeler Paddle Small
Autorickshaw 3 Wheeler Gas, Electric Medium
Priority Vehicle Any vehicle with Siren on Top Fossil fuel or paddle Any size
Bus 4 Wheeler Diesel Big
Truck 4 or 8 wheeler fossil fuel or electric Medium and Small
Cart Car 3 or 2 Wheeler Human or Animal Supported (No paddle) Small
Construction Vehicle Construction Related Vehicle Diesel, Fossil Fuel Medium, Big
Train Runs on Railway Track Diesel Big
Wheelchair 2 Wheeler with Priority Human Support Small
Motorbike 2 Wheeler Oil, Electric Medium
Bicycle 2 Wheeler Paddle Small
Table 2: Comparison with existing Datasets
2.4 Statistical Analysis of Dataset
In this section, we present a comprehensive statistical analysis of the dataset used for training, validation, and
testing our object detection model. The dataset comprised 9,825 images distributed among 13 distinct classes.
The dataset was split into training, validation, and testing sets in a 60:20:20 ratio, ensuring that each split
included night images.
Dataset Overview
Dataset Number of Images
Training Set 5,896
Validation Set 1,964
Testing Set 1,965
Table 3: Dataset Overview
4
Class Distribution
Class Training Set Validation Set Testing Set
Person 18,010 (38.22%) 6,202 (38.69%) 6,104 (38.64%)
Three-Wheeler 5,710 (12.12%) 1,929 (12.04%) 1,961 (12.41%)
Motorbike 3,749 (7.96%) 1,218 (7.60%) 1,254 (7.94%)
Auto Rickshaw 10,614 (22.53%) 3,638 (22.70%) 3,489 (22.09%)
Car 3,785 (8.03%) 1,305 (8.14%) 1,233 (7.81%)
Truck 2,296 (4.87%) 777 (4.85%) 782 (4.95%)
Bus 1,885 (4.00%) 589 (3.67%) 613 (3.88%)
Bicycle 673 (1.43%) 256 (1.60%) 241 (1.53%)
Priority Vehicle 229 (0.49%) 70 (0.44%) 64 (0.41%)
Cart Vehicle 141 (0.30%) 38 (0.24%) 44 (0.28%)
Construction Vehicle 23 (0.05%) 6 (0.04%) 11 (0.07%)
Wheelchair 2 (0.00%) 0 (0.00%) 1 (0.01%)
Train 1 (0.00%) 0 (0.00%) 0 (0.00%)
Table 4: Total Class Distribution in Each Split
3 Model
In this section, we elucidate the intricacies of the training regimen and architectural configurations for YOLOv5
and YOLOv8. Our primary objective is to draw a comparative analysis of their hyperparameters and perfor-
mance, as gauged by the mean Average Precision (mAP) metric on a designated object detection dataset.
3.1 YOLOv5
The YOLOv5 model stands as a vanguard in the realm of object detection, esteemed for its swift processing
capabilities and commendable accuracy. Its architecture is founded upon a deep neural network, striking a
judicious equilibrium between speed and efficacy.
3.1.1 Hyperparameters
Our training process for the YOLOv5 model adhered to a set of meticulously chosen hyperparameters, as
delineated in Table 6.
Hyperparameter Value
Batch Size 64
Learning Rate 0.001
Number of Epochs 10
Optimizer Adam
Weight Decay 1 × 10−4
Input Image Size 416x416
Table 5: Hyperparameters employed for training YOLOv5.
5
Figure 1: Sample Dataset
6
3.1.2 Results
Upon completion of our training regimen, the YOLOv5 model manifested a commendable mAP score of 0.6
when evaluated on our designated object detection dataset.
3.2 YOLOv8
3.2.1 Hyperparameters
Hyperparameter Value
Batch Size 64
Learning Rate 0.001
Number of Epochs 10
Optimizer Adam
Weight Decay 1 × 10−4
Input Image Size 416x416
Table 6: Hyperparameters employed for training YOLOv8.
3.2.2 Results
Upon completion of our training regimen, the YOLOv8 model manifested a commendable mAP score of 0.7
when evaluated on our designated object detection dataset.
3.3 Comparison
We present a juxtaposition of the hyperparameters and mAP results for YOLOv5 and YOLOv8 in Table 7.
Model mAP Training Time
YOLOv5 0.6 1 hour
YOLOv8 0.7 1 hour
Table 7: Comparative analysis of YOLOv5 and YOLOv8.
4 Conclusion
The research discusses a unique and complex task of object detection in diverse driving environments across 9
districts in Bangladesh. The dataset, collected exclusively from smartphone cameras, provides a comprehensive
representation of real-world scenarios, encompassing both day and night conditions. The limitations of existing
datasets available online don’t have vehicles that are available on Bangladesh roads or suitable for autonomous
navigation in Bangladesh. The challenges in object detection require robust algorithms capable of identifying
and localizing objects in dynamic and challenging environments, such as varying lighting conditions, different
road types, and the unique characteristics of each district. The dataset was collected using smartphone cameras
to ensure authenticity and to simulate real-world conditions faced by autonomous vehicles. The classification
of vehicles was challenging due to the many types of vehicles that cannot be seen elsewhere in the world and
have local vehicle names. A new set of classes based on characteristics rather than local vehicle names was
proposed to overcome this problem. The dataset will set a new benchmark for developing better models on
Bangladesh roads and will open up many research opportunities for the autonomous vehicle sector.
Key highlights of our work:
7
• Object detection in diverse Bangladesh driving environments
• Exclusive dataset from smartphone cameras for real-world scenarios
• 9 districts covered, including day and night conditions
• Limitations of existing datasets addressed
• Challenges: robust algorithms for dynamic environments
• Data collection: smartphone cameras for authenticity
• New class system for scalable vehicle classification
• Benchmark for better models on Bangladesh roads
• Research opportunities for the autonomous vehicle sector.
References
[1] Jianhua Han, Xiwen Liang, Hang Xu, Kai Chen, Lanqing Hong, Jiageng Mao, Chaoqiang Ye, Wei Zhang,
Zhenguo Li, Xiaodan Liang, Chunjing Xu, SODA10M: A Large-Scale 2D Self/Semi-Supervised Object
Detection Dataset for Autonomous Driving, arXiv preprint arXiv:2106.11118, 2021,
[2] Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan,
Trevor Darrell, BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning, arXiv
preprint arXiv:1805.04687, 2020,
[3] Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James
Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao,
Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Sheng Zhao, Shuyang Cheng,
Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov, Scalability in Perception for Autonomous
Driving: Waymo Open Dataset, arXiv preprint arXiv:1912.04838, 2020,