0% found this document useful (0 votes)

14 views27 pages

Report

The report details the development of a social distancing detection application using object detection models, specifically comparing YOLOv5 and Detectron2. YOLOv5 was chosen for its superior performance in GPU training and accuracy, while the project also emphasizes the importance of camera calibration for accurate distance measurement. Despite some limitations, the application aims to enhance awareness of social distancing in public settings.

Uploaded by

TOUQEER RASHID

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views27 pages

Report

Uploaded by

TOUQEER RASHID

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

CSC 420 Final Project Report

Team EyeDK
Social Distancing Detection

Team Member: Evan Pan, Hanyan Jiang, Victor Zhang

Abstract

In this report, we explored how we can build a social distance detection application and
compared the various ways on how this application can be improved. We wanted to
determine which state of the art models are more robust and will allow us to perform fast
real-time object detection in the context of social distancing analysis. Additionally, we
attempted to improve the accuracy of distance measurement by determining factors such as:
if our camera can be calibrated to remove distortion, if our frames can be warped into a bird
eye perspective, and if we can improve our model by fine tuning parameters. This will enable
us to measure using real world coordinates, and hence giving much more accurate distance
calculations.

Among all the computer vision methods for object detection, we screen over all the
traditional models and deep learning models, then pin down to two methods: YOLOv5 and
Detectron2 due to their distinguished performance on the state-of-art object detection tasks.
Experiments are done to select the best model between them in the form of a benchmark test,
and we focus on their behaviours in both accuracy and inference speed on three different
datasets: JAAD, EPFL and our custom data. The YOLOv5 shows significantly better
performance in GPU training than Detectron2, which manages to maintain almost the same
accuracy in prediction. Therefore, we select YOLOv5 as our baseline model for this program
and develop based on its pretrained architecture.

Although our detector was not perfect, we were still able to get qualitatively accurate results
in our visualizer and listed out potential optimizations that can further improve our results.
Given that social distancing has become an important part of staying healthy in our daily
lives, our application can serve as a starting point in building technologies that will help us be
more aware of our surroundings.
3

Table of contents

1. Introduction
2. Problem statement
a. Problem breakdown
b. Metrics of interest
3. Literature Review
a. Object detection
b. Camera calibration
4. Methodology
5. Experiments
6. Review
7. Conclusions
8. Author’s Contributions
9. Bibliography
4

Introduction

Covid-19 [1] has changed various aspects of our modern lives. In particular, physical
interaction between people has changed significantly. Currently the most effective way to
limit Covid-19 cases has shown to be avoiding interaction with other people through staying
home [2]. However, for people who need to go out (work, food, groceries), social distancing
becomes the most effective way of disease prevention.

Social distancing is described as a method by enforcing physical distance between one

another and hence reducing the spread of contagious disease. Naturally, to track how social
distancing is enforced, detection technologies are developed. Some common methods include
the uses of wearable devices[3], voluntarily downloaded apps [4], wi-fi usage analysis [5],
and image processing technologies. For our project, we would specifically look for
image-based solutions.

There are various advantages of using an image-based solution. First of all, it requires no
additional hardware beside a camera, which are wildly available as cell phone cameras and
security cameras are ubiquitous in the modern world. Secondly, additional information beside
location tracking is available in images. Examples of this are personal features, which can be
used to identify a person, and facial garment which can be used to identify whether they are
wearing a mask or not.

However, there are also challenges in using an image-based approach. First of all, the objects
of interest (people) would need to be identified and localized from the image, this is not a
trivial task as feature engineering often fail to generalized this kind of tasks, therefore a
learning-based approach would have to be used for such a task, such as R-CNN, Yolo , and
detectron2 [6]. A second challenge is to measure distance between individuals detected from
the image, this is difficult as the image could be projective, distorted, and there isn’t any
reference to link the pixel count into a distance metric. In this report, we discuss the
challenges of the various approaches and explain the techniques we felt were most
appropriate to address the issue of social distance detection.
5

Problem Statement

The specific task of social distancing monitoring can be further broken down into two
sub-problems. The first sub-problem being object detection, and the second sub-problem is to
measure the distance between objects that are detected (people).

Metrics of Interest - Object detection:

In the case of our application, we not only need to detect if an object is in the image, but we
also need to learn the geometrical location of the objects in the image, as we are interested in
finding distance between objects. This can be classified as a bounding box problem, which
can be more generally described as a segmentation problem. The major challenge for this
subproblem would be finding an approach that deals with occlusion, as we need to be able to
identify people especially if they are partially blocked by other people. And another potential
challenge is being able to detect the object from multiple scales.

The key objectives that are important for this task are processing speed and detection
performance. Processing speed is important for our task as a real time algorithm would allow
real time intervention, therefore allowing the user to enforce social distancing. The
processing speed would be measured in frames per seconds. The performance refers to how
accurate the objects are being detected and segmented. The performance would be measured
using the mAP(mean average precision), which is calculated with the ratio of the number of
true positive and total number of predictions. [13]

In object detection, a prediction is considered a true positive when the IOU (Intersection over
Union) surpasses a certain threshold.

Since most state of the art object detection algorithms are dependent on very large neural
network backbone (i.e. Resnet) that typically require multiple GPUs to train, it is infeasible
6

for our team to train such models. Therefore, for this subproblem, we will focus our effort on
selecting the best model through literature review, and implementing our algorithm using a
pretrained model.

Metrics of Interest -Distance Measuring:

To recover 3-D distance from a 2-D image, various approaches can be used, ranging from
simply counting pixel distance, computing camera models from external parameters,
homography, and other methods. For our project, we wish to create a generalizable method
that does not depend on any external parameters.

Since there are no existing benchmarks for this application, we will compare the results of
various approaches qualitatively on the same video. Similar to the subproblem of object
detection, We will also consider processing time as a key metric for our solution.
7

Literature Review

For this section, we will first examine existing approaches for object detection, compare the
state of the art models, then pick the model that best suits our objectives. Then we will
examine some current implementations of social distancing measuring approaches to identify
their strengths and weaknesses. Lastly, the distance measuring metrics for these approaches
will be compared.

Object detection
Though nowadays the idea of computer vision is closely tied with Deep learning and neural
nets, object detection is a topic of interest that predates Alexnet [8] by a long time.
Traditionally, object detection algorithms like the Viola–Jones framework [9] or sift [10]
generally involves two steps, the first of which is to obtain feature vectors from an image, and
the second of which is to match the feature vectors from the image with a feature vector from
a known library of objects [11]. The traditional methods work well and efficiently, and are
highly explainable. However, compared to modern deep learning methods trained on huge
datasets, the traditional methods are less accurate and versatile as shown by a comparative
study

[12]

As shown by the table, the state of the art Viola Jones algorithm performed poorly as
compared to the trained model. However, as the table below from the same study has shown,
this method has significantly faster processing speed, which makes traditional models like
this more useful in certain applications.

[12]
8

The current state of the art object detection algorithms can be divided into two distinct
classes, which are two-step detectrons and one-step detectron. They differ by the number
passes the image would go through a neural network. Two-step detectrons are represented by
approaches such as the R-CNN (regional based convolutional neural network) class of
algorithms, and one-step detectrons are represented by YOLO (you only look once) and SSD
(single shot detector) [14].

Two-step methods typically involve generating unlabeled boundboxes first (called regional
proposals), then using a learned model such as a support vector machine or convolutional
neural network to classify each region[15]. Due to the multi-step approach, R-CNNs are
typically slower. In the original model the boundboxes (called regional proposals) are
generated with a traditional computer vision algorithm called selective search
[selective_search], then the regional proposals are classified with a mixture of convolutional
neural networks and support vector machines [16].

[15]

The initial approach is improved over time. In the most recent iteration, Faster R-CNN, the
regional proposals steps are replaced with a Resnet-Based Convolutional neural network
[17], which significantly improves the speed for generating the proposals, and brings the
speed of the overall program to close to real time [ 18].

The one step models on the other hand uses a different approach, which only one pass
through a neural network is needed For YOLO, the image is first divided into a S by S grid,
then a convolutional neural network is used to generate boundboxes and class prediction for
each cell in the grid. The result is then integrated by the network to form the final prediction.
[19]
9

[19]

Unlike the R-CNN approach, since there is no need to run a neural network model with
various proposals, the YOLO model is significantly faster. The current iteration of it is the
YOLO v5 model released on July 24th [20], though it isn’t developed and trained by the
author of the original paper, it has a similar architecture as the original models and it has
slightly improved precision and run time compared to the previous generations, as shown by
a comparison by the author with the coco dataset [7].

[20]

Another example of the one-step detectron is the SSD (single shot detector) model. This class
of models is similar to the YOLO model, as the models also only involve the use of only one
pass of a convolutional neural network. However, the model differs from YOLO from a
couple of perspectives. First of all, after obtaining the feature map from the backbone CNN
model, several convolutional layers are used to obtain features at multiple scales, similar to
the use of a gaussian pyramid, and features from all levels of the convolutional network are
fed into the next layer. This allows richer information to be fed to the subsequent layers,
therefore making the network more fine-grained. Secondly, for each scale, the SSD only
generates boundboxes for fixed aspect ratios as opposed to generating boundboxes of any
shapes. This significantly reduces the number of potential outputs, simplifying the search
[21].
10

[21]

To choose the best object detection model that will be used by our project, we first need to
pick the model that has a reasonable speed. From a comparative study [22] for the
performances of these models in different papers. It can be seen that the SSD and YOLO
models have a significantly higher FPS, while R-CNN tend to be very slow. This is expected,
since faster R-CNN requires two passes in two different neural networks in series, while SSD
and YOLO models only require one pass. Since our goal for the project is for developing a
real-time algorithm, and hopefully without the need of a GPU, an R-CNN typed model would
be infeasible to implement.

[22]
11

In terms of mean average precision, SSD and YOLO have similar performance on the older
pascal VOC dataset[23] however YOLO performs significantly better on the coco dataset,
which has more classes and contains more images. For this reason, as well as the fact that
YOLO code are better documented on Github, we have chosen the most recent YOLOv5 as
our code for object detection.

[22]

[22]
12

Camera Calibration - Undistorted Image

Before applying bird eye view transformation to our frames, automatic calibration is a well
known computer vision problem that can significantly increase the accuracy of distance
measurement by removing any distortion, and will also allow us to measure in real world
metrics instead of pixels.

For the best results, the camera should be calibrated using intrinsic and extrinsic parameters.
We can use a camera calibrate function in order to get the distortion matrix, and then correct
for distortion using the data.

Two major types of distortion currently exist in many modern cameras, namely, tangential
and radial distortion [25]. Radial distortion will make straight lines in the 3-D world appear
more curved and budged out in the image. While tangential distortion makes some objects
appear closer than they are, which usually occurs when the image taking lense is not parallel
to the image plane.

(Radial distortion example where chess board lines are not aligned and budged out compared
to red lines)

Radial and tangential distortion are solved respectively:

Along with determining the cameras intrinsic and extrinsic parameters which are specific to
the camera, determining the distortion coefficients is the process of the calibration.

OpenCV currently supports 3 types of objects for calibration: chessboard, symmetrical circle
pattern, and asymmetrical circle pattern [25].
13

Using many snapshots of the corresponding pattern, and since we know the corners of a 3-D
chess board, we can calibrate the camera using the differences in the distance between
expected and actual results, and then use this data to undistort images.
In other words, to estimate camera parameters, we need to have 3-D world points of a
calibration parameter (chess boards are often used), and then use corresponding 2-D points to
solve with OpenCV functions such as cv2.findChessboardCorners.
Example below demonstrating chessboard and after calibration, and then taking a bird eye
view of the image [ 24]:

In our project, we will be assuming a static camera angle with relatively low distortion to the
frames, so the impact of projection and distortion will be minimum when estimating distance
between bounding boxes. However, ideally we would still want to get rid of any distortion
and then compute distances of objects in a bird’s eye perspective.

Unfortunately since we don’t know anything about the camera that was used for the video
input, we cannot estimate the camera intrinsic parameters which are specific to each camera,
and also we are unable get a picture of a chessboard taken by the camera in order to calibrate
the camera with openCV as well, since OpenCV currently supports chessboard object for
calibration.

Ideally, in the future our social distancing detector would want to be able to leverage a proper
camera calibration to allow us to map distances in pixels to actual measurable units (e.g
metres). Camera calibration is an important step to improving social distance detecting,
however we still were able to get decent results due to the nature of the camera angle, and
since it is just an approximation of distance it did not impact our results significantly.

Triangle Similarity (Potential alternative to camera calibration)

Triangle similarity essentially attempts to estimate the camera's distance from a known object
which is used in order to derive a perceived camera focal length. [triangle_similarity]. To
summarize triangle similarity, when we have some marker with a known width and distance
from the camera, we can take the 2-D image of the object using our camera and then measure
the perceived width. The formula for deriving perceived focal length is F = (P x D) / W,
where W is the known width in real world metrics, D is the distance, the P is the number of
pixels (or the perceived width). Essentially triangle similarity can be accomplished if we
14

know the two parameters W and D. Then modern computer algorithms will be able to
compute the perceived width of the object and hence derive our focal length. Alternatives to
calibrating our camera in OpenCV with chess boards were explored, but unfortunately we
were still unable to apply this method as we cannot determine the parameters W and D [34].
15

Methodology, Experiments and Reviews

Object Detection Model Performance Benchmark:

For the deep learning object detection model, we narrow down our scope to choose between
Detectron2, which maintains top class accuracy in object detection, and YOLOv5, which has
the fastest speed for in-time object detection. Detectron2 ensembles a variety of object
detection models such as Mask R-CNN and Faster R-CNN FPN, which scores top accuracy
in open datasets such as COCO and COCO minival[28][29]. YOLOv5 is famous for its
amazing speed of training and testing, also small size of model. Since both speed and
accuracy are the primary focus for this project, we test both models on public dataset and
compare their performance quantitatively to get the best result.

Dataset:
The dataset we are using are Multi-camera pedestrians video from EPFL[30], Joint Attention
in Autonomous Driving (JAAD) Dataset and some uncalibrated camera videos donated as
‘custom dataset’. Those dataset are selected purposely since our social distance detection
program will mainly be used for public area pedestrian walks, and analyzing real time camera
footages. The dataset from EPFL contains simulation for multiple person random walking,
which can be used to test on the model's computation capability. Dataset JAAD contains
footage shot in cars, and videos contain various crosswalks and pedestrians are selected to
test on the scalability of the models. Finally, the custom dataset is also selected to increase
the variability among datasets and test on models’ robustness.

Testing Environment:
This test is set up in the Colab notebook environment, facilitated by default GPU (Tesla K80
GPU) setting. For Detectron2, since there are a lot of model settings, we are using Fast
R-CNN R50-FPN backbone on RPN & Fast R-CNN baseline model for top-class prediction
accuracy. For YOLOv5, we accept all predefined parameters such as the number of full
connections and CNN layers. We use pre trained YOLOv5l weight to maintain a fast GPU
speed, while achieving high AP(Average Precision).

Inference FPS\ Dataset JAAD video 0067 EPFL 6p-c1 EPFL 4p-c0 Custom video

YOLOv5 0.013±0.002s 0.011±0.002s 0.011±0.001s 0.031±0.01s28

Detectron2 0.512±0.201s 0.332±0.521s 0.385±0.028s 0.529±0.511s

Table 1: Object Detection Speed per 300 frames

Inference error\ Dataset JAAD video 0067 EPFL 6p-c1 EPFL 4p-c0 Custom video

YOLOv5 1/300 frm 3/300 frm 0/300 frm 0/300 frm

Detectron2 0/300 frm 2/300 frm 0/300 frm 0/300 frm

Table 2: Object Detection Accuracy per 300 frames

Analysis:

From Table 1, we can see that YOLOv5 is 30-50 times faster than Detectron2. The inference
time for each frame is about 0.01-0.03s for YOLOv5, compared to Detectron2 that is about
0.3-0.5s per frame. For custom video, the inference speed tend to increase, this may be
caused by the fact that custom videos are not properly calibrated and standardized.

From Table 2, we can see the errors compared between both models are very similar. Those
are the conditions when only a part of a person's body enters the frame. Detectron2 has about
0.6 confidence of detecting a ‘person’ class even though there is only one arm in the frame,
while VOLOv5 has smaller confidence in those edge conditions, and therefore ruled out by
threshold. Since what we care about is the interaction BETWEEN people in pandemic time,
the edge case errors are considered as eligible.

Bird Eye View - Warped Image

Bird eye view is a when you warp the image to a top down perspective of a scene and it will
allow us to significantly increase the accuracy of distance measurement, especially since
euclidean distance sometimes false, for example when 2 people may seem geometrically
close by, they can actually be geographically far. Some important assumptions are made with
our input video when working with Bird Eye perspective. Firstly, we will assume our input
video is a fixed camera angle with respect to the road in the frames. We also need to assume
that our surface is planar and free of any interfering obstacles for best results. The steps for
processing our image to create a top-down view of a captured scene by applying a
homography are: resizing out image to appropriate size to manually select parallelogram
surface, apply camera calibration to remove distortion from our frame, and then transforming
our image into a bird eye view, and lastly we would enlarge and crop our region of interest
[27].
17

Distance Measurement between bounding boxes

First step involves manually selecting the 4 corners of the scene we want to warp into bird’s
eye view perspective to get our chosen plane. By using cv2 getPerspectiveTransform, which
takes the selected corner points as input, we can determine the transformation matrix. This
will calculate the 3x3 perspective transform matrix from four of the corresponding points,
which can be used to map the relationship between the original image coordinates to the bird
eye view coordinates [31]

We can apply this transformation matrix to each of the bounding boxes we detected, using a
pre trained model we discussed earlier, of people in each frame we detected in the first stage.
By applying the transformation matrix, we can determine real world coordinates of each
bounding box, which is significantly more accurate than measuring distance using original
image points as we are now considering coordinates close to the real world [ 26].

For each person detected, the top left and bottom right bounding box corner points are
returned. From these points, we computed the centroid of the box by getting the middle point
between them. Using this result, we calculated the coordinates of the point located at the
bottom center of the box, which we determined was best to represent the coordinates of a
person.

Bird Eye View Example:

Original Frame:
18

Then by the selecting sidewalk + road as our bird eye view scene we get the following warp:

Bounding box corners circled before warp:

Bounding box bottom center after warp (we use bottom centre as the reference point for each
bounding box):
19

Distance Measurement:

In terms of actually calculating the distance between any two people, we found the euclidean
distance for every pair of bottom centroids and stored them. Next, we determined which pair
of people violated the social distance threshold and inputted those boxes into our visualizer
which highlights their boxes as red.

Social Distancing Demos

1) Distance measured with Euclidean Distance:
20

(Including social distance violation detection below)

2) Distance measured with Bird Eye coordinates

Overall, by using a static camera angle with a wide view of the scene and relatively low
distortion and birds eye warping, along with state of the art object detection models, we were
21

able to generate a decent social distance detector. Compared between the performance of
multiple models across standard video datasets such as JAAD and EPFL, YOLOv5 stands out
from our benchmark tests, and is used as our object detection model to explore various
transformations that can be applied to improve the accuracy of distance measuring between
two people.

3) Distance detection with Heatmap

We also implement heatmaps on the location where people violate the social distance rule
powered by python library ‘heatmappy’. The location where violation happens are
accumulated from blue to green to red color, signifying the increasing potential danger with
elapsed time. This heatmap functionality can not only highlight the dangerous area in
pandemic, but also helps to detect public facilities that are poorly designed to make
suggestions for public pandemic disease control [35].
22

Conclusions
With the spread of Covid-19, social distancing has become very important in preventing the
spread of the virus and staying healthy. This project can be used as a starting point to
encourage people to maintain social distance and be aware when walking into crowded areas.
In future iterations of social distance detection, we can improve our accuracy and results with
many approaches. For example, YOLOv5 is currently performing prediction in a default
mode. But if we tune it to less object class detection and confine the CNN layer based on the
video quality, we may save more inference time and obtain higher in-time processing speed.
Also for the bird eye view conversion, since we are using homography, it relies on predefined
four corners. In the future, we can use a neural net to detect ground in images, and transform
to bird-eye view automatically. What’s more, as mentioned earlier, we were not able to
leverage camera calibration to remove any distortion in our frames. In the future, it would be
nice to improve our detector by optimizing our projection and distortion in frames by using a
proper calibration for the camera. Doing so will lead to better results and measurable units
between distances. This project is intriguing to practice on in the domain of computer vision,
while also practical in uses when it comes to serious time such as pandemic Covid-19.
23

Authors’ Contributions
Victor:
- Worked on literature review exploring various ways to estimate and calculate camera
calibration, including alternatives such as using triangle similarity.
- Implemented and debugged various ways our model could leverage birds eye
projection and how we can implement that into our code
- Explored distance measuring between pairs of people
Haoyan:
- Implementation of YOLOv5 and Detectron2 model baseline, initialize streamline of
object detection program in project early stage
- Benchmark test between YOLOv5 and Detectron2, literature resource collection for
object detection models and bird eye view problem
- Implementation of heatmap feature in videos
Evan:
- Implemented full pipeline of detection -> homography->distance metrics->display on
local machine, integrated both Victor and Haoyan’s code
- Sub-problem definition, and Investigated metrics of performance for object detection
models in common benchmarks.
- Performed literature review on traditional and Deep-Learning-based object detection
models and selecting the best model.
- Ported Haoyan’s implementation of the models from Google Collab to Local machine
- Edited Presentation video
24

References

[1] P. Canada, "Coronavirus disease (COVID-19) outbreak updates, symptoms, prevention,

travel, preparation - Canada.ca", Canada.ca, 2020. [Online]. Available:
https://www.canada.ca/en/public-health/services/diseases/coronavirus-disease-covid-19.html.
[Accessed: 07- Aug- 2020].

[2]P. Canada, "Coronavirus disease (COVID-19): Prevention and risks - Canada.ca",

Canada.ca, 2020. [Online]. Available:
https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection/p
revention-risks.html. [Accessed: 07- Aug- 2020].

[3]"UWB social distancing", Uwb-social-distancing.com, 2020. [Online]. Available:

https://www.uwb-social-distancing.com/?gclid=CjwKCAjw9vn4BRBaEiwAh0muDDfoyUu
vEQOgL_TSgWDK6RUes-uu1F4hWxUBkpSMDj2jmwDEC2Id3BoC8-AQAvD_BwE.
[Accessed: 07- Aug- 2020].

[4]"Social distancing app uses space to save lives", Esa.int, 2020. [Online]. Available:
https://www.esa.int/Applications/Telecommunications_Integrated_Applications/Social_dista
ncing_app_uses_space_to_save_lives. [Accessed: 07- Aug- 2020].

[5]2020. [Online]. Available: https://globalreachtech.com/,

https://www.itworldcanada.com/blog/need-technology-help-to-maintain-social-distancing/43
1981. [Accessed: 07- Aug- 2020].

[6]"5 Significant Object Detection Challenges and Solutions", Medium, 2020. [Online].
Available:
https://towardsdatascience.com/5-significant-object-detection-challenges-and-solutions-924c
b09de9dd. [Accessed: 07- Aug- 2020].

[7]"COCO - Common Objects in Context", Cocodataset.org, 2020. [Online]. Available:

https://cocodataset.org/#home. [Accessed: 07- Aug- 2020].

[8]"Open Images V6 - Description", Storage.googleapis.com, 2020. [Online]. Available:

https://storage.googleapis.com/openimages/web/factsfigures.html. [Accessed: 07- Aug-
2020].

[9]A. Krizhevsky, I. Sutskever and G. Hinton, "ImageNet classification with deep

convolutional neural networks", Communications of the ACM, vol. 60, no. 6, pp. 84-90,
2017. Available: 10.1145/3065386.
25

[10] P. Piccinini, A. Prati, and R. Cucchiara, “Real-time object detection and localization
with SIFT-based clustering,” 03-Jul-2012. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0262885612000923. [Accessed:
07-Aug-2020].

[11] N. O’Mahony, S. Campbell, A. Carvalho, S. Harapanahalli, G. V. Hernandez, L.

Krpalkova, D. Riordan, and J. Walsh, “Deep Learning vs. Traditional Computer Vision,”
Advances in Intelligent Systems and Computing Advances in Computer Vision, pp. 128–144,
2019.

[12] L. T. Nguyen-Meidine, E. Granger, M. Kiran, and L.-A. Blais-Morin, “A comparison of

CNN-based face and head detectors for real-time video surveillance applications,” 2017
Seventh International Conference on Image Processing Theory, Tools and Applications
(IPTA), 2017.

[13] J. Hui, “mAP (mean Average Precision) for Object Detection,” Medium, 03-Apr-2019.
[Online]. Available:
https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c12
1a31173. [Accessed: 07-Aug-2020].

[14] Intellica.AI, “A Comparative Study of Custom Object Detection Algorithms,” Medium,

18-Dec-2019. [Online]. Available:
https://medium.com/@Intellica.AI/a-comparative-study-of-custom-object-detection-algorith
ms-9e7ddf6e765e. [Accessed: 07-Aug-2020].]

[15] R. Gandhi, “R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection
Algorithms,” Medium, 09-Jul-2018. [Online]. Available:
https://towardsdatascience.com/r-cnn-fast-r-cnn-faster-r-cnn-yolo-object-detection-algorithm
s-36d53571365e. [Accessed: 07-Aug-2020].

[16] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for
Accurate Object Detection and Semantic Segmentation,” 2014 IEEE Conference on
Computer Vision and Pattern Recognition, 2014.

[17] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,”
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[18] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object
Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017.
26

[19] Intellica.AI, “A Comparative Study of Custom Object Detection Algorithms,” Medium,

18-Dec-2019. [Online]. Available:
https://medium.com/@Intellica.AI/a-comparative-study-of-custom-object-detection-algorith
ms-9e7ddf6e765e. [Accessed: 07-Aug-2020].

[20] Ultralytics, “ultralytics/yolov5,” GitHub, Jul-2020. [Online]. Available:

https://github.com/ultralytics/yolov5. [Accessed: 07-Aug-2020].

[21] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD:
Single Shot MultiBox Detector,” Computer Vision – ECCV 2016 Lecture Notes in Computer
Science, pp. 21–37, 2016.

[22] J. Hui, “Object detection: speed and accuracy comparison (Faster R-CNN, R-FCN, SSD,
FPN, RetinaNet and...,” Medium, 26-Mar-2019. [Online]. Available:
https://medium.com/@jonathan_hui/object-detection-speed-and-accuracy-comparison-faster-
r-cnn-r-fcn-ssd-and-yolo-5425656ae359. [Accessed: 07-Aug-2020].
.
[23]“Visual Object Classes Challenge 2012 (VOC2012),” The PASCAL Visual Object
Classes Challenge 2012 (VOC2012). [Online]. Available:
http://host.robots.ox.ac.uk/pascal/VOC/voc2012/. [Accessed: 07-Aug-2020].

[24] “Camera Calibration,” OpenCV. [Online]. Available:

https://docs.opencv.org/3.4/dc/dbb/tutorial_py_calibration.html. [Accessed: 07-Aug-2020].

[25] “Camera Calibration¶,” OpenCV. [Online]. Available:

https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_calib3d/py_calibration
/py_calibration.html. [Accessed: 07-Aug-2020].

[26] basileroth75, “basileroth75/covid-social-distancing-detection,” GitHub. [Online].

Available: https://github.com/basileroth75/covid-social-distancing-detection. [Accessed:
07-Aug-2020].

[27] “RidgeRun's Birds Eye View project research,” developer.ridgerun.com. [Online].

Available:
https://developer.ridgerun.com/wiki/index.php?title=Birds_Eye_View%2FIntroduction%2FR
esearch. [Accessed: 07-Aug-2020].

[28] “Papers with Code - COCO test-dev Benchmark (Object Detection),” The latest in
machine learning. [Online]. Available:
https://paperswithcode.com/sota/object-detection-on-coco. [Accessed: 07-Aug-2020].
27

[29]“Papers with Code - COCO minival Benchmark (Object Detection),” The latest in
machine learning. [Online]. Available:
https://paperswithcode.com/sota/object-detection-on-coco-minival. [Accessed:
07-Aug-2020].

[30] F. Fleuret; J. Berclaz; R. Lengagne; P. Fua, F. Fleuret, J. Berclaz, R. Lengagne, and P.

Fua, “Multi-camera pedestrians video,” CVLAB, 01-Jan-1970. [Online]. Available:
https://www.epfl.ch/labs/cvlab/data/data-pom-index-php/. [Accessed: 07-Aug-2020].

[31] ayushmankumar, “Perspective Transformation - Python OpenCV,” GeeksforGeeks,

09-Jul-2020. [Online]. Available:
https://www.geeksforgeeks.org/perspective-transformation-python-opencv/. [Accessed:
07-Aug-2020].

[32] A. Rosebrock, “4 Point OpenCV getPerspective Transform Example,” PyImageSearch,

18-Apr-2020. [Online]. Available:
https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-exam
ple/. [Accessed: 08-Aug-2020].

[33] “Camera Calibrator,” What Is Camera Calibration? - MATLAB & Simulink. [Online].
Available: https://www.mathworks.com/help/vision/ug/camera-calibration.html. [Accessed:
08-Aug-2020].

[34] “Find distance from camera to object using Python and OpenCV,” PyImageSearch,
18-Apr-2020. [Online]. Available:
https://www.pyimagesearch.com/2015/01/19/find-distance-camera-objectmarker-using-pytho
n-opencv/. [Accessed: 08-Aug-2020].

[35] A. Pai, “Build your Social Distancing Detection Tool using Deep Learning,” Analytics
Vidhya, 28-Jun-2020. [Online]. Available:
https://www.analyticsvidhya.com/blog/2020/05/social-distancing-detection-tool-deep-learnin.
[Accessed: 08-Aug-2020].

Social Distance Detector Mini Project - 1
No ratings yet
Social Distance Detector Mini Project - 1
7 pages
Social Distancing Detector
No ratings yet
Social Distancing Detector
14 pages
Saurabh Social Distancing
No ratings yet
Saurabh Social Distancing
9 pages
Social Distancing Review1
No ratings yet
Social Distancing Review1
11 pages
ML Lab Doc
No ratings yet
ML Lab Doc
16 pages
Covid Protection From Social Distancing Application: Prashant Setia19BCE1398, Yogender Singh 19BCE1472
No ratings yet
Covid Protection From Social Distancing Application: Prashant Setia19BCE1398, Yogender Singh 19BCE1472
8 pages
Optimized AI for Social Distancing
No ratings yet
Optimized AI for Social Distancing
20 pages
Social Distancing Detection System
No ratings yet
Social Distancing Detection System
18 pages
Social Distancing Technique For Covid-19 Using Yolo V5 & CNN For Fast Object Detection and Better Accuracy
No ratings yet
Social Distancing Technique For Covid-19 Using Yolo V5 & CNN For Fast Object Detection and Better Accuracy
7 pages
Technical Seminar
No ratings yet
Technical Seminar
18 pages
CpE Entry 2 Design and Development of A Distance Monitoring Device An OpenCV Application Full Paper Journal
No ratings yet
CpE Entry 2 Design and Development of A Distance Monitoring Device An OpenCV Application Full Paper Journal
6 pages
Social Distancing Analyzer Using Computer Vision - and Deep Learning
No ratings yet
Social Distancing Analyzer Using Computer Vision - and Deep Learning
11 pages
Sensors and Detectors Use in Covid-19 Final
No ratings yet
Sensors and Detectors Use in Covid-19 Final
37 pages
NN Report (3) 4
No ratings yet
NN Report (3) 4
68 pages
Yolo v3 Proj Draft
No ratings yet
Yolo v3 Proj Draft
51 pages
Social Distancing Detector Using Yolo
No ratings yet
Social Distancing Detector Using Yolo
4 pages
Irjet V8i4570
No ratings yet
Irjet V8i4570
3 pages
Naser2021 Article Novel Privacy Preserving PETRA21
No ratings yet
Naser2021 Article Novel Privacy Preserving PETRA21
5 pages
Translated Copy of Thesis (Repaired)
No ratings yet
Translated Copy of Thesis (Repaired)
35 pages
Social Distancing Detection Using Computer Vision
No ratings yet
Social Distancing Detection Using Computer Vision
10 pages
Seminar p1
No ratings yet
Seminar p1
9 pages
Translated Copy of Thesis
No ratings yet
Translated Copy of Thesis
41 pages
Social Distancing Tech Spec
No ratings yet
Social Distancing Tech Spec
10 pages
15 Sensors 21 04608
No ratings yet
15 Sensors 21 04608
15 pages
Water Compressed
No ratings yet
Water Compressed
16 pages
Social Distancing Detection Using Tensorflow
No ratings yet
Social Distancing Detection Using Tensorflow
4 pages
COVID-19 Care: Checking Whether People Are Following Social Distancing and Wearing Face Masks or Not Using Deep Learning
No ratings yet
COVID-19 Care: Checking Whether People Are Following Social Distancing and Wearing Face Masks or Not Using Deep Learning
9 pages
Social Distancing Arxiv
No ratings yet
Social Distancing Arxiv
10 pages
Social Distance
No ratings yet
Social Distance
18 pages
of Project2 SDDS FINAL
No ratings yet
of Project2 SDDS FINAL
19 pages
Group13 Ecea200-1l Finalmanuscript
No ratings yet
Group13 Ecea200-1l Finalmanuscript
57 pages
Content: Title Page No
No ratings yet
Content: Title Page No
28 pages
Iva Assignment 5
No ratings yet
Iva Assignment 5
10 pages
Final Report PSA
No ratings yet
Final Report PSA
43 pages
Design Phase
No ratings yet
Design Phase
12 pages
Application Development For Mask Detection and Social Distancing Violation Detection Using Convolutional Neural Networks
No ratings yet
Application Development For Mask Detection and Social Distancing Violation Detection Using Convolutional Neural Networks
8 pages
Crowd Social Distance Measurement and Mask Detection
No ratings yet
Crowd Social Distance Measurement and Mask Detection
37 pages
Paper Electronica Ingles
No ratings yet
Paper Electronica Ingles
4 pages
(2021) (09) - UAV-Based Crowd Surveillance in Post COVID-19 Era
No ratings yet
(2021) (09) - UAV-Based Crowd Surveillance in Post COVID-19 Era
15 pages
Social Distancing Prediction Using Opencv
No ratings yet
Social Distancing Prediction Using Opencv
39 pages
Physical Distancing Detector
No ratings yet
Physical Distancing Detector
5 pages
Real-Time Social Distancing Detector Using Socialdistancingnet-19 Deep Learning Network
No ratings yet
Real-Time Social Distancing Detector Using Socialdistancingnet-19 Deep Learning Network
7 pages
Sensors 20 05101 v2
No ratings yet
Sensors 20 05101 v2
26 pages
Social Distance Monitoring Using Drone
No ratings yet
Social Distance Monitoring Using Drone
5 pages
Prime D 23 00108
No ratings yet
Prime D 23 00108
13 pages
Latest Presentation March2021 - 102813
No ratings yet
Latest Presentation March2021 - 102813
21 pages
Social Distance Detector Using YOLO v3
No ratings yet
Social Distance Detector Using YOLO v3
7 pages
YOLO Based Real Time Human Detection Using Deep Learning
No ratings yet
YOLO Based Real Time Human Detection Using Deep Learning
9 pages
E3sconf Icmed-Icmpc2023 01016
No ratings yet
E3sconf Icmed-Icmpc2023 01016
6 pages
Hot-Spot Zone Detection To Tackle Covid19 Spread by Fusing The Traditional Machine Learning and Deep Learning Approaches of Computer Vision
No ratings yet
Hot-Spot Zone Detection To Tackle Covid19 Spread by Fusing The Traditional Machine Learning and Deep Learning Approaches of Computer Vision
39 pages
DEMO PPT
No ratings yet
DEMO PPT
10 pages
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
No ratings yet
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
8 pages
Social Distancing Tech for Safety
No ratings yet
Social Distancing Tech for Safety
6 pages
CCTV
No ratings yet
CCTV
23 pages
Monitoring COVID-19 Social Distancing With Person PDF
No ratings yet
Monitoring COVID-19 Social Distancing With Person PDF
10 pages
COVID Safety Detection System
82% (11)
COVID Safety Detection System
10 pages
Real-Time Object Detection & Depth
No ratings yet
Real-Time Object Detection & Depth
38 pages
Social Distancing Alarm Project
No ratings yet
Social Distancing Alarm Project
14 pages
81-120 Inetigacion
No ratings yet
81-120 Inetigacion
50 pages
Asim Project DIT 2nd Semester
No ratings yet
Asim Project DIT 2nd Semester
18 pages
Database Paper
No ratings yet
Database Paper
1 page
Database System Complete Notes
No ratings yet
Database System Complete Notes
117 pages
Admission Form 9TH-1
No ratings yet
Admission Form 9TH-1
1 page
Social Distance Old Thesis
No ratings yet
Social Distance Old Thesis
85 pages
Ahmish CV
No ratings yet
Ahmish CV
5 pages
Python Scan Book
No ratings yet
Python Scan Book
50 pages
Office Automation Scan Book
No ratings yet
Office Automation Scan Book
152 pages
Urdu- Shabash Tum Kaise Ho شاباش تم کرسکتے ہو #- By Qaiser Abbas
No ratings yet
Urdu- Shabash Tum Kaise Ho شاباش تم کرسکتے ہو #- By Qaiser Abbas
220 pages
Operatin System Complete Notes
No ratings yet
Operatin System Complete Notes
42 pages
Computer Network Complete Notes
No ratings yet
Computer Network Complete Notes
69 pages
Python Programming Complete Notes
No ratings yet
Python Programming Complete Notes
77 pages
Book - A Basic Guide To Air Photo Interpretation in HK OCR PDF
No ratings yet
Book - A Basic Guide To Air Photo Interpretation in HK OCR PDF
65 pages
1 PDF
No ratings yet
1 PDF
14 pages
Stellar Calibration of The Orbigon Lens: NOAA, National Ocean Survey Rockville, Maryland 20852
No ratings yet
Stellar Calibration of The Orbigon Lens: NOAA, National Ocean Survey Rockville, Maryland 20852
15 pages
Unit 3 - Speech and Video Processing (SVP)
100% (1)
Unit 3 - Speech and Video Processing (SVP)
44 pages
Close-Range Photogrammetry: A General Knowledge To
No ratings yet
Close-Range Photogrammetry: A General Knowledge To
41 pages
Metashape-Pro 1 8 en
No ratings yet
Metashape-Pro 1 8 en
193 pages
How To Read A Dji Terra Quality Report
No ratings yet
How To Read A Dji Terra Quality Report
10 pages
Forensic Photography in Policing
No ratings yet
Forensic Photography in Policing
16 pages
Doris Lessing's Influence on Architecture
100% (1)
Doris Lessing's Influence on Architecture
59 pages
Photogrammetric Image Measurements
No ratings yet
Photogrammetric Image Measurements
22 pages
MAIA Camera Calibration Guide
No ratings yet
MAIA Camera Calibration Guide
8 pages
Photogeology Notes
No ratings yet
Photogeology Notes
12 pages
CCTV Studymate
No ratings yet
CCTV Studymate
56 pages
Metashape Python Api 2 1 3
No ratings yet
Metashape Python Api 2 1 3
339 pages
Phantom 4 RTK Training 10.12.18 v0.4 - Compressed
No ratings yet
Phantom 4 RTK Training 10.12.18 v0.4 - Compressed
44 pages
Tokina Lens Catalog: AT-X Technology
No ratings yet
Tokina Lens Catalog: AT-X Technology
28 pages
Ncorrmanual v1 2
No ratings yet
Ncorrmanual v1 2
47 pages
Nik Collection 3 by DxO Product Overview en
No ratings yet
Nik Collection 3 by DxO Product Overview en
10 pages
13 RMK A 1523 23 07 1982 F 153 362 PDF
No ratings yet
13 RMK A 1523 23 07 1982 F 153 362 PDF
5 pages
Close Range Photogrammetry and 3D Imaging 3rd Edition Thomas Luhmann Stuart Robson Stephen Kyle Jan Boehm Full Chapters Included
No ratings yet
Close Range Photogrammetry and 3D Imaging 3rd Edition Thomas Luhmann Stuart Robson Stephen Kyle Jan Boehm Full Chapters Included
173 pages
Zebra Boards
No ratings yet
Zebra Boards
5 pages
Police and Forensic Photography Guide
100% (2)
Police and Forensic Photography Guide
43 pages
!kannala Brandt Calibration
No ratings yet
!kannala Brandt Calibration
15 pages
Photography by Mitchell
100% (3)
Photography by Mitchell
105 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
Adobe Photoshop CS4 Part 2: Editing and Manipulating Photographs
No ratings yet
Adobe Photoshop CS4 Part 2: Editing and Manipulating Photographs
21 pages
Agisoft Metashape User Manual: Standard Edition, Version 1.5
No ratings yet
Agisoft Metashape User Manual: Standard Edition, Version 1.5
77 pages
Método de Aplanado - Gellón Versión Inglés
No ratings yet
Método de Aplanado - Gellón Versión Inglés
9 pages
Tracker Help
No ratings yet
Tracker Help
135 pages
Nuke6.0v1 ReleaseNotes
No ratings yet
Nuke6.0v1 ReleaseNotes
27 pages

Report

Uploaded by

Report

Uploaded by

CSC 420 Final Project Report

Team Member: Evan Pan, Hanyan Jiang, Victor Zhang

Social distancing is described as a method by enforcing physical distance between one

Metrics of Interest - Object detection:

Metrics of Interest -Distance Measuring:

Camera Calibration - Undistorted Image

Radial and tangential distortion are solved respectively:

Triangle Similarity (Potential alternative to camera calibration)

Methodology, Experiments and Reviews

Object Detection Model Performance Benchmark:

YOLOv5 0.013±0.002s 0.011±0.002s 0.011±0.001s 0.031±0.01s28

Detectron2 0.512±0.201s 0.332±0.521s 0.385±0.028s 0.529±0.511s

Table 1: Object Detection Speed per 300 frames

YOLOv5 1/300 frm 3/300 frm 0/300 frm 0/300 frm

Detectron2 0/300 frm 2/300 frm 0/300 frm 0/300 frm

Table 2: Object Detection Accuracy per 300 frames

Bird Eye View - Warped Image

Distance Measurement between bounding boxes

Bird Eye View Example:

Bounding box corners circled before warp:

Social Distancing Demos

(Including social distance violation detection below)

2) Distance measured with Bird Eye coordinates

3) Distance detection with Heatmap

[1] P. Canada, "Coronavirus disease (COVID-19) outbreak updates, symptoms, prevention,

[2]P. Canada, "Coronavirus disease (COVID-19): Prevention and risks - Canada.ca",

[3]"UWB social distancing", Uwb-social-distancing.com, 2020. [Online]. Available:

[5]2020. [Online]. Available: https://globalreachtech.com/,

[7]"COCO - Common Objects in Context", Cocodataset.org, 2020. [Online]. Available:

[8]"Open Images V6 - Description", Storage.googleapis.com, 2020. [Online]. Available:

[9]A. Krizhevsky, I. Sutskever and G. Hinton, "ImageNet classification with deep

[11] N. O’Mahony, S. Campbell, A. Carvalho, S. Harapanahalli, G. V. Hernandez, L.

[12] L. T. Nguyen-Meidine, E. Granger, M. Kiran, and L.-A. Blais-Morin, “A comparison of

[14] Intellica.AI, “A Comparative Study of Custom Object Detection Algorithms,” ​Medium,​

[19] Intellica.AI, “A Comparative Study of Custom Object Detection Algorithms,” ​Medium,​

[20] Ultralytics, “ultralytics/yolov5,” ​GitHub​, Jul-2020. [Online]. Available:

[24] “Camera Calibration,” ​OpenCV​. [Online]. Available:

[25] “Camera Calibration¶,” ​OpenCV​. [Online]. Available:

[26] basileroth75, “basileroth75/covid-social-distancing-detection,” ​GitHub​. [Online].

[27] “RidgeRun's Birds Eye View project research,” ​developer.ridgerun.com​. [Online].

[30] F. Fleuret; J. Berclaz; R. Lengagne; P. Fua, F. Fleuret, J. Berclaz, R. Lengagne, and P.

[31] ayushmankumar, “Perspective Transformation - Python OpenCV,” ​GeeksforGeeks,​

[32] A. Rosebrock, “4 Point OpenCV getPerspective Transform Example,” ​PyImageSearch​,

You might also like

[14] Intellica.AI, “A Comparative Study of Custom Object Detection Algorithms,” Medium,

[19] Intellica.AI, “A Comparative Study of Custom Object Detection Algorithms,” Medium,

[20] Ultralytics, “ultralytics/yolov5,” GitHub, Jul-2020. [Online]. Available:

[24] “Camera Calibration,” OpenCV. [Online]. Available:

[25] “Camera Calibration¶,” OpenCV. [Online]. Available:

[26] basileroth75, “basileroth75/covid-social-distancing-detection,” GitHub. [Online].

[27] “RidgeRun's Birds Eye View project research,” developer.ridgerun.com. [Online].

[31] ayushmankumar, “Perspective Transformation - Python OpenCV,” GeeksforGeeks,

[32] A. Rosebrock, “4 Point OpenCV getPerspective Transform Example,” PyImageSearch,