0% found this document useful (0 votes)

5 views26 pages

Introduction To Compute Rvision

Uploaded by

ydeekshika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views26 pages

Introduction To Compute Rvision

Uploaded by

ydeekshika

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 26

INTRODUCTION

TO COMPUTER VISION
(Computer Vision and Robotics)

Contents

CHAPTER 1 3

1.1 Introduction 3

1.2 Motion Control Strategies 4

1.3 Digital Image Representation 6

Image Coordinates 7

1.4 Pinhole Camera Model 9

1.4.1 Central Projection in Homogeneous Coordinates 10
1.4.2 The World Coordinate System 13
1.4.3 The general camera calibration matrix K 15

1.5 Single Camera Calibration 16

1.6 Image Processing Algorithms 18

1.6.1 Motion Tracking through Optical Flow-Based 18
1.6.2 Motion Tracking through RGB-Based 23
1.6.3 Object Tracking Using Background Subtraction 25
Computer Vision

CHAPTER 1

COMPUTER VISION

1.1 Introduction
This chapter describes the vision-based control strategies for pick-
and-place robotic application. The software application of these strategies
is accomplished by using MATLAB/SIMULINK of MathWorks -
Company. The vision algorithms are used to identify the interest-objects
models and send their position and orientation data to the data acquisition
system then to the microcontroller which solves the inverse kinematics
and orders the robot to pick these objects to place them in a target goal.

Fig. 1.1 shows the sequence used to accomplish the above-mentioned

target. The camera captures the image/video and sends it to a PC where
the vision algorithms are set. The PC processes the image/video and
sends the position and orientation data of the objects and end effector to
the microcontroller where the robot inverse kinematics are set. The
microcontroller then sends the motion signals to the robot motors to
accomplish the target.

Robotic Motors Camera

Microcontroller Data-Acquisition PC

2
Computer Vision

Fig. 1.1 Sequence of computer vision

1.2 Motion Control Strategies

Fig. 1.2 illustrates the basic idea of using the camera to get
information about the goal position and orientation. The frame {C} is the
camera frame and the frames {B}, {S}, {G}, {W} and {T} are as
illustrated in chapter three.

{C}

{B}

{W}

{S}
{G}
α {T}

Fig. 1.2 Coordinate systems and camera frame

The robot kinematic equation as discussed before can be written as:

B
TW = BTS . STG . (WTG=T)-1 (5.1)

Where BTS and (WTG=T)-1 are known from the physical dimensions of
B
the robot. The robot variables are included in robot matrix TW.
To get BTW, the matrix STG which gives the position and orientation of the
goal relative to the frame {S} must be determined as seen
in section 1.7.

3
Computer Vision
It worth noting that the (x, y) image will be determined through
difficult image processing algorithms as discussed in the following
sections. However, determining the (X, Y) world of the object centroid in
(mm) needs a good calibration process see section 1.5.

The above mentioned approach can be illustrated in the block

diagram shown in Fig. 1.3.

As can be seen, the above strategy is considered an open-loop

Object Data Picture x, y
Image Camera
Camera Pixel
Processing Calibration

X, Y
mm
End-effector Robot Variables
Position d1, θ2, θ3, θ4,
Robot θ5
Robot Inverse
Kinematics
control system. Certain errors can arise from different sources like
Fig. 1.3 Open-loop
inverse kinematic inaccuracy, control and
robot precision block diagram
camera calibration.

Another strategy based on a closed-loop feedback signal is going to be

tested in the current research. The algorithm depends on capturing the
position of the end-effector and the pencil position and orientation (goal)
and developing an algorithm to make the end effector to coincide with the
goal. The proposed algorithm in this case can be represented in the block
diagram shown in Fig. 1.4.

Object Data Picture Image x, y Feedback

Camera Processing Pixel Algorithm
Error

Video Video x, y
Camera Processing Pixel Camera
Calibration
X, Y

4
mm

End-effector Robot Variables

Position d1, θ2, θ3, θ4, Robot Inverse
Computer Vision
In the previous chapters we have already discussed the robot and the
inverse kinematics (the green blocks in Fig. 1.3 and Fig. 1.4) in this
chapter the camera calibration, image processing and feedback algorithm
are discussed (the red blocks in Fig. 1.3 and Fig. 1.4).

1.3 Digital Image Representation

Digital image consists of a finite set of values called picture
elements or pixels for short. These pixels are arranged in a regular grid
(or raster) of rows and columns, and so it can be useful to think of an
image as a matrix. Every pixel in a greyscale image (also called an
intensity image) is an 8-bit unsigned integer, meaning that it can have an
integer value between 0 and 255. A value of 0 corresponds to pitch black,
a value of 255 to pure white, and values between these extremes produce
various grey levels between black and white.
A color image is also stored as a raster of pixels. Every pixel is now
represented by three integer values between 0 and 255: one for red, one
for green and one for blue. These three primary intensities are added to
reproduce a certain color on the screen, and this commonly used way of
representing color is called the RGB color scheme see Fig. 1.5.

Fig. 1.5 Examples of RGB color values [0:255].

An image may be defined as a two-dimensional function, f(x, y),

where (x) and (y) are spatial (plane) coordinates, and the amplitude of at
any pair of coordinates (x, y) is called the Intensity or gray level of the

5
Computer Vision
image at that point. When (x), (y), and the amplitude values of (f) are all
finite, discrete quantities, we call the image a digital image. The field of
digital image processing refers to processing digital images by means of a
digital computer. Note that a digital image is composed of a finite number
of elements, each of which has a particular location and value. These
elements are referred to as picture elements, image elements, and pixels.
Pixel is the term most widely used to denote the elements of a digital
image.

Image Coordinates
Assume that an image f(x, y) is sampled so that the resulting image
has M rows and N columns so, the image is of size M * N. The values of
the coordinates are discrete quantities. For notational clarity and
convenience, we shall use integer values for these discrete coordinates.
The image origin is usually defined to be at (x, y) = (0, 0). The next
coordinate values along the first row of the image are (x, y) = (0, 1). The
notation (0, 1) is used to signify the second sample along the first row. It
does not mean that these are the actual values of physical coordinates
when the image was sampled.
Fig. 1.6 shows this coordinate convention. Note that (x) ranges from
0 to (M–1) and (y) from 0 to (N–1) in integer increments.

6
Computer Vision

Fig. 1.6 Digital image coordinate conventions.

The coordinate system in Fig. 1.6 and the preceding discussion lead
to the following representation for a digitized image:

[ ]
f (0 , 0) f (0 , 1) . . .. f (0 , N −1)
f (1 , 0) f (1 ,1 ) . . .. f (1 , N−1 )
.. .. .. .. . . .. . .. .
f ( M−1 , 0 ) f ( M −1, 1) . . .. f ( M−1 , N −1)
f(x, y) =

The right side of this equation is a digital image by definition. Each

element of this array is called an image element, picture element or pixel.
The terms image and pixel are used throughout the rest of our discussions
to denote a digital image and its elements.

1.4 Pinhole Camera Model

Here we develop a basic pinhole camera model. The model performs
well as long as the lens is thin and no wide-angle lens is used. In practice
the image plane is located behind the lens, but to simplify calculations
and relations between the coordinate systems, the image plane can be put

7
Computer Vision
in front of the lens. Fig. 1.7 illustrates the pinhole camera model with the
image plane located in front of the lens.

Fig. 1.7 Illustration of the pinhole camera model, image plane in front
of the lens to simplify calculations [Courtesy of Maria
Magnusson Seger].

The hardest part of this model is keeping track of the different

coordinate systems. Before we continue, we have to define an image-
coordinate system in the image plane of the camera. In most electro-
optical cameras, this image plane is defined by the sensor plane. This is
centered at the focus of the camera with its X and Y axes aligned parallel
to, and the Z axis perpendicular to, the image plane see Fig. 1.8.

The image plane is also provided with a coordinate system to record

the position of features on the image. In practice of course, positions will
be measured in pixel coordinates, so ultimately we’ll have to make
provision to measure in pixel coordinates. The object or scene to be
captured is described in terms of a world coordinate system. It is
therefore often convenient to fix the world coordinate system to the
object or scene.

This section is about deriving the relationships between these

coordinate systems. Written in homogeneous coordinates it is a linear
relationship expressed in terms of a matrix called the camera matrix.

8
Computer Vision

1.4.1 Central Projection in Homogeneous Coordinates

Here we consider the central projection of a point X =[X Y Z] T in
the camera coordinate system, with origin at the camera center C, onto
the image plane. The image plane is located at Z = f in the camera
coordinate system where f is known as the focal length of the camera.

The point where the Z axis pierces the image plane is known as the
principal point and the Z axis as the principal axis. The origin of the
image coordinate system is chosen, for now, as the principal point and its
x- and y axes are aligned with the X and Y axes of the camera coordinate
system. All of this is illustrated in Fig. 1.8.

Fig. 1.8 Illustrating basic camera geometry.

If a point X has coordinates [X Y Z] T relative to the camera

coordinate system, X projects onto the point x on the image plane, with C
the center of the projection, as in Fig. 1.8. Using homogeneous
coordinates this projection is described by a matrix P. We’ll variously
refer to this matrix as the camera matrix or the projection matrix,
depending on which aspect we wish to emphasize.

The map from homogeneous camera coordinates to homogeneous

image coordinates is given by:

9
Computer Vision

[ ] ¿ ¿[ ]
X
x Y
y Z (1.2)
z = 1

Using this notation, we can describe the central projection

from Xcam to ximage as:

(1.3)
ximage = P Xcam

Where P is the [3×4] homogeneous camera projection matrix. This

defines the camera matrix for the central projection as:

P= ¿ ¿¿ ¿ (1.4)

The camera matrix derived above assumes that the origin of the
image coordinate system is at the principal point p. However, this is not
usually the case in practice. If the coordinates of the principal point (P)
are (px, py) in the image coordinate system, see Fig. 1.9.

Fig. 1.9 Illustrating camera geometry with offset image coordinates.

10
Computer Vision

From Fig. 1.8 the mapping of Xcam to ximage is given by:

[ ] ¿ ¿[ ]
X
x Y (1.5)
y Z
z = 1

Then the camera calibration matrix is:

[ ]
f 0 Px
(1.6)
0 f Py
K= 0 0 1

The camera matrix P is given by:

P= ¿ ¿¿ ¿ (1.7)

Emphasizing the fact that we are projecting features described in

terms of the camera coordinate system, we rewrite the projection as:

ximage = ¿ ¿¿ ¿ Xcam
(1.8)

The next step is to introduce the world coordinate system and relate
it to the camera coordinate system.

11
Computer Vision

1.4.2 The World Coordinate System

In general, 3D objects are described in terms of coordinate systems
fixed to the objects as shown in Fig. 1.10. In homogeneous coordinates it
is given by:

(1.9)
Xworld= [ X Y Z 1 ]T
Fig. 1.10 Camera geometry in a general world coordinate system.

Since we already know how to project a feature in the camera

coordinate system onto the image coordinate system, we only need to
relate the world and camera coordinate systems, i.e. Xworld and Xcam. Since
the two coordinate systems are related by a rotation (R) and a translation
(t), as is clear from Fig. 1.10, we may write:

Xcam= R(Xworld – C) = RXworld + t (1.10)

The Euclidean vector C in (1.11) is the coordinates of the camera

center in the world coordinate system, and the parameters (R, t) are called
the extrinsic parameters, that is the rotation and translation which relates
the world coordinate system to the camera coordinate system
see Fig. 1.10. From (1.11) the translation matrix t is equal to (–RC). Also

12
Computer Vision
note that Xcam = 0 if Xworld= C i.e. the camera coordinate is zero at the
camera center, as expected. From (1.9) we can write:

[]
X
Y (1.11)

[R −R C
Xcam= 0
T
1 ] [ ]
Z R
1 = 0T
t
1 Xworld

Combining (1.7) and (1.9) with (1.12), we get:

(1.12)
ximage= K [R t] Xworld
Where (K) is camera intrinsic matrix and Xworld is now given in the
world coordinate system. Note that all the parameters that refer to the
specific type of camera are contained in (K); these parameters are
referred to as the intrinsic parameters. (R) & (t) describe the external
orientation of the world coordinate system to the camera coordinate
system and are therefore referred to as the extrinsic parameters
see Fig. 1.11.
Fig. 1.11 shows the extrinsic and intrinsic parameters

1.4.3

The general camera calibration matrix K

In the models above we assume that the image coordinate frame is
Euclidean with equal scales in both axial directions, which is not always

13
Computer Vision
true. In particular, if the number of pixels per unit distance in image
coordinates are mx and my in the x and y directions, respectively, then the
calibration matrix becomes:

[ ][ ][ ]
mx 0 0 f 0 Px αx 0 x0
0 my 0 0 f Py 0 αy y0 (1.13)

K= 0 0 1 0 0 1 = 0 0 1

Where αx = f mx and αy = f my represent the focal length of the

camera in terms of pixel coordinates in the x and y directions,
respectively. Similarly, (x0, y0) is the principal point in terms of pixel
coordinates with x0 = mx px and y0 = my py.

For added generality, we use a calibration matrix of the form:

[ ]
αx s x0
0 αy y0 (1.14)
0 0 1
K=

Where the added parameter s is referred to as the skew parameter.

The skew parameter will be zero for most normal cameras.

1.5 Single Camera Calibration

Using a position based method; a 3D camera calibration is required
in order to map the 2D data of the image features to the Cartesian space
data. This is to say that intrinsic and extrinsic parameters of the camera
must be evaluated. Intrinsic parameters depend exclusively on the optical
characteristics, e.g. lens and CCD sensor properties. The calibration of
intrinsic parameters can be operated offline in the case that optical setup
is fixed during the operative tasks of the robot. Extrinsic parameters
indicate the relative pose of the camera reference system O C-(XC, YC&
ZC) with respect to a generic world reference system. It is assumed that
the world reference system OS-(XS, YS& ZS) is a system that is fixed with
14
Computer Vision
the target objects see Fig. 1.2, so that the extrinsic parameters give
directly the pose of the camera with respect to the target. The extrinsic
parameters matrix coincides with the homogeneous transformation
between the camera and the object reference systems:

C
[
TS=
¿ C RS ¿ C tS
0T 1 ] (1.16)

We can use the camera calibrator toolbox of Simulink to estimate

camera intrinsic, extrinsic, and lens distortion parameters. These camera
parameters can be used for various computer vision applications. These
applications include removing the effects of lens distortion from an
image, measuring planar objects, or 3-D reconstruction from multiple
cameras.

It is assumed that the world reference system [station frame {S},

OS-(XS, YS& ZS)] is a system that is fixed with the target objects, so that
the extrinsic parameters (R, t) give directly the pose of the camera with
respect to the target. Using the camera calibrator toolbox we can export
the camera parameters of (R, t & k):

[ ]
0 .9992 0. 0027 −0. 0399
−0 .002 0. 9999 0 . 0136 (1.15)
C
0 .0399 −0 .0162 0. 9991
RS =

C
[−226 . 1425 −166 . 5716 1. 24∗103 ] (1.16)
tS =

[ ]
1. 064∗103 0 0
3
−0 . 8118 1 .0658∗10 0 (1.17)
258 .3899 295 .171 1
k=

Applying to (1.12), then we can write:

15
Computer Vision

[ ][ ]
x image 1. 064∗103 0 0
y image 3
−0 . 8118 1 .0658∗10 0 .
zimage 258 .3899 295 .171 1
=

][ ]
X world

[
0 .9992 0. 0027 −0. 0399 −226. 1425 Y world (1.18)
−0 .002 0. 9999 0 . 0136 −166 . 5716 Z world
0 .0399 −0 .0162 0 . 9991 1 .24∗103 1

][ ]
X world

[ ][
x image 0.0106 0 −0 .0004 −2. 4079 Y world (1.19)
y image 0 0 .0107 0. 0002 −1 .7735 0
zimage 0.0026 0 .0030 0 −1. 0636 1
=

Solving (1.19), we can get:

x image +2. 4079 (1.20)

0. 0106
Xworld=

y image +1. 7735 (1.21)

0 . 0107
Yworld=

Where ximage and yimage are in pixels and determined from the
designed algorithms blob analysis. So, after identifying the values
of (x, y)image in [pixels], we can calculate the (X,Y) world in [mm] of the
target objects from (1.20) and (1.21).

1.6 Image Processing Algorithms

Image processing algorithms start with a given picture or video and
ends up with useful information about a certain object like its position
and orientation or object number and so on. In the present study the
pencil position and orientation and the end effector position are
considered the data to be determined by image processing algorithms.

16
Computer Vision
The proposed algorithm should guide the gripper to grasp the objects
from its centroid, so the centroid of objects should be obtained. The blob
analysis block in the Simulink software is very similar to the “region
props” function in MATLAB. They both measure a set of properties for
each connected object in an image file. The properties include area,
centroid, bounding box, major and minor axis, orientation and so on. The
details of the proposed Simulink models will be explained in the next
section. In the following sub-section three different image processing
algorithms are discussed.

1.6.1 Motion Tracking through Optical Flow-Based

A great deal of information can be extracted by recording time-
varying image sequences using a fixed camera. An image sequence (or
video) is a series of 2-D images that are sequentially ordered with respect
to time. Motion estimation is defined here as the estimation of the
displacement and velocity of features in image frame with respect to the
previous frame in a time sequence of 2D images.

The method tracks and estimates the velocity of the arm robot only.
It assumes all objects in the scene are rigid, no shape changes allowed.
This assumption is often relaxed to local rigidity. This assumption assures
that optical flow actually captures real motions in a scene rather than
expansions, contractions, deformations and/or shears of various scene
objects.

Optical flow is the distribution of apparent velocities of movement

of brightness patterns in an image. Optical flow can arise from relative
motion of objects and the viewer. Consequently, optical flow can give
important information about spatial arrangement of the objects viewed
and the rate of change of this arrangement. As known earlier,

17
Computer Vision
computation of differential optical flow is, essentially, a two-step
procedure:

a. Measure the spatio-temporal intensity derivatives (which are

equivalent to measuring the velocities normal to the local intensity
structures).
b. Integrate normal velocities into full velocities, for example, either
locally via a least squares calculation or globally via regularization.

The optical flow methods try to calculate the motion between two
image frames which are taken at times (t) and (t + δt) at every voxel
position. These methods are called differential since they are based on
local Taylor series approximations of the image signal; that is, they
use partial derivatives with respect to the spatial and temporal
coordinates.

Assume I (x, y, t) is the center pixel in a n×n neighborhood and

moves by δx, δy in time δt to I (x+δx, y +δy, t+δt). Since I (x, y, t) and I
(x + δx, y + δy, t + δt) are the images of the same point (and therefore the
same) we have:

I (x, y, t) = I (x + δx, y + δy, t + δt) (1.21)

Solving (1.21) using Horn-Schunck method gives:

IxVx + IyVy = − It (1.22)

Where, Ix, Iy, It are intensity derivative in x, y, t respectively and V x,

Vy are the x and y components of the velocity or optical flow of I(x, y, t).

The video sequence is captured using a fixed camera. The Optical

Flow block using the Horn – Schunck algorithm (1981) estimates the
direction and speed of object motion from one video frame to another and
returns a matrix of velocity components. Various image processing

18
Computer Vision
techniques such as thresholding, median filtering are then sequentially
applied to obtain labeled regions for statistical analysis.

Thresholding is the simplest method of image segmentation. The

process of thresholding returns a threshold image differentiating the
objects in motion (in white) and static background (in black). More
precisely, it is the process of assigning a label to every pixel
in an image such that pixels with the same label share certain
visual characteristics.

A Median Filter is then applied to remove salt and pepper noise

from the threshold image without significantly reducing the sharpness of
the image Median filtering is a simple and very effective noise removal
filtering process and an excellent filter for eliminating intensity
spikes.

The objective of this algorithm is to identify the targeted and

interested objects and track the moving objects within a video sequence.
The tracking of the object is based on optical flows among video frames
in contrast to image background-based detection. The proposed
optical flow method is straightforward and easier to implement
and has better performance.

The idea of this project is derived from the tracking section of the
demos listed in MATLAB computer vision toolbox website. The
algorithm consists of software simulation on Simulink.

19
Computer Vision
The Simulink model for this algorithm mainly consists of
three parts, which are “Velocity Estimation (yellow block)”, “Velocity
Threshold Calculation (green block)” and “Blob analysis (Centroid
Determination) (red block)”, see Fig.1.12.

Fig. 1.12 Optical Flow-Based Simulink block diagram

For the velocity estimation, the optical flow block (yellow block) is
used in the Simulink built in library. The optical flow block reads image
intensity value and estimate the velocity of object motion. The velocity
estimation can be either between two images or between current frame
and Nth frame back, see Fig. 1.12.

After obtaining the velocity from the Optical Flow block, calculating
the velocity threshold is needed in order to determine what is the
minimum velocity magnitude corresponding to a moving object (green
subsystem block, see Fig. 1.12).

The velocity threshold can be obtained as Fig. 1.13 by firstly getting

the mean velocity value across frame and across time by passing the
velocity through couple mean blocks (orange blocks).

20
Computer Vision
After that, a comparison of the input velocity with mean velocity
value will be done using relational operator block (gray block). If the
input velocity is greater than the mean value, it will be mapped to one and
zero otherwise. The output of this comparison became a threshold
intensity matrix and passed to a median filter block (green block) &
closing block (yellow block) to remove noise, see Fig. 1.13.

Fig. 1.13 Subsystem that determines the velocity threshold

After segmenting the moving object from the background of the
image, it will be passed to the blob analysis block (red block, see
Fig. 1.12) in order to obtain the boundary box, centroid for the object and
the corresponding box area see Fig. 1.14.

Fig. 1.14 Subsystem that determines the blob analysis box

1.6.2 Motion Tracking through RGB-Based

In the RGB color space, each color is described as a combination
of three main colors, namely Red, Green and Blue. This color space can

21
Computer Vision
be visualized as a 3d matrix with the main colors set out on the axis. The
values for the main colors vary from 0 to 1. Each color is coded with
three values, a value for red, blue and green. In this color space, an
imported image on a computer is thus transformed into 3 matrices with
values per pixel for the representing main color (Fig. 1.15).

Fig. 1.15 RGB color space model

The size of the matrix that represents the RGB color space is
dependent on the bitrate that is used. Matlab uses a standard bitrate
of 8 bits when an image is imported. This means that there are 256 tones
of each main color, so the size of the color space matrix is 256x256x256.

A 3D region inside this matrix must be defined to indicate a

particular color. This can be done by intuition, but it is a lot easier when
the colors are visualized. Because of the fact that colors in the RGB space
depend on three variables, a 2D image is not sufficient to visualize all

22
Computer Vision
colors. The color definition can be done with two 2D images, but that is
still very difficult.

The objective of this algorithm is to identify the target objects and

track the moving ones within a video sequence due to its colors. The
tracking of the gripper is based on its label color. The proposed RGB
algorithm is straightforward and easier to implement and has better
performance. This algorithm consists of software simulation on Simulink.

The Simulink model for this algorithm mainly consists of two parts,
which are “Identifying RGB of target objects and Gripper Label” and
“Boundary Box, Centroid Determination”. For the RGB identifying, color
analyzer program "Camtasia Studio program" is used to identify the RGB
values of objects so, a Simulink subsystem block called "RGB Filter" is
done for the proposed RGB values input see Fig. 1.16 and Fig. 1.17.

Fig. 1.16 RGB-Based Simulink block diagram

23
Computer Vision

Fig. 1.17 Subsystem that has the RGB values of objects

After obtaining the RGB values from the RGB Filter block, it will be
passed to the blob analysis block in order to obtain the boundary box,
centroid for the object and the corresponding box area see Fig.1.14.

1.6.3 Object Tracking Using Background Subtraction

Background subtraction, also known as Foreground Detection, is a
technique in the fields of image processing and computer vision wherein
an image’s foreground is extracted for further processing (object
recognition etc.). Generally an image’s regions of interest are objects
(humans, cars, text etc.) in its foreground. Background subtraction is a
widely used approach for detecting moving objects in videos from static
cameras. The rationale in the approach is that of detecting the moving
objects from the difference between the current frame and a reference
frame, often called “background image”, or “background model.

Background subtraction is particularly a commonly used technique

for motion segmentation in static images. It will detect moving regions by
subtracting the current image pixel by pixel from a reference
background image that is created by averaging images over time in an
initialization period. The basic idea of background subtraction method is
to initialize a background firstly, and then by subtracting current frame in

24
Computer Vision
which the moving object present that current frame is subtracted with
background frame to detect moving object. This method is simple and
easy to realize, and accurately extracts the characteristics of target data,
but it is sensitive to the change of external environment, so it is
applicable to the condition that the background is known.

This method of object detection involves finding the background

and tracking what is not a part of it. A median over time of each pixel of
the video is taken. Then subtract the background from the image to get
what remains in the foreground. Technically, what we get is not the
image of the foreground, but the difference between the background and
the foreground. This unfortunately means that if an object in the
foreground is visually close enough to the background, this method will
not detect it see Fig.1.18 and Fig.1.19.

Fig. 1.18 Background estimation Simulink block diagram

25
Computer Vision

Fig. 1.19 Subsystem for the taken median over time of each pixel

The absolute value of the difference between the whole picture and
the background is taken to eliminate negatives. Then a threshold is
established, so anything above it is in the foreground and becomes white,
anything below it is in the background and becomes black see Fig .1.20.

Fig. 1.20 Subsystem that determines the threshold and blob analysis

Simulink can perform blob detection on the white objects and determine
the points needed to draw a rectangle around them see
Fig. 1.14. Unfortunately, this software does not work perfectly due to lag.
The system takes almost a full second to get a new frame and analyze it
for background and object detection.

After applying the three software algorithms on the designed robot

to compare between them and choose which is better. It's noted that the
RGB-based algorithm had a better response and results.

Computer Vision & Robotics Guide
No ratings yet
Computer Vision & Robotics Guide
29 pages
Computer Vision Systems Overview
No ratings yet
Computer Vision Systems Overview
71 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
72 pages
تصوير رقمي 2
No ratings yet
تصوير رقمي 2
38 pages
Digital Image Acquisition Sampling and Quantization
No ratings yet
Digital Image Acquisition Sampling and Quantization
74 pages
Computer Vision
No ratings yet
Computer Vision
18 pages
Digital Image Fundamentals Guide
No ratings yet
Digital Image Fundamentals Guide
15 pages
AD8703 BCV Unit III 2023 (Autosaved)
No ratings yet
AD8703 BCV Unit III 2023 (Autosaved)
96 pages
Unit1 Lecture.3
No ratings yet
Unit1 Lecture.3
6 pages
Module - 2 - Computer Vision For Robotics Systems
No ratings yet
Module - 2 - Computer Vision For Robotics Systems
67 pages
Digital Image Processing Basics
No ratings yet
Digital Image Processing Basics
94 pages
Unit 4
No ratings yet
Unit 4
13 pages
VC 1
No ratings yet
VC 1
20 pages
CV Unit-1
No ratings yet
CV Unit-1
26 pages
CV - Unit 1
No ratings yet
CV - Unit 1
14 pages
Unit-1: 1) Phases of Image Processing
No ratings yet
Unit-1: 1) Phases of Image Processing
23 pages
Digital Image Processing: Assignment No. 2
No ratings yet
Digital Image Processing: Assignment No. 2
18 pages
Digital Image Fundamentals
No ratings yet
Digital Image Fundamentals
50 pages
Extra Questions
No ratings yet
Extra Questions
29 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Image Processing Operations Guide
No ratings yet
Image Processing Operations Guide
11 pages
Deep Learning For Vision Book 2
No ratings yet
Deep Learning For Vision Book 2
292 pages
Chunk 2
No ratings yet
Chunk 2
31 pages
Mod 1 Ip
No ratings yet
Mod 1 Ip
17 pages
Dip Unit 1
No ratings yet
Dip Unit 1
35 pages
Digital Image Basics for Beginners
No ratings yet
Digital Image Basics for Beginners
15 pages
Digital Image Processing
No ratings yet
Digital Image Processing
37 pages
Lecture 2: Image Processing Review, Neighbors, Connected Components, and Distance
No ratings yet
Lecture 2: Image Processing Review, Neighbors, Connected Components, and Distance
7 pages
Digital Image Processing Basics
No ratings yet
Digital Image Processing Basics
85 pages
Vision-Guided Robotics Overview
No ratings yet
Vision-Guided Robotics Overview
48 pages
CVP - Advanced Vision Guided Robotics - Steven Prehn PDF
No ratings yet
CVP - Advanced Vision Guided Robotics - Steven Prehn PDF
48 pages
Digitalisasi Citra
No ratings yet
Digitalisasi Citra
28 pages
IP Module1
No ratings yet
IP Module1
48 pages
Week2 4
No ratings yet
Week2 4
32 pages
UNIT 2 Ms Bcom 6th Sem
No ratings yet
UNIT 2 Ms Bcom 6th Sem
26 pages
Computer Vision
No ratings yet
Computer Vision
23 pages
By Saurabh Pethe: Robotics A Presentation On Vision System
No ratings yet
By Saurabh Pethe: Robotics A Presentation On Vision System
17 pages
Unit 1
No ratings yet
Unit 1
21 pages
1 - Module 1
No ratings yet
1 - Module 1
47 pages
Image Acquisition - Digitization
No ratings yet
Image Acquisition - Digitization
64 pages
DIP 1st UNIT
No ratings yet
DIP 1st UNIT
19 pages
IPPR Lec2
No ratings yet
IPPR Lec2
92 pages
IP Unit1
No ratings yet
IP Unit1
37 pages
Digital Image Processin
No ratings yet
Digital Image Processin
11 pages
COMP3411 Week 7 - Computer Vision
No ratings yet
COMP3411 Week 7 - Computer Vision
58 pages
Block 3 Output
No ratings yet
Block 3 Output
36 pages
Unit 1 Computer Vision Notes
No ratings yet
Unit 1 Computer Vision Notes
11 pages
Film Photography: Imaging
No ratings yet
Film Photography: Imaging
158 pages
Vision
No ratings yet
Vision
18 pages
Block 3
No ratings yet
Block 3
36 pages
1.DIP Sampling & Quantization
0% (1)
1.DIP Sampling & Quantization
45 pages
Robotics Vision Sensors Guide
No ratings yet
Robotics Vision Sensors Guide
55 pages
Unit 1 Introduction
No ratings yet
Unit 1 Introduction
65 pages
Image Processing: Introduction & Fundamentals
No ratings yet
Image Processing: Introduction & Fundamentals
58 pages
Lecture 1 - Image Formation and Representation
No ratings yet
Lecture 1 - Image Formation and Representation
22 pages
Lecture1 Merged
No ratings yet
Lecture1 Merged
182 pages
Fundamental Steps of Digital Image Processing
No ratings yet
Fundamental Steps of Digital Image Processing
17 pages
Basic ELECtronics
No ratings yet
Basic ELECtronics
16 pages
Haws Part 7360bt 7460bt Specsheet PDF
No ratings yet
Haws Part 7360bt 7460bt Specsheet PDF
2 pages
2.4 Analyzing Indicators of Malicious Activity
No ratings yet
2.4 Analyzing Indicators of Malicious Activity
4 pages
BAE401 VTUQP Module3 Solutions
No ratings yet
BAE401 VTUQP Module3 Solutions
19 pages
Airbnb ITIL Case Study Analysis
No ratings yet
Airbnb ITIL Case Study Analysis
23 pages
Touch-Screen Technology Survey
No ratings yet
Touch-Screen Technology Survey
5 pages
CPU Instruction Set and Architecture
No ratings yet
CPU Instruction Set and Architecture
5 pages
Fiverr Usage and Growth Statistics - How Many People Use Fiverr in 2022
No ratings yet
Fiverr Usage and Growth Statistics - How Many People Use Fiverr in 2022
16 pages
Cleanroom Airflow Velocities
No ratings yet
Cleanroom Airflow Velocities
11 pages
MELSEC iQ-R Ethernet, CC-Link IE, and MELSECNET/H Function Block Reference
No ratings yet
MELSEC iQ-R Ethernet, CC-Link IE, and MELSECNET/H Function Block Reference
202 pages
inteliLIGHT - API Docs
No ratings yet
inteliLIGHT - API Docs
53 pages
Data-Driven Marketing: Master's Degree Program in
No ratings yet
Data-Driven Marketing: Master's Degree Program in
82 pages
Odisha Tourism Project Report - 043206
No ratings yet
Odisha Tourism Project Report - 043206
27 pages
Introduction To Metro Ethernet
No ratings yet
Introduction To Metro Ethernet
4 pages
Networking Media in Education Connecting Students and Opportunities
No ratings yet
Networking Media in Education Connecting Students and Opportunities
10 pages
Utility Purchase Specs Guide
No ratings yet
Utility Purchase Specs Guide
12 pages
Project 2 978314918104156
No ratings yet
Project 2 978314918104156
3 pages
For New Sales Tax Registration, Biometric Verification Is Necessary Individual NTN S of Members and Company Is Necessary For The Process
No ratings yet
For New Sales Tax Registration, Biometric Verification Is Necessary Individual NTN S of Members and Company Is Necessary For The Process
1 page
Irb 4600
No ratings yet
Irb 4600
400 pages
Nelson 5930i Easy Set Hose Timer Owners Manual
No ratings yet
Nelson 5930i Easy Set Hose Timer Owners Manual
2 pages
P6BX2
No ratings yet
P6BX2
32 pages
High Vibration of GA-101D
100% (2)
High Vibration of GA-101D
17 pages
Joint Inspection Check List
No ratings yet
Joint Inspection Check List
5 pages
Applications of Matrices To Business and Economics
93% (175)
Applications of Matrices To Business and Economics
24 pages
AAIATC250282787 Stage1 4410091239 4197190046
No ratings yet
AAIATC250282787 Stage1 4410091239 4197190046
4 pages
Standard Coefficients For Building Projects
No ratings yet
Standard Coefficients For Building Projects
10 pages
Model 550 Bravo: Cessna® Illustrated Parts Catalog
No ratings yet
Model 550 Bravo: Cessna® Illustrated Parts Catalog
1 page
C C Bill April-Merged
No ratings yet
C C Bill April-Merged
4 pages
Enterprise DevOps Git Management
No ratings yet
Enterprise DevOps Git Management
46 pages
Garrett 1557800 Rev A EC 10754 Paragon Email
No ratings yet
Garrett 1557800 Rev A EC 10754 Paragon Email
2 pages