0% found this document useful (0 votes)
40 views6 pages

Template Tracking and Visual Servoing For Alignment Tasks With Autonomous Underwater Vehicles

The document describes a method for using visual servoing and template tracking to autonomously align an underwater vehicle relative to a target object. A template tracking algorithm extracts the coordinates of tracked corners on the target object. These coordinates are used as visual features in a visual servoing control law that guides the vehicle toward a desired pose relative to the target. The control law is validated both in simulation and with a real autonomous underwater vehicle in a water tank, where the vehicle autonomously aligns itself with respect to a target object on the tank floor.

Uploaded by

narcispr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views6 pages

Template Tracking and Visual Servoing For Alignment Tasks With Autonomous Underwater Vehicles

The document describes a method for using visual servoing and template tracking to autonomously align an underwater vehicle relative to a target object. A template tracking algorithm extracts the coordinates of tracked corners on the target object. These coordinates are used as visual features in a visual servoing control law that guides the vehicle toward a desired pose relative to the target. The control law is validated both in simulation and with a real autonomous underwater vehicle in a water tank, where the vehicle autonomously aligns itself with respect to a target object on the tank floor.

Uploaded by

narcispr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Template Tracking and Visual Servoing for

Alignment Tasks with Autonomous


Underwater Vehicles
Mario Prats Narcs Palomeras Pere Ridao
Pedro J. Sanz

University of Jaume-I, Spain, (e-mail: mprats@uji.es).


University of Girona, Spain (e-mail: narcispr@gmail.com).

Abstract: Alignment of underwater vehicles with respect to underwater structures is of utmost


importance for many tasks performed in marine environments, specially for those requiring
manipulation and, therefore, a specific and stable vehicle pose. These skills are crucial in the
case of autonomous vehicles, where there is not a human in the loop in charge of tele-operating
the vehicle to a suitable position and attitude with respect to the target. In this paper we apply
2D visual servoing techniques based on template tracking to the problem of autonomously
reaching and keeping a vehicle pose relative to a target in a typical underwater intervention
scenario. Assuming that the object of interest has been already detected, we develop a method
that autonomously aligns the vehicle relative to it, and keeps the pose during the task. The
method is validated, first in simulation, and then in a water tank with the Girona 500 vehicle
Keywords: Marine systems, Autonomous Vehicles, Autonomous Control, Robot Vision,
Tracking applications.
1. INTRODUCTION
Exploration of the oceans and maintenance of underwater
installations is attracting the interest of many companies
and institutions all around the world, the former mainly
because of the valuous resources that the ocean houses and
its interest for scientists, the latter because of the need to
operate underwater equipment typically found in intervention panels, observatories, etc. Remote Operated Vehicles
(ROVs) are currently the most extended machines for doing these tasks. In a typical scenario expert pilots remotely
control the underwater vehicles from support vessels. However, due to the high costs and control problems involved
in ROV-based missions, the trend is to advance towards
more autonomous systems, i.e. Autonomous Underwater
Vehicles (AUVs).
A typical task that needs to be automated in the transition
from ROVs to AUVs is to automatically reach and keep
a desired vehicle pose relative to a given target. Some
applications include, but are not limited to, station keeping
with respect to objects lying on the seafloor (e.g. for
grasping purposes), alignment with respect to underwater
intervention panels, homing and docking to a docking
station, etc.
In this paper, our approach to perform vision-based alignment tasks is presented (see Fig. 1). A template tracking
This research was partly supported by Spanish Ministry of
Research and Innovation DPI2011-27977-C03 (TRITON Project),
by the European Commission Seventh Framework Programme
FP7/2007-2013 under Grant agreement 248497 (TRIDENT Project),
by Foundation Caixa Castell
o-Bancaixa PI.1B2011-17, and by Generalitat Valenciana ACOMP/2012/252.

algorithm feeds a visual servoing controller that guides


an AUV towards a desired relative pose with respect to
a target. The control signal generated by the vision controller is combined with an estimation of the actual vehicle
velocity computed by fusing the readings of several vehicle
sensors. The outputs are the control signals to be sent to
the thrusters. Our approach is validated with the Girona
500 vehicle (Ribas et al., 2011) in the task of aligning with
respect to a black box found in the floor of a water tank.
This problem has been already addressed in other works,
like in (Lots et al., 2001), where 2D visual servoing was
used for visual station keeping with a cartesian robot
emulating underwater vehicle dynamics. In (van der Zwaan
and Santos-Victor, 2001), a decoupled controller was presented in order to keep station with a ROV, using different
vision-based controllers for the horizontal and for the vertical motion. The main differences of our approach with
respect to the aforementioned ones is that (i) a unique visual controller is considered for all the degrees of freedom,
(ii) we are not limited to station keeping tasks, but also
consider active alignment in translation and rotation, and
(iii) the approach is validated in a real AUV.
The paper is organized as follows: Section 2 describes the
tracking method; Section 3 outlines the visual servoing
control algorithm; in Section 4 the module in charge
of estimating the vehicle velocity from different sensors
is described, whereas Section 5 provides details of the
final controller running on the vehicle. Finally, Section 7
concludes the paper.

and partial occlusions that are frequent in underwater


environments.

Fig. 1. A diagram with the different blocks presented in


this paper.

The template to track is initialized online, either automatically from an object recognition module (Prats et al.,
2012), or manually by an operator. In the second case,
the user just have to define a bounding box around the
area to track. This is done by displaying the camera image
and clicking on the corners of the bounding box. The
tracker then initializes the template, and performs a local
search at each iteration for the homography that minimizes
the difference between template gradient and the image
gradient on the patch defined by the homography.
The output of the tracking module are the coordinates
in pixels of the tracked template corners, i.e. ci = [cix ciy ],
where i = 1 . . . 4. These corners are used as visual features
in the visual servoing module that is described next. Fig. 2
shows an example where a template to track is initialized
with a bounding box and then tracked in a video sequence.
3. VISUAL SERVOING
3.1 Computing the reference

Fig. 2. Template initialization and tracking in water tank


conditions.
2. TEMPLATE TRACKING
In a visual servoing context, vision is normally used in
order to obtain an estimation of the position of a set of
visual features, either in the image (image-based visual servoing) or in the 3D space (position-based visual servoing).
If working in the 3D space, pose estimation algorithms are
normally required in order to retrieve the 3D structure
from image features. These algorithms (e.g Dementhon
(Dementhon and Davis, 1995)), normally require some apriori knowledge on the dimensions of the real object, or its
distance to the camera. In the case of image-based visual
servoing, however, no a-priori knowledge of the object is
required, but 3D information about the pose of the target
in camera coordinates is not available. For situations where
the 3D pose of the object is required (e.g. manipulation),
position-based visual servoing is normally preferred. On
the contrary, for applications where pose information is
less important, like visual alignment or inspection tasks,
image-based visual servoing may be a better choice. This
work is focused on visual alignment tasks of an AUV
with respect to a target object. Therefore, an image-based
visual servoing approach is adopted.
Both approaches require robust tracking of a set of visual features. In our approach, template-based tracking
is adopted, since it is a robust and accurate tracking
method, especially for situations where the target to track
is planar or quasi-planar. More concretely, the Efficient
Second Order minimization method has been adopted
(Benhimane and Malis, 2004). This method computes in
real-time the homography relationship, H, that minimizes
the sum-of-squared-differences between a given template
and its projection on the current image. It can deal with
planar transformations including rotations and change of
scale, and is to some extent robust to illumination changes

An image-based visual servoing approach is adopted in


order to guide the tracked template towards a desired
reference position, orientation and size in the image. The
reference position can be obtained in two ways: (i) from
a previous learning step, where the robot is placed at
the desired relative pose, and the corners of the tracked
template at that point are stored as reference position;
(ii) in the case the dimensions of the target are known
and assuming a calibrated camera, the 3D template is
projected on the image according to a desired 3D pose. For
instance, for a target with corner coordinates (expressed
in meters in a local frame {T }) Ci = [Cxi Cyi Czi ] with
i = 1 . . . 4, and a desired camera-target relative 3D pose
represented by the homogeneous matrix C TT , being {C}
the camera frame, the corresponding 2D points of the
template projected in the image can be computed as:

Cxi
i
= PC TT Cy
Ci
z
1
1

cix

ci
y

(1)

where P is the perspective projection matrix built from


the camera calibration parameters, i.e:

P=

"

px 0 u c 0
0 py v c 0
0 0 1 0

being px and py the meter-to-pixel ratio in X and Y axis,


and [uc vc ] the principal point.
3.2 Visual Servoing control law
The visual servoing control law can be written as:

+
req = V WC L
s (s s )

(2)

where req is the velocity to be sent to the vehicle controller, is the gain of the control law, V WC is the twist
transformation matrix that transforms velocities from the
+
camera frame, C, to the vehicle frame, V . L
s is the
pseudo-inverse of the interaction matrix associated to the
vector of current image features s, and s is the vector of
desired features. s and s are built from the current and
the desired position of the template corners, as:

c1x
c1y


s = ...
4
cx
c4y

c1x

c1y

s = ...

c4

Fig. 3. Navigation node implemented in the Girona 500


AUV.

c4y

The interaction matrix, Ls , is computed at each iteration


from the current visual features, according to the general
expression derived in (Hutchinson et al., 1996) for point
features, but removing (for the specific application of
this paper) the degrees of freedom corresponding to roll
and pitch, since they are not actively controlled in the
underwater vehicle:
1
Z 0
i
Ls =
1
0
Z

cix
0 0 ciy
Zi

cy
0 0 cix
Z

Ls =

L1s
L2
s3
Ls
L4s

where x y represents the vehicles position and u v w are


the linear velocities represented in the vehicles coordinate
frame {V }. As explained before, depth, attitude and
angular velocities are directly obtained from the pressure
sensor and the AHRS.
4.2 System model
A constant velocity kinematics model is used to determine
how the vehicle state will evolve from time k1 to k during
the prediction step of the Kalman filter:

x(k) = f (x(k 1), n(k 1), u(k))

4. NAVIGATION
The navigation node (see Fig. 3) determines the vehicles
position ([x y z] ) and orientation ([ ]) as well as
their linear ([u v w]) and angular ([p q r]) velocities.
This data is then used by the behaviors and also by
the velocity controller. Vehicles depth (z) is obtained
from the pressure sensor. To compute it, the pressure
value is converted to meters and transformed (rotated and
translated) from the sensors frame {P } to the vehicles
frame {V }. Vehicles orientation and angular velocities are
directly obtained from the attitude and heading reference
system (AHRS). AHRS data is rotated from the AHRS
frame {A} to {V } and then derived with respect to the
time to obtain the angular velocities. Since both depth and
orientation are absolute measurements and their precision
and rate are good enough, this data does not require
further filtering. On the other hand x and y position is
determined by dead-reackoning by means of the velocity
measurements from de DVL, with occasional GPS fixes
whenever the vehicle is on surface. An extended Kalman
filter (EKF) is the sensor fusion algorithm in charge of
generating such estimates. The details of our particular
implementation are provided next.
4.1 State vector
The information to be estimated by the filter is stored
in a state vector. In the proposed filter the state vector
contains:
x(k) = [ x y u v w ]

(3)

"

uk1
vk1 t +
wk1
uk1 + nuk1 t
vk1 + nvk1 t
wk1 + nwk1 t

xk1
+ R(k k k )
yk1

"

xk
yk
uk =

vk
wk

nuk1
nvk1
nwk1

t2
2

(4)

where t is the time period, u = [ ] is the control


input determining the current vehicles orientation and
n = [ nu nv nw ] is a vector of zero-mean white Gaussian
acceleration noises whose covariance values, represented
by the system noise matrix Q, have been set empirically:

2
0
nu 0
(5)
Q = 0 n2 v 0
0 0 n2 w
4.3 Measurements
The state vector is updated with the measurements from
two sensors: the GPS, which provides direct measurements
of x and y position, and the DVL, which provides information about linear velocities [u v w].
Whenever a GPS measurement is available, the vehicles
latitude and longitude are first converted to Universal
Transverse Mercator (UTM) coordinates xG and yG and
then transformed from the GPS frame {G} to the vehicles
frame {V }:

xV
yV

= V TG

xG
yG

(6)

Likewise, the DVL measurements need to be transformed


from the DVLs frame {D} to vehicles frame {V } following the equation:
"

uV
vV
wV

= RD

"

uD
vD
wD

(V V pD ),

(7)

where V RD is a rotation matrix between {D} and {V},


V is the angular velocity ([p q r]) in {V} and V pD is the
position of {D} measured in {V}.
Given this, the measurement vector used in the Kalman
filter update step is:
T

z = [ x V yV u V v V w V ] .

(8)

Then, the measurement model for the update state can be


generally described as:
z(k) = Hx(k|k 1) + m(k)

Fig. 4. Control stack implemented in the Girona 500.


5.2 Velocity controller
The velocity controller, takes the velocity request merged
by the previous node (req ) and the velocity computed by
the navigator node (V ) and computes a force request (V )
according to the standard PID equations:
e = req V ,
Z t
e dt.
V = Kp e + Kd e + Ki

(11)
(12)

1
0

H=0
0
0

0
1
0
0
0

0
0
1
0
0

0
0
0
1
0

0
0

0,
0
1

5.3 Thruster allocator


(9)

with m(k) being a vector of zero-mean white Gaussian


noises affecting the measurements. The associated covariance matrix R is:


RG 0
R(k) =
.
(10)
0 RD
The covariance values for the GPS sensor RG and for
the DVL sensor RD have been set according to the
specifications of sensors manufacturers.

The thruster allocator node takes as input the force request computed by the velocity controller (V ) and computes a set-point for each thruster. A thruster allocation
matrix (TAM) is used to split the force vector into a force
per thruster. Because the AUV used in these paper, the
Girona 500 (Ribas et al., 2011), has 5 thrusters that allow
to actuate it in 4 DoFs (i.e. Surge Sway, Heave and Yaw)
the following TAM is used:

thruster

1
0
=
0
1

1
0
0
1

0
0
1
0

0
0
1
0

0
1

0 V
0

(13)

The general measurement model described here assumes


that both measurements arrive simultaneously for simplicity. In case of asynchronous operation of the sensors the
model needs to be conveniently modified.

Finally, because the relationship between the set-point


given to a thruster and the force that it generates is nonlinear, a polynomial function is used to translate thruster
into each set-point.

5. CONTROL STACK

6. RESULTS

The control stack module (see Fig. 4) is composed by


several elements: a node to merge the velocity requests
send by the behaviors, a velocity controller and a thruster
allocator. Next, all these nodes are detailed.

In this section, simulated and real experiments validating


our approach are detailed:

5.1 Merger

A simulation experiment has been carried out using the


UWSim simulator (IRSLab, 2011). UWSim is software
tool for visualization and simulation of underwater robotic
missions. The software visualizes an underwater virtual
scenario that can be configured using standard modeling
software. Controllable underwater vehicles, surface vessels
and robotic manipulators, as well as simulated sensors,
can be added to the scene and accessed externally through
ROS-based interfaces (Cousins et al., 2010). This allows to

This node is used to combine all the velocity requests


(req = [ureq vreq wreq rreq ]) computed by different
behaviors into a single one. Basically, each velocity request
includes the desired velocity for each DoF, a boolean for
each DoF indicating if the DoF is actuated or not and a
priority value. Requests are sorted by their priority and,
following this order, merged if possible.

6.1 Simulation experiments

Fig. 5. Visual alignment with respect to a black box in simulation. Top row: a general view of the vehicle motion.
Bottom row: the virtual camera image with the current visual features (in green) and their desired position (in
red).

Fig. 6. The trajectory followed by the visual features in


the image during the simulation experiment.

Fig. 9. The Girona 500 vehicle in a water tank, aligning


with respect to a black box mockup.

Fig. 7. The velocity reference sent to the vehicle in the


simulation experiment.

part of the Girona 500 vehicle, looking towards the floor,


and the vehicle was placed on top of the black box so
that the longer side of the black box was parallel to the
image horizontal, as shown in Fig. 5 (left) and in red
color in Fig. 6. The task was to align the vehicle with
respect to the black box, so that the longer side was in
the direction of the image vertical, as shown in Fig. 5
(right), and in green color in Fig. 6. The previous images
show, respectively, a sequence of the vehicle performing the
alignment in the simulated scenario, and the trajectory
of the visual features in the image, from its initial to
its desired configuration. Fig. 7 shows the evolution of
the velocity reference computed by the visual servoing
control law, in agreement with the classical exponential
convergence behavior of visual servoing systems.

easily integrate the simulation and visualization tool with


existing control architectures, thus allowing hardware-inthe-loop simulations (HIL). Indeed, the algorithms described in the previous sections are the same when running
in simulation and in the real robot, because both the simulator and the real robot share the same software interface
through ROS topics.
The simulator was loaded with the CIRS (University of
Girona) water tank 3D model, and with the Girona 500
model. A black box mockup was placed in the floor of the
water tank. A virtual camera was added to the bottom

6.2 Real experiments


The real experiments were also carried out at the CIRS
water tank with the Girona 500 vehicle (Ribas et al.,
2011), as shown in Fig. 9. As in the simulation experiment,
a black box mockup was placed at the bottom of the
water tank. The dimensions of the black box were assumed
to be known (0.16 0.30 meters), and used to compute
the reference position of the visual features by projecting
a desired 3D pose into the image. The desired relative
3D pose between the object and the camera was set to

Fig. 8. A sequence taken from the vehicle camera while performing the alignment task.
7. CONCLUSIONS

Fig. 10. The trajectory followed by the visual features in


the image during the real experiment with the Girona
500 vehicle.

Fig. 11. The velocity reference sent to the vehicle in the


real experiment.
[0 0 1.15] meters in translation (i.e. to have the template
centered in the image at a distance of 1.15 meters), and 0
degrees in yaw (i.e. the longer side aligned with the image
vertical). The initial position of the visual features in the
image were initialized by clicking on the corners. These
clicks were also used to extract the template used for the
tracking algorithm.
The vehicle behavior was as expected: it successfully
aligned with respect to the target object after a period
of some 80 seconds. Fig. 8 shows the view of the onboard
camera during the whole alignment task, from the initial
configuration to the desired one. The trajectory in the
image of the visual features is plotted in Fig. 10, whereas
Fig. 11 shows the evolution of the velocity reference sent
to the vehicle.

In this paper, our approach to autonomous alignment


of underwater vehicles with respect to a given object
has been described. The alignment is performed visually,
by continuously tracking the target and feeding a visual
servoing control law that computes a vehicle velocity
reference, having into account the relative transformation
between the camera and the rotation center of the vehicle.
The reference velocity computed by the visual controller
is compared with respect to the actual vehicle velocity
estimated with the AUV sensors, leading to the final
velocity that is later transformed into control references for
the thrusters. Experiments performed in simulation and
with a real AUV show the validity of our approach.
REFERENCES
Benhimane, S. and Malis, E. (2004). Real-time imagebased tracking of planes using efficient second-order
minimization. In IEEE/RSJ International Conference
on Intelligent Robots and Systems, 943948.
Cousins, S., Gerkey, B., and Conley, K. (2010). Sharing
software with ros [ros topics]. IEEE Robotics & Automation Magazine, 17(2), 1214.
Dementhon, D. and Davis, L. (1995). Model-based object
pose in 25 lines of code. International Journal of
Computer Vision, 15(1/2), 123141.
Hutchinson, S., Hager, G., and Corke, P. (1996). A tutorial
on visual servo control. IEEE Transactions on Robotics
and Automation, 12(5), 651670.
IRSLab (2011). UWSim: The UnderWater Simulator.
http://www.irs.uji.es/uwsim.
Lots, J., Lane, D., Trucco, E., and Chaumette, F. (2001).
A 2-d visual servoing for underwater vehicle station
keeping. In IEEE International Conference on Robotics
and Automation, 27672772. Seoul, South Korea.
Prats, M., Garca, J., Wirth, S., Ribas, D., Sanz, P., Ridao,
P., Gracias, N., and Oliver, G. (2012). Multipurpose
autonomous underwater intervention: A systems integration perspective. In 20th Mediterranean Conference
on Control and Automation, in review.
Ribas, D., Ridao, P., Mag, L., Palomeras, N., and Carreras, M. (2011). The Girona 500, a multipurpose
autonomous underwater vehicle. In Proceedings of the
Oceans IEEE. Santander, Spain.
van der Zwaan, S. and Santos-Victor, J. (2001). Realtime vision-based station keeping for underwater robots.
In OCEANS, 2001. MTS/IEEE Conference and Exhibition, volume 2, 10581065.

You might also like