0% found this document useful (0 votes)
198 views16 pages

Trajectory Prediction

This document discusses two models for predicting user trajectories in location-aware computing systems: a probability-based model and a learning-based model. It analyzes these two approaches and evaluates their performance through experiments. Trajectory prediction can enhance location awareness by providing information about a user's likely future locations, allowing systems to better prepare location-based services. The models are evaluated based on their ability to accurately predict trajectories and locations for mobile users.

Uploaded by

Alberto Castillo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
198 views16 pages

Trajectory Prediction

This document discusses two models for predicting user trajectories in location-aware computing systems: a probability-based model and a learning-based model. It analyzes these two approaches and evaluates their performance through experiments. Trajectory prediction can enhance location awareness by providing information about a user's likely future locations, allowing systems to better prepare location-based services. The models are evaluated based on their ability to accurately predict trajectories and locations for mobile users.

Uploaded by

Alberto Castillo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Computers, Environment and Urban Systems

30 (2006) 741–756
www.elsevier.com/locate/compenvurbsys

Location awareness through trajectory prediction


Xiong Liu, Hassan A. Karimi *

Department of Information Science and Telecommunications, University of Pittsburgh,


Pittsburgh, PA 15260, USA

Received 15 July 2004; accepted in revised form 15 July 2005

Abstract

Location-aware computing is a type of ubiquitous computing that uses user’s location informa-
tion as an essential parameter for providing services and application-related optimization. Location
management plays an important role in location-aware computing because the provision of services
requires convenient access to dynamic location and location-dependent information. Many existing
location management strategies are passive since they rely on system capability to periodically record
current location information. In contrast, active strategies predict user movement through trajecto-
ries and locations. Trajectory prediction provides richer location and context information and facil-
itates the means for adapting to future locations. In this paper, we present two models for trajectory
prediction, namely probability-based model and learning-based model. We analyze these two models
and conduct experiments to test their performances in location-aware systems.
Ó 2006 Elsevier Ltd. All rights reserved.

Keywords: Location-aware computing; Location management; User movement; Trajectory prediction;


Probabilistic methods; Machine learning algorithms

1. Introduction

The emergence of widespread mobile devices and low-cost location sensing techniques,
coupled with expanding wireless coverage and more reliable wireless connections, has
spurred ubiquitous mobile computing. This trend has paved the way for Geospatial
Information Systems (GIS) to evolve from desktop-based systems utilized primarily by

*
Corresponding author. Tel.: +1 412 624 4449; fax: +1 412 624 2788.
E-mail addresses: xliu@mail.sis.pitt.edu (X. Liu), hkarimi@mail.sis.pitt.edu (H.A. Karimi).

0198-9715/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.compenvurbsys.2006.02.007
742 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

professionals to service-oriented mobile, ubiquitous GIS accessible to and usable by all


types of users. Today users carrying wirelessly connected mobile devices are able to request
and obtain geographic information services as they travel through the geographic space
and conduct the ordinary tasks of everyday life (Smyth, 2000).
Mobile computing can be divided, depending on information access methods, into two
categories: (1) conventional mobile computing that does not have the means for obtaining
and utilizing user’s current location in processing activities and (2) location-aware comput-
ing that is capable of obtaining and utilizing user’s current location information as one
of the essential parameters for providing services and application-related optimization
(Fritsch & Volz, 2003). Location-aware computing is expected to benefit customers, appli-
cation developers, service providers, and networks operators (Borriello & Deshpande,
2002). For example, coupling location-aware computing and GIS has brought about
Location-Based Services (LBS)—where user’s current location is used to compute and
process queries; a typical query processed by LBS is nearest points of interest while the
user is driving on a road.
Location-aware computing and ubiquitous GIS are closely related to the research area
of ‘‘context-aware computing’’. Dey (2001) defined context as ‘‘any information that can
be used to characterize the situation of an entity’’ such as a person, place or object relevant
to the interaction between a user and an application. In addition to the explicit input by
the user, further context elements could be obtained through the use of new sensors or
inference mechanisms (Zipf & Jost, 2006). Context constitutes an important aspect of
LBS as both the user and location are part of a context (Jiang & Yao, 2006). For example,
Zipf and Jost (2006) propose an ontology-based approach to model various types of
context (e.g., means of transportation) and user parameters (e.g., demographics), and
introduce several strategies to realize adaptive mobile GIS that use user and context infor-
mation; Li (2006) describes an immersive Virtual Reality (VR) approach for capturing
real-time context data on information transactions and individual behaviors in dynamic
LBS environments. In this paper, we focus on location, which is an aspect of context.
In GIS, locations are modeled in continuous or discrete georeferencing systems (Jiang
& Yao, 2006). For example, the universal reference system, known as World Geodetic Sys-
tem 1984, is a continuous georeferencing system which models location as latitude and
longitude, while a mailing address is a discrete georeference system.
Location-aware computing poses challenges regarding location and location-dependent
data management, including: data stream from a poll of mobile devices; varied and inter-
mittent wireless connections; limited bandwidth, power and device size; and location infor-
mation deployment for location-dependent applications (Tan & Wolfson, 2003). Various
strategies for location management have been proposed in the literature. For examples,
location tracking through paging in cellular mobile networks (Yeung & Yum, 1995);
updating and querying Moving Object Databases (MOD) for real-time mobile units track-
ing (Wolfson et al., 1999); and location sensors integration and location conflict resolution
when multiple sensors provide conflicting information (Indulska & Sutton, 2003). Many
existing location management strategies rely on system capability to periodically record
current location of the mobile unit into a database; such strategies are called passive
(Aljadhai & Znati, 2001; Liu, Bahl, & Chlamtac, 1998). In contrast, active strategies are
those that predict trajectories and locations well in advance of current location. Trajectory
prediction is concerned with determining future paths, and location prediction is con-
cerned with determining future locations on predicted trajectories using estimated travel
X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756 743

distances. Location prediction can be regarded as a supplementary step of trajectory pre-


diction (Karimi & Liu, 2003).
Trajectory prediction is part of location management as it helps improve the provision
of mobile services in many aspects such as query processing, caching and data dissemina-
tion. More importantly, it enhances location awareness. Current location-aware systems
utilize location information to determine such services as nearest features of interest from
a given location. The general assumption in such systems is that the area centered at the
current location of the user is where the services are needed. Although this assumption is
valid and used as the basis of many computing strategies, it constrains location awareness
to only current location information. However, trajectory prediction extends location
awareness to include future locations. Knowing future locations, the system is able to infer
activities well in advance and provide much richer context information for preparing ser-
vices, especially those with time-critical constraints. For example, knowing the precise
location of a driver five minutes in advance it is possible to check traffic information
and provide dynamic routing in case of traffic jam, or to check driving conditions and pro-
vide safety alerts in case of poor driving conditions. These value-added services will be
very helpful for mobile users in unfamiliar or unsafe driving environments.
Trajectory prediction is related to time geography and travel demand analysis in trans-
portation planning (Miller, 2003; Miller & Storm, 1996). Time geography studies how
people move through social spaces in their daily activities using space–time activity data.
Travel demand analysis is based on four components: (1) trip generation from specified
origins, (2) trip allocation from a given origin among destinations, (3) modal split for trips
between origin–destination pairs, and (4) route choice within each mode. While time geo-
graphy and travel demand analysis are mainly used for public transportation planning and
transportation policy development, trajectory prediction is concerned with real-time pre-
diction of future trajectories for personalized and time-critical tasks. Due to the differences
between the requirements in travel demand analysis and those in trajectory prediction,
conventional travel demand analysis models are not directly applicable to trajectory pre-
diction, thus requiring new techniques.
The rest of the paper is organized as follows: In Section 2, trajectory modeling is dis-
cussed. In Section 3, the computing infrastructure and trajectory data storage issues are
described. In Section 4, the trajectory prediction process is outlined. In Sections 5 and
6, the probability-based prediction and the learning-based prediction are presented in
details. In Section 7, experimentations and the performances of the two prediction
approaches are discussed. Finally, in Section 8, conclusions are given.

2. Trajectory modeling

Trajectory modeling is an abstraction of user movement required for trajectory predic-


tion. Depending on how a mobile unit’s location is represented, there are two trajectory
modeling methods: cell-based trajectory modeling and network-based trajectory modeling.
In cell-based modeling, the geographic area is divided into regular-shaped cells, usually
determined by the architecture of the cellular network, and the mobile unit’s location is
determined as the cell within which it is contained (Das & Sen, 1999; Kyriakakos, Had-
jiefthymiades, Frangiadakis, & Merakos, 2003; Shah & Nahrstedt, 2002). The trajectory
is expressed as a series of cells making the path of the mobile unit, and the problem of
trajectory prediction is to determine the next set of cells the mobile unit will follow.
744 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

Cell-based modeling has played an important role in mobile computing to improve perfor-
mance such as Quality of Service (QoS) routing in mobile ad hoc networks (Shah & Nahr-
stedt, 2002). However, this method does not consider the geometry and topology of the
actual path, and cannot precisely locate a mobile user because the radius of a typical cell
is 150 m and upward (a cell may have a radius of more than 30,000 m) (Zhao, 2000). Also,
it can only calculate the cell change probability based on the boundary through which the
user leaves the current cell, but cannot precisely estimate the travel distance and travel time
to a destination. Another problem with the cell-based modeling method is that a cell may
contain several roads while the provision of a service, e.g., routing, may require the exact
road on which the user is driving. Due to these limitations, the cell-based modeling method
is not suitable for certain applications where on-road services are demanded.
Network-based modeling determines the mobile unit’s location as a point constrained
to a network instead of a cell using such geopositioning technologies as the Global Posi-
tioning System (GPS) (Karimi & Liu, 2003). In outdoor environments, the network is a
road network where it contains actual roads obtained from digital maps; vertices represent
intersections and edges represent road segments. In addition, frequently traveled paths can
easily be incorporated into a road network database. GPS has become the dominant geo-
positioning technology, and it is anticipated that there will be ample GPS data generated
by mobile users over time. Time-tagged GPS data present a close estimation, depending on
GPS errors, of the user’s actual trajectories. An accuracy of better than 30 m is expected
from stand-alone GPS, while differential GPS would provide an accuracy of a few meters.
Due to GPS errors, map-matching is used to locate the whereabouts of the user on the
road. For example, Map Matched GPS (MMGPS) is an application that corrects raw
GPS data using history of previous position estimates and road geometry (Blewitt & Tay-
lor, 2002). Furthermore, odometer-derived MMGPS (OMMGPS) (Taylor et al., 2006)
uses height information obtained from Digital Terrain Models (DTM) to improve the
accuracy of GPS positions with poor satellite geometry. In network-based modeling, a tra-
jectory is represented as a series of interconnected road segments or intersections (Karimi
& Liu, 2003). The advantage of network-based modeling is that it utilizes both geometry
and topology of actual roads and paths to predict trajectories. The geometry (shape) and
topology (connectivity) of actual roads represent possibilities and constraints for user
movement. Also, speed limits of actual roads influence user’s driving speed and thus pro-
vide a reasonable estimation of travel speed.

3. Computing infrastructure

Of the different computing infrastructures possible for mobile computing environments,


the one with mobile units and fixed base stations is assumed in this paper. The mobiles are
linked to the base stations via wireless communications, and each mobile unit is equipped
with a GPS receiver. The infrastructure also contains different databases for trajectory pre-
diction. Available at the base stations are the road network database containing all the
roads within a given area, a map-matched trajectory database containing each mobile’s
historical trajectories, and other useful information such as user registration profiles.
Available at the mobile unit is a dynamic trajectory database containing a trajectory col-
lected during a travel.
We call previously completed trajectories long-term historical trajectories and real-time
trajectories short-term historical trajectories. A short-term historical trajectory is connected
X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756 745

to the future trajectory via the user’s current location. The collection of all users’ long-
term historical trajectories is useful for discovering group movement patterns or knowl-
edge. Short-term historical trajectories capture user’s activities immediately before current
location, and provide a real-time context for predicting future trajectories.

4. Trajectory prediction process

In this paper, we introduce a network-based modeling method for real-time trajectory


prediction in outdoor environments. We denote user’s current location at time tk as Lk
(where k is a time epoch), and the duration between the current time tk and a later time
tk+1 as the prediction period T. Given the current location (Lk, tk) and a prediction period
T, the problem of trajectory prediction is to predict a trajectory or path starting from cur-
rent location to a future location at time tk + T or tk+1. All locations are map-matched
against the road network and predicted trajectories are represented as a series of intercon-
nected road segments, or intersections.
The trajectory prediction process is accomplished through the following steps: (1) esti-
mate the geographic range or area within which user’s future activities will be performed,
(2) identify a list of candidate trajectories based on the topology of the estimated geo-
graphic range, and (3) select the most likely trajectory from the list of candidate trajecto-
ries. The details of these steps are discussed in Sections 4.1–4.3.

4.1. Dynamic window

Considering the geographic range of a service area (e.g., the Pittsburgh metropolitan), it
is inefficient to retrieve the entire road network for trajectory prediction since only a small
portion of this area may be where user’s activities are performed. We call this user’s activ-
ities range, i.e., the area that contains future activities, dynamic window. The center of
dynamic window is at user’s current location and its shape can be regular, such as a circle,
or irregular. For computational efficiency, we focus on circular dynamic windows. Its
radius is determined by user’s travel speed and the prediction period T. While T is prede-
termined by the system, user’s travel speed is an uncertain factor. When traveling on a
highway, the posted speed limit on the highway could be assumed as the speed of the user,
however when traveling on a local road, the speed could be estimated at less than the
posted speed limit as it is affected by traffic lights and traffic conditions. For simplicity,
we assume that the user’s speed is the same as the posted speed limit of the road on which
the user is driving, SpeedLimit(tk). Therefore, the radius R of dynamic window can be
determined by
R ¼ SpeedLimitðtk Þ  T ð1Þ
For example, if the user is driving on a 25 miles/h road and the prediction period time is
30 min, then the size of a dynamic window is 12.5 miles.
Fig. 1 shows a road network (solid lines) and a dynamic window (dashed circle). A road
network is modeled as a graph G = hV, Ei where V is the set of all intersections with size
jVj and E is the set of all road segments with size jEj. Each road segment e may have such
attributes as speed limit, length and road width. In practice, each road segment in the net-
work has a direction, a one-way road or a two-way road. If direction is not considered, G
is a non-directed (symmetric) graph. When direction is taken into account, G is a directed
746 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

v9 v11
ex2

e7 e10

ex3
v1 ex1 e1 v2 e2 v3 e3 v4

e8 e11
v5 ex6 e4 v6 ex4 e5 v7 e6 v8
ex5
e9 e12

v10 v12

Road Vertex v Road Segment e

User Location L Exit Point ex

Fig. 1. Road network and dynamic window.

(non-symmetric) graph. In this paper, we assume non-directed road networks throughout.


Because dynamic window is a clipping process, it clips the road network into a smaller
portion. Only the road network contained in dynamic window is used for trajectory pre-
diction. We consider all the vertices inside dynamic window and their immediate (neigh-
bor) vertices as the vertices of the sub-network. The sub-network is a new graph
G 0 = hV 0 , E 0 i where V 0 is the set of all retrieved vertices with size jV 0 j and E 0 is the set
of all retrieved edges with size jE 0 j.
Dynamic window represents the largest geographic area within which user’s activities
are expected during the prediction period T. We define the intersections between dynamic
widow and the road segments of the sub-network as exit points because they represent all
possible exit locations where the user may leave the window (see Fig. 1). Exit points may
be considered as destinations of predicted trajectories. The number of exit points can be
inferred from the topological information in G 0 . Suppose the number of vertices inside
a dynamic window is N, we define a new graph G00 = hV00 , E00 i where V00 is the set of vertices
(N) inside the window and E00 is the set of edges between those N vertices. For each vertex
u inside the window, we know the number of edges linked to it or the number of neighbor
vertices (num_neighbors(u)) by looking up the row of u in theP adjacency matrix of G 0 .
00
If we sum up the number of edges linked to each vertex in G as uðG00 Þ num neighborsðuÞ,
we get a duplicated count of those edges in G00 , which is exactly twice the number of edges
00 00
P j). Therefore, if we subtract the duplicated count of edges in G from
(2jE
uðG00 Þ num neighborsðuÞ, we get the number of edges that intersect with the window,
which is also the number of exit points (num_Exp):
X
num Exp ¼ num neighborsðuÞ  2jE00 j ð2Þ
uðG00 Þ

For the example shown in Fig. 1, v2 has four edges (e1, e2, e8, e7) linked to it and four neigh-
bor vertices (v1, v3, v6, v9). Similarly, v6 has four edges (e8, e4, e5, e9) and four neighbor ver-
tices (v2, v5, v7, v10). G00 is a graph including only v2 and v6, and there is only one edge e8 in
G00 . Therefore, the number of exit points is (4 + 4)  2 = 6, as shown in Fig. 1.
X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756 747

4.2. Candidate trajectories

The topology information in G 0 is used to generate candidate trajectories. We use a


graph search tree to identify candidate trajectories in a sub-network G 0 , where the root
is user’s current location L, the intermediate nodes are vertices in G 0 , the leaf nodes are
exit points, and the links between the nodes are road segments. To reduce the search space,
we assume the following rules:

1. User does not make a u-turn at an intersection.


2. User does not visit an intersection that has already been visited during a trip.

Fig. 2 depicts a search tree constructed for the dynamic window in Fig. 1. Clearly,
certain branches are pruned by the rules of the search tree.
Fig. 3 illustrates the TrajSearch algorithm that identifies candidate trajectories. The
algorithm takes the sub-network G 0 , the exit points set ExpSet and the current location
L. It first initializes the vertices in G 0 to an unprocessed status. Then it calls a Depth First
Search (DFS) algorithm to detect candidate trajectories. DFS first takes the nearest vertex
u0 of L based on the moving direction of the user. It then changes the status of u0 to ‘‘pro-
cessed’’ and finds all the children (neighbor vertices) of it. Then the children become par-
ents by calling DFS (see Line 6 in DFS), the recursive process continues until one exit
point is detected (see Line 5 in DFS). DFS then changes the status of the input vertex
to ‘‘stopped’’, meaning one trajectory has been detected. There is one loop in the algorithm
which goes through all neighbor vertices of a particular vertex u. All other operations per-
formed within the loop, such as changing status, have O(1) time complexity. The loop
P 
through all neighbors of all vertices in G 0 takes O 0 num neighborsðuÞ
uðG Þ ¼ Oð2jE0 jÞ.
Therefore, the overall time complexity is O(jE 0 j), where jE 0 j is number of edges in G 0 .

L
Pruned
e1

v1
v2

e2 e7
e8
ex2
ex3
P(Traj1) = 1/3 P(Traj5) = 1/3
v6
Pruned
e4 e9
e5 e8
ex6 ex5 v2
P(Traj2) = 1/9 ex4 P(Traj4) = 1/9
P(Traj3) = 1/9

Fig. 2. Trajectory search tree.


748 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

Fig. 3. TrajSearch algorithm.

4.3. Trajectory prediction

After all candidate trajectories are identified, the task of trajectory prediction is to select
the most likely one that a user will take. For the trajectory search tree shown in Fig. 2,
when the user reaches the nearest intersection of current location, which is v2, the system
is faced with the problem of deciding which road segment the user will take: e2, e8 or e7. If
the system decides that e8 is the road that the user will take, then the system is faced with
the second intersection v6 and again needs to decide which road segment the user will take:
e4, e5 or e9. The process of deciding which road segment the user will take at an intersec-
tion continues until an exit point or leaf node is reached.
To resolve uncertainty at intersections, there is a need to decision models that consider
all road choices and choose one. Developing such decision models is one main problem in
trajectory prediction. With an appropriate decision model, a trajectory using the trajectory
search tree, such as the one shown in Fig. 2, can be determined. We present two decision
models: a probability-based model and a learning-based model. Both models use long-
term and short-term historical trajectories for prediction. The difference is that the
probability-based model adopts probabilistic information for prediction while the learn-
ing-based model characterizes the features of long-term and short-term trajectories and
is based on machine learning algorithms such as classification for prediction. The details
of the two models are discussed in Sections 5 and 6, respectively.

5. Probability-based model

The probability-based model calculates the probability of taking each road segment at
each intersection. The probability parameters are stored in a conditional probability
X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756 749

matrix for each intersection. For an intersection v with n road segments, the probability
matrix is defined as
2 3
P ðR1 jR1 Þ P ðR2 jR1 Þ    P ðRn jR1 Þ
6 P ðR jR Þ P ðR jR Þ . . . P ðR jR Þ 7
6 1 2 2 2 n 2 7
Mðv; tk Þ ¼ 6 7 ð3Þ
4     5
P ðR1 jRn Þ P ðR2 jRn Þ    P ðRn jRn Þ

where tk is current time, and R1, R2, . . . , Rn are road segments that share v. P(R1jR1) is the
probability of taking R1 when the user is currently on R1, and P(R2jR1) is the probability
of taking R2 when the user is currently on R1, and so on. Long-term historical trajectories
are used to infer the number of times the user has traveled on each road segment and the
trajectory choice at each intersection. The data is then used to calculate the parameters of
the probability matrix. Therefore, M(v, tk) can be expressed as
2 3
N 1;1 =N 1 N 2;1 =N 1    N n;1 =N 1
6 N =N N =N    N =N 7
6 1;2 2 2;2 2 n;2 27
Mðv; tk Þ ¼ 6 7 ð4Þ
4     5
N 1;n =N n N 2;n =N n  N n;n =N n
where Ni is the number of times the user has traveled on road segment Ri. Nj,i is the num-
ber of times the user has taken road segment Rj when on road segment Ri. In other words,
the following should hold:
X n
N j;i ¼ N i ð5Þ
j¼1

We call intersection v where a road choice needs to be predicted as a decision point. When a
user is on a road segment Ri (1 6 i 6 n) connected to the decision point, the road segments
Rj (j 5 i) become road choices. We call Rj class labels and their probabilities P(RjjRi) class
priors. The probability-based model chooses the class label Rk with the largest prior
P(RkjRi). Therefore, the decision function of a class label ^y is
^y ¼ maxðP ðRk jRi ÞÞ ð1 6 k 6 nÞ ð6Þ
k

With the decision model and a trajectory search tree, we are able to predict the user’s road
choice intersection by intersection until an exit point is reached. For the trajectory search
tree shown in Fig. 2, the model must be applied twice at decision points v2 and v6 in order
to predict a complete trajectory within the dynamic window.

6. Learning-based model

The probability-based model utilizes long-term historical trajectories for probability


calculation and trajectory prediction. However, the prediction does not consider context
information such as time of the day, travel distance and travel direction. In contrast,
the learning-based model incorporates context information in trajectory prediction. One
advantage of the learning-based model is that mobile contexts can be quantitatively mea-
sured and mapped into a feature space for prediction.
750 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

6.1. Context feature characterization

Mobile contexts vary significantly and may change dynamically during usage. We
focus on mobile contexts that capture common characteristics of historical trajectories.
Common trajectory features include distance, movement direction, time, and travel time
(Schlieder & Werner, 2003). After feature characterization, both long-term and short-term
historical trajectories are mapped into data points in a N-dimensional feature space, where
N is the number of features. As in the probability-based model, the road segments Ri
(1 6 i 6 n) connected to the decision point P are called class labels. For all the long-term
historical trajectories that pass through P, the class label is known. While for short-term
historical trajectories, the label is unknown and needs to be predicted. The unknown label
represents the user’s road choice, which is immediately connected to the short-term histor-
ical trajectory via the decision point P.

6.2. Multi-way classification

A machine learning algorithm, called multi-way classification, is used to predict the


class label of a short-term trajectory. The idea is to divide the feature space into different
portions, or classes, using long-term historical trajectory data points with known labels.
Then a short-term historical trajectory data point, with an unknown label, will automat-
ically be assigned a label depending on which portion of the space the point is located. The
algorithm decomposes the classification problem into several binary classification tasks,
and then combines the results of the sub-tasks into a hypothesized solution to the original
problem. The number of binary classification tasks depends on the number of class label
an intersection or decision point is assigned. We use logistic regression model for the bin-
ary classification task.

6.2.1. Logistic regression model


For binary classification, the logistic regression model uses discriminant functions

g1 ðxÞ ¼ gðwT xÞ ð7Þ

and
g0 ðxÞ ¼ 1  gðwT xÞ ð8Þ
z
where x is the input with dimension d, w is the array of weights, and g(z) = 1/(1 + e ) is a
logistic function.
Let li = p(yi = 1jxi, w) = g(zi) = g(wTx). Then the likelihood of outputs is
Y n Y
n
li i ð1  li Þ1y i
y
LðD; wÞ ¼ P ðy ¼ y i jxi ; wÞ ¼ ð9Þ
i¼1 i¼1

The log-likelihood is
Yn
y 1y
X
n
lðD; wÞ ¼ log li i ð1  li Þ i ¼ y i log li þ ð1  y i Þ logð1  li Þ ð10Þ
i¼1 i¼1

Logistic regression model tries to find weights that maximize the likelihood of outputs.
The derivatives of the log-likelihood is
X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756 751

1 0 vs (1 or 2)

x1 1 vs (0 or 2)
.
.
xd 2 vs (0 or 1)

Fig. 4. Three-way classification.

o Xn
 lðD; wÞ ¼ xi;j ðy i  gðzi ÞÞ ð11Þ
owj i¼1

where D is the set of examples and yi is the true class label.


Let the derivatives be zeros, we get a non-linear function of wj. The gradient solution is
Xn
wðkÞ wðk1Þ þ aðkÞ ½y i  f ðwðk1Þ ; xi Þxi ð12Þ
i¼1

where a(k) is learning rate.


The decision function of predicted class label ^y is

1; g1 ðxÞ P g0 ðxÞ
^y ¼ ð13Þ
0; g1 ðxÞ < g0 ðxÞ

6.2.2. Multi-way classification


In the multi-way classification algorithm, a binary classification using logistic regression
model is performed on each pair of classes. Fig. 4 depicts the situation for a three-way clas-
sification, where the three class labels are denoted as ‘‘0’’, ‘‘1’’ and ‘‘2’’. Class ‘‘0’’ is dis-
tinguished from a combination of classes ‘‘1’’ and ‘‘2’’, class ‘‘1’’ is distinguished from a
combination of classes ‘‘0’’ and ‘‘2’’, and class ‘‘2’’ is distinguished from a combination
of classes ‘‘0’’ and ‘‘1’’.
This three-way classification generates three decision boundaries. Then the data point
for the short-term trajectory is mapped into the feature space and assigned a class label.
With the multi-way classification algorithm and a trajectory search tree, we are able to pre-
dict the user’s road choice intersection by intersection until an exit point is reached. For
the trajectory search tree shown in Fig. 2, the model must be applied twice for the decision
point v2 and the decision point v6 in order to predict a trajectory.

7. Experimentation

In this section, we present an experimentation that was conducted to test the perfor-
mances of the two prediction models. The focus of the experimentation was to measure
the prediction accuracy at one decision point. The accuracy of predicting a complete tra-
jectory within a dynamic window using a trajectory search tree can be calculated based on
the accuracy at each decision point, however, its discussion is beyond the scope of this
paper.
752 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

7.1. Trajectory dataset

Synthetic trajectory data sets are used for this experimentation. User’s movement can
be off-road or random walk. However, many studies have argued that mobile users, espe-
cially drivers, very often follow a network and use a fastest or shortest path to their des-
tination (Brinkhoff, 2002). Fig. 5 shows a portion of the road network of downtown
Pittsburgh area used in this experimentation. A total of 4624 fastest routes between every
possible pair of intersections were computed. Table 1 shows a list of sample trajectories,
where each trajectory is represented as a sequence of intersections or nodes.
Suppose the user is moving in the direction from ‘‘20’’ to ‘‘21’’, and ‘‘21’’ is the decision
point. To predict the user’s road choice at ‘‘21’’, only those trajectories that pass through
‘‘21’’ are selected. There are 552 such trajectories, as shown in Fig. 5. Because three road
segments are connected to ‘‘21’’, the user will have three road choices. We denote the road
segment 21 ! 25 as class ‘‘0’’, the road segment 21 ! 45 as class ‘‘1’’, and the road seg-
ment 21 ! 23 as class ‘‘2’’. Then the problem becomes: Which road segment, ‘‘0’’, ‘‘1’’
or ‘‘2’’, will the user take?

7.2. Trajectory feature characterization

For a decision point P with n road segments, we characterize three features: distance L,
movement direction h and time-of-day T. Distance here refers to the trajectory length
between the start point Si and the decision point P. For movement direction, we introduce
a reference vector which is a direction from P to the East and a direction vector from Si to
P. A polar coordinate is used for measuring movement direction which is the angle
between the reference vector and the direction vector, see Fig. 6(a). Fig. 6(b) shows an
example of calculating movement direction for the trajectory 0 ! 1 ! 22 ! 20 ! 21,
where the start point is ‘‘0’’ and the decision point is ‘‘21’’. The time-of-day feature refers

Fig. 5. Trajectories that pass through intersection ‘‘21’’.


X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756 753

Table 1
Sample simulated trajectory data
Trajectory ID Nodes
1 0 1 22 20 21 45 51
2 4 18 25 21 45 51 52
3 9 10 61 13 20 21 45 51
4 46 52 51 45 21 23 33
5 66 53 52 51 45 21 25 18

S1 direction vector
direction vector
P θ reference
vector East Reference
direction vector -θ vector
45
Si -θ East

(a) (b)

Fig. 6. Movement direction calculation.

to the time when mobile users pass through the decision point. It is generated as a Gaus-
sian distribution of time.
L, h and T are vectors of attribute values or inputs with different scales. They have to be
normalized before they are passed to prediction algorithms. The following equations are
used to calculate normalized inputs, L 0 , h 0 , and T 0 , where mean( ) is the mean of input
and std( ) is the standard deviation of input:
L  meanðLÞ
L0 ¼ ð14Þ
stdðLÞ
h  meanðhÞ
h0 ¼ ð15Þ
stdðhÞ
T  meanðT Þ
T0 ¼ ð16Þ
stdðT Þ
We divide the normalized trajectories into a training set with 407 trajectories and a test set
with 145 trajectories. The training set contains long-term historical trajectories whose class
labels are known to the prediction models, while the test set contains short-term historical
trajectories whose class labels need to be predicted by the models. Fig. 7 shows the 3D
view of the training set.

7.3. Analysis of results

We applied the probability-based model and the multi-way logistic regression classifica-
tion on the training set, and computed the misclassification errors for both the training set
754 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

2
dimension 3: time-of-day

-1

-2

-3

-4
2

1 3
2
dim 0 1
ensi
on 2 0
: dir
ecti tance
on -1 -1 1: dis
nsion
-2 dime
-2 -3

Fig. 7. Visualization of training set, where red ‘‘s’’ is class 0, green ‘‘’’ is class 1 and blue ‘‘n’’ is class 2. (For
interpretation of color in this figure legend, the reader is referred to the web version of this article.)

Table 2
Experimentation results
Technique Training error Test error
Random prediction 0.6682 0.6731
Probability-based prediction 0.4865 0.5517
Multi-way logistic regression 0.3145 0.3241

and test set. Also, we calculated misclassification errors of random prediction for compar-
ison (see Table 2). It can be seen that both the probability-based model and the multi-way
logistic regression have much better prediction performance than random prediction.
Also, it is shown that the multi-way logistic regression based on feature characterization
performs better than the probability-based model which does not consider trajectory
features.
It should be noted that the results are dependent on the trajectory dataset used. There
may exist such cases where the probability-based model outperforms the multi-way clas-
sification. However, the probability-based model only uses probabilistic information for
prediction without considering context information. On the contrary, the learning-based
model, such as the multi-way classification, is able to include any context information
through feature characterization and selection, and proves to be a much more flexible pre-
diction strategy.
X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756 755

8. Conclusions

Location-aware computing introduces new challenges and opportunities for location


management in mobile computing. Many existing location management strategies in loca-
tion-aware computing rely on system capability to periodically record current location
information. A major shortcoming of this strategy is that it constraints location awareness
to current location, hence impeding the possibility of inferring future activities. Trajectory
prediction offers richer location and context information and facilitates adaptation to
future locations. In this paper, we presented a process for trajectory prediction and dis-
cussed two prediction models: probability-based and learning-based. Both models use
long-term and short-term historical trajectories for prediction. The major difference
between the two models is that the probability-based model adopts probabilistic informa-
tion for prediction while the learning-based model characterizes trajectory features and
adopts machine learning algorithms for prediction. To realize the benefits of these models,
an experiment was conducted. The results of this experiment reveal that both models per-
form much better than random prediction. Furthermore, the results of the experiment
indicated that the learning-based model is more flexible than the probability-based model
because it supports more context information which is needed for better prediction.

References

Aljadhai, A. R., & Znati, T. (2001). Predictive mobility support for QoS provisioning in mobile wireless
environments. IEEE Journal on Selected Areas in Communications, 19, 1915–1930.
Blewitt, G., & Taylor, G. (2002). Mapping dilution of precision (MDOP) and map matched GPS. International
Journal of Geographical Information Science, 16(1), 55–67.
Borriello, G., & Deshpande N. (2002). Location-aware computing: Creating innovative and profitable
applications and services. Intel Developer Update Magazine.
Brinkhoff, T. (2002). A framework for generating network-based moving objects. GeoInformatica, 6(2), 153–180.
Das, S. K., & Sen, S. K. (1999). Adaptive location prediction strategies based on a hierarchical network model in
a cellular mobile environment. The Computer Journal, 42(6), 473–486.
Dey, A. K. (2001). Understanding and using context. Personal and Ubiquitous Computing, 5, 4–7.
Fritsch, D., & Volz, S. (2003). NEXUS – the mobile GIS-environment. In Joint first workshop on mobile future
and symposium on trends in communications (pp. 1–4).
Indulska, J., & Sutton, P. (2003). Location management in pervasive systems. In Proceedings of the Australasian
information security workshop conference on ACSW frontiers (Vol. 21, pp. 143–151).
Jiang, B., & Yao, X. (2006). Location-based services and GIS in perspective. Computers, Environment and Urban
Systems, this issue, doi:10.1016/j.compenvurbsys.2006.02.003.
Karimi, H. A., & Liu, X. (2003). A predictive location model for location-based services. In Proceedings of the
11th ACM international symposium on advances in geographic information systems (pp. 126–133). New Orleans,
LA, USA.
Kyriakakos, M., Hadjiefthymiades, S., Frangiadakis, N., & Merakos, L. (2003). Multi-user driven path
prediction algorithm for mobile computing. In 14th International workshop on database and expert systems
applications (DEXA’03) (pp. 191–195).
Li, C. (2006). User preferences, information transactions and location-based services: A study of urban pedestrian
wayfinding. Computers, Environment and Urban Systems, this issue, doi:10.1016/j.compenvurbsys.2006.02.008.
Liu, T., Bahl, P., & Chlamtac, I. (1998). Mobility modeling, location tracking, and trajectory prediction in
wireless ATM networks. Wireless access broadband networks [Special issue]. Proceedings of the IEEE Journal
on Special Areas in Communications, 16(6), 922–936.
Miller, H. J. (2003). What about people in geographic information science? Computers, Environment and Urban
Systems, 27(5), 447–453.
Miller, H. J., & Storm, J. D. (1996). Geographic information system design for network equilibrium-based travel
demand models. Transportation Research Part C: Emerging Technologies, 4(6), 373–389.
756 X. Liu, H.A. Karimi / Comput., Environ. and Urban Systems 30 (2006) 741–756

Schlieder, C., & Werner, A. (2003). Interpretation of intentional behavior in spatial partonomies. In C. Freksa,
W. Brauer, C. Habel, & K. F. Wender (Eds.), Spatial cognition III: Routes and navigation, human memory and
learning, spatial representation and spatial learning (pp. 401–414). Berlin: Springer.
Shah, S. H. & Nahrstedt, K. (2002). Predictive location-based QoS routing in mobile ad hoc networks. In
Proceedings of IEEE International Conference on Communications (ICC 2002) (pp. 1022–1027), New York.
Smyth, S. (2000). Mobile geographic information services: Turning GIS inside out. Microsoft Report.
Tan, K. L., & Wolfson, O. (2003). Guest editors’ introduction to MONET special issue on mobile and wireless
data management. Mobile Networks and Applications, 8(4), 315–316.
Taylor, G., Brunsdon, C., Li, J., Olden, A., Steup, D., & Winter, M. (2006). GPS accuracy estimation using map
matching techniques: applied to vehicle positioning and odometer calibration. Computers, Environment and
Urban Systems, this issue, doi:10.1016/j.compenvurbsys.2006.02.006.
Wolfson, O. et al. (1999). Updating and querying databases that track mobile units. Mobile data management
and applications [Special issue]. Distributed and Parallel Databases Journal, 7(3), 257–287.
Yeung, K., & Yum, T. S. (1995). Cell group decoupling analysis of a channel borrowing based dynamic channel
assignment strategy in linear radio systems. IEEE Transactions on Communications, 43(2–4), 1289–1292.
Zhao, Y. (2000). Mobile phone location determination and its impact on intelligent transportation systems. IEEE
Transaction on Intelligent Transportation Systems, 1(1), 55–64.
Zipf, A., & Jost, M. (2006). Implementing adaptive mobile GI services based on ontologies: Examples from
pedestrian navigation support. Computers, Environment and Urban Systems, this issue, doi:10.1016/
j.compenvurbsys.2006.02.005.

You might also like