TMCED
TMCED
net/publication/220855537
CITATIONS READS
89 196
4 authors, including:
Chih-Chieh Hung
National Chung Hsing University
50 PUBLICATIONS 954 CITATIONS
SEE PROFILE
All content following this page was uploaded by Chih-Chieh Hung on 23 May 2016.
1
ing algorithm we developed will be invoked as needed in op-
eration to effectively determine faulty readings. A prelim-
inary performance evaluation is conducted via simulation.
Experimental result shows that the proposed TrustVoting
algorithm is able to effectively identify faulty readings and
outperforms majority voting and distance weighted voting,
two state-of-the-art voting schemes for in-network faulty
reading detection for sensor networks.
A significant amount of research effort has been elabo-
rated upon issues of identifying faulty sensor readings [2,
Figure 1: An illustrative topology of a wireless sen-
5, 6, 9]. In [6], the authors explored spatial correlation
sor network.
among sensors and proposed a distributed Bayesian algo-
rithm for detecting faulty sensors. By assuming that faulty
measurements are either much larger or much smaller than
performed (i.e., 0.3*1+0.4*1+0.9*(-1)=-0.2) where positive
normal measurements, the authors in [2] use a statistical
and negative votes are represented by 1 and -1, respectively).
method to detect outlier measurements. Some variations of
As shown above, the distance-based weighted voting meth-
the weighted voting technique for detecting faulty sensors
od has two primary deficiencies: 1) while the distance be-
are proposed in [5] and [9]. In [5], the past performances
tween sensor nodes may be a factor in generating similar
of sensors are considered to enhance the classical majority
readings of nearby sensor nodes, it does not precisely cap-
voting, and the coverage of sensing range is considered in [9]
ture what we really care about as the correlation between
for its weighted voting. To the best of our knowledge, prior
sensor readings; 2) while it is a good idea to inquire opin-
works neither fully formulate the similarity of sensors nor
ions of neighbors, the trustworthiness of neighbors is not
provide the concept of SensorRank, let alone devising filter-
considered.
ing algorithm based on SensorRank. These features distin-
Based on the above observation, in this paper, we pro-
guish this work from others.
pose an innovative in-network voting scheme for determin-
The rest of this paper is organized as follows. The no-
ing faulty readings by takeing into account the correlation
tion of correlation network is presented in Section 2. The
of readings between sensor nodes and the trustworthiness of
SensorRank and the TrustVoting algorithms are described
a node. To obtain the pair-wise correlations of sensor read-
in Section 3 and Section 4, respectively. A preliminary per-
ings, we construct a logical correlation network on top of the
formance evaluation is conducted in Section 5. Finally, the
sensor network. The correlation network can be depicted by
conclusion and future work are discussed in Section 6.
a set of vertices and edges, where each vertex represents a
sensor node and a edge between two sensor nodes denotes
their correlation (i.e., the similarity between their readings). 2. CORRELATION NETWORK
If two nearby sensor nodes do not have any similarity in their As mentioned earlier, prior works only take the distance
readings, these two sensor nodes do not have an edge con- between sensor nodes into consideration when modeling the
nected. Therefore, only sensor nodes that are connected by correlation of sensor readings. However, it is also possible
correlation edges are participated in voting. The weighted that the readings of two geographically close sensor nodes
voting method actually uses the correlation (modeled as a to have dramatically different readings. Thus, it’s critical to
function of distance) between sensor nodes as weights. How- truly capture the correlation of sensor readings rather than
ever, using the correlation alone may not correctly identify their distance.
faulty readings due to the domination problem discussed Definition 1. Reading Vector: Assume that the over-
above. Thus, in the proposed algorithm, each sensor node all readings of a sensor si consists of a series of readings in
is associated with a trustworthiness value (called Sensor- a sliding window ∆t. The readings of si can be expressed as
Rank ) that will be used in voting. SensorRank of a sensor bi (t) = {xi (t − ∆t + 1) , xi (t − ∆t + 2) , . . . , xi (t)}, where
node implicitly represents the number of ’references’ (i.e., xi (t) is the reading sensed by si at the time t.
similar sensor nodes nearby) it has to support its opinions. Clearly, the readings of a sensor within a sliding window
A sensor node will obtain a high SensorRank if this sen- is represented as a reading vector. Therefore, we can define
sor has many references. The number under each sensor the similarity of two sensor nodes in terms of their reading
node in Figure 1 is its SensorRank. In the figure, s4 has a vectors. Since a faulty reading may be very different from
small SensorRank because the readings in s4 are not very other normal readings from the perspectives of trend and
similar to that of its neighbors. By using SensorRank, our magnitudes, we employ the Extended Jaccard similarity [7]
voting scheme takes the trustworthiness of each sensor into as our similarity function. The Extended Jaccard similarity
account. A vote with small SensorRank has only a small function for calculating the similarity of two sensors si and
impact on the final voting result. For example, in Figure 1, sj is denoted as corri,j and defined as follows:
when s1 inquires opinions from its neighbors (i.e., s2 , s3 and
s4 ), the vote from s4 has a small impact due to its lower bi (t) · bj (t)
corri,j = ,
SensorRank. ||bi (t)||22 + ||bj (t)||22 − bi (t) · bj (t)
Our design consists of two parts: 1) an algorithm that where ||bi (t)||22 = |xi (t − ∆t + 1)|2 + · · · + |xi (t)|2 .
calculates SensorRank for each sensor node; and 2) an al-
gorithm that use SensorRank to determine faulty readings. When the readings of two sensors have neither the similar
Specifically, we first obtain correlations among sensor read- trend nor the similar difference, the value of corri,j is close
ings and then model the sensor network as a Markov chain to to 0. On the other hand, the value will be set to 1 when the
determine SensorRank. In light of SensorRank, the TrustVot- reading vectors of two sensors are exactly the same.
2
Assume that the communication range of a sensor node Algorithm 1 SensorRank
is denoted as R and the geographical distance of two sensor Input: a sensor si , and a threshold δ.
nodes is represented as dist(si , sj ). In light of the corre- Output: ranki for si .
lations among sensor nodes in the network, a correlation (0)
1: ranki = 1
network is defined as follows: 2: for k = 1 to δ do
Definition 2. Correlation network: The correlation 3: for all sj ∈ nei (si ) do
network is modeled as a graph G = (V, E), where V repre- corri,j
4: pi,j = P corri,k
sents the sensor nodes in the deployment region and E = sk ∈nei(i)
(k−1)
{(si , sj )|si , sj ∈ V, dist(si , sj ) < R and corri,j > 0}. The 5: send ranki · pi,j to sj
(k−1)
weight of an edge (si , sj ) is assigned to be corri,j . 6: receive all rankj · pj,i from every sj ∈ nei (i)
Once the correlation network of sensors is constructed (k) P (k−1)
7: ranki = sj ∈nei(i) rankj · pj,i
(and maintained), one can easily deduce the correlations
among sensor nodes. Based on the correlation network, we
shall further develop an algorithm to compute SensorRank
for each sensor node, in terms of the correlation with its
neighbors, in the network.
3. SENSORRANK
SensorRank is to represent the trustworthiness of sensor
nodes. By our design, two requirements need to be met in
deriving SensorRank for each sensor.
Requirement 1: If a sensor has a large number of neigh-
bors with correlated readings, the opinion of this sensor is
trustworthy and thus its vote deserves more weight.
Requirement 2: A sensor node with a lot of trustworthy
neighbors is also trustworthy.
Figure 2: An example of Sensor Rank.
These two requirements ensure that 1) a sensor node which
has a large number of similar neighbors to have a high rank;
and 2) a sensor node which has a large number of ’good ref-
Consider an example in Figure 2. In the first round, s3
erences’ to have a high rank. Given a correlation network
has some similarity information from its first level neigh-
G = (V, E) derived previously, we determine SensorRank
bors {s2 , s4 , s9 , s10 , s11 }. Similarly, both s2 and s4 could ex-
for each sensor to meet the above two requirements.
change some information with their neighbors. In the second
We model the correlation network as a Markov chain M ,
round, s3 can obtain similarity information from the second
where each sensor si is viewed as the state i, and the tran-
level neighbors {s1 , s5 } since its first level neighbors s2 and
sition probability from state i (i.e., sensor si ) to state j
s4 have explored s1 and s5 during the first round. If k is
(i.e., sensor si ) is denoted as pi,j and formulated as pi,j =
corri,j 0.1 larger, SensorRanks will be more accurate since every sensor
P . For example, in Figure 2, p2,3 = 0.4+0.1+0.7 =
k∈nei(i) corri,k can explore more neighbors. In sensor networks, the com-
0.083. Based on the above setting, we can formulate Sen- putation cost will be larger when the number of iterations
sorRank of si , denoted as ranki , as follows: is larger. Therefore, we can limit k to a preset bound δ.
X Given a correlation network in Figure 2, we now demon-
ranki = rankj · pj,i strate how to calculate SensorRank. Initially, sensor si sets
sj ∈nei(i) (0)
its sensorRank ranki to 1. For sensor si , si calculates
where nei(i) is the witness set of node i. the trust relations pi,j to the corresponding neighbor sj and
(0) (0)
The computation of SensorRank can be viewed as a ran- sends ranki · pi,j to sj . For example, s3 sends rank3 ·
0.5 0.2
dom walk over the correlation network. Several iterations p3,1 = 1 · 3.0 = 0.167 to s1 , 0.033 to s2 , 3 = 0.067 to
is required to perform random walks until a steady state s4 , and etc. At the same time, s3 receives SensorRanks
(0)
is achieved (i.e., SensorRanks become stable). Specifically, from its neighbors. For example, s3 receives rank2 · p2,3 =
(k) 0.1
ranki is the value of SensorRank at the k-th iterations. 1 · 0.4+0.1+0.7 = 0.083 from s2 . Upon receiving all the pro-
(0)
At the beginning, the initial ranki is set to 1. Note that portion of SensorRank from the neighbors, s3 can update
(1)
(0)
ranki can be set to any constant c, and the results will be its SensorRank to rank3 .
c times the value generated when the initial SensorRank is (1)
X (0)
rank3 = ranki · pj,i
set to 1. In the first round, each sensor node si updates its
(1) i∈{1,2,4,9,10,11}
SensorRank as ranki using the initial SensorRanks of its
neighbors. Now each sensor node has considered the first = 1 · p1,3 + 1 · p2,3 + 1 · p4,3 + 1 · p9,3
level neighbors to calculate its SensorRank. In the second +1 · p10,3 + 1 · p11,3
round, each sensor node can indirectly obtain some informa- 0.5 0.1 0.2 0.7 0.8 0.7
tion from the second level neighbors through its first level = + + + + +
2.1 1.2 1.9 1.5 2.3 1.4
neighbors since its first level neighbors have explored their = 1.74
first level neighbors as well. Therefore, after the kth round, n o
(1)
sensor node si has explored the kth level neighbors and up- After the first round, ranki |i = 1, 2, 3, 4 ={1.13, 0.59,
(k)
dated SensorRank as ranki . 1.11, 1.33}. In the second round, sensors calculate the values
3
s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11
k=0 1 1 1 1 1 1 1 1 1 1 1
k=1 1.13 0.59 1.74 1.11 1.33 0.64 1.14 0.89 0.58 1.3 0.54
k=2 1.17 0.68 1.43 1.05 1.24 0.77 0.91 1.05 0.86 1.04 0.8
of SensorRank with the updated values of SensorRank in the whether its current reading vector is faulty or not. Once the
(1)
first round. For example, s1 now sends rank1 · p1,3 = 1.13 · reading vector of a sensor node is determined as normal, the
0.5
2.1
= 0.269 to s3 . Similarly, when s3 receives all the values sensor node does not need to enter the neighbor-diagnosis
from its neighbors, s3 can update its SensorRank to rank3 .
(2) phase. To execute a self-diagnosis, each sensor si only main-
Assume tains two reading vectors: i) the current reading vector at
n that δ = 2, s1 will o stop updating its SensorRank, the current time t (denoted as bi (t)); and ii) the last correct
(2)
and ranki |i = 1, 2, 3, 4 ={1.17, 0.68, 1.43, 1.05}. As reading vector at a previous time tp (expressed by bi (tp )).
expected, s3 has the highest SensorRank 1.43, since s3 has bi (tp ) records a series of readings occurred in the previous
many similar neighbors. Since s1 has fewer similar neighbors time and is used for checking whether the current reading
than s3 , s1 has smaller SensorRank than s3 . The values of behavior is faulty or not. If these two reading vectors are not
SensorRank after the third iteration are listed in Table 1. similar, bi (t) is viewed as an unusual reading vector. Once
From Table 1, s3 has the largest SensorRank since more a sensor node is detected an unusual reading vector, this
nearby sensors have similar reading behaviors with s3 . This sensor node will enter the neighbor-diagnosis phase to fur-
meets the requirements we set for design of SensorRank as ther decide whether the unusual reading behavior is faulty
mentioned earlier. or not. Note that when bi (t) is identified as a normal vector
through the neighbor-diagnosis, bi (tp ) is updated so as to
4. TRUSTVOTING ALGORITHM reflect the current monitoring state.
Here we describe our design of the TrustVoting algorithm,
which consists of two phases: a) self-diagnosis; and b) neigh-
4.2 Neighbor-diagnosis Phase
bors diagnosis phase. In the self-diagnosis phase, each sensor If a sensor node si sends bi (t) to a neighbor sj , sj will
verifies whether the current reading of a sensor is unusual compare bi (t) with its own current reading vector bj (t) and
or not. Once the reading of a sensor goes through the self- then give its vote with respect to bi (t). From the votes
diagnosis phase, this sensor can directly report the reading. from neighbors, si has to determine whether bi (t) is faulty
Otherwise, the sensor node consults with its neighbors to or not. Notice that some votes are from sensors with high
further validate whether the current reading is faulty or not. SensorRank. A sensor node with high SensorRank has more
If a reading is determined as faulty, it will be filtered out. similar neighbors to consult with and thus is more trust-
The sensor nodes generating faulty readings will not partic- worthy. Therefore, the votes from the neighbors with high
ipate in voting since these sensors are likely to contaminate SensorRank are more authoritative, whereas the votes from
the voting result. Note that TrustVoting is an in-network the neighbors with low SensorRank should cast less weights.
algorithm which is executed in a distributed manner. The When sensor si sends bi (t) to all its neighbors for the
execution order of algorithm TrustVoting has an impact on neighbor-diagnosis, each neighbor should return its vote af-
faulty reading detection. We will discuss this issue later. ter determining whether bi (t) is faulty or not. If a neighbor
sj considers bi (t) is not faulty by comparing the similarity of
Algorithm 2 T rustV oting the two reading vectors (i.e., corri,j ≥ σ), sj will send a pos-
Input: a sensor si , SensorRank ranki and time interval t itive vote, denoted votej (i), to si . Otherwise, the vote will
Output: justify whether the reading is faulty or not (i.e., be negative. In addition, the vote from sj will be weighted
f aulty = true or not) by its SensorRank.
½
1: set f aulty = f alse rankj , corri,j ≥ σ
2: broadcast ranki to the neighbors votej (i) =
−rankj , otherwise.
3: receive {rankj |sj ∈ nei (i)} from the neighbors
/* set timer by the priority sorted by SensorRank */
After collecting all the votes from the neighbors, si has
4: sort SensorRank values received
two classes of votes: one is positive class (bi (t) is normal)
5: x = ranki ’s order in the sorted SensorRank values
and the other is negative class ( bi (t) is faulty). If the weight
6: n = neighbors of sensor si
t of the former is larger than the weight of the later, the most
7: timer = x ∗ ( n+1 ) /*t is the time interval given */
neighbors will view bi (t) as normal. Note that the weight of
8: while time == timer do
a vote represents how authoritative a vote is. It is possible
9: f aulty = Procedure Self-Diagnosis
that a neighbor sj of si with a large SensorRank has a small
10: if f aulty == true then
correlation with si . In this case, these two sensor nodes may
11: f aulty = Procedure Neighbor-Diagnosis
not provide good judgments for each other. Therefore, each
12: return f aulty
vote (i.e., votej (i)) has to be multiplied by the correspond-
ing correlation, corri,j . Thus, we use the following formula
4.1 Self-diagnosis Phase to determine whether the reading is faulty or not.
X
When a set of sensor nodes is queried, each sensor in deci = corri,j · votej (i)
the queried set performs a self-diagnosis procedure to verify sj ∈nei(i)
4
Procedure Neighbor-Diagnosis Order Faulty Not faulty
Input: a sensor si , its current reading behavior bi (t), and s1 , s2 , s3 , s4 , s5 s5 s1 , s2 , s3 , s4
a threshold σ. s5 , s1 , s2 , s3 , s4 s4 , s5 s1 , s2 , s3
Output: the variable f aulty.
1: set deci = 0 Table 2: Faulty detection under different orders.
2: broadcast bi (t) to the neighbors
3: for all sj ∈ nei (i) do
4: if sim (bi (t) , bj (t)) ≥ σ then dec2 = (−1.17) · 0.4 + (−1.43) · 0.1 + 1.05 · 0.7 = 0.124 , s2 is
5: votej (i) = rankj identified as normal. Following the same operations, we find
6: else that s4 is also identified as normal. However, in the order of
7: votej (i) = −rankj {s1 , s2 , s3 , s4 , s5 }, the result is different. When s5 executes
8: deci = deci + trij · votej (i) TrustVoting, s5 is identified as faulty obviously because al-
9: if deci ≥ 0 then most all neighbors give s5 negative votes. Therefore, s5 do
10: return f alse not vote for other sensors. Without the vote from s5 , s4 is
11: else regarded as faulty since (−1.17)·0.3+0.68·0.7+(−1.43)·0.2 =
12: return true −0.161. Table 2 shows the results under two different exe-
cution orders. From Table2, not all faulty readings reported
by faulty sensors (i.e., s2 , s4 and s5 ) are detected and differ-
ence executions orders have an impact on the faulty reading
detection.
As such, how to determine an appropriate order to per-
form self-diagnosis and neighbor-diagnosis in algorithm Trus-
tVoting will have an impact on the final result. Since algo-
rithm TrustVoting is executed in a distributed manner, we
could use a timer to control the execution order of proce-
dures self-diagnosis and neighbor-diagnosis. Those sensors
having smaller values in their timers will perform first. By
exploring SensorRank, we could allow those sensor nodes
with higher SensorRank to perform self-diagnosis and neighb-
Figure 3: An example query for TrustVoting. or-diagnosis as soon as possible. As pointed out early, sen-
sor nodes with a high SensorRank are likely to have many
similar neighbors, thereby these sensors could be correctly
If the weight of the positive votes are more than the weight identified whether readings are faulty or not. Once sen-
of the negative votes, deci will be positive which means that sors reporting faulty readings are detected, these sensors do
si ’s reading is normal and the current reading can be re- not get involved in voting in other sensor nodes. Therefore,
ported. Otherwise, deci is negative, implying that the cur- the domination problem can be alleviated since those faulty
rent reading of si is faulty. For example in Figure 3, a sensors with higher weights could be determined as early as
region of sensors is queried (s1 , s2 , s3 , s4 and s5 ) and four possible.
faulty sensors (gray nodes) exist. SensorRanks of sensors Clearly, we could determine the order of executing pro-
are shown in square brackets in nodes and the correlation cedures of self-diagnosis and neighbor-diagnosis according
between sensors are shown on edges. To facilitate presenta- to SensorRank. However, some highly ranked sensor nodes
tion of this example, the plus sign (minus sign) shows that may get their ranks from their highly ranked neighbors while
two sensor nodes have similar (dissimilar) current readings, having few neighbors. Therefore, the number of neighbors
and they are going to give the positive (negative) votes to should also be taken into consideration. In algorithm TrustV-
each other when executing the neighbors’ diagnosis. Con- oting, timers are set for each sensor in accordance to both
sider sensor node s5 as an example, where s5 will receive the of the SensorRank and the number of neighbors. Specifi-
votes from its neighbors (i.e., s1 , s4 , s6 , s7 and s8 ). It can cally, assume that a time interval will be given in algorithm
be obtained that dec5 = (−1.17) · 0.4 + 1.05 · 0.7 + (−0.77) · TrustVoting. In algorithm TrustVoting, each sensor should
0.5 + (−0.91) · 0.2 + (−1.05) · 0.4 = −0.72. Therefore, the first broadcast SensorRank to neighbors. Once receiving
reading reported by sensor node s5 is a faulty reading. SensorRank values from its neighbors, each sensor should
sort SensorRank values in a decreasing order. Then, each
4.3 Execution Order of TrustVoting sensor should determine the order of its SensorRank in such
Since the TrustVoting algorithm is a distributed algo- sorted list. Furthermore, a sensor will have information re-
rithm, sensors being queried will perform the self-diagnosis lated to the number of neighbors from SensorRank values
t
and neighbor-diagnosis procedures individually. Different received. Therefore, we could set timer to be x· (n+1) , where
execution orders have produce different results for faulty x is the order of this sensor in a sorted list, n is the number
detection. For example, consider two execution orders {s1 , of neighbors and t is the time interval given. With a smaller
s2 , s3 , s4 , s5 } and {s5 , s1 , s2 , s3 , s4 } in Figure 3. Assume value of timer, procedures of self-diagnosis and neighbor-
that all the queried sensor nodes have to perform the neigh- diagnosis will executed first.
bors’ diagnosis. In the order of {s1 , s2 , s3 , s4 , s5 }, when Consider an illustrative example in 3. The timer value for
s1 executes TrustVoting, s2 , s4 and s5 give negative votes, sensor s3 should be 1 · 7t since sensor node s3 has 6 neigh-
while s3 and s8 claim positive votes. As such, dec1 will bors and its SensorRank is the highest among SensorRank
be 0.16 and s1 will be identified as normal. For s2 , since values collected (i.e., 6 neighbors and sensor s3 ). Following
5
the same operation, we could have the timer values 2·t 6
, 3·t
3
, sorRank is set to 3. Figure 4 shows the faulty detection rates
3·t 2·t of these three algorithms with various faulty rates. It can be
3
, and 6
for s 1 , s2 , s4 and s5 , respectively. Assume that
each sensor does not pass self-diagnosis and have to exe- seen that TrustVoting can detect almost 90% faulty readings
cute the neighbor-diagnosis procedure. According to the while MajorVoting and WeightVoting can only identify 40%
timers derived, s3 will perform first and the reading of s3 is faulty readings. However, since faulty readings in our faulty
identified as a normal reading by neighbor-diagnosis Then, model are biased from normal readings, it is hard to identi-
both s1 and s5 will execute neighbor-diagnosis at the same fied faulty readings for MajorVoting and WeightVoting. By
time. The reading reported by s1 (respectively, s5 ) will be exploring SensorRank, TrustVoting outperforms other two
determined as a normal reading (respectively, faulty). The voting algorithms. Figure 5 shows the false positive rate
executions of s2 and s4 are then performed. In particular, of the three algorithms. As the faulty rate increases, false
since s5 is viewed as faulty, s5 could not participate in vot- positive rates of three algorithms tend to increase due to a
ing process of s4 . As a result, through the execution order larger number of faulty sensors (i.e., it is hard to correctly
derived, we could accurately detect faulty readings reported detect faulty sensors when the number is large).
by s2 , s4 and s5 .
5. PERFORMANCE EVALUATION
5.1 Simulation Model
We simulate a synthetic environment, where sensors are
deployed in a 500 by 500 to monitor temperatures. The tem-
perature reading range is [−25, 275]. Moreover, events with
unusual readings are randomly generated in the monitored
field. The model of generating events are the same as in
[5, 6]. The faulty sensor rate (abbreviated as faulty rate)
is the ratio of the number of faulty sensors and the total
number of sensors deployed. Each sensor will report noisy
readings according to the parameter noise prob. A faulty
sensor always report faulty readings and thus noise prob is
set to 1 for faulty sensors. On the other hand, a normal
sensor is still likely to report noise or faulty readings. Thus,
for normal sensors, we set the noise prob to 0.1. A noise
reading (referred to as a faulty reading) is randomly biased Figure 4: Faulty detection rates of the three algo-
from the normal reading generated and the amount of bias rithms.
is within the range of [−50, 50]. A query is submitted to
wireless sensor networks with its query region as a rectangle
and query region size varied from 80 by 80 up to 160 by 160.
To evaluate the simulation result, two performance metrics
are employed: faulty detection rate and false positive rate.
Specifically, a query is issued to a query region B to obtain
the current readings sensed by the sensors, where the set of
these current readings is denoted as XB . Assume that YB is
a set of faulty readings in XB . After executing TrustVoting
algorithm, we can filter out a set of faulty readings denoted
0
0
as YB , and obtain a subset of current readings XB ⊆ XB
without faulty
¯
readings.
¯
The faulty detection rate is de-
¯ 0 ¯
¯YB ∩YB ¯
fined as |YB |
and the false positive rate is defined as
¯³ 0 ´ ³ 0 ´¯¯
¯
¯ YB ∪YB − YB ∩YB ¯
|XB |
. In other words, the faulty detection
rate is the percentage of faulty readings correctly identified,
and the false positive rate is the percentage of faulty readings
(respectively, normal readings) that are identified as normal
(respectively, faulty) readings. We implement the classical
Figure 5: False positive rates of the three algo-
majority voting (denoted as MajorVoting) and the distance
rithms.
weighted voting (denoted as WeightVoting) for comparison.
6
be seen in Figure 6, when δ increases, the faulty detection loss of generality, the number of iterations for SensorRank
rate will increase. This is due to that with a larger number is set to 3 and the similarity threshold is set to 0.5. The
of iterations, SensorRank is able to have more neighbor- experimental results are shown in Figure 8 and Figure 9.
ing information. Therefore, TrustVoting is able to precisely
identify faulty readings. Furthermore, with the number of
iterations increases, the false positive rate declines. How-
ever, increasing the number of iterations for SensorRank
will incur message transmissions among sensors. In addi-
tion, from Figure 6 and Figure 7, it can be seen that after 3
iterations, the improvements in the faulty detection rate and
the false positive rate are not very significant. Therefore, in
the following experiments, we set to number of iterations
for SensorRank to be 3. Clearly, the number of iteration for
SensorRank will be dependent upon the sensing data and
can be empirically determined.
7
results in wireless sensor networks may be greatly affected.
In this paper, we first formulated the correlation among
readings of sensors nodes. Given correlations among sensor
nodes, a correlation network is built to facilitate derivation
of SensorRank for sensor nodes in the network. In light of
SensorRank, an in-network algorithm TrustVoting is devel-
oped to determine faulty readings. Performance evaluation
shows that by exploiting SensorRank, algorithm TrustVot-
ing is able to efficiently identify faulty readings and out-
performs majority voting and distance weighted voting, two
state-of-the-art approaches for in-network faulty reading de-
tection.
Acknowledgement
Wen-Chih Peng was supported in part by the National Sci-
ence Council, Project No. NSC 95-2211-E-009-61-MY3 and
NSC 95-2221-E-009-026, Taiwan, Republic of China. Wang-
Chien Lee was supported in part by the National Science
Figure 10: Faulty detection rates with the number Foundation under Grant no. IIS-0328881, IIS-0534343 and
of neighbors varied. CNS-0626709.
7. REFERENCES
[1] T. Clouqueur, K. K. Saluja, and P. Ramanathan. Fault
tolerance in collaborative sensor networks for target
detection. IEEE Transactions on Computers,
53(3):320–333, 2004.
[2] Min Ding, Dechang Chen, Kai Xian, and Xiuzhen
Cheng. Localized fault-tolerant event boundary
detection in sensor networks. In Proc. of IEEE
INFOCOM, March 2005.
[3] Eiman Elnahrawy and Badri Nath. Online data
cleaning in wireless sensor networks. In Proc. of
International Conference on Embedded Networked
Sensor Systems(SenSys), pages 294–295, 2003.
[4] Farinaz Koushanfar, Miodrag Potkonjak, and Alberto
Sangiovanni-Vincentelli. Fault tolerance in wireless
ad-hoc sensor networks. In Proc. of IEEE International
Conference on Sensors, June 2002.
[5] Mark D. Krasniewski, Padma Varadharajan, Bryan
Figure 11: False positive rates with the number of Rabeler, Saurabh Bagchi, and Y. Charlie Hu. Tibfit:
neighbors varied. Trust index based fault tolerance for arbitrary data
faults in sensor networks. In Proc. of International
Conference on Dependable Systems and Networks,
Hence, experiments of TrustVoting with the number of neigh- pages 672–681, 2005.
bors varied are conducted. We set the number of iterations [6] B. Krishnamachari and S. Iyengar. Distributed
for SensorRank to 3, the similarity threshold to 0.5 and the bayesian algorithms for fault-tolerant event region
length of reading vectors to 5. Figure 10 and Figure 11 show detection in wireless sensor networks. IEEE
the performance of TrustVoting. It can be seen that in Fig- Transactions on Computers, 53(3):241–250, 2004.
ure 10, the faulty detection rate increases with the number
[7] Alexander Strehl, Joydeep Ghosh, and Raymond J.
of neighbors. Moreover, in Figure 11, the false positive rate
Mooney. Impact of similarity measures on web-page
decreases as the number of neighbors increases. With larger
clustering. In Proc. of AAAI Workshop on AI for Web
number of neighbors, both the faulty detection rate and the
Search, pages 58–64, July 2000.
false positive rate are greatly improved even for the case in
which the faulty rate is high (i.e., 0.6 in this experiment). [8] Sharmila Subramaniam, Themis Palpana, Dimitris
This agrees with our intuition that the more neighbors a Papadopoulos, Vana Kalogeraki, and Dimitrios
sensor has, the better performance a voting scheme is. The Gunopulos. Online outlier detection in sensor data
above experiments show that when the faulty rate is high, using non-parametric models. In Proc. of International
one should deploy a sufficient amount of sensors so as to Conference on Very Large Data Bases(VLDB), pages
further improve the effectiveness of TrustVoting. 187–198, 2006.
[9] Tony Sun, Ling-Jyh Chen, Chih-Chieh Han, and Mario
Gerla. Reliable sensor networks for planet exploration.
6. CONCLUSIONS In Proc. of International Conference on Networking,
With the presence of faulty readings, the accuracy of query Sensing and Control, pages 816–821, 2005.