Toit
Toit
net/publication/347506848
CITATIONS READS
17 634
7 authors, including:
All content following this page was uploaded by Zhaolong Ning on 20 December 2020.
XIAOJIE WANG, School of Communication and Information Engineering, Chongqing University of Posts
and Telecommunications, Chongqing 400065, China.
LAISEN NIE* , School of Electronics and Information, Northwestern Polytechnical University, Xi’an, 710072,
China.
ZHAOLONG NING* , School of Communication and Information Engineering, Chongqing University of
Posts and Telecommunications, Chongqing 400065, China.
LEI GUO, School of Communication and Information Engineering, Chongqing University of Posts and
Telecommunications, Chongqing 400065, China.
GUOYIN WANG, Chongqing Key Laboratory of Computational Intelligence, Chongqing University of
Posts and Telecommunications, Chongqing 400065, China.
XINBO GAO, Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecom-
munications, Chongqing 400065, China.
NEERAJ KUMAR, Department of Computer Science and Engineering Thapar Institute of Engineering
and Technology Patiala, India.
Internet of Vehicles (IoV), as a special application of Internet of Things (IoT), has been widely used for Intelligent
Transportation System (ITS), which leads to complex and heterogeneous IoV backbone networks. Network traffic
prediction techniques are crucial for efficient and secure network management, such as routing algorithm, network
planning, anomaly and intrusion detection. This paper studies the problem of end-to-end network traffic prediction
in IoV backbone networks, and proposes a deep learning-based method. The constructed system considers the
spatio-temporal feature of network traffic, and can capture the long-range dependence of network traffic. Furthermore,
a threshold-based update mechanism is put forward to improve the real-time performance of the designed method by
using Q-learning. The effectiveness of the proposed method is evaluated by a real network traffic data set.
Authors’ addresses: Xiaojie Wang, School of Communication and Information Engineering, Chongqing University of Posts and
Telecommunications, Chongqing 400065, China.; Laisen Nie, nielaisen@nwpu.edu.cn, School of Electronics and Information,
Northwestern Polytechnical University, Xi’an, 710072, China.; Zhaolong Ning, zhaolongning@gmail.com, School of Communi-
cation and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.;
Lei Guo, School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications,
Chongqing 400065, China.; Guoyin Wang, Chongqing Key Laboratory of Computational Intelligence, Chongqing University
of Posts and Telecommunications, Chongqing 400065, China.; Xinbo Gao, Chongqing Key Laboratory of Image Cognition,
Chongqing University of Posts and Telecommunications, Chongqing 400065, China.; Neeraj Kumar, Department of Computer
Science and Engineering Thapar Institute of Engineering and Technology Patiala, India.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from permissions@acm.org.
○
c 2020 Association for Computing Machinery.
Manuscript submitted to ACM
Additional Key Words and Phrases: Internet of vehicles, traffic prediction, network security, deep learning
1 INTRODUCTION
Cloud-based Internet of Vehicles (IoV), integrated by Internet of Things (IoT) and cloud computing, has
been rapidly developed for Intelligent Transportation System (ITS) [6, 11, 16, 19, 23, 26, 29]. IoVs can
provide reliable communications among vehicles for safety-related applications, and the network access
to infrastructures and pedestrians. IoV backbone networks transmit the traffic aggregated by end-to-end
network traffic generated from a large number of sensors and safety monitoring systems of On-Board Units
(OBUs) [9], as shown in Fig. 1. Under this case, the IoV backbone network has become much more complex
and heterogeneous. As a crucial component of an IoV, the network security and management system is
necessary to provide reliable data transmissions for secure IoVs. Network management operations usually
collect network states at first, and then make decisions for network management according to the reported
data [5, 12, 28, 33]. End-to-end network traffic is a crucial network state, which is the basis for network
planning and predictive routing [2, 31].
With the development of IoVs, more data have been generated for implementing ITS. It arises the
challenges in network security. For instance, an attacker may intrude an IoV, and it can steal the information
related to users’ privacy leading to a tremendous economic loss for users. In this case, many network security
applications, such as Intrusion Detection Systems (IDS), have been deployed in the IoV backbone network
to prevent it from kinds of threats. Besides, anomaly detection techniques are leveraged to implement
anomaly-based intrusion detection, and detect impossible damages in networks [8, 13, 30]. In practice, both
intrusion and anomaly detection systems need end-to-end network traffic data as an input parameter to
carry out the corresponding functions. For instance, an IDS can identify the Distributed Denial of Service
attack (DDoS) by detecting the anomalous behaviors of end-to-end network traffic.
Thereby, precise network traffic prediction is useful for maintaining IoVs. A Traffic Matrix (TM), which
shows the size of network traffic among Origin-Destination (OD) nodes, is the mathematical representation
of network traffic [20]. In a TM, each row vector reveals the time-varying features of the corresponding
OD flow. A TM obeys manifold statistical features, e.g., spatial, temporal and spatio-temporal features.
These intricate statistical features arise the main challenges in network traffic prediction. Initially, the
statistics-based methods, such as Gaussian and Poisson models, are proposed to predict network traffic. The
statistical features of TM are much more complex, while supporting more services in IoVs. Then, Machine
Learning (ML)-based methods emerge for network traffic prediction, such as Principal Component Analysis
(PCA) [24], Convolutional Neural Network (CNN) [32] and Long Short-Term Memory (LSTM) methods [22].
These methods capture several statistical features jointly to decease the network traffic prediction error.
Nevertheless, they are not appropriate for predicting network traffic in IoVs. The corresponding main
challenges can be summarized as follows [7, 10, 25, 27]:
Manuscript submitted to ACM
Deep Learning-based Network Traffic Prediction for Secure Backbone Networks in Internet of Vehicles 3
Network
Internet Management
Firewall
Router Router
Router
5G Base
RSU Station
V2I
V2I
∙ The statistical features of network traffic in secure IoVs obey a superposition of various distributions.
There are kinds of data in a secure IoV leading to complex network traffic statistical features. These
complex statistical features result in unfaithful predictors via previous methods.
∙ The ML-based methods cannot meet the commands in real-time performance. For a secure IoV, the
topology changes frequently due to quick movement of vehicles. Meanwhile, a secure IoV needs fast
responses for any unlawful attacks. Hence, the real-time performance of network traffic prediction is
crucial for secure IoVs. Nevertheless, the ML-based methods usually extract the features of network
traffic by collecting many data samples. Meanwhile, to guarantee prediction accuracy, deep architectures
are often updated persistently. Training these deep architectures by massive data is time-consuming
for network resources (e.g., computation and communication recourses).
∙ It is significantly difficult for existing ML-based methods to exact the features of network traffic
using few traffic samples. With limited communication resources, sampling and transmitting massive
training data are uneconomical for secure IoVs.
Motivated by the above challenges, we investigate the network traffic prediction problem, and design a
network traffic prediction system for secure IoVs. The proposed system uses CNN and LSTM to optimize the
accuracy and real-time performance of network traffic prediction by extracting multiple statistical features
of TM jointly and implementing intermittent model updates. To extract the multiple statistical features
of TM jointly, a deep architecture integrating CNN and LSTM is built, in which CNN and LSTM are
employed to track the spatio-temporal and time-varying features of TM. LSTM is a particular paradigm of
Manuscript submitted to ACM
4 Xiaojie Wang, Laisen Nie, Zhaolong Ning, Lei Guo, Guoyin Wang, Xinbo Gao, and Neeraj Kumar
Recurrent Neural Network (RNN), which is able to capture the relationship between several input elements
by the feedbacks within the same layer. Based on this advantage, we explore the time-varying features of
network traffic through LSTM. Moreover, the TM consisting of all the OD flows, expresses spatial and
spatio-temporal features caused by routing algorithms and network configurations. Hence, we take advantage
of CNN to portray the spatio-temporal features of TM. To obtain the best predictors, each predictor is
calculated by implementing a training in practice. However, each training of the designed deep architecture
is expensive for real-time performance. Thereby, we train the deep architecture discontinuously. We propose
a threshold-based mechanism to calculate the time intervals between two training processes [17].
The following summarizes the main contributions of our work:
∙ To predict network traffic of a secure IoV backbone network accurately, we design a deep architecture
utilizing CNN and LSTM to trace the temporal and spatio-temporal features of TM jointly.
∙ To improve the real-time performance of the proposed method, a threshold-based mechanism is
proposed to update the deep architecture discretely. A Reinforcement Learning (RL) algorithm is
proposed to determine the threshold.
∙ We evaluate the designed scheme by real network traffic data set sampled from the Abilene network
and our constructed testbed which is leveraged to imitate the scene of a secure IoV. According to the
evaluation, our method can capture the long-term network traffic in secure IoVs.
The following sections are organized as follows. The related work is reviewed in Section 2. Section 3
provides the system model of network traffic prediction. Section 4 presents the proposed method, and Section
5 gives the evaluation of the designed deep architecture. At last, we conclude our work and illustrate the
future work in Section 6.
2 RELATED WORK
2.1 Shallow Learning-based Methods
Initially, researchers model OD traffic by simple statistical models, e.g., Gaussian and Poisson models,
to extract network traffic features. Then, the problem of network traffic prediction is converted into a
parameter estimation problem. However, simple models cannot model nonlinear distributions of network
traffic [1, 4, 21, 24].
ML techniques have a widespread usage in network traffic prediction, which relies on its remarkable
ability in modeling complex statistical distributions. Initially, kinds of shallow learning-based approaches
have emerged to solve network traffic prediction problem, such as shallow Neural Network (NN) and PCA.
These methods try to fit the statistical features of network traffic consisting of linear and nonlinear features.
According to the analysis of network traffic in depth, current network traffic yields multiple distributions,
such as multi-fractal and heavy-tailed distributions. Capturing these statistical features can enhance the
accuracy of network traffic prediction remarkably. However, traditional shallow learning approaches cannot
track these features effectively. For instance, the PCA method deals with the problem of network traffic
prediction by singular value decomposition, and neglects some non-principal components with small singular
values. Namely, it relies on the spatial correlation of TM. The NN method is mainly designed to adapt to
the changes of TM with respect to the time interval.
Manuscript submitted to ACM
Deep Learning-based Network Traffic Prediction for Secure Backbone Networks in Internet of Vehicles 5
To enhance the accuracy of network traffic prediction, hybrid model-based methods have been emerged, in
which researchers model network traffic by more than two models. In [1], the authors proposed two hybrid
methods based on Fanout Estimation (FE) and Iterative Proportional Fitting (IPF), respectively. The IPF
method is adapted to track the time-varying features of OD network traffic. By contrast, the FE method
can extract the spatial features of TM. They are combined with other state-of-the-art modeling techniques
for network traffic feature extraction, i.e., tomogravity, entropy maximization and shallow NN. Besides,
the authors in [18] proposed a network traffic feature extraction method using artificial NN and genetic
algorithm. They first model OD flows by the autoregressive model with exogenous inputs, and then calculate
the inputs (i.e., weights and biases) by genetic algorithm.
3 SYSTEM MODEL
As mentioned before, TM reveals the volume of traffic that flows among all OD pairs. In this paper, we
denote TM by 𝑋 for a secure IoV backbone network. Its element is 𝑋𝑛,𝑡 , where 𝑛 is the index of an OD
flow and 𝑡 denotes time slot. Generally, 𝑡 is a time interval, and then 𝑋𝑛,𝑡 shows the average traffic within
time interval 𝑡. If the network is made up of 𝑁 nodes and 𝐾 links, then 𝑛 = 1, 2, 3, ..., 𝑁 2 . Moreover, if TM
contains 𝑇 time slots of network traffic in a secure IoV backbone network, then 𝑡 = 1, 2, 3, ..., 𝑇 . Link load
expresses the aggregation of OD network traffic, and it conforms a linear relationship with respect to TM.
This relationship can be denoted by:
𝑌 = 𝑅𝑋, (1)
where 𝑅 is a so-called routing matrix. The routing matrix is constructed by 0 and 1, if we consider that an
OD flow is transmitted through the same path of an IoV backbone network. The element in 𝑅 is denoted
Manuscript submitted to ACM
6 Xiaojie Wang, Laisen Nie, Zhaolong Ning, Lei Guo, Guoyin Wang, Xinbo Gao, and Neeraj Kumar
by 𝑅𝑘,𝑛 . When 𝑅𝑘,𝑛 = 1, the 𝑛th OD flow passes link 𝑘 (𝑘 = 1, 2, ..., 𝐾), otherwise 𝑅𝑘,𝑛 = 0. Obviously,
the sizes of link load and routing matrices are 𝐾 × 𝑇 and 𝐾 × 𝑁 2 , respectively. TM estimation techniques
calculate TM 𝑋 via 𝑌 and 𝑅. Symbols 𝑌 and 𝑅 are available, because they can be obtained from the
simple network management protocol and routing configurations, respectively. In this paper, we utilize link
load and routing matrix to build a threshold for deep architecture update.
The problem of network traffic prediction in an IoV backbone network is to calculate 𝑋𝑡 according to
previous network traffic (𝑋𝑡−1 , 𝑋𝑡−2 , ..., 𝑋𝑡−𝐸 ), which can be denoted by:
Then this problem is defined as fitting function 𝑓 (·) with respect to 𝑋𝑡−1 , 𝑋𝑡−2 , ..., 𝑋𝑡−𝐸 . Eq. (2) mainly
shows the problem of network traffic prediction according to a sequence. Namely, fitting this function mainly
considers the temporal features of OD flows. By expanding to other features, the network traffic prediction
problem can be defined as:
𝑋𝑛,𝑡 =𝑓 (𝑋𝑡−1 , 𝑋𝑡−2 , ..., 𝑋𝑡−𝐸 ) , (3)
where vector 𝑋𝑡−𝑒 (𝑒 = 1, 2, ..., 𝐸) is a snapshot of network traffic over time slot 𝑡 − 𝑒. Symbol 𝐸 is the
length of prior network traffic. Different from the problem of network traffic prediction shown by Eq. (2), we
consider the spatial and spatio-temporal features of network traffic. In this case, the problem of network
traffic prediction is modeled by calculating a traffic element according to previous snapshots of network
traffic.
4 OUR METHODOLOGY
Traditional prediction approaches usually calculate network traffic by extracting the time-varying features of
OD flows to fit the function in Eq. (2). Different from previous approaches, we combine CNN and LSTM,
and design a deep architecture to predict the network traffic of secure IoVs by means of extracting two
features, i.e., the spatio-temporal and temporal features. The designed deep architecture for network traffic
prediction is shown in Fig. 2. It contains 6 hidden layers,i.e., a convolutional layer, a subsampling layer, an
LSTM layer, two fully connected layers and a dropout layer. In our method, we first preprocess the network
traffic data set to normalize and centralize it, that is:
𝑋𝑛,𝑡 − 𝜇𝑛
𝑋𝑛,𝑡 = , (4)
|𝑋𝑛,𝑡 |
where 𝜇𝑛 is the average value of OD flow 𝑛, and can be computed by the prior of network traffic.
Subsampling Layer
Dropout Layer
Output Layer
LSTM Layer
Input Data
6@2 ´ 2
6@5 ´ 5
(𝑙) (𝑙)
where 𝑏𝑗 and 𝑘𝑗 are the 𝑗th convolution kernel and its bias of the 𝑙th convolutional layer, respectively.
(𝑙−1)
Variable 𝑎𝑖 is the output map of the (𝑙 − 1)th layer, and function 𝜎 (·) is the activation function.
As mentioned before, CNN is suitable for handling 2-dimensional data. Thereby, we utilize it to carry
out the TM-oriented spatio-temporal feature extraction. In our deep architecture shown in Fig. 2, the
convolutional layer, following the input layer, consists of 6 convolution kernels which learn 5 × 5 spatial
dimensions. To predict network traffic, the tanh function is used as the activation function on the convolutional
layer. Then, the output maps are:
(︁(︁ )︁ )︁
(1) (1) (1)
𝑎𝑗 = 𝑡𝑎𝑛ℎ 𝑋 (𝑚) * 𝑘𝑗 + 𝑏𝑗 , 𝑗 ∈ {6} , (6)
where 𝑋 (𝑚) is the training data set constructed by the prior of TM, and set {6} denotes the set {1, 2, ..., 6}.
Following the convolutional layer, a subsampling layer is built to carry out the average pooling with a factor
of 2. Then, the output maps of this average pooling are:
(︁ )︁
(2) (1)
𝑎𝑗 = 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑎𝑗 , 𝑗 ∈ {6} , (7)
where 𝑋𝑡 is input data, and 𝑈 , 𝑉 and 𝑊 are weights of RNN. Variables 𝑏𝑓 and 𝑐𝑓 are biases, and 𝑎𝑡 is the
final output map of RNN. LSTM takes advantage of three gates to control the contents of unit state. They
are the forget gate, the input gate and the output gate, whose weights are denoted by 𝑊 𝑓 , 𝑊 𝑖 and 𝑊 𝑜 ,
respectively , and they can be denoted by:
⎧ (︀ 𝑓 𝑓
)︀
⎨ 𝑔𝑡 = 𝜎 (︀ 𝑊 [𝑓𝑡−1 ; 𝑋𝑡 ] + 𝑏 )︀ ,
⎪
𝑖𝑡 = 𝜎 𝑊 𝑖 [𝑓𝑡−1 ; 𝑋𝑡 ] + 𝑏𝑖 , (9)
⎪
𝑜𝑡 = 𝜎 (𝑊 𝑜 [𝑓𝑡−1 ; 𝑋𝑡 ] + 𝑏𝑜 ) ,
⎩
where [·; ·] denotes the combination of two matrices. Variables 𝑏𝑓 , 𝑏𝑖 and 𝑏𝑜 are the biases for three gates,
respectively. Finally, the current cell state and the final output map of LSTM cell are:
{︃
𝑐𝑡 = 𝑐𝑡−1 𝑔𝑡 + 𝑖𝑡 tanh (𝑊 [𝑓𝑡−1 ; 𝑋𝑡 ] + 𝑏𝑐 ) ,
(10)
𝑎𝑡 = 𝑜𝑡 tanh (𝑐𝑡 ) ,
where 𝑐𝑡−1 is the previous cell state.
In our method, we extract the spatio-temporal and time-varying features of TM by CNN and LSTM,
respectively. As shown in Fig. 2, the output maps of subsampling layer are 6 matrices. To combine the
LSTM layer, we unfold these matrices as a vector, and make it as the input of LSTM layer. After that, the
output maps of LSTM can be shown as follows:
⎧ (︁ [︁ ]︁ )︁
(2)
⎪
⎪ 𝑔𝑡 = 𝜎 𝑊 𝑓 𝑓𝑡−1 ; 𝑎𝑗 + 𝑏𝑓 ,
⎪
⎪ (︁ [︁ ]︁ )︁
⎪ 𝑖𝑡 = 𝜎 𝑊 𝑖 𝑓𝑡−1 ; 𝑎(2) + 𝑏𝑖 ,
⎪
⎪
⎪
⎨ 𝑗
(︁ [︁ ]︁ )︁
(2)
𝑜𝑡 = 𝜎 𝑊 𝑜 𝑓𝑡−1 ; 𝑎𝑗 + 𝑏𝑜 , (11)
⎪
⎪ (︁ [︁ ]︁ )︁
(2)
𝑐𝑡 = 𝑐𝑡−1 𝑔𝑡 + 𝑖𝑡 tanh 𝑊 𝑓𝑡−1 ; 𝑎𝑗 + 𝑏𝑐 ,
⎪
⎪
⎪
⎪
⎪
⎪
⎩ (3)
𝑎 = 𝑜𝑡 tanh (𝑐𝑡 ) .
On the LSTM layer, there are 3 𝑁 2 − 4 2 LSTM blocks for temporal feature extraction.
(︀ )︀⧸︀
The rest of the proposed deep architecture contains two fully connected layers and one dropout layer
employed between two fully connected layers. The first fully connected layer is made up of 3 𝑁 2 − 4 2
(︀ )︀⧸︀
neurons, and we also use the tanh function as the activation function for prediction. It can be denoted by:
(︁ )︁
(4) (4) (4)
𝑎𝑖 = tanh 𝑊𝑖 𝑎(3) + 𝑏𝑖 . (12)
The following is the dropout layer used to prevent the proposed deep architecture from overfitting. The
number of neurons on the dropout is the same as the first fully connected layer. It carries out a probabilistic
dormancy mechanism for each neuron of the first fully connected layer. In other words, the output maps of
the dropout layer can be denoted by:
(︁ )︁
(5) (5) (4) (4)
𝑎𝑖 = 𝑑𝑖 tanh 𝑊𝑖 𝑎(3) + 𝑏𝑖 , (13)
(5) (5)
where 𝑑𝑖 obeys a Bernoulli distribution 𝑑𝑖 ∼ 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖 (𝑝). The fully connected layer connecting with
the output layer determines the length of predicted sequence. The objective of the deep architecture is
to implement regression. Hence, to implement prediction, the tanh function is also utilized as activation
function on this layer. The length of predicted sequence by way of a forward propagation is equal to the
number of neurons on this fully connected layer.
To train the deep architecture, the backpropagation algorithm is applied to our approach. The loss
function used in the backpropagation algorithm is the Mean-Squared-Error (MSE) of prediction, which is
defined as:
(𝐿)
𝐼∑︁
1
𝑙𝑜𝑠𝑠𝑀 𝑆𝐸 = (ℎ𝑖 − 𝑋𝑛,𝑖 )2 , (14)
𝐼 (𝐿) 𝑖=1
where 𝐼 (𝐿) is the number of neurons on the output layer, and ℎ𝑖 is the output maps of the proposed deep
architecture.
^ is the predictor of TM, and 𝑌 is the corresponding aggregation traffic (i.e., link load). Comparing
where 𝑋
with other metrics, NMAE can measure the spatial and temporal errors of network traffic prediction.
Therefore, we use NMAE as the metric of the threshold-based mechanism. In our method, when the NMAE
between a snapshot of link load and its predictor is larger than the threshold, and then we train the deep
architecture. Otherwise, we predict traffic elements over the next time interval using previous trained deep
architecture.
The threshold is computed by RL to obtain the optimal tradeoff between real-time performance and
accuracy. RL has an excellent ability of dealing with the decision problem, in which it can compute an
optimal policy with the maximum reward by implementing iterative actions. It leverages a sample of the
environment to guide the following action from the current state to the next, and the relative reward can be
gained in the meantime. Furthermore, the environment can be updated in the light of the obtained reward.
The RL can be regarded as an MDP represented by ⟨𝑆, 𝐴, 𝑃 , 𝑅, 𝛾⟩. In detail, 𝑆 and 𝐴 are the state and
action spaces, respectively. Symbol 𝑃 is the transition probability matrix, and 𝛾 is the discount factor. 𝑅 is
the immediate reward from the current state to the next according to the given policy with environment.
To explore the optimal policy, many algorithms have been proposed. Recently, the Q-learning, known as
an off-policy learning algorithm, has been brought in RL and widely developed. In Q-learning algorithm,
when an agent with state 𝑠 (𝑠 ∈ 𝑆) moves to state 𝑠˜ (˜
𝑠 ∈ 𝑆) after carrying out action 𝑎 (𝑎 ∈ 𝐴) according to
given policy 𝜋 (𝑠, 𝑎), immediate reward 𝑅 (𝑠, 𝑎) can be obtained. The reward with respect to current state 𝑠
by taking action 𝑎, denoted by 𝑄 (𝑠, 𝑎), is a weight sum of immediate reward 𝑅 (𝑠, 𝑎) of moving to the next
˜ ∈ 𝐴˜ is the action set under the next state 𝑠˜. The Q-learning algorithm can be regarded as an
where 𝑎
optimization problem that finds an optimal policy to maximize the reward 𝑄 (𝑠, 𝑎). For each iterative process
of researching the maximum reward, the reward can be updated according to a weighted sum denoted by:
(︂ (︂ )︂)︂
𝑄 (𝑠, 𝑎) ← (1 − 𝛼) 𝑄 (𝑠, 𝑎) + 𝛼 𝑅 (𝑠, 𝑎) + 𝛾 max 𝑄 (˜
𝑠, 𝑎
˜) , (17)
˜
˜ ∈𝐴
𝑎
where 𝛼 ∈ [0, 1] is a constant step-size parameter. During each iteration, the sampling of state can be chosen
by way of exploration-only and exploitation-only approaches. The former prefers to carry out sampling with
large range in value, and determine a state sampling by the uniform distribution. By contrast, the latter
selects the state sampling with the maximum cumulative reward currently.
The detail process of computing the threshold based on Q-learning is shown in Algorithm 1. The state
𝑠 represents the error threshold. The state space is set according to previous prediction errors, i.e., it is
equal to the maximum NMAE among previous predictors. We define the number of updates with respect to
state 𝑠 as 𝑁 (𝑠). Similarly, the NMAE with respect to state 𝑠 can be denoted by 𝑁 𝑀 𝐴𝐸 (𝑠). They can be
obtained by counting the previous data set. Obviously, 𝑁 (𝑠) and 𝑁 𝑀 𝐴𝐸 (𝑠) are monotone decreasing and
increasing functions with respect to 𝑠, respectively. In practice, we hope that 𝑁 𝑀 𝐴𝐸 (𝑠) and 𝑁 (𝑠) are as
small as possible. Therefore, we define the initialized policy as follows:
1
𝜋 (𝑠, 𝑎) = . (18)
𝑁 𝑀 𝐴𝐸 (𝑠) 𝑁 (𝑠)
By Algorithm 1, we can gain optimal policy 𝜋 * (𝑠, 𝑎), and then set the threshold according to 𝜋 * (𝑠, 𝑎).
Up to now, we have presented the proposed deep architecture to solve the traffic prediction problem and
the corresponding optimization method to optimize the real-time performance. The details of prediction
method are shown by Algorithm 2. We first construct the training data set from previous TM 𝑋 ′ which
is known for us. 𝑋 (𝑚) is the 𝑚th sampling, and it is an 𝑁 2 × 𝐸 matrix. Symbol 𝑋*′ , (𝑚:(𝑚+𝐸−1)) denotes
Manuscript submitted to ACM
Deep Learning-based Network Traffic Prediction for Secure Backbone Networks in Internet of Vehicles 11
intercepting partial columns of 𝑋 ′ , i.e., from columns 𝑚 to 𝑚 + 𝐸 − 1. Similarly, 𝑋𝑛′ , (𝑚:(𝑚+𝐿 means
𝑝 −1))
intercepting partial columns (columns 𝑚 to 𝑚 + 𝐿𝑝 − 1) from row 𝑛 of 𝑋 ′ . Obviously, the output maps of
(𝑁 2 −4)
convolutional and subsampling layers are 𝑁 2 − 4 × (𝐸 − 4) and × (𝐸−4)
(︀ )︀
2 2(︂
matrices. The unfolding
)︂
3(𝑁 2 −4)(𝐸−4)
layer reshapes 6 output maps of the subsampling layer as a sequence which is a 2
× 1 vector.
(2) (𝑁 2 −4) (2)
In each output map 𝑎𝑗 , it is unfolded as a row vector consisting of 2
rows of 𝑎𝑗 in turn. Aiming at
each OD flow, we build the corresponding training data set for prediction. Then, we train the proposed deep
architecture by the backpropagation algorithm independently for each OD flow prediction. The predictor
can be achieved by forward propagation over trained deep architecture. After predicting all OD flows, we
update the prior TM 𝑋 ′ for the next prediction. Meanwhile, we can calculate the NMAE according to
^ Besides, we compute a threshold by means of Algorithm 1.
link load, routing information and predictor 𝑋.
If the NAME is greater than or equal to the threshold, we train the deep architecture by updated prior
TM for the network prediction. Otherwise, we implement the next prediction by the forward propagation
without re-training the proposed deep architecture. This method is able to decease the number of training
and optimize the real-time performance of prediction.
5 NUMERICAL RESULTS
5.1 Network Traffic Data Set
To evaluate the designed approach, we implement it by the real data set sampled via the Abilene network
and our testbed [21]. The Abilene network includes 12 nodes, and these nodes are connected by 54 internal
and external links. Hence, the number of OD flows is 144 in Abilene. We use one week of TM to evaluate
Manuscript submitted to ACM
12 Xiaojie Wang, Laisen Nie, Zhaolong Ning, Lei Guo, Guoyin Wang, Xinbo Gao, and Neeraj Kumar
7
10
9
Real
CNN+LSTM
Predictor
8
6
0 5 10 15
7
10
9
Real
Predictor
8
PCA
7
6
0 5 10 15
7
10
9
Real
8 Predictor
SRMF
5
0 5 10 15
Fig. 3. Real network traffic and its predictor for large OD flow in Abilene.
the proposed approach. The network traffic is sampled by 10-minute interval. Namely, the TM contains
2016 time slots. Besides, we construct a testbed, which consists of 1 RSU and 12 OBUs, to imitate the scene
of a secure IoV [15]. The topology of our testbed is built according to the open shortest path first algorithm,
in which weights are defined as the general urban path loss model [3]. Meanwhile, the distance between
two OBUs is random. Furthermore, the services provided by the testbed include video and radio. We also
collect end-to-end network traffic with 2016 time slots by using Wireshark. Moreover, two state-of-the-art
network traffic prediction approaches are leveraged for comparison, i.e., the PCA method and the Sparsity
Regularized Matrix Factorization (SRMF) method. The PCA method is a ML-based approach, and leverages
singular value decomposition to capture the spatial features of TM [24]. SRMF is a matrix interpolation
algorithm in fact. It is put forward to reconstruct the missing data during network traffic direct measurement
process. Besides, by setting special parameters, it is also an available network traffic prediction approach by
extracting the spatio-temporal features of TM [21].
In our evaluations, the first 2000 time slots of TM are used as the previous TM to build the training
data set, and the rest is for test. Furthermore, we set 𝑀 = 150, 𝐸 = 50, 𝐿𝑝 = 4, 𝛼 = 0.5, 𝛾 = 0.01 in
our evaluations. To train the proposed deep architecture, we set the batch equal to 2 and 𝑝 = 0.5 for
the Bernoulli distribution on the dropout layer. We set these parameters empirically to obtain the lowest
prediction error. We use MATLAB R2019a to carry out all the evaluations. The evaluations are conducted
on a 64-bit Windows 7 machine running on an Intel Xeon W-2102 (2.9 GHz) and a 32 GB RAM.
Manuscript submitted to ACM
Deep Learning-based Network Traffic Prediction for Secure Backbone Networks in Internet of Vehicles 13
6
10
2.5
Real
CNN+LSTM
Predictor
1.5
0 5 10 15
6
10
2.5
Real
Predictor
PCA
1.5
0 5 10 15
6
10
6
Real
5 Predictor
SRMF
2
0 5 10 15
Fig. 4. Real network traffic and its predictor for small OD flow in Abilene.
600
Real
CNN+LSTM
Predictor
400
200
0
0 5 10 15
600
Real
Predictor
400
PCA
200
0
0 5 10 15
600
Real
Predictor
400
SRMF
200
0
0 5 10 15
Fig. 5. Real network traffic and its predictor for large OD flow in testbed.
6
Real
CNN+LSTM
Predictor
4
0
0 5 10 15
4
Real
Predictor
PCA
2
0
0 5 10 15
0
SRMF
-10
Real
Predictor
-20
0 5 10 15
Fig. 6. Real network traffic and its predictor for small OD flow in testbed.
0.8 CNN+LSTM
PCA
0.6 SRMF
0.4
0.2
0
2 4 6 8 10 12 14 16
(b) Time Slot Order
sharp damping faithfully. By contrast, SRMF has a low error, and PCA cannot track this trace at all. For
small traffic flows, all approaches occur large prediction error. Specially, SRMF has negative predictors, and
PCA shows a consistent predictor. Our approach can pursue the profile of this traffic flow from time slots 6
to 13.
To provide a quantitative analysis of three approaches, we leverage the Spatial Relative Error (SRE) and
Temporal Relative Error (TRE) defined as:
‖𝑋^ 𝑛,𝑡 −𝑋𝑛,𝑡 ‖2
⎧
⎨ 𝑆𝑅𝐸 (𝑛) =
⎪
‖𝑋𝑛,𝑡 ‖2
‖ ^ 𝑛,𝑡 −𝑋𝑛,𝑡 ‖
𝑋
, (19)
⎩ 𝑇 𝑅𝐸 (𝑡) =
⎪ 2
‖𝑋𝑛,𝑡 ‖2
ˆ 𝑛,𝑡 are real network traffic and its predictor, respectively. Fig. 7 shows the SREs and TREs
where 𝑋𝑛,𝑡 and 𝑋
of three approaches. The x-axis in Fig. 7(a) is the index of each OD flow, and it is arranged according to the
means of OD flows from the smallest to the largest ones. We can obtain similar conclusion for Figs. 3 and 4.
SRMF has the largest SREs, specially for small OD flows. The means of SREs of CNN+LSTM, PCA, and
SRMF are 0.47, 1.03 and 22.01, respectively. Fig. 7(b) exhibits the TREs of three approaches. Our method
obtains the lowest TRE consistently. SRMF shows the largest TRE in three approaches. Fig. 8 displays
the Cumulative Distribution Functions (CDF) of SREs and TREs. From Fig. 8(a), for CNN+LSTM, PCA
and SRMF, the SREs are less than 1.63, 1.80 and 15.38 respectively, with respect to 90% of all the OD
flows. Meanwhile, in Fig. 8(b), the TREs are less than 0.13, 0.25 and 0.39 respectively for 80% of time slots.
Similarly, the means of SREs of three methods are 2.34, 6.63, and 4.92, as shown in Fig. 9. The TREs of our
method occur some fluctuations, though it is the lowest one of three approaches. These fluctuations of TRE
can be exhibited clearly in Fig. 10(b).
Manuscript submitted to ACM
16 Xiaojie Wang, Laisen Nie, Zhaolong Ning, Lei Guo, Guoyin Wang, Xinbo Gao, and Neeraj Kumar
CNN+LSTM
0.8 PCA
CDF of SRE
1 SRMF
0.6 0.9
0.4 0.8
0.7
0.2
0 200
0
0 500 1000 1500 2000
(a) Spatial Relative Error
1
CNN+LSTM
0.8 PCA
CDF of TRE
SRMF
0.6
0.4
0.2
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
(b) Temporal Relative Error
500
CNN+LSTM
Spatial Relative Error
400 PCA
SRMF
300
200
100
0
0 20 40 60 80 100 120 140
(a) Flow ID,From Smallest to Largest in Mean
1.5
Temporal Relative Error
CNN+LSTM
PCA
SRMF
1
0.5
0
0 5 10 15
(b) Time Slot Order
0.8
CDF of SRE
0.6
0.4
CNN+LSTM
0.2 PCA
SRMF
0
0 100 200 300 400 500
(a) Spatial Relative Error
0.8
CDF of TRE
0.6
0.4
CNN+LSTM
0.2 PCA
SRMF
0
0 0.5 1 1.5
(b) Temporal Relative Error
Moreover, we leverage the prediction bias and the relative sample Standard Deviation (SD) to evaluate
the availability, which are denoted by:
⎧
𝑇 (︁ )︁
⎪ 𝑏𝑖𝑎𝑠(𝑛) =
⎪ 1
∑︀ ˆ 𝑛,𝑡 − 𝑋𝑛,𝑡
𝑋
⎪
⎨ 𝑇
√︃ 𝑡=1 , (20)
𝑇
1
(𝑒𝑟𝑟𝑜𝑟(𝑛) − 𝑏𝑖𝑎𝑠(𝑛))2
⎪ ∑︀
⎩ 𝑆𝐷(𝑛) =
⎪
𝑇 −1
⎪
𝑡=1
ˆ 𝑛,𝑡 − 𝑋𝑛,𝑡 . Fig. 11 shows the biases and SD of three approaches. In Fig. 11(a), we sort
where 𝑒𝑟𝑟𝑜𝑟 (𝑛) = 𝑋
all OD flows by the descending order according to their averages. From Fig. 11(a), we find that all the
approaches show an under-estimate or over-estimate. Specially, the CNN+LSTM has smaller biases than
the others. Meanwhile, the biases are decreased with the means of OD flows decrease. The biases of SRMF
are not consistently related to flow size, as in CNN+LSTM and PCA. It has prominent under-estimates and
over-estimates. For small OD flows, SRMF appears a remarkable under-estimate. The biased estimators
sometimes may have lower SD, and then predictors are closer to the real value than those of unbiased
predictor. Thereby, we refer to the SD (or variance) of bias for further evaluation. In Fig. 11(b), we find
that CNN+LSTM and PCA have lower bias, but higher SD. An approach with low bias and high SD tends
to predict the long-term traffic [24]. Otherwise, it prefers to predict the short-term traffic, when it has high
bias and low SD. Hence, CNN+LSTM and PCA are suitable for long-term traffic prediction, and SRMF
for short-term traffic prediction. As mentioned before, the SRMF is a matrix interpolation algorithm used
to reconstruct the missing data in TM. Generally, the missing probability of each element is independent
identically distributed. To extending this matrix interpolation algorithm to predict network traffic, it assumes
Manuscript submitted to ACM
18 Xiaojie Wang, Laisen Nie, Zhaolong Ning, Lei Guo, Guoyin Wang, Xinbo Gao, and Neeraj Kumar
Bias
-1
CNN+LSTM
-2 PCA
SRMF
-3
0 50 100 150
Flow ID
107 (b) Bias versus SD
2
0
Bias
-1
CNN+LSTM
-2 PCA
SRMF
-3
0 2 4 6 8 10 12
SD in Error 1013
that the missing elements are a series of columns in TM. Under this case, SRMF predicts an element from a
snapshot of TM. Hence, it is good at predicting short-term traffic. On the contrary, PCA and CNN+LSTM
predict a network traffic element by using a great number of snapshots of TM. For instance, the singular
value decomposition is a matrix-oriented operation. We use CNN as the first hidden layer, and employ a
matrix as the input of deep architecture. As a result, PCA and our method are suitable for long-term traffic
prediction. We can obtain the similar conclusion according to the evaluations in Fig. 12.
Finally, we evaluate the performance improvement ratio for our approach versus PCA and SRMF. The
performance improvement ratio is expressed by:
∑︀2
𝑁 𝑇 ⃒
∑︀ ⃒ ˆ𝐴
⃒ 𝑁 2
𝑇 ⃒
⃒ ∑︀ ∑︀ ⃒ ˆ𝐵
⃒
⃒𝑋𝑛,𝑡 − 𝑋𝑛,𝑡 ⃒ − ⃒𝑋𝑛,𝑡 − 𝑋𝑛,𝑡 ⃒
⃒
𝑛=1 𝑡=1 𝑛=1 𝑡=1
𝑃 𝐼𝑅 = , (21)
∑︀2
𝑁 𝑇 ⃒
∑︀ ⃒ ˆ𝐴
⃒
⃒𝑋𝑛,𝑡 − 𝑋𝑛,𝑡 ⃒
⃒
𝑛=1 𝑡=1
ˆ 𝑛,𝑡
where 𝑋 𝐴 ˆ 𝑛,𝑡
and 𝑋 𝐵
are the predictors obtained via algorithms A and B, respectively. According to Eq. (8),
the performance ratios of CNN+LSTM versus PCA and SRMF are 43.34% and 71.47% in Abilene. Moreover,
they are 60.93% and 64.54% in our testbed, respectively.
Bias
0
-100
-200
0 50 100 150
Flow ID
(b) Bias versus SD
200
CNN+LSTM
PCA
100 SRMF
Bias
-100
-200
0 1 2 3 4 5 6
SD in Error 104
two features of TM, i.e., the temporal and spatio-temporal features. Convolutional and subsampling layers
are used for extracting the spatio-temporal features of TM. Meanwhile, an LSTM layer is leveraged to
capture the temporal features. To improve the real-time performance of the proposed deep architecture
for prediction, we propose a threshold-based deep architecture update policy. Furthermore, we propose
a Q-learning algorithm to gain this threshold, in which link loads and routing information are employed
to generate NMAE. In this case, the threshold can be calibrated to obtain an optimal tradeoff between
real-time performance and prediction error. The proposed approach is evaluated by real network traffic data
set from the Abilene backbone network and our constructed testbed. From these evaluations, the proposed
approach can track the trace of end-to-end network traffic precisely.
The 5G-enabled communication network, in which infrastructures are employed densely, has been involved
in IoVs to provide adequate communication resources. Hence, network traffic prediction approaches for
large-scale and heterogeneous IoV backbone networks are necessary. Under this case, the computational
complexity of a prediction approach is significantly important for online network traffic prediction. Thereby,
a lightweight network traffic prediction approach is the main work in the future.
7 ACKNOWLEDGMENTS
This work was supported in part by the National Key R&D Program of China under Grant 2018YFE0206800,
in part by the National Natural Science Foundation of China under Grants 61936001, 61971084, 62001073
and 61701406, in part by the National Natural Science Foundation of Chongqing under Grants cstc2019jcyj-
cxttX0002 and cstc2019jcyj-msxmX0208.
Manuscript submitted to ACM
20 Xiaojie Wang, Laisen Nie, Zhaolong Ning, Lei Guo, Guoyin Wang, Xinbo Gao, and Neeraj Kumar
REFERENCES
[1] T. O. Adelani and A. S. Alfa. 2010. Hybrid Techniques for Large-Scale IP Traffic Matrix Estimation. In 2010 IEEE
International Conference on Communications. IEEE, 1–6.
[2] Nikolaos Athanasios Anagnostopoulos, Saad Ahmad, Tolga Arul, Daniel Steinmetzer, Matthias Hollick, and Stefan
Katzenbeisser. 2020. Low-Cost Security for Next-Generation IoT Networks. ACM Trans. Internet Technol. 20, 3,
Article 30 (Sept. 2020), 31 pages. https://doi.org/10.1145/3406280
[3] J. Andrusenko, R. L. Miller, J. A. Abrahamson, N. M. M. Emanuelli, R. S. Pattay, and R. M. Shuford. 2008. VHF
general urban path loss model for short range ground-to-ground communications. IEEE Transactions on Antennas and
Propagation 56, 10 (2008), 3302–3310.
[4] J. Contreras-Castillo, S. Zeadally, and J. A. Guerrero-Ibaez. 2018. Internet of Vehicles: Architecture, Protocols, and
Security. IEEE Internet of Things Journal 5, 5 (2018), 3701–3709.
[5] Giuseppe Faraci, Christian Grasso, and Giovanni Schembra. 2020. Fog in the Clouds: UAVs to Provide Edge Computing
to IoT Devices. ACM Trans. Internet Technol. 20, 3, Article 26 (Aug. 2020), 26 pages. https://doi.org/10.1145/3382756
[6] Qiang He, Xingwei Wang, Zhencheng Lei, Min Huang, Yuliang Cai, and Lianbo Ma. 2019. TIFIM: A Two-stage
Iterative Framework for Influence Maximization in Social Networks. Appl. Math. Comput. 354 (2019), 338 – 352.
https://doi.org/10.1016/j.amc.2019.02.056
[7] Z. Hu, Y. Qiao, and J. Luo. 2018. ATME: Accurate Traffic Matrix Estimation in Both Public and Private Datacenter
Networks. IEEE Transactions on Cloud Computing 6, 1 (2018), 60–73.
[8] B. Hussain, Q. Du, A. Imran, and M. A. Imran. 2020. Artificial Intelligence-Powered Mobile Edge Computing-Based
Anomaly Detection in Cellular Networks. IEEE Transactions on Industrial Informatics 16, 8 (2020), 4986–4996.
[9] H. Khelifi, S. Luo, B. Nour, H. Moungla, Y. Faheem, R. Hussain, and A. Ksentini. 2020. Named Data Networking in
Vehicular Ad Hoc Networks: State-of-the-Art and Challenges. IEEE Communications Surveys & Tutorials 22, 1 (2020),
320–351.
[10] T. Li, J. Yuan, and M. Torlak. 2018. Network Throughput Optimization for Random Access Narrowband Cognitive
Radio Internet of Things (NB-CR-IoT). IEEE Internet of Things Journal 5, 3 (June 2018), 1436–1448. https:
//doi.org/10.1109/JIOT.2017.2789217
[11] M. Liu, L. Liu, H. Song, Y. Hu, Y. Yi, and F. Gong. 2020. Signal Estimation in Underlay Cognitive Networks for
Industrial Internet of Things. IEEE Transactions on Industrial Informatics 16, 8 (2020), 5478–5488.
[12] X. Liu, L. Che, K. Gao, and Z. Li. 2020. Power System Intra-Interval Operational Security Under False Data Injection
Attacks. IEEE Transactions on Industrial Informatics 16, 8 (2020), 4997–5008.
[13] Marcin Luckner, Maciej Grzenda, Robert Kunicki, and Jaroslaw Legierski. 2020. IoT Architecture for Urban Data-
Centric Services and Applications. ACM Trans. Internet Technol. 20, 3, Article 29 (July 2020), 30 pages. https:
//doi.org/10.1145/3396850
[14] E. Maggiori, Y. Tarabalka, G. Charpiat, and P. Alliez. 2017. Convolutional Neural Networks for Large-Scale Remote-
Sensing Image Classification. IEEE Transactions on Geoscience and Remote Sensing 55, 2 (2017), 645–657. https:
//doi.org/10.1109/TGRS.2016.2612821
[15] L. Nie, Z. Ning, X. Wang, X. Hu, Y. Li, and J. Cheng. 2020. Data-Driven Intrusion Detection for Intelligent Internet
of Vehicles: A Deep Convolutional Neural Network-based Method. IEEE Transactions on Network Science and
Engineering (2020), 1–1.
[16] Z. Ning, K. Zhang, X. Wang, L. Guo, X. Hu, J. Huang, B. Hu, and R. Y. K. Kwok. 2020. Intelligent Edge Computing
in Internet of Vehicles: A Joint Computation Offloading and Caching Solution. IEEE Transactions on Intelligent
Transportation Systems (2020), 1–14.
[17] Z. Ning, K. Zhang, X. Wang, M. S. Obaidat, L. Guo, X. Hu, B. Hu, Y. Guo, B. Sadoun, and R. Y. K. Kwok. 2020.
Joint Computing and Caching in 5G-Envisioned Internet of Vehicles: A Deep Reinforcement Learning-Based Traffic
Control System. IEEE Transactions on Intelligent Transportation Systems (2020), 1–12.
[18] A. Omidvar and H. S. Shahhoseini. 2011. Intelligent IP traffic matrix estimation by neural network and genetic algorithm.
In 2011 IEEE 7th International Symposium on Intelligent Signal Processing. IEEE, 1–6.
[19] Carlo Puliafito, Enzo Mingozzi, Francesco Longo, Antonio Puliafito, and Omer Rana. 2019. Fog Computing for the
Internet of Things: A Survey. ACM Trans. Internet Technol. 19, 2, Article 18 (April 2019), 41 pages. https:
//doi.org/10.1145/3301443
[20] C. Qiu, Y. Zhang, Z. Feng, P. Zhang, and S. Cui. 2018. Spatio-Temporal Wireless Traffic Prediction With Recurrent
Neural Network. IEEE Wireless Communications Letters 7, 4 (Aug 2018), 554–557. https://doi.org/10.1109/LWC.
2018.2795605
[21] M. Roughan, Y. Zhang, W. Willinger, and L. Qiu. 2012. Spatio-temporal compressive sensing and Internet traffic
matrices (extended version). IEEE Transactions on Networking 20, 3 (2012), 662–676.
[22] S. K. Singh and A. Jukan. 2018. Machine-learning-based prediction for resource (Re)allocation in optical data
center networks. IEEE/OSA Journal of Optical Communications and Networking 10, 10 (Oct 2018), D12–D28.
https://doi.org/10.1364/JOCN.10.000D12
[23] S. Sinha and C. S. R. Murthy. 2005. Information theoretic approach to traffic adaptive WDM networks. IEEE/ACM
Transactions on Networking 13, 4 (2005), 881–894.
[24] A. Soule, A. Lakhina, N. Taft, K. Papagiannaki, K. Salamatian, A. Nucci, M. Crovella, and C. Diot. 2005. Traffic
matrices: balancing measurements, inference and modeling. In Proceedings of SIGMETRICS 2005. IEEE, 362–373.
[25] D. A. Tedjopurnomo, Z. Bao, B. Zheng, F. Choudhury, and A. K. Qin. 2020. A Survey on Modern Deep Neural Network
for Traffic Prediction: Trends, Methods and Challenges. IEEE Transactions on Knowledge and Data Engineering
(2020), 1–1.
[26] D. Wang, J. Fan, Z. Xiao, H. Jiang, H. Chen, F. Zeng, and K. Li. 2019. Stop-and-Wait: Discover Aggregation Effect
Based on Private Car Trajectory Data. IEEE Transactions on Intelligent Transportation Systems 20, 10 (Oct 2019),
3623–3633. https://doi.org/10.1109/TITS.2018.2878253
[27] J. Wang, J. Tang, Z. Xu, Y. Wang, G. Xue, X. Zhang, and D. Yang. 2017. Spatiotemporal modeling and prediction
in cellular networks: A big data enabled deep learning approach. In IEEE INFOCOM 2017 - IEEE Conference on
Computer Communications. IEEE, 1–9.
[28] X. Wang, Z. Ning, S. Guo, and L. Wang. 2020. Imitation Learning Enabled Task Scheduling for Online Vehicular Edge
Computing. IEEE Transactions on Mobile Computing (2020), 1–1.
[29] J. Weng, J. Weng, Y. Zhang, W. Luo, and W. Lan. 2019. BENBI: Scalable and Dynamic Access Control on the
Northbound Interface of SDN-Based VANET. IEEE Transactions on Vehicular Technology 68, 1 (Jan 2019), 822–831.
https://doi.org/10.1109/TVT.2018.2880238
[30] D. Wu, Z. Jiang, X. Xie, X. Wei, W. Yu, and R. Li. 2020. LSTM Learning With Bayesian and Gaussian Processing for
Anomaly Detection in Industrial IoT. IEEE Transactions on Industrial Informatics 16, 8 (2020), 5244–5253.
[31] S. Yang, Y. Su, Y. Chang, and H. Hung. 2019. Short-Term Traffic Prediction for Edge Computing-Enhanced Autonomous
and Connected Cars. IEEE Transactions on Vehicular Technology 68, 4 (April 2019), 3140–3153. https://doi.org/10.
1109/TVT.2019.2899125
[32] C. Zhang, H. Zhang, D. Yuan, and M. Zhang. 2018. Citywide Cellular Traffic Prediction Based on Densely Connected
Convolutional Neural Networks. IEEE Communications Letters 22, 8 (Aug 2018), 1656–1659. https://doi.org/10.
1109/LCOMM.2018.2841832
[33] L. Zhang and Y. Liang. 2019. Joint Spectrum Sensing and Packet Error Rate Optimization in Cognitive IoT. IEEE
Internet of Things Journal 6, 5 (Oct 2019), 7816–7827. https://doi.org/10.1109/JIOT.2019.2907993
[34] Z. Zhao, W. Chen, X. Wu, P. C. Y. Chen, and J. Liu. 2017. LSTM network: a deep learning approach for short-term
traffic forecast. IET Intelligent Transport Systems 11, 2 (2017), 68–75.