On resampling the raw AIS data during the preprocessing procedure, extracting the maritime route pattern by trajectory clustering, and introducing the attentional seq2seq time series prediction framework, we propose a deep neural network called VesNet, which jointly learns to classify the input vessel historical movement sequence into appropriate route patterns and forecast the future navigation sequence within a period. The overall structure of VesNet is shown in Figure
7. It consists of three modules: the encoder, the decoder, and the MTL block. VesNet is an extended variational version of the attentional seq2seq model, where the intermediate hidden vector supports both route pattern recognition and trajectory prediction. The VesNet takes the historical vessel movement sequence as the input and outputs the route pattern classification result and the future movement sequence, varying from minutes to hours.
As demonstrated in Figure
7, a vessel movement sequence (
\(a_1, a_2, \ldots , a_l\)) with the same time interval is fed into the encoder. Each timestep input
\(a_j\), where
\(j=1,2,\ldots ,l\), is a four-dimensional vector comprised of vessel latitude, longitude, velocity, and course. We leverage the min-max normalization for feature scaling, which enforces the input features distributed in the range of [0, 1]. After processing the input with a sequential LSTM recurrent network, we collect the returned sequence prepared for the upcoming attention mechanism. Meanwhile, the last timestep hidden state
\(h_l\), which contains the spatial and temporal characteristics of the input sequence, is concatenated with the latent representation of the departure port extracted from historical sequences. The merged hidden vector is reserved for later route pattern classification and vessel trajectory prediction. When it turns to the decoder side, another LSTM recurrent network with length
h is deployed to predict the vessel’s future movement sequence. We set
h to a sufficient value to cover both short-term and long-term predictions. Stopping the inference at different timesteps can make VesNet predict over different time lengths. For each timestep at the decoder, it operates on the last timestep status (
\(h^{\prime }_{i-1}, C^{\prime }_{i-1}\)) and output
\(a^{\prime }_{i-1}\), where
\(i=1,2,\ldots ,h\), to generate the current timestep hidden state
\(h^{\prime }_i\). Note that
\(h^{\prime }_0\) and
\(C^{\prime }_0\) are the merged hidden vectors generated by the encoder. Later on, with the help of the attention mechanism, which is elaborated in (
7)–(
9), we utilize
\(h^{\prime }_i\) to query the contextual sequence obtained by the encoder, resulting in the weighted sum vector
\(attn_i\). Finally, the MTL block is responsible for route pattern classification and future movement sequence forecasting. On the one hand, a softmax function activates the merged hidden vector to match the one-hot version of the route pattern cluster labeled in Section
4.1. Consequently, the classification result
r is embedded to match the dimension of the attentional context. On the other hand, we concatenate
\(h^{\prime }_i\),
\(attn_i\) and
\({\rm embedding}(r)\) to merge the hidden status of current timestamp queried historical context and route pattern information into a hybrid vector, exploited for single timestep vessel movement prediction. Specifically, the hybrid vector goes through a
Traj Output (TO) module, which is sequentially connected with a dense layer, a RELU layer, another dense layer, and a sigmoid layer. By iteratively following the process, a vessel movement sequence is therefore produced. After a reversed operation of the min-max normalization, we derive the ultimate forecasts at the end. The primary purpose for jointly learning the route pattern classification and future movement sequence is to constrain the predictions within a prior statistical distribution with the auxiliary information the extracted route knowledge provides, guaranteeing the forecasting precision.