Electricity Price Forecasting Model based on Gated
Recurrent Units
                                                     Nafise Rezaei                        Roozbeh Rajabi                                 Abouzar Estebsari
                                                  Faculty of ECE                        Faculty of ECE                   School of the Built Environment and Architecture
                                             Qom University of Technology          Qom University of Technology                   London South Bank University
                                                    Qom, Iran                              Qom, Iran                                 London, United Kingdom
                                                 rezaii.n@qut.ac.ir                     rajabi@qut.ac.ir                                estebsaa@lsbu.ac.uk
                                            Abstract—The participation of consumers and producers in             the signal cannot be determined; and ii) the same IMF contains
                                         demand response programs has increased in smart grids, which            more than two main frequencies, which limits the noise
                                         reduces investment and operation costs of power systems. Also,          reduction [6]. These problems are addressed by Ensemble
                                         with the advent of renewable energy sources, the electricity
                                         market is becoming more complex and unpredictable. To                   EMD (EEMD) in [7], where the signals are integrated with
                                                                                                                 white Gaussian noise. However, adding noise can produce
arXiv:2207.14225v1 [cs.LG] 28 Jul 2022
                                         effectively implement demand response programs, forecasting the
                                         future price of electricity is very crucial for producers in the        varying numbers of IMFs, whose consequence would be
                                         electricity market. Electricity prices are very volatile and change     observation of residual noise in the reconstructed signals
                                         under the influence of various factors such as temperature,             after the decomposition. The problem of imbalance EMD
                                         wind speed, rainfall, intensity of commercial and daily activities,
                                         etc. Therefore, considering the influencing factors as dependent        decomposition scale was resolved when complete EEMD
                                         variables can increase the accuracy of the forecast. In this paper, a   with Adaptive Noise (CEEMDAN) [8] added white noise
                                         model for electricity price forecasting is presented based on Gated     components in the sifting process of each IMF. Selecting the
                                         Recurrent Units. The electrical load consumption is considered as       sensitive mode to distinguish relevant and irrelevant IMFs in
                                         an input variable in this model. Noise in electricity price seriously   an efficient way is a serious challenge in EMD-based methods.
                                         reduces the efficiency and effectiveness of analysis. Therefore, an
                                         adaptive noise reducer is integrated into the model for noise           To tackle these challenges, we use CEEMDAN method which
                                         reduction. The SAEs are then used to extract features from the          potentially avoid the spurious modes and the reduction in the
                                         de-noised electricity price. Finally, the de-noised features are fed    amount of noise contained in the modes. In the following
                                         into the GRU to train predictor. Results on real dataset shows that     sections of this paper, some related works on electricity
                                         the proposed methodology can perform effectively in prediction          price forecasting are summarized, an improved version of
                                         of electricity price.
                                            Index Terms—Electricity Price, Gated Recurrent Unit (GRU),
                                                                                                                 CEEMDAN as well as a GRU model is introduced, and results
                                         Feature Extraction, Stacked Auto Encoder(SAE).                          on evaluating the efficiency of the new modified method are
                                                                                                                 presented. The paper is concluded then with remarks.
                                                                I. I NTRODUCTION                                                    II. L ITERATURE R EVIEW
                                            Current technology cannot store electricity in large amounts            Creating high quality and effective forecasting models is
                                         efficiently and cost effectively. Traditionally, the production         becoming challenging and difficult in the new electricity
                                         has been following the demand to ensure the balance,                    markets where most of consumers and producers take
                                         and in smart grids with integration of more intermittent                active roles in the market, more renewable resources with
                                         renewable-sourced production, the demand is also following              intermittent behavior are integrated into the network, and
                                         the generation by appropriate demand side management                    more fluctuations and volatility are observed in the system
                                         schemes [1]. Electricity demand depends on temperature,                 [5]. Reviewing literature shows the challenges resulted in
                                         wind speed, precipitation, intensity of business and everyday           developing several different methods and models by involved
                                         activities, etc [2], [3]. These would result in price dynamics.         stakeholders and researchers. These models can be classified
                                         The electricity market actors find it challenging to forecast           into three categories: statistical methods, artificial intelligence
                                         the electricity load [4], generation, and price accurately and          methods, and hybrid methods [9]. Statistical methods such
                                         efficiently as the systems face more uncertainty, non-linearity         as autoregressive moving average (ARMA) [10], autoregres-
                                         and volatility. [5]. There are different methods and solutions to       sive integrated moving average (ARIMA) [11], generalized
                                         tackle the above-mentioned challenges. However, each method             autoregressive conditional heteroskedasticity (GARCH) [12],
                                         comes with its pros and cons. Empirical Mode Decomposition              vector auto-regression (VAR) [13], and Kalman filters (KF)
                                         (EMD) is a technique which decomposes a signal into Intrinsic           [14], outperform other methods in stable power markets [15].
                                         Mode Functions (IMFs) series and a residual with different              However, the rapid changes of electricity prices with their
                                         frequencies. Besides the advantages of using EMDs, there are            nonlinear characteristics are not considered in the electricity
                                         two main issues: i) the very extreme values at the two ends of          price, which is a limitation of these methods [5]. Comparing
to these statistical methods, artificial intelligence methods,
instead, are capable of managing the nonlinear properties and
rapid changes [16]. There are many studies on the subject
of electricity price prediction based on artificial intelligence
methods such as artificial neural network (ANN) [17], [18],
recurrent neural network (RNN) [19], [20], and Extreme
Learning Machine (ELM) [21]. Nevertheless, applying only
traditional models may overlook the complex characteristics
of the original nonlinear and non-fixed electricity prices. To
overcome the disadvantages of single models, many hybrid
models have been developed for electricity price forecasting
that use data analysis methods to preprocess nonlinear and
nonstationary electricity price data before forecasting [16]. For
example, Zhang et al [22], developed a forecasting model
using WT models, generalized regression neural network
(GRNN) and GARCH models for electricity price forecasting,
which was validated in the Spanish electricity market and
performed well [23].
                    III. P ROPOSED M ETHOD
   Increasing attention to the prediction of time series such
as Electricity Price indicates its potential application in
many fields. However, predicting time series is not an easy
task. How to deal with noise and extract features from
time series are the two main challenges to obtain accurate
predictions. Because time series are unstable and often contain
noise, data processing is necessary to reduce the effect of
noise on prediction. To solve these issues, a framework
for noise removing and feature extraction for electricity
price forecasting is proposed. In this framework, CEEMDAN
technique is used to decompose time series and eliminate
noise. A combination of stacked encoders and GRU is also
used for feature extraction and prediction. The diagram of the
proposed model is presented in Figure 1.
A. Adaptive Noise Reduction
                                                                                      Fig. 1. The proposed ANR-SAE-GRU method.
   First, CEEMDAN technique [8] is used to decompose
electricity price into IMFs and a residual (R). This technique
can decompose a signal into a set of IMFs that include noise                 Step 2: When k = 1, 2, · · · , I the remaining kth residue is
modes and information modes. Therefore, it can be a powerful              calculated as rk = rk−1 − IM Fk and r0 = X. Then rk +
compatible tool for extracting IMFs for signal reconstruc-                0 Ek (wi )(i = 1, 2, · · · , I) is decomposed by EMD to obtain
tion. The main issue is how to choose a sensitive mode to                 the first state of EMD and the definition of the (k + 1)th of
distinguish between IMFs and unrelated IMFs in an efficient               the IMF is as follows:
way. residual has the lowest frequency and the frequency
distribution of IMFs is from top to bottom. It is assumed that                                       I
the electricity price time series are X = {x1 , x2 , . . . , xm } [24].                          1X
                                                                                    IM Fk+1 =          E1 (rk + k Ek (wi ))           (2)
   Step 1: The amount of Gaussian noise 0 wi (i = 1, 2, · · · , I)                              I i=1
is added to the electricity price and X + 0 wi is decomposed
by the EMD technique to obtain the first IMF:                               Step 3: Step 2 are repeated until the residue cannot be
                                                                          decomposed anymore. The final residue is as follows:
                               I
                            1X
                IM F1 =           E1 (X + 0 wi )                  (1)
                            I i=1                                                                        K
                                                                                                         X
                                                                                             R=X−              IM Fk                   (3)
  In this relation, 0 is the adaptive coefficient and I is the                                          k=1
number of adding Gaussian noise. In this paper, the value of
I is considered equal to 100.                                               Where K is the IMF number and the electricity price time
series can be rewritten as follows:                                and converting it to labeled data is costly and requires
                           K
                           X                                       professional knowledge. unsupervised learning, can learn a
                    X=           IM Fk + R                   (4)   feature display layer from unlabeled data. In addition, these
                           k=1                                     layers can be stacked to create deep grids. The proposed
   In the next step, noisy and no-noise IMF must be                model uses automatic stack encoder (SAE), which is a kind of
separated. To do this, the permutation entropy method is used.     unsupervised learning method. Figure 2 shows the structure of
Permutation Entropy (PE) [25] is a method of calculating the       a typical encoder. SAE consist of two parts, the encoder and
complexity of time series. The permutation entropy measures        the decoder. In this paper, an auto encoder with three hidden
the randomness of time series. The higher the permutation          layers is used, which extracts the required features to increase
entropy, the more noise will present in time series. IMF           the accuracy of forecasting.
classification is based on the PE value of each IMF. The PE
value threshold is set at 0.7 based on several experiments in
different methods. If the PE value of an IMF is higher than the
threshold, the IMF contains noise; otherwise, the IMF is noise-
free. Assuming that P is the point of separation between high-
frequency noise IMFs and low-frequency noise-free IMFs, the
decomposed series can be expressed as follows:
                P
                X −1                K
                                    X
          X=           IM F (k) +         IM F (k) + R       (5)
                k=1                 k=p
B. Noise Reduction
  To reduce noise in the IMF with high-frequency noise, the
adaptive threshold and the threshold function are determined
using the following equations:
                                                                                     Fig. 2. Auto Encoder Structure.
                  s
                      2ln(m)
             λ=σ
                     ln(k + 1)                                     E. GRU Predictor
                   (                                      (6)
                    sgn(w)(|w| − λ), |w| ≥ λ                          Gated Recurrent Unit (GRU) is especially effective in
             wλ =                                                  learning the characteristics of longer time series, because
                    0,                  |w| < λ
                                                                   learning the properties of long sequences requires less training
   where σ, m and K are the standard deviation, length and         weights and faster calculations compared to LSTM. The GRU
number of IMFs, respectively. The notation w is a noise value      is therefore used here to process and learn the time domain
in the IMF. If the absolute value of w is greater than λ, it       characteristics of the electricity prices extracted by SAE and
takes the value sgn(w)(|w| − λ), otherwise it takes 0 [24].        to improve the accuracy of the final model forecast.
C. Reconstruction
                                                                                 IV. E VALUATION AND R ESULTS
   Noise-free time series are obtained by adding high-
frequency and low-frequency noise-free IMFs. Noise-free time          In this section, the proposed framework is compared with
series are obtained as follows:                                    the framework, which used LSTM to predict electricity price.
                                                                   First, the dataset is described, and then the method for data
               P
               X −1                 K
                                    X                              sampling for evaluation is described. Three criteria are used
          X=           IM F (k) +         IM F 0(k) + R      (7)   to evaluate the prediction models, including root mean square
                k=1                 k=p                            error (RMSE) and mean absolute error (MAE).
  Where IM F 0(k) (k = P, P + 1, · · · , K) is equivalent to the
                                                                   A. Dataset
high-frequency noise-free IMF and IM F (k) is equivalent to
low frequency noise-free IMF and R is the residue.                    The data used in this article are related to the price
                                                                   of electricity consumed in Iran over three years. In this
D. Unsupervised Learning                                           dataset, hourly electricity prices are collected. In total, the
   To effectively analyze the price of electricity, an important   size of the electricity price data set is 35040. In this study,
method is to extract the features and their interdependence        information for 1186 days is used for training (28032 data
between variables. Feature extraction methods can be divided       samples, one sample per hour) and 292 days for testing
into supervised learning methods and unsupervised learning         (7008 samples). Iran electricity market price dataset is public
methods. Supervised learning method extracts functions from        and can be downloaded from https://www.igmc.ir/Electronic-
labeled datasets. However, electricity price are often unlabeled   Services/Power-Market-Deputy/Reports.
                                                             TABLE I
                                          C OMPARISON R ESULTS BASED ON RMSE AND MAE.
                                       Measure                RMSE                          MAE
                            Model
                                   Horizon (Hour)    3      6      9     12       3       6     9         12
                              Proposed Method       2.86   3.89  5.18   6.48     1.8     2.83 3.92         5
                              ANR-SAE-LSTM          4.33   5.28  6.39   7.5      2.09    2.94   4        5.42
B. Evaluation Measures
   The evaluation process was performed using evaluation
indicators named RMSE and MAE for both training and test
sets. Eqs. 8, 9 show these criteria, respectively.
                          v
                          u
                          u1 X    N
                 RMSE = t            (yi − yˆi )2    (8)
                             N i=1
                            N
                         1 X
                   MAE =       |yi − yˆi |                   (9)
                         N i=1
where N is the total number of sample points. yi and yˆi are
the actual and predicted values of ith sample respectively.                    Fig. 3. Prediction results compared to actual values.
C. Results and discussion
   In this section, the proposed GRU-based method is               methods becomes larger as the forecast horizon increases. On
compared with the LSTM-based method on a electricity price         average, for different forecast horizons, the proposed method
dataset to demonstrate the superiority of the proposed method      is 1.54 and 0.3 superior to the other method in terms of RMSE
in electricity price prediction. Simulations are performed in      and MAE criteria, respectively.
Python using NumPy, PyEEMD, and Keras libraries.                      Our approach is only to forecast electricity prices using
   In our experiment, the prediction horizon is set to 3, 6, 9,    historical electricity price data. However, since the price of
12 and the corresponding input data length is set to 24. Table I   electricity depends on several factors, the effect of several
summarizes the prediction results according to different values    dependent variables on the price forecast can be considered. In
of the forecast horizon. It can be seen that as the forecast       future work, we will further explore the possible methods of
horizon increases, the forecasting error for both methods          multi-objective forecasting. In the proposed method, only one
becomes larger, but the proposed method works better for all       GRU predictor is used in the aggregation stage, while several
three criteria and has the least error growth. On average, for     predictors such as LSTM and GRU can combine consumption
different forecast horizons, the proposed method is 1.54 and       patterns in a group.
0.3 superior to the other method in terms of RMSE and MAE                                       R EFERENCES
criteria, respectively.
                                                                   [1] J. Bauer, Upgrading Leadership’s Crystal Ball. CRC Press, 2013.
   Figure 3 shows the forecast results of the methods compared     [2] R. Weron, Modeling and forecasting electricity loads and prices: A
to the actual cost values for 12 days. We can see that the             statistical approach. John Wiley & Sons, 2007.
proposed method provides the predicted values closer to the        [3] A. Estebsari and R. Rajabi, “Single residential load forecasting using
                                                                       deep learning and image encoding techniques,” Electronics, vol. 9, no. 1,
actual values than the ANR-SAE-LSTM.                                   2020. [Online]. Available: https://www.mdpi.com/2079-9292/9/1/68
                                                                   [4] R. Rajabi and A. Estebsari, “Deep learning based forecasting of
                      V. C ONCLUSION                                   individual residential loads using recurrence plots,” in 2019 IEEE Milan
   To deal with noise in electricity price, an ANR method              PowerTech, 2019, pp. 1–5.
                                                                   [5] S.-C. Chan, K. M. Tsui, H. Wu, Y. Hou, Y.-C. Wu, and F. F. Wu,
was used in this paper. Permutation Entropy (PE) is used to            “Load/price forecasting and managing demand response for smart
determine the boundary point between noisy and low-noise               grids: Methodologies and challenges,” IEEE signal processing magazine,
IMFs, and an adaptive threshold function is constructed to             vol. 29, no. 5, pp. 68–85, 2012.
                                                                   [6] J. Zhang, Y. Guo, Y. Shen, D. Zhao, and M. Li, “Improved ceemdan–
eliminate noise from high-frequency IMFs. SAE was used                 wavelet transform de-noising method and its application in well logging
to extract features that fully take into account electricity           noise reduction,” Journal of Geophysics and Engineering, vol. 15, no. 3,
price forecasting performance and prevent overfitting, the             pp. 775–787, 2018.
                                                                   [7] Z. Wu and N. E. Huang, “Ensemble empirical mode decomposition:
GRU model was employed to obtain a robust forecaster. The              a noise-assisted data analysis method,” Advances in adaptive data
proposed GRU-based method was compared with the LSTM-                  analysis, vol. 1, no. 01, pp. 1–41, 2009.
based method on a power consumption price dataset. Two             [8] M. E. Torres, M. A. Colominas, G. Schlotthauer, and P. Flandrin, “A
                                                                       complete ensemble empirical mode decomposition with adaptive noise,”
criteria of RMSE and MAE were considered for the results               in 2011 IEEE international conference on acoustics, speech and signal
comparsion. The results showed that the forecast error for both        processing (ICASSP). IEEE, 2011, pp. 4144–4147.
 [9] M. Lei, L. Shiyan, J. Chuanwen, L. Hongling, and Z. Yan, “A review                  forecasting via the application of artificial neural network based models,”
     on the forecasting of wind speed and generated power,” Renewable and                Applied Energy, vol. 172, pp. 132–151, 2016.
     sustainable energy reviews, vol. 13, no. 4, pp. 915–920, 2009.               [19]   S. Anbazhagan and N. Kumarappan, “Day-ahead deregulated electricity
[10] F.-L. Chu, “Forecasting tourism demand with arma-based methods,”                    market price forecasting using recurrent neural network,” IEEE Systems
     Tourism Management, vol. 30, no. 5, pp. 740–751, 2009.                              Journal, vol. 7, no. 4, pp. 866–872, 2012.
[11] T. Jakaša, I. Andročec, and P. Sprčić, “Electricity price                [20]   U. Ugurlu, I. Oksuz, and O. Tas, “Electricity price forecasting using
     forecasting—arima model approach,” in 2011 8th International                        recurrent neural networks,” Energies, vol. 11, no. 5, 2018. [Online].
     Conference on the European Energy Market (EEM). IEEE, 2011, pp.                     Available: https://www.mdpi.com/1996-1073/11/5/1255
     222–225.                                                                     [21]   X. Chen, Z. Y. Dong, K. Meng, Y. Xu, K. P. Wong, and H. Ngan,
[12] R. C. Garcia, J. Contreras, M. Van Akkeren, and J. B. C. Garcia, “A                 “Electricity price forecasting with extreme learning machine and
     garch forecasting model to predict day-ahead electricity prices,” IEEE              bootstrapping,” IEEE Transactions on Power Systems, vol. 27, no. 4,
     transactions on power systems, vol. 20, no. 2, pp. 867–874, 2005.                   pp. 2055–2062, 2012.
[13] H. Nyberg and P. Saikkonen, “Forecasting with a noncausal var model,”        [22]   J. Zhang, Z. Tan, and C. Li, “A novel hybrid forecasting method using
     Computational statistics & data analysis, vol. 76, pp. 536–555, 2014.               grnn combined with wavelet transform and a garch model,” Energy
[14] H. Takeda, Y. Tamura, and S. Sato, “Using the ensemble kalman filter                Sources, Part B: Economics, Planning, and Policy, vol. 10, no. 4, pp.
     for electricity load forecasting and analysis,” Energy, vol. 104, pp. 184–          418–426, 2015.
     198, 2016.                                                                   [23]   D. Wang, H. Luo, O. Grunder, Y. Lin, and H. Guo, “Multi-step ahead
[15] F. J. Nogales, J. Contreras, A. J. Conejo, and R. Espı́nola, “Forecasting           electricity price forecasting using a hybrid model based on two-layer
     next-day electricity prices by time series models,” IEEE Transactions               decomposition technique and bp neural network optimized by firefly
     on power systems, vol. 17, no. 2, pp. 342–348, 2002.                                algorithm,” Applied Energy, vol. 190, pp. 390–407, 2017.
[16] W. Yang, J. Wang, T. Niu, and P. Du, “A novel system for multi-step          [24]   F. Liu, M. Cai, L. Wang, and Y. Lu, “An ensemble model based on
     electricity price forecasting for electricity market management,” Applied           adaptive noise reducer and over-fitting prevention lstm for multivariate
     Soft Computing, vol. 88, p. 106029, 2020.                                           time series forecasting,” IEEE Access, vol. 7, pp. 26 102–26 115, 2019.
[17] D. Singhal and K. Swarup, “Electricity price forecasting using artificial    [25]   C. Bandt and B. Pompe, “Permutation entropy: A natural complexity
     neural networks,” International Journal of Electrical Power & Energy                measure for time series,” Phys. Rev. Lett., vol. 88, p. 174102, Apr
     Systems, vol. 33, no. 3, pp. 550–555, 2011.                                         2002. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.
[18] I. P. Panapakidis and A. S. Dagoumas, “Day-ahead electricity price                  88.174102