0% found this document useful (0 votes)
35 views30 pages

Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review

This paper presents a systematic literature review on network intrusion detection systems (IDS), focusing on anomaly, signature, and hybrid-based approaches. It highlights the gaps in existing research, particularly the lack of comprehensive analyses of hybrid methods, and proposes future research directions to enhance IDS models. The study adheres to systematic review principles, providing a detailed synthesis of methodologies, datasets, and performance metrics used in current IDS research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views30 pages

Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review

This paper presents a systematic literature review on network intrusion detection systems (IDS), focusing on anomaly, signature, and hybrid-based approaches. It highlights the gaps in existing research, particularly the lack of comprehensive analyses of hybrid methods, and proposes future research directions to enhance IDS models. The study adheres to systematic review principles, providing a detailed synthesis of methodologies, datasets, and performance metrics used in current IDS research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Wireless Networks (2024) 30:453–482

https://doi.org/10.1007/s11276-023-03495-2 (0123456789().,-volV)(0123456789().
,- volV)

ORIGINAL PAPER

Towards an efficient model for network intrusion detection system


(IDS): systematic literature review
Oluwadamilare Harazeem Abdulganiyu1 • Taha Ait Tchakoucht1 • Yakub Kayode Saheed2

Accepted: 30 August 2023 / Published online: 14 September 2023


 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023

Abstract
With the recent rise in internet usage, the volume of crucial, private, and confidential data traveling online has increased.
Attackers have made attempts to break into the network due to weaknesses in the security mechanisms, gaining access to
sensitive data that could disrupt the operation of the systems and jeopardize the confidentiality of the data. Intrusion
detection systems, a key component of cybersecurity, were used to defend against these potential threats. Numerous review
papers have examined various intrusion detection system (IDS) approaches for networks, many of which focused primarily
on the anomaly-based IDS while paying less attention to the signature and hybrid-based approaches. Additionally, a lot of
these studies took a non-systematic approach, comparing existing techniques, without taking into account the in-depth
analytical synthesis of methodologies and results of the approaches to provide a thorough grasp of state-of-the-art. In order
to provide a thorough assessment of the current status of Network IDS, this work provides an in-depth insight into what is
attainable in the research of Anomaly, Signature, and Hybrid-Based methods by adhering to the principles of Preferred
Reporting Items for Systematic Reviews and Meta-Analyses and guidelines for Software Engineering Domain. We dis-
covered unexplored study topics and unsolved research issues based on the study’s findings. We end by highlighting
potential, high impact future research areas in order to improve the IDS model.

Keywords Network intrusion detection system  Signature-based  Anomaly-based  Artificial intelligence 


Rule-based  Pattern matching

1 Introduction infrastructure which monitors a network and spots suspi-


cious, malicious, or policy violations through network
With gaps in the security systems, attackers have attempted traffic. IDSs can be either anomaly-based or signature-
to intrude the network, thereby gaining access to sensitive based [49] [52] [59].
information, which may cause harm to the operation of the An IDS works by searching for either signatures of
systems and alter the data’s availability and confidentiality. known attacks or deviation from a predefined profile of
Thus, cybersecurity has been perceived as a crucial domain normal activity [64] [68]. Typically, The signature-based
to fight against cyber-attacks and alleviate related damages IDS, also known as ‘‘misuse intrusion detection’’ or
and costs. To this end, the Intrusion Detection System ‘‘knowledge-based intrusion detection,’’ examines a data-
(IDS), which acts as a tool to recognize various forms of base of acknowledged attack signatures and makes an
intrusions, is an essential component of the security effort to match the activity to the particular attack signature
in the database, and if there is a match, the system labels it
as malicious[74]. Due to the accessibility of existing
& Oluwadamilare Harazeem Abdulganiyu attacks’ signatures, this method has the advantage of easily
h.abdulganiyuoluwadamilare@ueuromed.org
identifying known attacks. However, because the database
1
EuroMed Research Center, School of Digital Engineering does not contain signature patterns for unique or unknown
and Artificial Intelligence, Euro-Mediterranean University of attacks, this technique is unable to detect them. Also, it
Fes, 30030 Fes, Morocco consumes enormous resources due to the extensive signa-
2
School of IT and Computing, American University of ture database that has to be sustained, updated, and
Nigeria, Yola, Nigeria

123
454 Wireless Networks (2024) 30:453–482

compared with the data packets for possible attacks. On the The remainder of the paper is structured as follows:
contrary, anomaly-based intrusion detection systems, often Sect.2 discuss related works. Section 3 discusses the
known as ‘‘behavior-based IDS,’’ work to create a prede- study’s methodology, which includes the review protocol,
termined profile of typical behavior and classify any vari- research questions, search strategy, selection of records,
ation from it as an intrusion [76]. This strategy’s strength is inclusion and exclusion criteria, quality assessment, data
in its capacity to identify unidentified attacks. However, it extraction. The study selection, publishing distributions,
typically leads to a higher false-positive rate, which limits and discussion of results and findings are all included in
its application. Researchers have made numerous attempts Sect.4’s. In sect.5, challenges and future directions were
to develop IDS techniques that can effectively detect discussed, while conclusions and viewpoints are drawn in
attacks. The majority of IDSs recently presented in the sect. 6.
literature have focused on the anomaly-based approach
using artificial intelligence (deep learning and machine
learning techniques) to identify intrusions in networks, 2 Related work
while few are based on the signature-based approach using
rule-based and pattern matching techniques [77]. On the IDS was thoroughly evaluated by [25] with a focus on the
other hand, little attention has been paid to investigating a signature and anomaly-based approaches, but a systematic
hybrid-based method that combines the signature and literature review approach was not used, and there were
anomaly techniques[80]. also no in-depth analyses or comparisons of the methods,
The current IDS methods still have a high false alert rate datasets, attacks, or performances of the techniques. Sim-
[22, 28, 32, 47, 69] and a low detection rate ilar to [51], who used a systematic strategy to review
[11, 14, 16, 47], despite the fact that many studies have hybrid-based IDS, this study’s hybridization focused on
been conducted in the context of IDS. As a result, there has integrating techniques rather than approaches. Addition-
been a shift towards deep learning methods, which have ally, there was no synthesis or comparison of the perfor-
been used by researchers to mitigate some of these issues. mances of the techniques. Similar to [57], which presented
Although they make an effort, these approaches still an examination of the literature on intrusion detection
struggle with false-positive rates, the issue of imbalanced systems with a focus on the IoT environment, neither the
data, and computational and resource limitations hybrid approach nor any evidence of an analysis and syn-
[1, 8, 28, 34, 47]. thesis of the techniques were discovered.
A large number of review papers have looked at dif- Intrusion detection in the network environment was
ferent network intrusion detection methodologies, but most reviewed in a different study by [37]. The research was
of them adopt a non-systematic approach and merely done using a standard methodology, which just displays a
compare the approaches that are currently in use without comparison of different methodologies without any in-
reflecting an in-depth analytical synthesis of the method- depth synthesis or analysis. Finally, a systematic study of
ologies and results of the approaches to provide a com- anomaly-based IDS was conducted by [41, 63], however it
prehensive picture of the IDS state of the art. Anomaly IDS mainly focused on deep learning approaches and didn’t go
with deep learning models was the main emphasis of many into detail about hybrid or signature-based IDS. Therefore,
of these investigations, whereas Signature and Hybrid- this study is being conducted to fill in the gaps in these
Based (Signature ? Anomaly-Based) had a much more appraised publications and to describe the state of the art in
constrained focus. The current methods for signature, Signature-Based, Anomaly-Based, and Hybrid-Based IDS.
anomaly, and hybrid-based IDS are examined in this work, Table 1 displays a comparison of the most recent review
which, in contrast to earlier review studies, discusses the studies.
advantages, disadvantages, and perspectives. It also pro-
vides a thorough analysis of the most recent methods,
datasets, attack types, and performances. The novelty of 3 Research methodology
this review paper lies in its effective in-depth synthesis and
analysis of the various solutions present, giving more The study followed a Systematic Literature Review (SLR)
indications for wise choices, with a summary of the pros approach in identifying, evaluating and interpreting all the
and cons of the various categories of algorithms, datasets, relevant parts of the research studies that address specific
and a vision of the future evolution of these algorithms, objectives. This SLR adheres to the Preferred Reporting
additionally, previous review papers failed to review Items for Systematic Reviews and Meta-Analyses stan-
hybrid-based (Signature ? Anomaly-Based) IDS, this gap dards (PRISMA) [84] as well as the standards established
has been filled by this current study. for conducting systematic reviews in the field of software

123
Wireless Networks (2024) 30:453–482 455

Table 1 Comparison with


Paper Year Systematic study Signature-based Anomaly-based Hybrid-based
related reviewed papers (Key:
H Yes x No) [25] 2021 x H H x
[37] 2021 x x H x
[41] 2021 H x H x
[51] 2022 H H H x
[57] 2022 H H H x
[63] 2022 H x H x
Current Study 2022 H H H H

engineering [85, 86]. In carrying out the SLR, the study Detection System’’ OR ‘‘NIDS’’ OR ‘‘Intrusion Detection
followed the following systematic procedures; System’’ OR ‘‘Network Intrusion Detection System’’ OR
‘‘NIDS’’ OR ‘‘IDS’’) AND (‘‘Machine Learning’’ or ‘‘Deep
3.1 Literature review planning protocol Learning’’ or ‘‘Artificial Intelligence’’), (‘‘Network Intru-
sion Detection System’’ OR ‘‘NIDS’’ OR ‘‘Intrusion
For this review, the following review planning protocol Detection System’’ OR ‘‘IDS’’) AND (‘‘Rule-Based’’ or
was taken into account: review background definition, ‘‘Pattern Matching’’)
determination of the need for SLR, identification of Literature for Anomaly covers from 2018 to 2022, while
research questions, search strategy, Quality Assessment, Signature-Based and Hybrid-Based was from 2014 to 2022.
data extraction, selection criteria for studies, and data Filters were not applied for the publication type or lan-
syntheses. guage while retrieving the records from the databases.

3.2 Research questions 3.4 Selection of records

Q1 What methods or techniques are used for Signature, The initial records selection method excluded duplicate
Anomaly, and Hybrid-Based IDS? copies, and the records that were left were then screened by
Q2 What datasets are used for evaluating the Signature, title and abstract to exclude lecture notes, reports, books,
Anomaly, and Hybrid-Based IDS? and other sources aside from journals and conference
Q3 What are the performance metrics used in evaluating proceedings. The range of works that could be published in
Signature, Anomaly, and Hybrid-Based IDS? reliable journals and sessions was therefore constrained.
Q4 What types of attacks are detected in Signature, Primary records were filtered to produce secondary records
Anomaly, and Hybrid-Based IDS? using eligibility (inclusion and exclusion) criteria devel-
Q5 What is the performance level of the techniques oped in response to the study questions. These criteria were
employed in Signature, Anomaly, and Hybrid-Based utilized to ensure that the data analysis only contained
IDS studies? relevant research.
Q6 What are the challenges in Signature, Anomaly, and
Hybrid-Based IDS? 3.5 Inclusion criteria

IC1. Articles related to Network Intrusion Detection


3.3 Search strategy
System
IC2. Relevant content to Signature, Anomaly, Hybrid-
The keywords used to build the search strings were chosen
Based IDS
based on terms that are frequently used in literature and
IC3. Publication of articles in peer-reviewed journals.
terms that are relevant to this investigation. For the exe-
IC4. Accessible research articles.
cution of the SLR, specific keyword strings were mapped
out and applied for each database, as shown in this seg-
ment: (‘‘Network Intrusion Detection System’’ OR 3.6 Exclusion criteria
‘‘NIDS’’ OR ‘‘Intrusion Detection System’’ OR ‘‘IDS’’)
AND (‘‘Signature-Based’’ or ‘‘Misuse Detection’’), EC1. Research Articles dated before the year 2014 for
(‘‘Network Intrusion Detection System’’ OR ‘‘NIDS’’ OR the Signature and Hybrid-Based
‘‘Intrusion Detection System’’ OR ‘‘IDS’’) AND (‘‘Ano- EC2. Articles dated before the year 2018 for the
maly-Based’’ or ‘‘Anomaly’’), (‘‘Network Intrusion Anomaly

123
456 Wireless Networks (2024) 30:453–482

EC3. Articles not related to Network-Based IDS


EC4. Research articles published in predatory journals
EC5. Inaccessible articles.
EC6. Articles that do not present experimentation results

3.7 Quality assessment (QA) of the selected


eligible records

The best articles for the study were chosen based on a set of
five QA questions, which are as follows:
QA1. Does the paper’s topic connect to network-based
IDS?
QA2. Is there a clear explanation of the backdrop issue
the study has addressed?
QA3. Is the research approach properly described and
interpreted?
QA4. Does the publication include the findings of its
experiments?
QA5. Does the paper prospects to perform additional
future work as perspectives?
Each QA question is attributed a score of 1, implying
that papers that fulfill all five (5) QA criteria have a total Fig. 1 Study selection flowchart
score of five (5). In scoring each paper based on the five (5)
defined criteria, there are two determining factors for the
paper to be considered a quality paper and considered for Table 2 Publication distribution among databases
the study: Online databases (Publishers) Publication
C1. The paper must score a total point of 3 and above Science Direct 45
C2. The paper must fulfill criteria QA1, QA3, QA4. Springer Nature 7
IEEExplore 6
3.8 Data extraction and synthesis of the review MDPI 4
paper Hindawi 2
PeerJ 1
The survey was conducted on April 18, 2022. Seven (7) Taylor & Francis 1
well-known databases with a scientific focus were used for Others 5
the literature search. A total of 776 papers were searched; Total 71
705 of those were rejected and excluded using the exclu-
sion and quality assessment criteria, and 71 papers that
passed the quality evaluation were chosen for this study. 4 Analysis and result
Significant information was taken from each of the chosen
research, including authors, publication year, methodology, 4.1 Distribution among online databases,
datasets, performance measures, strength, and limitations. journals, and conferences
The information was then combined to conduct a detailed
analysis of the issues with intrusion detection in networks. The publication distribution among databases, journals,
The fields used for data extraction are listed below: (Fig. 1) and conferences are depicted in Tables 2, 3. The distribu-
D1. Employed Methods or Techniques tion of publications by year and IDS’s approach are shown
D2. Datasets employed in the evaluation of the system in Fig. 2 and 3.
D3. The desired (output) predictions (Performance
Metrics)
D4. Strength and Limitations.

123
Wireless Networks (2024) 30:453–482 457

Table 3 Publication distribution of journals and conferences,


databases
Title of Journals No

Journal of Computer & Security 11


Journal of Computer Networks 5
Journal of Computer Communications 3
ICT Express 3
Knowledge Based Systems 3
Advancing Technology for Humanity 5
Expert Systems with Applications 3 Fig. 2 Distribution of studies by type of IDS approaches
Computer & Electrical Engineering 2
Neural Computing and Applications 1
Future Generation Computer Systems 1
Engineering Application of Artificial Intelligence 1
Machine Learning with Application 1
Scientific African 1
Applied Intelligence 1
Cluster Computing 1
Cyber Security 1
Microprocessor & Microsystems 1
Journal of Visual Communication & Image Representation 1
Computer Networks and Communication 1
Fig. 3 Distribution of selected studies by year
Security & Communication Networks 1
Journal of Network and Computer Application 1 4.2 Result distribution of anomaly-based
Journal of Computational Science 1 intrusion detection system
Journal of King Saud Uni. Computer & Info. Science 1
International Journal of Computing & Digital System 1 Accuracy, recall, precision, false-positive rate (FPR), also
Sensors 1 known as false alarm rate (FAR), f-measure, false-negative
Evolutionary Intelligence 1 rate (FNR), and true positive rate (TPR), are the most often
Electronics 1 used metrics in the evaluation of anomaly-based IDS
Mathematics 1 (Fig. 4). The outcome also shows that, in terms of perfor-
PeerJ Computer Science 1 mance, [43] report that accuracy, precision, and recall are
Journal of Information Privacy and Security 1 all close to 100%, with a false alarm rate based on CNN
Soft Computing in Data Analytics 1 and AE being almost nil. AE algorithms can find anomalies
Preprints 1 by using both linear and nonlinear dimensionality reduc-
IJITEE 1 tion. High-dimensional, unstructured, and diverse data
Total 60 produced by network traffic can likewise be handled by
Titles of Conferences No AE. Additionally, the training stage of AE requires the
Materials Today: Proceeding 3 regeneration of fresh input data from old ones that have
Procedia Computer science 2 been largely destroyed. One of the characteristics of CNN
Frontiers and Advances in Data Science 1 is independent learning of the key aspects of each class,
Global Transition Proceedings 1 which makes it a more appealing technique because it
Artificial Intelligence and Evolutionary Computations in 1 automatically detects relevant features [1–5]. Nonetheless,
Engineering System variables like stride, kernel size, padding, and the quantity
Computer Science and Electronic Engineering Conference 1 of filters may have an impact on how well CNN performs.
Advances in Intelligent Info. Hiding & Multimedia Signal 1 Although, integrating AE and CNN will probably lessen
Processing the dimensionality of the data and produce promising
International Conference on Information Technology 1 performance outcomes. However, the model’s processing
Total 11 and training phases can take too long, resulting in high
computational complexity and resource consumption. In
addition, the model concentrated on a small number of

123
458 Wireless Networks (2024) 30:453–482

utilizing the SMOTE oversampling technique, the class


imbalance in the data can be rectified. Additionally, the
researchers haven’t considered testing their algorithm
against modern attack types or employing metrics like
accuracy and false warning rate.
[47] created a thorough model called MSCNN-LSTM,
which combines Long Short-Term Memory and Multiscale
Convolutional Neural Network. MSCNN and LSTM were
used to examine the spatial and temporal components of
the dataset, respectively. The model employed the spatial-
temporal features to obtain the categorization. The system
was tested using the UNSWNB15 dataset. According to
experimental results, the MSCNN-LSTM model performs
better than traditional neural networks, with an accuracy of
95.6%, a false negative rate of 1.6%, and a false alarm rate
of 9.8%. When compared to others, it has a high compu-
tational time, nevertheless. Additionally, neither the model
nor its attempt to address the issue of imbalanced datasets
Fig. 4 Frequency distribution of performance metrics took into account uncommon attacks.
[44] proposed the Auto-Encoder Intrusion Detection
attack types. In a related study [1], a model for IDS was System (AE-IDS) lightweight online approach. Rather than
created using the CNN technique on the UNSW-NB15 a tagged dataset, the model was trained on actual traffic. In
dataset, and it had a 94.4 percent accuracy rate. The order to effectively select the most important features, the
model’s performance was severely downgraded as a result method used Random Forest. The selected features were
of the model’s failure to account for the class imbalance then divided into multiple subsets using the Affinity
problem in the data, and the detection rate was appreciably Propagation (AP) Algorithm to achieve feature grouping.
low and insufficient to properly detect a zero-day assault. This was done in order to identify the features with high
When [10] applied ANN to the CSE-CIC- IDS2018 data- correlation that can improve AE expression ability. Then,
set, they got similar findings. They had a low false alarm AE processes the incoming traffic and calculates the Root
rate and 99.9% accuracy, recall, and precision. The model, Mean Square Error (RMSE), whose value is the criterion to
however, only takes a few different attack types into
determine whether or not the network traffic is normal.
account, and training and preprocessing take too long. KMeans and the Gaussian Mixture Model (GMM) were
[58] proposed a hybrid approach utilizing PCA-GWO also deployed at this point. The issue of class imbalance
and AE Classifier using the Kaggle dataset. The proposed
was also addressed utilizing AE. The online-based AE-IDS
model achieved accuracy rates of 99.9%, sensitivity rates was able to reduce computational complexity through
of 95.4%, and specificity rates of 100%. The methodology, feature selection and feature grouping, according to
though, is resource-intensive, and multiclass classification
experimental results. Additionally, it performs better than
was not taken into account. The KDDCup99 dataset was some offline methods and other traditional machine learn-
used to develop an Information Entropy Deep Believe ing approaches. The current nature of network attacks,
Networks (IE-DBN) model [12] for network IDS. Infor-
which have evolved to be more encrypted and as a result
mation entropy (IE) was used to calculate the number of mixed with regular network traffic, makes it necessary to
hidden neurons in the DBN network and its depth, while take into account other data, such as system logs, in order
information gain (IG) was used to reduce their dimen-
to have an IDS that can effectively defend against these
sionality and remove any extraneous characteristics. The malicious activities. Therefore, simply extracting features
synthetic minority oversampling technique (SMOTE) from network traffic is insufficient to ensure a good IDS
algorithm was also used to address the problem of data
model. Additionally, recall and computing time were the
class imbalance. The obtained results of 98.76% detection only metrics used to assess the model.
accuracy and 0.76 false alarm rate demonstrate that uti- [8] provided an SVM and naive Bayes IDS architecture.
lizing IE and IG with DBN increases the model’s learning
The original characteristics were adjusted using the Nave
efficiency and detection accuracy while lowering the net- Bayes method to provide fresh, high-quality data, and the
work’s computational cost and FAR. Although the study SVM classifier was trained using the new, altered data to
found decreased detection accuracy for the minority class
produce the intrusion detection model. For the datasets
of attacks with reference to larger datasets, even when UNSW-NB15, CICIDS2017, NSL-KDD, and Kyoto

123
Wireless Networks (2024) 30:453–482 459

2006?, experimental results indicated good performance create a high-quality feature set, while CNN and BG
accuracy of 93.75%, 98.92%, 99.35%, and 98.58%, together greatly improved detection performance, with a
respectively. The tactic fared better than other methods in 98.2% detection rate, a 0.5 false alarm rate, and a 95.4%
terms of detection rate and false alarm rate as well. true positive rate. The GA and CNN that were used for the
Although only binary cases were taken into account for identification of the pertinent features, however, caused the
intrusion detection, the model has to be expanded to proposed model to take a lot of time. For the real simula-
include scenarios with other assault types. Additionally, the tion, repeating the 5-fold Cross Validation also required a
methodology did not adequately address the class disparity. lot of time. In addition, data imbalances that skewed the
[14] proposed an IDS model based on the stacked outcomes in favor of the dominant class are a possibility
denoising autoencoder extreme learning machine (SDAE- when computer networks are deployed in the real world.
ELM) and Deep Belief Network Softmax (DBN-Softmax). In an effort to address the problem of high False Positive
The dataset’s features were first learned using SDAE, and poor Detection Rate in detecting intrusions, [87]
which was then used as an input to the ELM method, where developed an IDS model that utilizes Multi-Layer Per-
the features were adjusted to create the trained model. ceptron Neural Networks (MLP-NNs) and Multi-Objective
Finally, the testing results were applied to the SDAE-ELM Genetic Algorithms (MOGA). The proposed method gen-
model for intrusion detection. The entire network is erated a detection accuracy of 97% and a False Positive
assumed to be a stack of various Restricted Boltzmann rate of 2% for the NSL-KDD dataset. The approach’s
Machines (RBMs) because each layer in the DBN phase is major flaw is that it requires iterative calculations and takes
an RMM. Unsupervised training was utilized to train the a long time. Because the technique was also evaluated
entire network layer by layer, and the Softmax classifier using very small subsets of the dataset, a more realistic
was then employed in the DBN model’s Back Propagation dataset should be used to illustrate how it may be used.
(BP) technique to classify the data in order to identify the In order to recognize attacks, [18] combined the
type of intrusion and improve accuracy. Both the ELM Recursive Feature Elimination (RFE) technique with a
method and the Softmax classifier were used to optimize variety of machine learning methods, including Decision
the SDAE and DBN models. According to experimental Tree, Support Vector Machine, and an Ensemble Classifier
findings, both models outperformed more conventional Random Forest in the form of discriminant analysis. In
machine learning models in terms of detection accuracy at order to reduce the dimensionality of the KDD CUP99
both the binary and multiclass classification levels. dataset, RFE was used to remove any redundant or extra-
Although the SDAE-ELM model demonstrates effective- neous features from the dataset. The studies’ findings
ness, its detection performance on a limited dataset is showed that the proposed strategy achieved good classifi-
subpar. Additionally, the DBN-Softmax model is unreli- cation rates for all classes of attacks. A comparison of the
able for real-time detection and takes too long to train on three classification techniques before feature selection
large datasets. showed that Random Forest outperformed SVM. However,
[35] presented an ensemble system that combined following feature selection, SVM outperformed Random
multiple modified adaptive boosting with area under the Forest and Decision Tree.
curve (M-AdaBoost-A) based classifiers into an ensemble [15] examined five machine learning methods to rec-
by using strategies like particle swarm optimization. the ognize various kinds of network assaults using the UNSW-
system was designed to successfully detect network intru- NB15 dataset, including Random Forest, Decision Tree,
sions while addressing the issue of class imbalance. the Logistic Regression, K-Nearest Neighbors, and Artificial
model did not call for data preprocessing like feature Neural Networks. One of the classifiers, Random Forest,
selection or variable scaling, which prevented data infor- has the highest accuracy, scoring 89.29%. Additionally,
mation loss. The model was able to achieve an accuracy of after the Synthetic Minority Oversampling Technique
0.99 percent and a false positive rate of 0.00. (SMOTE) was applied to address the problem of class
An innovative network IDS algorithm is presented in imbalance, there was evidence of further improvement in
[3]. Fuzzy C means Clustering (FCM) was used to enhance classification model accuracy, with the Random Forest
the features that were chosen after the GA with KNN fit- classifier achieving the highest accuracy of 95.1 percent
ness function was used to select an improved feature sub- with 24 features selected from the Principal Component
set. In order to create a high-quality deep feature subset Analysis method. Additionally, after addressing minority
that can learn most attack character types across multiple classes, class balancing had no favorable benefits on LR
layers, CNN was also used as an extractor. Finally, 5-fold and ANN classifiers, instead lowering their accuracy. The
cross validation was used to apply BG as a classifier. The analysis distribution of the various works is displayed in
results revealed that CNN, GA, and FCM were able to Table 4.

123
460 Wireless Networks (2024) 30:453–482

Table 4 Techniques, datasets, and performance of anomaly-based, studies models


Paper Year Classifier Dataset Metrics Pros Cons

[1] 2021 CNN UNSW-NB15 AC: 95.6 Demonstrates a substantial Class imbalance issues lead to a
P: 97.9 advancement in the multiclass considerable model
paradigm for identifying new degradation and low detection
assaults rate for the underrepresented
classes
[10] 2019 ANN CSECIC-DS2018 AC: 99.9 high rate of detection and low Time-consuming to train and
P:99.9 percentage of false positives process, only detects a few
types of assaults
R: 99.9
FM: 99.9
FAR:
0.03
[7] 2021 NB ? GA NSL-KDD AC: 99.7 Higher processing speed Low rate of detection of U2R
P: 99.1 Low training time and R2L attack types
FM: 47.1 Low f-score. Dataset used does
not reflect modern day attack
types
[58] 2020 AE ? PCA ? GWO NA AC: 99.9 High detection rate for binary No multiclass classification was
TNR: classification performed, as a result could
99.9 Low computational complexity not show its capability at
detecting various types of
attacks
[12] 2021 IE- KDDCUP99, AC: 98.1 An improved learning efficiency Low detection rate of the
DBN ? SMOTE ? IG NSL-KDD AC: 98.7 and detection accuracy leading minority class of attacks
to reduced computational cost
and low false positive
[33] 2018 ABC NSL-KDD P: 99.0 High detection rate with low High overhead with respect to
AFS, CART UNSW-NB15 false positives computational complexity and
time cost. Low processing
FCM ? CBFS
speed
[13] 2019 DNN NSL-KDD AC: 99.7 High detection rate with Low detection rate of the
LSTM-RNN CICIDS2017 P: 99.8 significant reduction in the underepresented classes due to
false positive when Double class imbalance in dataset.
DBN R: 99.8
PSO was combined with DBN Huge computational
? DoublePSO FM: 99.8 complexity
FAR:
0.23
TNR:
99.7
[2] 2021 DNN NSLKDD AC: 87.0 Addressed class imbalance Low detection rate
CNN CIDDS001 P: 87.0 problem in data. Minimal
expended time
XGBoost CICIDS2017 R: 83.0
LSTM (LIO)
[34] 2020 Ensemble of KDDCUP99 AC: 98.9 Yield good detection accuracy Dataset used does not reflect
Discriminant Approach P: 99.7 for different types of attacks contemporaries’ attack. Also,
class imbalance problem was
not addressed
[20] 2018 SVM ? GA CICIDS20 AC: 99.8 Demonstrate the potential of The dataset used to train the
ADFA-LDWMN P: 99.4 detecting attacks with less model only contains a small
computational complexity and number of attack types, which
less communication overhead could lead to misclassification
when employed on networks
with more attack types
[22] 2019 DT-SVM Ensemble NSL-KDD AC: 99.9 High detection rate High False Positive
P: 99.9
FAR:
0.11

123
Wireless Networks (2024) 30:453–482 461

Table 4 (continued)
Paper Year Classifier Dataset Metrics Pros Cons

[35] 2020 M-AdaBoost TPR: High detection accuracy High computational complexity
SVM Ensemble 99.0 Low false positive
? PSO AC: 99.0
P: 88.0
R: 95.0
FM: 91.0
FAR:
0.01
[82] 2020 PIO KDDCUP99 TPR: Significant reduction of features Could not detect all types of
NSL-KDD 97.0 to select most important subset attacks
AC: 94.7 leading to high detection
UNSW-NBS
accuracy, low false positive,
FM: 88.2
and reduction in training time
FAR: 9.7
[23] 2021 OC-SVM KDDCUP99 AC: 99.7 Lightweight NIDS. High No multiclass classification was
PIO NSLKDD P: 99.8 detection rate with low false carried out to show the
positives effectiveness of detection
UNSWNB15 FM: 99.2
against different types of
FAR:0.01 attacks
[24] 2020 SVM AC: 96.2 Low false alarm rate Low detection accuracy
[83] 2021 PABCE N/A FNR: High detection accuracy for There is no information on
KSIDO 90.0 various attack types with multiclass classification that
P: 88.0 minimized time and space demonstrate how well the
complexity model can recognize different
types of attacks
[87] 2019 AMGA2 (MOGA) NSLKDD P: 88.0 Yield good accuracy in detection Requires iterative computations,
FAR:0.02 of anomalies and takes a lot of time, with
high false positives
[55] 2021 O-CNN NSLKDD FNR: 5.7 Effective performance in High computational complexity
HM-LSTM ISCXIDS AC: 96.3 detecting anomalies
LSO UNSWNB15 P: 99.9
R: 95.8
FM: 98.1
FAR: 5.8
[26] 2021 LNNLS-KH NSL-KDD AC: 98.2 An improved performance in Huge training time
CICIDS2017 P: 97.7 accuracy and optimal fitness
iteration curve and
FAR:1.18
convergence speed with low
false positive
[16] 2018 C4.5 KDDCUP99 P:90.0 Indicated that 10 features is High computational time
BN FM: 77.3 sufficient to build an improved
classifier for IDS with high
FA FAR:0.02
detection rate
[47] 2019 MS-CNN ? LSTM UNSWB15 FNR: 1.6 High overall detection accuracy High computational time, and
AC:95.6 with low false positive Poor detection performance on
imbalanced datasets
FAR:0.09
[44] 2020 AE CSECICIDS2018 R: 99.7 Reduced computational Low detection rate
complexity through feature
selection and feature grouping.
Addressed class imbalance
problems

123
462 Wireless Networks (2024) 30:453–482

Table 4 continued
Paper Year Classifier Dataset Metrics Pros Cons

[8] 2021 SVM ? NB UNSW- AC: 93.7 High detection rate Class imbalance in dataset led to
NB15 P: 99.4 Low false positives low detection of the minority
CICIDS2017 class of the attacks
FAR: 3.0
NSLKDD
Kyoto
2006 ?
[28] 2021 TS-RF UNSW- AC: 83.1 Reduced feature space/vector, therefore Increased misclassification rate
NB15 resulted in low computational complexity among different attacks types as
İmproved detection rate for low sample a result of class imbalance
attacks problem in dataset
Low false positive
[14] 2021 DBN-SoftMax KDDCUP99 TPR: High detection accuracy for binary and Poor detection ability on small
SDAE-ELM ADFA 0.04 multiclass classification dataset
NSLKDD AC: 78.0 High training time
UNSW-NB P: 96.0
CIDDS-001 R: 64.0
FM: 76.0
FAR:0.06
[69] 2021 Light GBM NSLKDD AC: 99.9 Application of ADASYN addressed the False positive is relatively high
Ensemble UNSW- FAR:0.01 problem of imbalanced dataset, and led to
ADASYN NB15 an improved detection rate of the minority
Technique classes, with low computational
CICIDS2017
complexity
[70] 2020 Adaboost KDDCUP99 FAR: Modest Adaboost has higher processing Gentle and Real Adaboost present
UNSW- 0.02 time but lower performance and higher good accuracy for binary
NB15 error rate compared to Gentle and Real classifications of intrusions
Adaboost
TRABID
NSLKDD
CICIDS2017
[79] 2019 PSO NSLKDD TPR: Enhanced specificity, precision and Performance improvement for
ACA UNSW- 91.3 accuracy only binary classification
GA NB15 AC: 91.2
P: 91.6
FAR: 8.9
[18] 2021 DT KDDCUP99 AC: 98.5 Good representation of dimensionality Class imbalance in dataset led to
SVM P: 95.0 reduction leading to good classification low classification accuracy of
rate for all the classes of attacks the minority attack classes
En-RF R: 95.5
RFE FM: 95.0
[45] 2021 SAE ? DNN KDDCUP99 AC: 99.9 High detection accuracy rate Fair representation of feature
NSL-KDD P: 99.9 learning and dimensionality
reduction
UNSW- R: 99.9
NB15 FM: 99.9
FAR:0.17
[81] 2021 GB-GWO NSL-KDD AC: 98.6 Enhanced accuracy towards attack detection High time complexity
[17] 2022 DT,GBT,MLP UNSWNB15 AC: 99.0 DT with the feature selection technique The model did not extend its
AdaBoost,LSTM Network R: 99.8 gave a promising result in detecting cability towards reflecting
TON-IOT anomalies accurately when compared with different types of attacks
GRU FM: 99.8
other technique detected nor time expended

123
Wireless Networks (2024) 30:453–482 463

Table 4 continued
Paper Year Classifier Dataset Metrics Pros Cons

[3] 2020 CNN NSLKDD AC: Produced high quality feature set High computational complexity
BG 98.2 İmproved detection accuracy İmbalanced data which led to biased
FCM,GA FAR: classification towards the majority class
0.5
[30] 2021 RF AC: High detection rate for binary Could not effecively detect various types of
99.9 classification attacks
P: 99.9
[54] 2021 RCNN ? LSTM DARPA AC: Detects various kinds of attacks with High false positive rate, and low detection
CSE- 93.8 minimal error rate accuracy
ICIDS2018 P: 99.9
R: 80.5
FM:
89.2
[4] 2020 CNN NSLKDD AC: Detect anomalies of network traffic Low performance in anomaly detection for
94.0 with huge semantic coding space network traffic with low semantic space
P: 95.3
TPR: True Positive Rate; FNR: False Negative Rate; AC: Accuracy; P: Precision; R: Recall; FM: F-Measure; FAR: False Alarm Rate; TNR:
True Negative Rate

4.2.1 Machine learning (ML) and deep learning (DL) 4.2.1.2 K nearest neighbour (KNN) Prior to classifying the
techniques for anomaly-based IDS test sample, a KNN supervised machine learning classifier first
finds the class of the k trials that are the most similar. For the
ML and DL algorithms are used by IDS that take an majority of these closest samples, the sample test prediction is
anomaly-based approach for the IDS’s training and pre- returned [93, 94]. The goal of these methods is to classify an
diction stages [88]. In this section, the various ML and DL unlabeled data sample based on the k closest neighbors. There
techniques utilized in anomaly-based network IDS are are no parameters required for KNN to work. Using the
described in general terms. Table 5 offers a brief descrip- Euclidean formula [93], the distance between neighbors is
tion of ML and DL techniques, their advantages and dis- determined. The underlying principle of the KNN classification
advantages, and references to related research. method is that instances of recent data are classified into pre-
viously observed classes according to their relative proximity to
4.2.1.1 Naı̈ve bayes (NB) NB, a simple and highly scal- each class. Some research [9, 15] for network intrusion detec-
able probabilistic classifier [89], applies the Bayes’ Theo- tion have used KNN-based classification, which offers a decent
rem to calculate the probability of an event occurring based level of accuracy in identifying attacks.
on previous observations of related events [90]. It functions
well during the phases of training and categorization. Every 4.2.1.3 Decision tree (DT) Decision Trees (DTs) first
vector property is assumed by NB to be equally significant extract features from the samples in a dataset before
and independent. NB is used in intrusion detection to organizing an ordered tree according to the value of a
forecast the likelihood that a class is either a normal class feature. The branches originating from each node in the
or an attack class based on the prior data. NB calculates the tree serve to indicate the related values of each character-
posterior probability before classifying unlabeled traffic as istic, and each feature is represented by a node in the tree.
normal or anomalous. A different collection of variables Any feature node that divides the tree into two sections in
from the seen traffic, such as status flags, protocols, and the best way possible serves as the tree’s genesis node [95].
latency, are utilized to forecast the possibility that seen data The origin node that splits the training datasets most
would be normal or aberrant. Since NB classifier is a effectively is determined using a number of parameters,
simple and easy to implement algorithm, many IDSs have including the Gini index [96] and Information Gain [97].
employed it to recognize aberrant traffic [7–9]. It just DT algorithms build a model, then perform the classifica-
requires a limited number of training data and can cate- tion using the induction and inference procedures [98].
gorize using binary or multiple labels [91]. However, it A trained DT selects multiple elements from a packet to
does not take feature interdependencies into account while identify its class. An ideal DT with as few layers as pos-
categorizing data, which lowers the method’s accuracy. sible can hold the most data [97]. Many methods have been
[92]. proposed to produce optimal trees, including Iterative

123
464 Wireless Networks (2024) 30:453–482

Table 5 Strength and limitations of ML and DL based methods


Paper ML/DL Methods Pros Cons

[7–9] NB Few samples are required for training. It can perform Interdependencies between features is not taken into
binary and multi-label classification. It exhibit account for classification purposes, therefore affects
robustness towards irrelevant features its accuracy
[9, 15] KNN It is Simple to use It is quite challenging to determine the optimal value
of K and identify missing nodes. Also, Prediction
computation can be expensive
[15–19] DT It is simple and easy to use Huge storage is required. High computational
complexity
It is easier to use only when few DTs are used.
Another difficult concern in DT is over-fitting
[20–24] SVM Highly scalable for its simplicity. capable of The optimal kernel function used in separating data
detecting network anomalies in real-time, works when it is not linearly separable, remains an
well with leaarning online. It is suitable for data obstacle towards achieving desired classification
with large number of feature attributes. Lesser speed. In the vicinity of the hyperplane, SVM are
storage is required sensitive to noise
[28–30] RF It results in a more robust and accurate result that is The use of RF may be impractical in real time
resistant to overfitting. Fewer inputs are required applications that require large dataset, this is
and process of feature selection is not required because RF constructs several DTs
[12–14] DBN Suitable for vital feature extraction with training on High computational costs
unlabeled data
[14, 29, 34–36] EL It is robust to overfitting. Performs better than a single Increased time complexity, due to the use of multiple
classifier classifiers in parallel
It reduces variance. Performs better in uncertain
environments with a high number of features
[13] RNN It is best suited in environments where data is to be The problem of vanishing gradients, which hinders
processed sequentially learning of long data sequences
[1, 4, 5, 42, 43] CNN It is suitable for effective feature extraction from raw High computational complexity, Consequently, it is
data difficult to implement them in a situation with
It possess good potential in network security limited resources
application
[43–46] AE It is very effective for feature extraction and It is computationally heavy. It may yield undesired
dimensionality reduction results when the training dataset is not
representative of the
testing dataset
[12] RBM Feedback function of RBM facilitates the extraction Single RBM lacks the capability of feature
of discriminative features from large and complex representation
datasets which are then used to capture the High computational resources
behaviour of the network traffic
[2, 13, 47, 54, 55] LSTM Designed to address concerns with bursting or Training is expensive in terms of both time and
vanishing gradients resources
[14, 36] ELM Extremely quick rate of learning. For real-time They are less reliable than conventional networks
retraining, they are intriguing
[33] FCM It provides a more practical approach to addressing The FCM can result in a local minimum
patterns and can identify potential outliers
[6, 58] PCA Leads to a reduced training time It is an attribute reduction technique that should be
used in conjunction with additional prediction
techniques to find incursions

123
Wireless Networks (2024) 30:453–482 465

Table 5 (continued)
Paper ML/DL Methods Pros Cons

[2, 65] XGBoost Able to withstand overfitting, and full use of the May struggle when dealing with outliers
computing resources
[69] LightGBM Quick training and little memory utilization Light GBM divides the tree along the leaves, which
might cause overfitting because it results in trees
that are more complicated. Since light GBM is
susceptible to overfitting, it is likely to overfit little
data
Naı¨ve Bayes (NB); K Nearest Neighbour (KNN); Decision Tree (DT); Random Forest (RF); Fuzzy C-Means (FCM); Support Vector Machine
(SVM); Principal Component Analysis (PCA); Extreme Learning Machine (ELM); Convolutional Neural Network (CNN); Long Short-Term
Memory (LSTM); Recurrent Neural Network (RNN); AutoEncoder (AE); Restricted Boltzmann Machine (RBM); Deep Believe Network (DBN);
Ensemble Learning (EL)

Dichonomiser 3 (ID3), C4.5 [97], and classification and that SVM gave more accurate findings than DTs, NB, and
regression tree (CART) [99]. There are many measures Random Forest. SVM employs an ideal kernel function to
available to assess DT performance. Because ID3 does not separate the data when it cannot be separated linearly,
support features with missing or continuous values and although this approach still has speed limits for
instead selects characteristics with the highest information classification.
gain (entropy), its use is constrained. Because it offers a
simple classification method, is simple to comprehend and 4.2.1.5 Random forest (RF) RFs is one type of supervised
use, and frequently enables superior generalization through machine learning method. An RF is built using numerous
post-construction pruning, DT is a common model in DTs to foresee categorization results that are more accurate
intrusion detection. and error-resistant [106]. Randomly generated DTs are
In the context of intrusion detection, DTs have the trained to provide categorization findings depending on
potential to be employed as classifiers [73, 100]. The majority vote [106]. Both DTs and RF are classification
complexity of processing and growing storage require- techniques. In contrast, RF creates a rule-subset using all of
ments must be considered, though [98]. The poor resilience the member DTs, whereas DTs create a rule-set during
of DT is another flaw, meaning that even little changes to training for the subsequent classification of fresh samples.
the training data could result in a completely different DT. This generates a more trustworthy and accurate output that
Because information gain is biased in favor of traits with is resistant to overfitting with significantly fewer inputs and
higher levels, larger DTs might require human pruning without the requirement for feature selection [107].
[101]. Numerous research [9, 28, 30] have found that RF is
suitable for network anomaly intrusion detection. Further-
4.2.1.4 Support vector machine (SVM) A different type of more, according to another study [15, 108], RF outper-
classifier is the SVM, which creates a hyperplane from the formed KNN, ANN, and SVM in networks for DDoS
feature set of two or more classes. SVMs are particularly detection because it required less input features and could
well suited for the use case when classes with extensive do away with the time-consuming calculations required for
feature sets need to be identified using smaller data samples feature selection in real-time IDS [108].
[102]. The maximum distance to the nearest data point for
each comparison class is used to determine the splitting
hyperplane. A kernel, such as a polynomial, linear, GRBF, 4.2.1.6 Fuzzy C-means (FCM) By using FCM to group the
or hyperbolic tangent, can be used to separate several data into clusters, it is possible for the data points with the
different kinds of hyperplanes. When it’s important to highest degree of similarity to be in the same cluster and
categorize data into normal and abnormal classifications, the data points with the highest degree of dissimilarity to be
SVMs, which are based on statistical learning [103], are in separate clusters [109]. It also allows the assignment of a
fantastic for anomaly identification. SVMs are extremely data point to numerous classes with various degrees of
scalable and capable of real-time anomaly-based intrusion linkage and provides further conceptual breakthroughs in
detection in addition to online learning because of their grouping. Because of this, it provides a more sane way of
simplicity [104, 105]. Another advantage of using SVM is treating patterns and can identify potential outliers [110].
that it uses less memory and storage. The effectiveness of
SVM-based IDSs in network environments was evaluated 4.2.1.7 Principal component analysis (PCA) PCA is a
by several studies [8, 20, 21, 23, 24], and it was discovered statistical technique that has been widely used to pinpoint

123
466 Wireless Networks (2024) 30:453–482

the characteristics of future network traffic and reduce sequentially. In contrast to other neural networks, its output
unnecessary or noisy resources. It attempts to reduce the is reliant on back-propagation rather than forward propa-
dimensionality of the data by eliminating the correlation gation [113, 117, 118]. Since a DNN has a hard time fitting
between attributes and cutting training time [110]. PCA is data that changes over time, the RNN was suggested as a
an approach that selects a small number of uncorrelated fix. An RNN has a temporal layer for sequential data
features, known as main components, from a large number analysis, followed by a learning phase where hidden
of features since its goal is to identify the most variance recurrent component units are taught about multi-dimen-
with the fewest main components [111]. sional differences [119]. In reaction to the data it encoun-
ters, the neural network then modifies these hidden units,
4.2.1.8 Extreme learning machine (ELM) The ELM is a leading to continuing updates and the manifestation of the
single hidden layer feedforward neural network that cal- neural network’s present state.
culates the output weights analytically and randomly By estimating upcoming concealed states as the trig-
chooses the hidden-layer bias and input weights [112]. In gering of a previously unrevealed state, an RNN technique
actual use, this technique frequently provides good gener- analyzes the current unrevealed state of the neural network.
alization performance at a rapid learning rate. A unique Neurons of the preceding layer provide feedback to neu-
learning approach (ELM) was proposed by [36] to over- rons in the form of outputs from RNN functions. RNNs
come the limitations of ANN. Experimental results confirm have long been important in areas like natural language
the ELM’s high accuracy and time-saving benefits for processing and action recognition [120], but recently they
network intrusions. have gained more relevance and use in IDS because the
vast bulk of the data in this field consists of temporally
4.2.1.9 Convolutional neural network (CNN) The CNN is continuous data streams. It has been noted that this
a shared-weight artificial neural network that was modeled approach has produced useful findings, especially for time
after biological processes. It is based on convolutional series-based threats, and that an RNN has been recom-
kernels or filters. It was developed to decrease the quantity mended for use in past studies [13, 78] for network intru-
of data inputs required for a conventional artificial neural sion detection through study of network traffic behavior.
network (ANN) by utilizing equivariant representation, RNNs can be effectively used to predict the temporal
sparse interaction, and parameter sharing [113]. Due to its relationships between malicious conduct and security
extraction of local features, which also lowers the number assaults. The present and prior states of the features
of weights and computational complexity, the CNN is more determine the likelihood of an attack, and the RNNs can be
scalable and requires less time to train and predict than the trained for this purpose using both historical and real-time
DNN. CNN is best known for its exceptional performance inputs [121]. However, because RNNs do not have a
in computer vision and digital image processing, but more specific treatment for the activation function, the continu-
recently, it has also shown promising results in its appli- ous product of their partial derivatives can easily lead to
cation in IDS. gradient disappearance or even gradient explosion when
In the intrusion detection field, CNNs are frequently the number of layers in the network is considerable.
used to extract features from the raw data. Three-layer
types that are ideal for CNN include the convolutional 4.2.1.11 Long short-term memory (LSTM) LSTM network
layer, pooling layer, and activation unit. The convolutional designs, a specific kind of RNN, were also used in the
layers muddle the data inputs using various kernels [114]. creation of IDS. The gradient vanishing problem that
The pooling layers shrink samples, which in turn shrinks bedevils the conventional RNN is fixed by LSTM by
the sizes of next layers. Both maximum pooling and introducing additional storage states [122]. The main fea-
average pooling are employed. After segmenting the input ture of LSTM-based RNNs is their capacity to store data or
into several clusters, max pooling selects a maximum value cell state for later usage in the network. This trait makes
for each cluster in the preceding layers [115]. Compara- them appropriate for examination of temporal data that
tively, the average pooling establishes the average values evolves over time. The degree of gradient vanishing is
for every cluster in the preceding layer. The activation unit efficiently controlled by the LSTM by using a gate function
has the ability to non-linearly trigger an activation function as the activation function to choose to allow some infor-
on any feature in the feature collection [115]. CNN excels mation to pass through. Forgetting gates were introduced
in rapidly and effectively extracting characteristics from by [122] to the original LSTM architecture to enable the
raw data, but this requires a lot of processing power [116]. LSTM to reset its state and simulate memory forgetting.
The LSTM has become one of the most common RNN
4.2.1.10 Recurrent neural network (RNN) RNN is a DL types because it fixes the gradient vanishing problem that
technique that performs well while processing data affects traditional RNNs. Many IDS investigations employ

123
Wireless Networks (2024) 30:453–482 467

LSTM networks because of their superior performance for employed in a variety of industries, including intrusion
classification and prediction based on time-series data and detection. These deep learning networks are efficient at
because its forgetting mechanism is better suited for the classifying data and learning features [128].
detection of data streams [42, 47, 54]. However, due to the
nature of RNNs [123], the basic LSTM architecture cannot 4.2.1.15 Ensemble learning (EL) Ensemble learning
be trained in parallel, making the usage of LSTM-based combines a number of fundamental classifier models to
models occasionally uneconomical. improve performance. The tendency for higher accuracy of
an ensemble classifier reduces the probability of selecting
4.2.1.12 AutoEncoder (AE) The input layer and output the wrong classifier. The result can be stabilized [129] by
layer of feed-forward ANNs, also called auto-encoders, combining your predictions. EL generates a majority vote
have an identical number of neurons. An auto-encoder may for categorization by integrating the results of various
have many hidden layers since the inputs are built with the classifiers and leveraging their advantages. Classification
intention of reducing the difference between the input and accuracy is increased by combining the outputs of various
the output. A decoder and an encoder, which translate input homogeneous/heterogeneous classifiers [130, 131]. EL is
data to the code and produce input data from the code, are based on a study [132] that found that the application and
also included in each auto-encoder. Auto-encoders specif- relevant data affect how accurate an ML classification
ically facilitate unsupervised learning of dataset encoding method is. Among the ensemble strategies that can be used
for dimensionality reduction by instructing the network to are bagging, boosting, and stacking.
avoid signal noise [124]. One use for AE is the extraction The bagging [133] method uses numerous iterations of
of features from datasets. However, the demand for pow- the initial classifier to create an aggregate classifier. The
erful computation makes this difficult. learning set is bootstrapped to produce fresh learning sets
for each version, which are then utilized to produce the
4.2.1.13 Restricted Boltzmann machine (RBM) It uses an various versions. The instability of the base classifier is a
unsupervised learning approach to build a deep, generative, significant factor. If changing the learning set can result in
and undirected model [125]. No two nodes in any layer of significant changes to the built-in classifier, bagging can
an RBM are linked together in any way. Visible layers and improve accuracy [133]. Minimizing bias and variation in
concealed layers are the two types of layers that make up supervised learning is the primary objective of the
an RBM. While the hidden layer, which is composed of ensemble learning technique known as ‘‘boosting’’. The
several layers, holds the unknown potential variables, the objective is to convert fundamental classifiers into an
visible layer contains the known input parameters. Work- aggregate classifier. Unlike bagging, this combination is
ing in a hierarchical fashion, the subsequent layer receives accomplished by having each classifier function inside the
latent variables that reflect features obtained from a dataset. limitations of the preceding classifiers.
RBMs were used in a number of research papers for net- Therefore, later basic classifiers are only put to the test
work intrusion detection systems [126, 127]. RBM imple- or trained when the prior classifiers failed to achieve a
mentation is challenging since it calls for several computer sufficient level of precision [129]. The stacking method
resources. Furthermore, Single RBM does not allow for considers a number of simple heterogeneous models, trains
feature representation. This constraint can be overcome, them concurrently, and then combines them to produce a
though, by using two or more RBM layers to build a Deep metamodel that bases its predictions on a number of weak
Belief Network (DBN). models. Since no ML method can be stated to be a ‘‘one
size fits all solution,’’ EL-like combinations may be the
4.2.1.14 Deep believe network (DBN) DBNs are stacked most appropriate for general applications in order to
versions of RBMs that are generative probabilistic models. enhance accuracy through a decrease in variance and avoid
In a DBN, each RBM’s output functions as the subsequent overfitting [132]. The accuracy of EL is sacrificed for a
RBM’s input. Furthermore, neurons in the DBN layers are higher temporal complexity since numerous classifiers are
connected to those in the layer above them, but not the used in parallel [134, 135]. The efficiency of EL for
other way around. DBNs can fix ANN training problems intrusion detection has been examined in a number of
and prevent problems such training that is excessively studies [19, 22, 29, 34]. The EL algorithm produced more
slow, needs a large training dataset, or falls into a local precise and superior results than each member classifier
minimum. DBNs have proven to be more effective than individually, according to studies on the viability of EL
other ML algorithms in speech recognition, image identi- under IDS [136] [137].
fication, and natural language processing. They are

123
468 Wireless Networks (2024) 30:453–482

Table 6 Techniques, datasets, and performance of signature-based studies models


Paper Year Methodology Dataset Metrics Pros Cons

[31] 2021 FSM Based N/A MU: Memory Efficiency and High throughput No clear description on the types of
Regex 1.3Mb rate attacks detected or dataset employed
Pattern TP: 593.4
Maching
Algorithm
[39] 2019 Smilarity & IP N/A AC: 99.4 The ability to detect attacks more quickly Both the datasets used and the sorts of
Blacklist PT: 25 s was improved by the use of numerous attacks that were discovered lacked
smaller databases. Additionally, the use sufficient information, also the
of an IP blacklist and similarity allowed complexity of memory
for the discovery of new attack
signatures
[27] 2016 Combinatorial N/A TPR: 96.0 High detection rate High false positive rate, and did not
Algorithm AC: 96.5 cover wide range of attacks
FAR: 3.0
[53] 2019 DT Models UNSWB15 AC: 83.8 High detection rate with low false positive Not able to detect zero-day attacks,
(C5, CHAID, RTNTP18 DR: 90.3 and not able to detect on real-time
CART, data set
FM: 84.5
QUEST)
FAR: 2.0
[62] 2017 Parallel N/A PT: The issue of database size and infrequent Because choosing signatures is done
Processing 7m31s signatures was solved with reduced manually, it is challenging to update
with Snort matching time using multithreaded the database of signatures
Rules approach
[66] 2019 MapReduce N/A MU: 10gb High execution speed. Reduced searching Focused only on few types of attacks
with Pattern PT: 26 m time
Matching
Mechanism
[38] 2018 AC, KMP, N/A MU: As more tables are created by KMP to The complexity of space and time
WM, RK 740,792 prevent recurring matches, less memory needs to be improved for pattern
PT:0.03 m is used. WM utilize less run-time matching
[56] 2016 RA, BRA N/A PT: Low processing time Low detection rate of various kinds of
0.94ms attacks
TPR: True Positive Rate; MU: Memory Used; AC: Accuracy; DR: Detection Rate; FM: F-Measure; FAR: False Alarm Rate; PT: Processing
Time; TP: Throughput; NA: Not Available

Table 7 Strength and limitations of pattern matching techniques


Paper Algorithms Pros Cons

[27] CA Effective at classifying unknown traffic as either match or High time complexity
mismatch
[38] AC Good performance for shorter patterns matching Due to the trie depth, there is exponential growth in the
required memory with the number of pattersns
[38] KMP KMP performs very well for large packet traces as it does It exhibits longer run time
not require a backward match in the packets
[38] WM WM is very efficient for larger patterns WM performance degrades for shorter patterns
[56] RA Decreased Searching Time High Resources constraints, Low detection rate for various
types of attacks
[56] BR Quick Search and Effective in preprocessing of data packets High Space Complexity for larger packets
Combinatorial Algorithm (CA); Aho-Corasick (AC); Knuth-Morris-Pratt (KMP); Raita Algorithm (RA); Wu-Manber (WM); Berry-Ravindran
Algorithm (BR); Boyer-Moore Algorithm (BM)

123
Wireless Networks (2024) 30:453–482 469

4.3 Results distribution of signature-based IDS studies in order to improve the IDS’s accuracy and lower
the false-positive rate. The findings showed that CA had a
In order to find intrusions, signature-based IDS use a pro- greater accuracy of 96.5 percent but a higher proportion of
cedure in which signatures are searched and compared. For false positives at 3 percent. Additionally, there is a low
signature-based IDS, pattern matching algorithms or rule- detection rate for various sorts of assaults, and the dataset
based approaches are frequently used. Processing time, utilized is not adequately documented. Table 6 lists the
Throughputs, Memory Usage, Accuracy, Precision, Recall, many investigations employing signature-based methods.
F-measure, False-Positive Rate (FPR), also known as False
Alarm Rate (FAR), True Positive Rate (TPR), and Missing 4.3.1 Pattern matching/rule-based algorithms
Rate (MR), also known as False Negative Rate (FNR), are for signature-based IDS
evaluation metrics that are frequently used for signature-
based systems. In this section, a quick overview of the algorithms and
[38] established standardized criteria to establish com- methods that have been used for signature-based IDS was
mon metrics that may be used to assess how well various covered. The strength and limitations of each of the algo-
pattern matching algorithms perform when it comes to rithms under discussion are shown in Table 7.
identifying intrusions. There were four implemented
algorithms: Knuth-Wu-Manber (WM), Aho-Corasick 4.3.1.1 Combinatorial algorithm Combinatorial pattern
(AC), Rabin-Karp (RK), and Morris-Pratt (KMP). It was matching algorithms were developed specifically to search
discovered that Rabin-Karp performed poorly in terms of a database of sequences for a given sequence in bioinfor-
memory and run-time. The results also showed that KMP matics [27]. The basic combinatorial pattern matching
beat WM in terms of memory use, whereas WM outper- technique is shown in algorithm 1. Algorithm 1 requires
formed AC. KMP adds a second table to stop repeated two inputs to generate the desired pattern. The network
matches, making it more memory-efficient for large data- traffic (t) and the preset pattern (p) serve as the two inputs.
sets. While WM generates three more hashtables than KMP The operation will begin at point i and scan the substring of
and utilizes more memory, its run-time is shorter. In con- t with length n in the order ti = ti... ti?n-1. The following are
trast, AC creates a sizable Deterministic Finite Automata allegedly examples of how this algorithm displays the
(DFA) trie, whose size increases exponentially for bigger pattern. İf ti = p:
signature datasets. The technique was built in C#, though,
which acts as a trash collection system and reduces the
accuracy of memory measurements. It’s also important to
keep in mind that the technique’s time and memory
requirements can be influenced by the programming lan-
guage that is used. For all three techniques, longer traces
typically require more memory.
A multi-fusion pattern matching method (MFPM) that
combines the Raita and Berry-Ravindran algorithms was
proposed by [56]. In order to set the shift value (distance)
of the pattern window during the preprocessing phase,
Berry-Ravindran was implemented. When searching for
patterns in the text, this shift value is employed to make as
In the study by [27], the CA was used to help the
few character comparisons as possible. For the searching
analysis engine, which was combined with a database of
stage, Raita algorithms were used. The study’s findings
normal traffic, classify the unfamiliar traffic as either a
indicate that the proposed MFPM algorithm outperformed
match or a mismatch using a scoring system..
the ones already in use in terms of processing time (0.94) in
milliseconds.
4.3.1.2 Aho-corasick (AC) Deterministic Finite Automata
Combinatorial algorithms (CA)-based IDS models have
(DFA) are created using the multiple matching technique
been proposed in [27]. The CA installed additional data-
known as AC [138]. All patterns that need to be searched
bases to speed up the combinatorial algorithm’s processing
are encoded into DFA by AC using the ‘‘goto’’ and ‘‘fail-
of network traffic while also assisting the Signature Based
ure’’ stages. In the first ‘‘goto’’ phase, AC establishes a
System (SBS) in acting as an Anomaly Based System
succession of patterns, one character at a time. The root
(ABS) and detecting new assaults. An appropriate thresh-
node or state serves as a representation of the non-matching
old of 12 was also set based on the analysis of earlier
state. The AC builds a new node from the root node for

123
470 Wireless Networks (2024) 30:453–482

each character in each pattern in the dataset. The process is attempt. This methodology is the foundation of several
repeated for the remaining pattern characters [38]. The pattern-matching algorithms [142].
lengthiest pattern, which is measured from the state to the
root, determines the depth of the state. Linkages that, in the 4.3.1.6 Raita algorithm (RA) The Raita algorithm sear-
case of a mismatch, only partially match another pattern ches for a pattern by comparing each character in the input
are dealt with in the ‘‘failure’’ step. We go back to the root text. The search will go as follows. The first and last
node instead of using the patching suffix with the longest characters of the pattern are compared to the window’s
length when the matching is unsuccessful. By simply rightmost character. The initial character of the pattern is
moving through the trie while reading each character in the compared to the character that is located farthest to the left
packet one at a time, a matching pattern can be found. The in the window if a match is discovered. It compares the
matching pattern has been successfully matched if you middle character of the window to the middle character of
reach the end of the branch; otherwise, no matches are the pattern if they match once more. The initial compar-
recorded [139]. For shorter patterns, AC outperforms other ison, if all goes as planned, starts with the second character
approaches, but as the number of patterns rises, the trie and finishes with the last character. If there is a mismatch at
depth also rises, substantially raising memory needs. Each any time during the algorithm’s execution, the bad char-
time a new pattern is added, the trie needs to be acter shift function that was computed during the pre-
reprogrammed. processing stage is used. The proposed Boyer-Moore
method [56] is comparable to the poor character shift
4.3.1.3 Knuth-morris-pratt (KMP) KMP is the first linear- function.
time algorithm for pattern matching. It is Nave Bayes-
based but uses fewer comparisons [38]. If there are mat- 4.3.1.7 Berry-Ravindran algorithm (BR) The Berry-
ches, KMP repeats the matching process and advances the Ravindran algorithm is a combination of the Boyer-Moore
pattern one position; if not, it compares each character of algorithm and a distinct variant known as the Zhu-Takaoka
the pattern with the packet. Repetitive matching is less algorithm [56]. The window shifts are carried out by con-
necessary when pre-computed tables are employed. KMP sidering the ‘‘bad-character shift’’ for the two consecutive
performs remarkably well for long packet traces because, text characters to the right of the window. The shift values
unlike its predecessor, it does not need to fit backwards into are computed from a two-dimensional array in the pre-
the packet [140]. processing stage. Using this method, the shift value of the
pattern on text is only computed when there are two con-
4.3.1.4 Wu-manber (WM) The WM is an algorithm for secutive characters. After each trial, the largest shift value
multiple pattern matching that also uses Boyer-poor between the dependent expression for the bmBc (character
Moore’s character heuristic [142] and a B character sized in the text where a mismatch occurred) and from the bmGs
sliding window [141]. The method creates shift, hash, and table for a matching suffix is taken into consideration. This
prefix tables to expedite the matching process. The shift methodology is the basis of many pattern-matching algo-
table determines the skip forward values for various win- rithms. In the algorithm suggested by [56], Berry-Ravin-
dows hashes. After hashing the final B characters of the dran and Raita algorithms have been integrated. The Berry-
sliding window, WM reads the associated value of the shift Ravindran bad character (brBc) function, used in the pre-
table. The algorithm must use the prefix and hash tables to processing stage, was demonstrated to be effective in
find and confirm the match because a shift of zero denotes a establishing the shift value (distance) of the pattern win-
likely match. For shorter patterns, the WM algorithm dow, which is used in the searching stage to identify pat-
works badly, but it is highly good for longer patterns [38]. terns in the text using the least number of character
comparisons possible. The study of the various algorithms
4.3.1.5 Boyer-Moore algorithm (BM) The technique gen- utilized for the signature-based IDS is shown in Table 7
erates two tables after preprocessing the pattern, known as along with their strengths and weaknesses for the examined
the Boyer-Moore bad character (bmBc) and Boyer-Moore literature.
good-suffix (bmGs) tables [56]. Based on how frequently
each character appears in the pattern, a bad-character 4.4 Result distribution of hybrid-based IDS
table maintains the shift value for each letter of the
alphabet. On the other hand, a good-suffix table records the Anomaly-based and signature-based approaches are com-
matching shift value for each character in the pattern. The bined to form hybrid-based. Although signature-based IDS
largest shift value between the bmBc dependent expression has a low FAR and is very good at recognizing known
and from the bmGs table for a matching suffix is taken into attacks, it struggles to identify new or undiscovered threats.
consideration during the searching phase after each In contrast, anomaly-based IDSs have a high FAR but a

123
Wireless Networks (2024) 30:453–482 471

high rate of new and undiscovered attack detection. In signature match, the data stream is sent to ADE to look for
order to improve performance at identifying network new attacks. The SGE develops a set of signatures for
intrusions, using a hybrid method aims to balance out each malicious traffic packets and stores them in the IDS’s
strategy’s advantages and disadvantages. signature repository if DE discovers an attack. For the
With the adoption of a hybrid-based technique, [50] CICIDS2017 and NSL-KDD datasets, the model achieved
proposed the D-Sign paradigm, which consists of three a TPR of 99.45% and 98.51%, a TNR of 99.32% and
tiers of parts: a honeypot server, a detection engine (DE), 99.20%, and an accuracy of 99.10% and 99.40%, respec-
and a signature generation engine (SGE). D-Sign was tively. However, the system did not take into account
evaluated using the NSL-KDD and CICIDS 2017 datasets. multiclass classification, the issue of an unbalanced dataset
Honeypot servers simulate vulnerable servers, rerouting was not addressed, and the researchers believe that the
anomalous traffic to them instead of the network’s real signature matching technique might be enhanced to
production servers. DE continued to investigate the traffic improve the quality of signatures. Additionally, the system
it had gleaned from the honeypot’s records in order to evaluation did not take into account parameters like FAR
understand more about the attacks. The Misuse Detection and processing time.
Engine (MDE) and Anomaly Detection Engine (ADE) [67] offered a hybrid approach based on Deep Learning-
were used in a hybrid technique by DE to identify attacks based Feature Extraction (DLFE) and Optimization of
or anomalies in the packets stored by the honeypot server. Pattern Matching (OPM) for NIDS. AE was used to extract
To categorize the attacks, a rule-based technique was features, SVM to classify assaults, and Snort rules to match
used for the MDE and an LSTM-based recurrent neural patterns. The strategy performed well in terms of time,
network was used for the ADE. The MDE will raise an throughput, and memory, according to the results. The
alarm if the signature patterns match. If there isn’t a study found that the DLFE phase was evaluated using

Table 8 Techniques, datasets, and performance of hybrid-based studies models


Paper Year Methodology Dataset Metrics Pros Cons

[32] 2020 RNN LSTM N/A TPR: 96.4 Good detection rate for High false positive
Bro Signature FNR: 4.1 zero-day attacks
Analyzer FAR: 2.7
[40] 2020 FRS-FS NSL-KDD, DR: 96.5 Low false positive and Low accuracy for known and
GA-GOGMM Nidsbench-based FAR: 1.4 missing rate unknown attacks
[48] 2020 Snort ? Naı̈ve An increased execution Low detection accuracy rate, and
Bayes speed with decreased huge resource consumed
processing time
[49, 50] 2019 rule-based CICIDS 2017, NSL- TPR: 99.4 High detection accuracy Time expended by the model is
approach and KDD AC: 99.1 rate relatively high, also imbalance
RNN using nature of data leads to low
TNR: 99.2
LSTM detection of some specific kinds of
attacks
[60, 61] 2019 DT, DARPA,KDDCup99, AC:99.9 Enhanced detection Different attack types were not
Association NSL-KDD, UGR16 DR: 99.8 performance explored
rule mining, Kyoto 2006 ? FAR:0.2
GA
[67] 2021 AE, SVM NA PT:1.6us/ An improved time Only detects few types of attacks
Snort AC Rules 100byte complexity to process
Algorithm TP:621,118/ each packet
100byte
MU:682.7 Mb/
5byte
[73] 2014 C4.5 decision NSL-KDD PT: 11.2 s High detection Degraded profiling ability
tree (DT), perfornance for Uneven data distribution leading to
1-class SVM unknown attacks, low high training and testing time
processing time, and
less training time

123
472 Wireless Networks (2024) 30:453–482

Table 9 Accuracy range for techniques minimizing the time complexity of the training and testing
Study No. of Study Techniques Accuracy Range
procedures. Training and testing times for the model were
56.58 and 11.20 seconds, respectively, on average. How-
[1–6] 6 CNN (87%—99%) ever, a well-formed normal cluster can be divided during
[9–11] 3 ANN (76.9% -99.9%) the decomposition procedures because the initial C4.5
[7, 9] 2 NB (99.0%—99.7%) decision tree does not take into consideration clusters in the
[12–14] 3 DBN (87.2% -98.1%) normal data set. This can hinder the 1-class SVM model’s
[33] 1 ABC ? AFS ? CART (99.0%) ability to profile and hinder the well-formed normal cluster
[13] 1 LSTM ? RNN (98.8%) that is connected to a single decision boundary. Addition-
[2] 1 XGBoost (87.0%) ally, training and testing take longer when the distribution
[9, 15, 28–30] 6 RF (83.1%—99.9%) of the data is uneven. The potential for time reduction of
[2] 1 LSTM (87.0%) the offered strategy therefore falls short of expectations.
[9, 20, 22–24] 5 SVM (96.2% -99.9%) Different classifiers need to be looked into in order to
[42, 47, 54, 55] 4 CNN ? LSTM (76.8% -95.6%) improve the model’s performance without lowering its
[14, 33, 58] 3 AE (87.2% -99.7%) ability to spot unknown attacks. Table 8 displays the rel-
[8] 1 SVM ? NB (99.3%) evant research on the hybrid-based approach.
[69] 1 Light GBM Ensemble (99.9%)
[9, 35, 70–72] 5 AdaBoost (98.9%-99.3%) 4.5 Analysis of the research questions
[9] 1 KNN (99.0%)
[17, 18, 30] 3 DT (85.2%—99.9%) 4.5.1 Evaluation of techniques employed
[45, 46] 2 AE ? DNN (89.0%—99.9%)
[75] 1 LR ? SVM (86.0%) Table 9 displays the range of accuracy for each method
[65] 1 XGBoost ? DNN (97.0%) used to find anomalies. ANN covers a wide range, from
[43] 1 CNN ? AE (100%)
76.9 to 99.9%, and was applied four times in total. In
[78] 1 RNN (94.5%)
addition, RF covers a wide range of accuracy, starting at 83
[36] 1 KELM (95.8%)
percent and going all the way to 99.8 percent, while DBN
covers a range from 87.2 to 98.1% and CNN a range from
87 to 99%. SVM was applied five times, with an accuracy
range of 96.2 to 99.9%. There were four times each for
CNN and LSTM. A 100% accuracy was reached with CNN
performance traits like accuracy, precision, and FAR, while
and AE, which were only used once. Table 9 lists the
the OPM phase was assessed using packet processing time,
several methods that were employed in the investigations.
throughput, and memory use. A data packet requires more
time to process when it is larger for a pattern matching
4.5.2 Evaluation of datasets employed and types of attacks
engine. The suggested model outperformed the Snort pat-
detected
tern matching engine in terms of time complexity. How-
ever, the model did not account for other attack types.
NSL-KDD and KDDCup99 are the most often used data-
Regarding the results of the DLFE phase, there isn’t
sets, as seen in Table 10. However, these datasets don’t
enough data yet. Although there is minimal data on how
represent modern attack types. Table 10 lists the datasets
the various components interact, the study also claimed
that have been used in the literature. An important aspect of
that each phase was investigated independently.
determining an efficient IDS is the types of assaults it can
Another hybrid-based IDS was provided by [73], and
identify. Even while some of the study’s models were quite
1-Class SVM and Decision Tree analysis were performed
successful in correctly classifying a particular assault type,
on the NSL-KDD data set. The abuse detection model was
newer attack types were not covered in their analyses.
created using the same C4.5 decision tree (DT) that was
Consequently, a model with a greater detection rate and the
used to divide the standard training data into smaller sub-
ability to detect a wide range of attacks, including con-
sets. A single-class SVM was then used in each disas-
temporary ones, is required. For instance, the NSLKDD
sembled region to develop a model for identifying
dataset was used to test the model proposed by [7]. Attacks
abnormalities. When building profiles of typical behavior,
of the User to Root (U2R), Remote to Local (R2L), Denial
this model can more effectively use information about
of Service (DoS), and Probing kind were identified by this
previous attacks. The experimental results suggested that
model. Similarly, studies by [2–4, 6, 12, 16, 23, 34, 35,
the suggested model would improve the IDS’s functionality
42, 81, 87] identified the same attack types while utilizing
and speed in detecting unknown threats while also
KDDCup99 and NSL-KDD datasets. Furthermore, in

123
Wireless Networks (2024) 30:453–482 473

Table 10 The datasets used in the study


Datasets Year No. of Authenticity Labelled No. of Description
study labels

KDDCup99 1999 19 Emulated Yes 4 DARPA network dataset files served as the foundation for the KDD99
dataset. The DARPA dataset’s characteristics were examined, and the
data were preprocessed before the dataset was created. The collection
includes around 4.9 million vectors from seven weeks of network
activity. User-to-root (U2R), remote-to-local (R2L), probing, and
denial-of-service (DoS) attacks are the four categories into which
attacks fall. Each instance is represented by 41 features divided into the
following three groups: (1) basic; (2) traffic; and (3) content. TCP/IP
connections are mined for their basic properties. Traffic characteristics
are divided into those that share the same host characteristics or the
same service characteristics. Characteristics of the content are related to
the data part’s suspicious behavior. The most comprehensive dataset for
assessing intrusion detection methods is this one
NSL-KDD 2009 25 Emulated Yes 4 To address some of the inherent issues of the KDD99 dataset, the NSL-
KDD dataset has been proposed. Because there aren’t many publicly
available data sets for network-based IDSs, even though this updated
version of the KDD dataset still has some issues and may not be a
perfect representation of real networks, we think it can still be used as a
useful benchmark dataset to aid researchers in comparing various
intrusion detection strategies. The NSL-KDD train and test sets contain
a respectable number of records. Due to this benefit, all of the data can
be used for the tests without having to pick a tiny sample at random. In
turn, evaluation outcomes from various research projects will be
uniform and comparable
CSECIC- 2018 4 Emulated Yes 7 Emerged from a joint project between the Canadian Institute for
IDS2018 Cybersecurity (CIC) and the Communication Security Establishment
(CSE). Brute-force, botnet, DDoS, DoS, web attacks, and penetration
are only a few of the seven attack scenarios in the dataset. The dataset
contains 80 features that CICFlowMeter-V3 retrieved from the network
traffic that was recorded as well as the system logs of each computer
CICIDS2017 2017 8 Emulated Yes 7 CICIDS2017 contains benign and common attacks, with both source data
(PCAPs) and results of network traffic analysis based on timestamps,
source and destination IPs, source and destination ports, protocols, and
token flows of attacks. The B-Profile technology was utilized by the
researchers to create safe background traffic and evaluate the abstract
behavior of human interactions. 25 users’ abstracted behaviors from the
HTTP, HTTPS, FTP, SSH, and email protocols are included in the
dataset. FTP, SSH, DoS, Heartbleed, web attacks, infiltration, botnets,
and DDoS are examples of brute force cracking attacks
ADFA-LD 2013 2 Emulated Yes 6 The dataset provides a contemporary linux dataset for evaluation by
traditional HIDS
UNSW- 2015 17 Emulated Yes 9 The Cyber Range Laboratory of the Australian Cyber Security Center
NB15 developed UNSW-NB15. Due to the diversity of unique attacks it offers,
it is often used. Attacks can be classified as Fuzzers, Analyses,
Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, or
Worms. It has a testing set of 175,341 records and a training set of
82,332 records
CICIDS-001 2017 2 Emulated Yes 6 The CIDDS-001 dataset is made up of unidirectional NetFlow data and
traffic information from an OpenStack environment with internal servers
(backup, mail, file, and web) and external servers (file synchronization
and web server) that are set up on the internet to record real-time and
recent internet traffic. Four different attack types—suspicious, attacker,
unknown, and victim—are represented in the dataset as attack flows

123
474 Wireless Networks (2024) 30:453–482

Table 10 (continued)
Datasets Year No. of Authenticity Labelled No. of Description
study labels

ISCX-IDS 2012 1 Real Yes Unknown The ISCX 2012 benchmark dataset includes statistics elements
(time_stamp, source_bytes, dst_bytes, source_packets, dst_packets,
protocol, direction, Tag, source_ip, dst_ip) collected with a single
switch interface to which all traffic is directed. The impact of real
network traffic traces was examined in this dataset to ascertain the
typical behavior of computers based on the HTTP, IMAP, SMTP, POP3,
SSH, and FTP protocols’ actual traffic. Realistic network traffic that has
been labeled and incorporates several attack scenarios is what it depends
on. It is a tagged dataset with over two million traffic packets that target
data representing 2% of the total traffic. Infiltration of the network from
within, HTTP denial of service (DoS), brute-force SSH, and distributed
denial of service are the four types of attack scenarios included in this
dataset
Kyoto 2006 1 Real Yes Unknown Kyoto 2006 ? is a honeypot dataset of actual network traffic that is
2006 ? available to the public, however it only contains a small number and
narrow range of realistic, typical user behaviors. Sessions are a novel
format created by the researchers out of packet-based data. There are 24
variables total for each session, 14 of which are statistical information
features derived from the KDD CUP 99 dataset. The remaining 10
elements are common traffic-based attributes including IP address
(anonymous), port, and duration. There are over 93 million sessions total
in the data, which were gathered over three years
DARPA 1999 1 Emulated Yes 4 Communications between source IPs and destination IPs make up the
Darpa dataset. This dataset includes many attacks from various IPs.
Source: Scalable Dynamic Network Embedding (dynode2vec). The
dataset offers a sizable sample of computer intrusions undertaken in the
midst of common background data and numerous realistic intrusion
scenarios

[1, 14, 28, 33, 36, 45, 47, 53, 55, 69], NSL-KDD and assault types. Therefore, additional research is required to
UNSW-NB15 increased the attack surface by identifying both increase the pace at which different assaults are dis-
contemporary attacks like Shellcode, Fuzzer and worm, covered as previously mentioned as well as our capacity to
exploits, back-doors, analysis, generic, and reconnaissance. accurately identify recent attack types.
[10, 44] applied CSE-CIC- IDS2018 that tackles attacks
such as Bruteforce, Heartbled, Bot, Web attack, DoS-Hulk, 4.5.3 Evaluation of metrics employed in evaluating
Infilteration, DoS-GoldenEye, DoS-lowHTTPTest, DDoS- performance of the various models
LOIC-HTTP, DoSSlowloris, and DDoS-HOIC. A variety
of attacks, including DoS/DDoS, Botnet, Web Attack, The frequency with which each of the commonly used
Brute Force, Infiltration, PortScan, web attack, Bot, grey evaluation metrics was applied in the publications under
hole, black hole, SQL Injection, Benign, DoS Hulk, review is shown in Fig. 4. The most often employed met-
DosHulk, FTP-Potato, SSH-Potato, Dos Slowloris, Dos rics are accuracy, precision, recall, F1 value, and false-
Slowhttptest, Heartbleed were utilized in the study by alarm rate (FAR), as illustrated in Fig. 4. Most studies
[2, 20] using the CICIDS 2017 dataset. employ precision and accuracy, then recall and false alarm
The authors of the study [8, 9, 13, 21, 22, 32, 54, 58, rate. The fraction of accurately identified attacks to all
67, 70, 82] only focused on binary categorization, that is, attacks is known as precision. The classifier’s precision in
whether an attack was launched or not. The kind of datasets identifying the attacks can be measured. The accuracy of a
utilized in the investigations [4, 22, 69] were not made classifier is determined by dividing the number of samples
clear. The methods used in the preceding investigations that were properly predicted by the total number of samples
have demonstrated their ability to recognize a wide range that were forecasted. FAR is a crucial parameter to con-
of assault patterns, including the above-mentioned attacks. sider when assessing intrusion detection systems. False
However, due in part to datasets with uneven data, there are positives, whose high volume will put more strain on the
discrepancies in how well they can recognize various system and its human resources, show as false alarms.

123
Wireless Networks (2024) 30:453–482 475

Once the data has been classified, it can be separated into The analysis was able to show that processing time,
four groups: true positives (TP), false positives (FP), true memory, and throughput are frequently the metrics that are
negatives (TN), and false negatives (FN). The following is most employed in evaluating the IDS with regard to sig-
the calculating formula: nature-based IDS. Processing time in this IDS quantifies
TP þ TN how long it takes a pattern matching engine to process a
Accuracy ¼ ð1Þ packet, whereas throughput assesses how many packets the
TP þ TN þ FP þ FN
engine can process in a second, or in Megabits per second
TP
precision ¼ ð2Þ (Mbps). Memory consumption, on the other hand, counts
TP þ FP the number of Megabytes used by patterns during pattern
TP matching.
Recall ¼ ð3Þ
TP þ FN
FP
FAR ¼ ð4Þ 5 Challenges and potential future directions
TN þ FP
precision  recall
F  measure ¼ 2  ð5Þ Network IDSs have been the focus of numerous studies and
precision þ recall
published research papers. Particularly for anomaly-based
Another typical evaluation metric in the field of intru- network intrusion detection systems, there are still a
sion detection is detection time. Our study has found few number of outstanding research challenges and issues to be
papers that cover time performance. The duration required solved. The issue is that no standard mechanism exists to
to categorize a sample using a trained model is referred to guarantee the applicability of the proposed systems or
as detection time. Even with techniques like feature methods. The bulk of research papers evaluate the pro-
selection for dimensionality reduction, IDS research fre- posed systems using simulated datasets, which might not
quently encounters the issue of dimensional disaster due to be appropriate for situations involving real data and other
the complexity of network traffic, which eventually mani- difficulties. This study and similar ones on the state of the
fests as a long detection time. Some of the numerous art in IDS show how difficult it is to create an IDS that
intrusion detection algorithms that are currently in use are includes, at the very least, the most crucial features, such as
practically nonexistent in engineering implementations, the ability to be quickly deployed online, scalability,
and one key factor in this is their lengthy detection times. effectiveness with real data, and meeting the needs of all
The major objective of intrusion detection from the per- stakeholders. Instead, the majority of the literature uses
spective of the application is to achieve an appropriate biased parameters, only covers a tiny fraction of the sys-
detection rate with minimal resource consumption, which tem, and reports evaluation results that were evaluated on
calls for the creation of an optimal model structure for IDS fictitious datasets.
as well as parameter configurations. A model’s long The literature has described a number of techniques for
detection time typically indicates that the algorithm com- Signature-Based IDS, such as pattern matching and rule-
plexity is too high. Reviewing earlier research reveals a based algorithms, with the main goal of speeding up the
distinct tradeoff between model complexity and perfor- matching of signatures (time complexity), while research-
mance. Although deep learning-based methods typically ers paid little attention to space complexity (memory
outperform other methods in terms of detection capabili- consumption), especially when the size of the database is
ties, these methods are challenging to apply in big data growing. It is challenging to compare the performances of
scenarios due to their excessively long detection times. these approaches because there are so numerous metrics,
Although computational complexity has a direct impact on datasets, and simulation environments. However, the
detection time, most papers only give the training and memory and search time requirements for pattern matching
testing times of their algorithms on the given dataset or signature search operations are now higher. This calls
because it can be challenging to determine the computa- for an improved approach that can reduce processing time
tional complexity of some algorithms or is debatable under and memory utilization.
various assumptions. It is still challenging to determine the There were only seven studies [32, 40, 48, 50, 60, 67,
superiority of an algorithm in terms of time complexity 73] that specifically addressed Hybrid-Based IDSs in the
simply from the running time because the platforms uti- literature. Since multiclass categorization was not used, the
lized to achieve each result and the preparation techniques research under review did not offer enough details about
for the datasets vary. In conclusion, we think that rather the attacks found. In addition, not enough prior knowledge
than focusing solely on detection time, the current IDS has been provided on the integration and evaluation of the
research still needs a unified complexity evaluation entire system.
criterion.

123
476 Wireless Networks (2024) 30:453–482

Additionally, it is quite challenging to describe or pro- Therefore, one of the main research objectives is to
duce a demonstration of the precision and thoroughness of create a dynamic and computationally efficient feature
any suggested IDS. The main conclusion of this research is selection method that can work in both normal and
that it is difficult to develop an all-encompassing IDS that attack traffic.
offers high levels of accuracy, scalability, resilience, and • The use of DL and ML approaches is common when
threat prevention. The following discussion focuses on the developing a model on a large dataset. This has made it
main issues and challenges that academics face today and simpler to properly respond to cyberattacks. Research-
in the future. The potential for future study in this area is ers need to pay attention to a few concerns, neverthe-
enormous, particularly for hybrid intrusion detection sys- less, regarding the use of DL and ML approaches for
tems that combine Signature and Anomaly methods. The attack detection in networks. For example, the issue of
most recent Network IDS issues are discussed below: resource constraints restricts the use of DL/ML algo-
rithms [19, 87] for network security [143]. Another
• To test and validate suggested NIDS, an excellent
challenge with applying ML/DL to big, distributed
dataset is a must. Such a dataset should contain a
networks is scalability issues, such as those linked to
substantial volume of labeled network traffic informa-
varied scenarios and IDS deployment options. The use
tion that both describes attacks and normal activities.
of an ensemble of ML/DL algorithms that outperform a
However, the majority of publicly available datasets fall
single Machine Learning algorithm is one potential
short in supplying the essential components, such as
remedy for the shortcomings of individual Deep or
lacking labels, insufficient network features, missing
Machine Learning approaches, as suggested by several
raw pcap files, being challenging to comprehend, and/or
of the writers [19]. However, due to their high
having incomplete CSV files. Creating a dataset that
computing costs, these methods resulted in network
can address these issues in a real-world situation will be
latency issues, which are costly in critical network
a challenging undertaking and potential area of study.
contexts.
• The creation of an online, real-time network IDS is very
• The creation of an IDS for network security has not yet
challenging. This is because an IDS of this kind would
been properly researched and evaluated using semi-
need to comprehend typical behavior before it could
supervised learning, transfer learning, and reinforce-
detect unusual or malicious behavior. The lack of noise
ment learning (RL). Therefore, to achieve important
and attack traffic, which cannot be guaranteed, is a need
objectives like real-time, quick training, and unified
for the learning phase. False alarms may result from
models for anomaly detection, future research may
such an IDS if these issues are not fixed.
concentrate on the semi-supervised learning, transfer
• The majority of anomaly-based NIDS make an effort to
learning, and reinforcement learning (RL) techniques
create a model that captures the characteristics of each
for developing an IDS for network security [143]. The
prospective action or typical traffic pattern. But given
benefits of integrating RL and DL in network systems
that it has been shown that these models tend to favor
with high data dimensionality and non-stationary
the dominant class, which results in high false-positive
circumstances further make this a fascinating research
rates, this is particularly challenging. The rate of false-
topic.
negative outcomes increases because it is hard to detect
• Finally, greater research on the creation of an IDS
every single potential normal observation that could be
hybrid system that combines signature and anomaly-
generated in a network. To entirely prevent or lower the
based techniques would be fascinating because it is
rates of false-positive and false-negative results in
clear from the literature that this component has not
NIDS is another research objective.
been properly researched and tested.
• Computational complexity rises during several stages of
the design and implementation of NIDS, including
feature reduction and data preprocessing, model train-
ing, and deployment, especially for ML and DL based 6 Conclusion
NIDS. Thus, developing an efficient NIDS with low
computational requirements is a difficult task and an This study presents a survey of anomaly-, signature-, and
interesting area for future research. hybrid-based IDS. With a focus on network intrusion
• The proposed IDSs’ feature selection and dimensional- detection, this work aims to give academics a concise but
ity reduction algorithms are suitable for a specific type thorough understanding of the different security concerns
of normal traffic and a specific kind of attack detection, that are currently being addressed and potential solutions.
but they might not work as well if the environment for This study follows the guidelines set forth by Preferred
normal or attack sequences changes somewhat. Reporting Items for Systematic Reviews and Meta-

123
Wireless Networks (2024) 30:453–482 477

Analyses (PRISMA). A collection of specified keywords are rule-based and pattern matching. However, because
and RQs were used to categorize and summarize the lit- these approaches use various measures, datasets, and sim-
erature on signature-based, anomaly-based, and hybrid- ulation environments, it is difficult to compare their per-
based network intrusion detection. After applying the formances. However, present approaches require more
exclusion, inclusion, and quality criteria, 71 papers in total memory and search time because they use pattern matching
were selected for the review. This study provided infor- or signature search operations. This necessitates a more
mation about what may be discovered from prior studies on effective strategy that can optimize memory usage and
hybrid, anomaly, and signature-based network intrusion processing time.
detection. Additionally, it evaluated each contribution’s There were only seven studies that addressed hybrid-
performance and limitations while comparing each con- based IDSs in the literature that was searched. Due to a
tribution’s methodology, datasets, types of attacks discov- lack of multiclass classification, the research under con-
ered, and assessment metrics. sideration did not offer enough details about the attacks
The SLR study has indicated that, in terms of research found. The evaluation of the entire system is also lacking in
studies, there has been a significant movement from the adequate background knowledge. In order to address the
signature-based method to the anomaly-based approach; aforementioned difficulties, a hybrid system must be
nevertheless, the area of the hybrid-based approach has designed.
received little attention from researchers. The study also Further review research showed that while processing
demonstrates how various machine learning and deep time and memory use were utilized to evaluate the per-
learning algorithms have been used by researchers for formance of signature-based IDS, Accuracy, Precision, and
intrusion detection, with the most popular ones being CNN, FAR were the three metrics that were most frequently used
ANN, DBN, AE, AdaBoost, LSTM, RF, and SVM. How- for anomaly-based IDS. Future studies ought to consider
ever, CNN?AE was found to perform better in terms of using all available performance measures, particularly
accuracy. detection time, which is essential for real-time systems.
Thirteen different types of datasets were utilized to IDS models must be able to adjust to the changing network
evaluate the models in the studies that were analyzed. The environment, which necessitates models with the ability to
most frequently used datasets were KDDCup99, NSL- independently learn about their surroundings and update
KDD, and UNSWNB15. Researchers have challenged themselves.
these datasets for being outdated since, despite their great
documented popularity, they do not reflect the modern
Data availability Not Applicable.
types of assaults. In order to achieve this, the SLR analysis
turned up newly emerging datasets that include contem-
porary assaults. These include the CICIDS2017, UNSW- Declarations
NB15, ADFA-LD, ISCX-IDS, RTNTP18, UGR16, and
CSE-CIC-IDS2018 datasets. Therefore, in order to achieve Conflict of interest The authors declare that there is no conflict of
a successful IDS, researchers must make use of contem- interest in this paper.
porary datasets.
Deep Learning (DL) approaches are clearly the newest
References
fad in Anomaly-Based IDS. Deep Learning Techniques
were used in the majority of the literature review. Perfor- 1. Ashiku, L., & Dagli, C. (2021). Network intrusion detection
mance-wise, according to CNN and AE, there was a report system using deep learning. Procedia Computer Science, 185,
of performance of almost 100%. It is crucial to remember 239–247. https://doi.org/10.1016/j.procs.2021.05.025
that the FAR must be decreased to the absolute minimum 2. Gupta, N., Jindal, V., & Bedi, P. (2021). LIO-IDS: Handling
class imbalance using LSTM and improved one-vs-one tech-
while taking into account the various attack types present nique in intrusion detection system. Computer Networks, 192,
in the datasets. We advise research to look into more 108076. https://doi.org/10.1016/j.comnet.2021.108076
datasets that have attacks that are more recent. More study 3. Nguyen, M. T., & Kim, K. (2020). Genetic convolutional neural
can be done in this area because several DL approaches network for intrusion detection systems. Future Generation
Computer Systems, 113, 418–427. https://doi.org/10.1016/j.
have yet to be investigated. High attack detection accuracy future.2020.07.042
may be achieved through the hybridization of two or more 4. Wu, Z., Wang, J., Hu, L., Zhang, Z., & Wu, H. (2020). A net-
approaches, although this may require a significant work intrusion detection method based on semantic re-encoding
investment in training time and resources. and deep learning. Journal of Network and Computer Applica-
tions, 164, 102688. https://doi.org/10.1016/j.jnca.2020.102688
The process of matching signatures has been sped up 5. Kim, J., Kim, J., Kim, H., Shim, M., & Choi, E. (2020). CNN-
using a number of different techniques. The most often based network intrusion detection against denial-of-service
used algorithms in the literature for Signature-Based IDS attacks. Electronics. https://doi.org/10.3390/electronics9060916

123
478 Wireless Networks (2024) 30:453–482

6. Xiao, Y., Xing, C., Zhang, T., & Zhao, Z. (2019). An intrusion 21. Hadem, P., Saikia, D. K., & Moulik, S. (2021). An SDN-based
detection model based on feature reduction and convolutional intrusion detection system using SVM with selective logging for
neural networks. IEEE Access, 7, 42210–42219. https://doi.org/ IP traceback. Computer Networks, 191, 108015. https://doi.org/
10.1109/ACCESS.2019.2904620 10.1016/j.comnet.2021.108015
7. Onah, J. O., Abdullahi, M., Hassan, I. H., & Al-Ghusham, A. 22. Gu, J., Wang, L., Wang, H., & Wang, S. (2019). A novel
(2021). Genetic algorithm based feature selection and naı̈ve approach to intrusion detection using SVM ensemble with fea-
Bayes for anomaly detection in fog computing environment. ture augmentation. Computers and Security, 86, 53–62. https://
Machine Learning with Applications, 6, 100156. https://doi.org/ doi.org/10.1016/j.cose.2019.05.022
10.1016/j.mlwa.2021.100156 23. Alazzam, H., Sharieh, A., & Sabri, K. E. (2022). A lightweight
8. Gu, J., & Lu, S. (2021). An effective intrusion detection intelligent network intrusion detection system using OCSVM
approach using SVM with naı̈ve Bayes feature embedding. and Pigeon inspired optimizer. Applied Intelligence, 52(4),
Computers and Security, 103, 102158. https://doi.org/10.1016/j. 3527–3544. https://doi.org/10.1007/s10489-021-02621-x
cose.2020.102158 24. Krishnaveni, S., Vigneshwar, P., Kishore, S., Jothi, B., &
9. Kanimozhi, V., & Jacob, T. P. (2021). Artificial intelligence Sivamohan, S. (2020). Anomaly-based intrusion detection sys-
outflanks all other machine learning classifiers in network tem using support vector machine. In Artificial Intelligence and
intrusion detection system on the realistic cyber dataset CSE- Evolutionary Computations in Engineering Systems, Singapore,
CIC-IDS2018 using cloud computing. ICT Express, 7(3), S. S. Dash, C. Lakshmi, S. Das, & B. K. Panigrahi (Eds.),
366–370. https://doi.org/10.1016/j.icte.2020.12.004 Springer Singapore, pp. 723–731.
10. Kanimozhi, V., & Jacob, T. P. (2019). Artificial intelligence 25. Ozkan-Okay, M., Samet, R., Aslan, Ö., & Gupta, D. (2021). A
based network intrusion detection with hyper-parameter opti- comprehensive systematic literature review on intrusion detec-
mization tuning on the realistic cyber dataset CSE-CIC-IDS2018 tion systems. IEEE Access, 9, 157727–157760. https://doi.org/
using cloud computing. ICT Express, 5(3), 211–214. https://doi. 10.1109/ACCESS.2021.3129336
org/10.1016/j.icte.2019.03.003 26. Li, X., Yi, P., Wei, W., Jiang, Y., & Tian, L. (2021). LNNLS-
11. Mebawondu, J. O., Alowolodu, O. D., Mebawondu, J. O., & KH: A feature selection method for network intrusion detection.
Adetunmbi, A. O. (2020). Network intrusion detection system Security and Communication Networks, 2021, 8830431. https://
using supervised learning paradigm. Scientific African, 9, doi.org/10.1155/2021/8830431
e00497. https://doi.org/10.1016/j.sciaf.2020.e00497 27. Folorunso, O., Ayo, F. E., & Babalola, Y. E. (2016). Ca-NIDS:
12. Jia, H., Liu, J., Zhang, M., He, X., & Sun, W. (2021). Network A network intrusion detection system using combinatorial
intrusion detection based on IE-DBN model. Computer Com- algorithm approach. Journal of Information Privacy and Secu-
munications, 178, 131–140. https://doi.org/10.1016/j.comcom. rity, 12(4), 181–196. https://doi.org/10.1080/15536548.2016.
2021.07.016 1257680
13. Elmasry, W., Akbulut, A., & Zaim, A. H. (2020). Evolving deep 28. Nazir, A., & Khan, R. A. (2021). A novel combinatorial opti-
learning architectures for network intrusion detection using a mization based feature selection method for network intrusion
double PSO metaheuristic. Computer Networks, 168, 107042. detection. Computers and Security, 102, 102164. https://doi.org/
https://doi.org/10.1016/j.comnet.2019.107042 10.1016/j.cose.2020.102164
14. Wang, Z., Liu, Y., He, D., & Chan, S. (2021). Intrusion detec- 29. Zhou, Y., Cheng, G., Jiang, S., & Dai, M. (2020). Building an
tion methods based on integrated deep learning model. Com- efficient intrusion detection system based on feature selection
puters and Security, 103, 102177. https://doi.org/10.1016/j.cose. and ensemble classifier. Computer Networks, 174, 107247.
2021.102177 https://doi.org/10.1016/j.comnet.2020.107247
15. Ahmed, H. A., Hameed, A., & Bawany, N. Z. (2022). Network 30. Chiche, A., & Meshesha, M. (2021). Towards a scalable and
intrusion detection using oversampling technique and machine adaptive learning approach for network intrusion detection.
learning algorithms. PeerJ Computer Science, 8, 820. https:// Journal of Computer Networks and Communications, 2021,
doi.org/10.7717/peerj-cs.820 8845540. https://doi.org/10.1155/2021/8845540
16. Selvakumar, B., & Muneeswaran, K. (2019). Firefly algorithm 31. Nagaraju, S., Shanmugham, B., & Baskaran, K. (2021). High
based feature selection for network intrusion detection. Com- throughput token driven FSM based regex pattern matching for
puters and Security, 81, 148–155. https://doi.org/10.1016/j.cose. network intrusion detection system. Materials Today: Pro-
2018.11.005 ceedings, 47, 139–143. https://doi.org/10.1016/j.matpr.2021.04.
17. Disha, R. A., & Waheed, S. (2022). Performance analysis of 028
machine learning models for intrusion detection system using 32. Sohi, S. M., Seifert, J.-P., & Ganji, F. (2021). RNNIDS:
gini impurity-based weighted random forest (GIWRF) feature Enhancing network intrusion detection systems through deep
selection technique. Cybersecurity, 5(1), 1. https://doi.org/10. learning. Computers and Security, 102, 102151. https://doi.org/
1186/s42400-021-00103-8 10.1016/j.cose.2020.102151
18. Sharma, N. V., & Yadav, N. S. (2021). An optimal intrusion 33. Hajisalem, V., & Babaie, S. (2018). A hybrid intrusion detection
detection system using recursive feature elimination and system based on ABC-AFS algorithm for misuse and anomaly
ensemble of classifiers. Microprocessors and Microsystems, 85, detection. Computer Networks, 136, 37–50. https://doi.org/10.
104293. https://doi.org/10.1016/j.micpro.2021.104293 1016/j.comnet.2018.02.028
19. Gao, X., Shan, C., Hu, C., Niu, Z., & Liu, Z. (2019). An adaptive 34. Bhati, B. S., Rai, C. S., Balamurugan, B., & Al-Turjman, F.
ensemble machine learning model for intrusion detection. IEEE (2020). An intrusion detection scheme based on the ensemble of
Access, 7, 82512–82521. https://doi.org/10.1109/ACCESS.2019. discriminant classifiers. Computers and Electrical Engineering,
2923640 86, 106742. https://doi.org/10.1016/j.compeleceng.2020.106742
20. Vijayanand, R., Devaraj, D., & Kannapiran, B. (2018). Intrusion 35. Zhou, Y., Mazzuchi, T. A., & Sarkani, S. (2020). M-AdaBoost-a
detection system for wireless mesh network using multiple based ensemble system for network intrusion detection. Expert
support vector machine classifiers with genetic-algorithm-based Systems with Applications, 162, 113864. https://doi.org/10.1016/
feature selection. Computers and Security, 77, 304–314. https:// j.eswa.2020.113864
doi.org/10.1016/j.cose.2018.04.010 36. Lv, L., Wang, W., Zhang, Z., & Liu, X. (2020). A novel
intrusion detection system based on an optimal hybrid kernel

123
Wireless Networks (2024) 30:453–482 479

extreme learning machine. Knowledge-Based Systems, 195, traffic analysis attacks using multipath routing and deception,
105648. https://doi.org/10.1016/j.knosys.2020.105648 Proceedings of the 27th ACM on symposium on access control
37. Ayyagari, M. R., Kesswani, N., Kumar, M., & Kumar, K. models and technologies.
(2021). Intrusion detection techniques in network environment: 53. Kumar, V., Sinha, D., Das, A. K., Pandey, S. C., & Goswami, R.
a systematic review. Wireless Networks, 27(2), 1269–1285. T. (2020). An integrated rule based intrusion detection system:
https://doi.org/10.1007/s11276-020-02529-3 analysis on UNSW-NB15 data set and the real time online
38. Aldwairi, M., Alshboul, M. A., & Seyam, A. (2018). Charac- dataset. Cluster Computing, 23(2), 1397–1418. https://doi.org/
terizing realistic signature-based intrusion detection Bench- 10.1007/s10586-019-03008-x
marks, In Proceedings of the 6th international conference on 54. Thilagam, T., & Aruna, R. (2021). Intrusion detection for net-
information technology: IoT and smart City, Hong Kong. https:// work based cloud computing by custom RC-NN and optimiza-
doi.org/10.1145/3301551.3301591. tion. ICT Express, 7(4), 512–520. https://doi.org/10.1016/j.icte.
39. AlYousef, M. Y., & Abdelmajeed, N. T. (2019). Dynamically 2021.04.006
detecting security threats and updating a signature-based intru- 55. Kanna, P. R., & Santhi, P. (2021). Unified deep learning
sion detection system’s database. Procedia Computer Science, approach for efficient intrusion detection system using inte-
159, 1507–1516. https://doi.org/10.1016/j.procs.2019.09.321 grated spatial–temporal features. Knowledge-Based Systems,
40. Liu, J., et al. (2020). Adaptive intrusion detection via GA- 226, 107132. https://doi.org/10.1016/j.knosys.2021.107132
GOGMM-based pattern learning with fuzzy rough set-based 56. ManoharNaik, S., & Geethanjali, N. (2016). A multi-fusion
attribute selection. Expert Systems with Applications, 139, pattern matching algorithm for signature-based network intru-
112845. https://doi.org/10.1016/j.eswa.2019.112845 sion detection system, Preprints, pp. 1–8, https://doi.org/10.
41. Alsoufi, M. A., et al. (2021). Anomaly-based intrusion detection 20944/preprints201608.0197.v1.
systems in IoT using deep learning a systematic literature 57. Luo, G., Chen, Z., & Mohammed, B. O. (2022). A systematic
review. Applied Sciences. https://doi.org/10.3390/app11188383 literature review of intrusion detection systems in the cloud-
42. Jiang, K., Wang, W., Wang, A., & Wu, H. (2020). Network based IoT environments. Concurrency and Computation:
intrusion detection combined hybrid sampling with deep hier- Practice and Experience, 34(10), e6822. https://doi.org/10.1002/
archical network. IEEE Access, 8, 32464–32476. https://doi.org/ cpe.6822
10.1109/ACCESS.2020.2973730 58. RM, S. P., et al. (2020). An effective feature engineering for
43. Hwang, R. H., Peng, M. C., Huang, C. W., Lin, P. C., & DNN using hybrid PCA-GWO for intrusion detection in IoMT
Nguyen, V. L. (2020). An unsupervised deep learning model for architecture. Computer Communications, 160, 139–149. https://
early network traffic anomaly detection. IEEE Access, 8, doi.org/10.1016/j.comcom.2020.05.048
30387–30399. https://doi.org/10.1109/ACCESS.2020.2973023 59. Abu Al-Haija, Q., & Al-Badawi, A. (2021). Attack-aware IoT
44. Li, X., Chen, W., Zhang, Q., & Wu, L. (2020). Building auto- network traffic routing leveraging ensemble learning. Sensors,
encoder intrusion detection system based on random forest 22(1), 241.
feature selection. Computers and Security, 95, 101851. https:// 60. Kalavadekar, P. N., & Sane, S. S. (2019). Building an effective
doi.org/10.1016/j.cose.2020.101851 intrusion detection system using combined signature and
45. Rao, K. N., Rao, K. V., & PVGD, P. R. (2021). A hybrid anomaly detection techniques. International Journal Innovative
intrusion detection system based on sparse autoencoder and deep Technology Explore Engineering, 8(10), 429.
neural network. Computer Communications, 180, 77–88. https:// 61. Aldweesh, A., Derhab, A., & Emam, A. Z. (2020). Deep
doi.org/10.1016/j.comcom.2021.08.026 learning approaches for anomaly-based intrusion detection sys-
46. Yang, Y., Zheng, K., Wu, C., & Yang, Y. (2019). Improving the tems: A survey, taxonomy, and open issues. Knowledge-Based
classification effectiveness of intrusion detection by using Systems, 189, 105124. https://doi.org/10.1016/j.knosys.2019.
improved conditional variational autoencoder and deep neural 105124
network. Sensors. https://doi.org/10.3390/s19112528 62. Almutairi, A. H., & Abdelmajeed, N. T. (2017). Innovative
47. Zhang, J., Ling, Y., Fu, X., Yang, X., Xiong, G., & Zhang, R. signature based intrusion detection system: Parallel processing
(2020). Model of the intrusion detection system based on the and minimized database. In 2017 International Conference on
integration of spatial-temporal features. Computers and Secu- the Frontiers and Advances in Data Science (FADS),
rity, 89, 101681. https://doi.org/10.1016/j.cose.2019.101681 pp. 114–119, https://doi.org/10.1109/FADS.2017.8253208.
48. Ugtakhbayar, N., Usukhbayar, B., & Baigaltugs S. (2020). A 63. Yang, Z., et al. (2022). A systematic literature review of
hybrid model for anomaly-based intrusion detection system, in methods and datasets for anomaly-based network intrusion
Advances in Intelligent Information Hiding and Multimedia detection. Computers and Security, 116, 102675. https://doi.org/
Signal Processing, Singapore, J.-S. Pan, J. Li, P.-W. Tsai, & L. 10.1016/j.cose.2022.102675
C. Jain (Eds.), Springer Singapore, pp. 419–431. 64. Abu Al-Haija, Q., & Al Badawi, A. (2022). High-performance
49. Saheed, Y. K., Abdulganiyu, O. H., & Tchakoucht, T. A. (2023). intrusion detection system for networked UAVs via deep
A novel hybrid ensemble learning for anomaly detection in learning. Neural Computing and Applications, 34(13),
industrial sensor networks and scada systems for smart city 10885–10900. https://doi.org/10.1007/s00521-022-07015-9
infrastructures. Journal of King Saud University-Computer and 65. Devan, P., & Khare, N. (2020). An efficient XGBoost–DNN-
Information Sciences, 35(5), 101532. based classification model for network intrusion detection sys-
50. Kaur, S., & Singh, M. (2020). Hybrid intrusion detection and tem. Neural Computing and Applications, 32(16), 12499–12514.
signature generation using deep recurrent neural networks. https://doi.org/10.1007/s00521-020-04708-x
Neural Computing and Applications, 32(12), 7859–7877. https:// 66. Rao, C. S., & Raju, K. B. (2019). Mapreduce accelerated sig-
doi.org/10.1007/s00521-019-04187-9 nature-based intrusion detection mechanism (idm) with pattern
51. Maseno, E. M., Wang, Z., & Xing, H. (2022). A systematic matching mechanism. In Soft Computing in Data Analytics:
review on hybrid intrusion detection system. Security and Proceedings of International Conference on SCDA 2018 (pp.
Communication Networks, 2022, 9663052. https://doi.org/10. 157-164). Springer Singapore.
1155/2022/9663052 67. Abbasi, J. S., Bashir, F., Qureshi, K. N., ul Islam, M. N., & Jeon,
52. Abolfathi, M., Shomorony, I., Vahid, A., & Jafarian, J. H. G. (2021). Deep learning-based feature extraction and
(2022). A Game-theoretically optimal defense paradigm against

123
480 Wireless Networks (2024) 30:453–482

optimizing pattern matching for intrusion detection using finite classifier for anomaly intrusion detection. Materials Today:
state machine. Computers and Electrical Engineering, 92, Proceedings. https://doi.org/10.1016/j.matpr.2021.01.765
107094. 84. Liberati, A., et al. (2009). The PRISMA statement for reporting
68. Abu Al-Haija, Q., Al Badawi, A., & Bojja, G. R. (2022). Boost- systematic reviews and meta-analyses of studies that evaluate
defence for resilient IoT networks: A head-to-toe approach. healthcare interventions: explanation and elaboration. BMJ, 339,
Expert Systems, 39(10), e12934. b2700. https://doi.org/10.1136/bmj.b2700
69. Liu, J., Gao, Y., & Hu, F. (2021). A fast network intrusion 85. Kitchenham, B., & Brereton, P. (2013). A systematic review of
detection system using adaptive synthetic oversampling and systematic review process research in software engineering.
LightGBM. Computers and Security, 106, 102289. https://doi. Information and Software Technology, 55(12), 2049–2075.
org/10.1016/j.cose.2021.102289 https://doi.org/10.1016/j.infsof.2013.07.010
70. Shahraki, A., Abbasi, M., & Haugen, Ø. (2020). Boosting 86. Kitchenham, B. A., & Stuart, C. (2007). Guidelines for per-
algorithms for network intrusion detection: A comparative forming systematic literature reviews in software engineering, in
evaluation of real AdaBoost, Gentle AdaBoost and modest EBSE Technical Report, Keele University and Durham
AdaBoost. Engineering Applications of Artificial Intelligence, University Joint Report, Report EBSE 2007–001, 2007. Avail-
94, 103770. https://doi.org/10.1016/j.engappai.2020.103770 able: https://www.elsevier.com/__data/promis_misc/525444sys
71. Mazini, M., Shirazi, B., & Mahdavi, I. (2019). Anomaly net- tematicreviewsguide.pdf.
work-based intrusion detection system using a reliable hybrid 87. Zhao, H., Li, M., & Zhao, H. (2020). Artificial intelligence
artificial bee colony and AdaBoost algorithms. Journal of King based ensemble approach for intrusion detection systems.
Saud University–Computer and Information Sciences, 31(4), Journal of Visual Communication and Image Representation,
541–553. https://doi.org/10.1016/j.jksuci.2018.03.011 71, 102736. https://doi.org/10.1016/j.jvcir.2019.102736
72. Ahmad, I., Ul Haq, Q. E., Imran, M., Alassafi, M. O., & 88. Abu Al-Haija, Q., & Zein-Sabatto, S. (2020). An efficient deep-
AlGhamdi, R. A. (2022). An efficient network intrusion detec- learning-based detection and classification system for cyber-at-
tion and classification system. Mathematics, 10(3), 530. tacks in IoT communication networks. Electronics, 9(12), 2152.
73. Kim, G., Lee, S., & Kim, S. (2014). A novel hybrid intrusion 89. Saheed, Y. K., Abiodun, A. I., Misra, S., Holone, M. K., &
detection method integrating anomaly detection with misuse Colomo-Palacios, R. (2022). A machine learning-based intru-
detection. Expert Systems with Applications, 41(4), 1690–1700. sion detection for detecting internet of things network attacks.
https://doi.org/10.1016/j.eswa.2013.08.066 Alexandria Engineering Journal, 61(12), 9395–9409.
74. Masdari, M., & Khezri, H. (2020). A survey and taxonomy of 90. D’Agostini, G. (1995). A multidimensional unfolding method
the fuzzy signature-based intrusion detection systems. Applied based on Bayes’ theorem. Nuclear Instruments and Methods in
Soft Computing, 92, 106301. https://doi.org/10.1016/j.asoc. Physics Research Section A: Accelerators, Spectrometers,
2020.106301 Detectors and Associated Equipment, 362(2), 487–498. https://
75. Meftah, S., Rachidi, T., & Assem, N. (2019). Network based doi.org/10.1016/0168-9002(95)00274-X
intrusion detection using the UNSW-NB15 dataset. Interna- 91. Box, G. E. P., & Tiao, G. C. (1973). Bayesian inference in
tional Journal of Computing and Digital Systems, 8(5), statistical analysis. International Statistical Review, 43, 242.
478–487. 92. Ng, A., & Jordan, M. (2001). On discriminative vs. generative
76. Masdari, M., & Khezri, H. (2021). Towards fuzzy anomaly classifiers: A comparison of logistic regression and naive
detection-based security: A comprehensive review. Fuzzy bayes. Advances in neural information processing systems, 14.
Optimization and Decision Making, 20(1), 1–49. https://doi.org/ 93. Soucy, P., & Mineau, G. W. (2001). A simple KNN algorithm
10.1007/s10700-020-09332-x for text categorization. In Proceedings 2001 IEEE International
77. Ashfaq, R. A. R., Wang, X.-Z., Huang, J. Z., Abbas, H., & He, Conference on Data Mining, pp. 647–648, https://doi.org/10.
Y.-L. (2017). Fuzziness based semi-supervised learning 1109/ICDM.2001.989592.
approach for intrusion detection system. Information Sciences, 94. Li, W., Yi, P., Wu, Y., Pan, L., & Li, J. (2014). A new intrusion
378, 484–497. https://doi.org/10.1016/j.ins.2016.04.019 detection system based on KNN classification algorithm in
78. Larijani, H., Ahmad, J., & Mtetwa, N. (2018, September). A wireless sensor network. Journal of Electrical and Computer
novel random neural network based approach for intrusion Engineering, 2014.
detection systems. In 2018 10th Computer Science and Elec- 95. Kotsiantis, S. B. (2007) Supervised machine learning: A review
tronic Engineering (CEEC) (pp. 50-55). https://doi.org/10.1109/ of classification techniques, presented at the Proceedings of the
CEEC.2018.8674228 2007 conference on Emerging Artificial Intelligence Applica-
79. Tama, B. A., Comuzzi, M., & Rhee, K. (2019). TSE-IDS: A tions in Computer Engineering: Real Word AI Systems with
two-stage classifier ensemble for intelligent anomaly-based Applications in eHealth, HCI, Information Retrieval and Per-
intrusion detection system. IEEE Access, 7, 94497–94507. vasive Technologies.
https://doi.org/10.1109/ACCESS.2019.2928048 96. Du, W., & Zhan, Z. (2002) Building decision tree classifier on
80. Abdulganiyu, O. H., Ait Tchakoucht, T., & Saheed, Y. K. private data, presented at the Proceedings of the IEEE interna-
(2023). A systematic literature review for network intrusion tional conference on Privacy, security and data mining - Volume
detection system (IDS). International Journal of Information 14, Maebashi City, Japan.
Security. https://doi.org/10.1007/s10207-023-00682-2 97. Quinlan, J. R. (1986). Induction of decision trees. Machine
81. Yerriswamy, T., & Murtugudde, G. (2021). An efficient algo- Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251
rithm for anomaly intrusion detection in a network. Global 98. Kotsiantis, S. B. (2013). Decision trees: A recent overview.
Transitions Proceedings, 2(2), 255–260. Artificial Intelligence Review, 39(4), 261–283. https://doi.org/
82. Alazzam, H., Sharieh, A., & Sabri, K. E. (2020). A feature 10.1007/s10462-011-9272-4
selection algorithm for intrusion detection system based on 99. Loh, W.-Y. (2011). Classification and regression trees. WIREs
pigeon inspired optimizer. Expert Systems with Applications, Data Mining and Knowledge Discovery, 1(1), 14–23. https://doi.
148, 113249. https://doi.org/10.1016/j.eswa.2020.113249 org/10.1002/widm.8
83. Sona, A. S., & Sasirekha, N. (2021). Kulczynski indexed drag- 100. Goeschel, K. (2016). Reducing false positives in intrusion
onfly feature optimization based polytomous adaptive base detection systems using data-mining techniques utilizing support

123
Wireless Networks (2024) 30:453–482 481

vector machines, decision trees, and naive Bayes for off-line research challenges. Expert Systems with Applications, 105,
analysis. SoutheastCon, 2016, 1–6. 233–261. https://doi.org/10.1016/j.eswa.2018.03.056
101. Deng, H., Runger G., & Tuv, E. (2011). Bias of Importance 120. Tang, T. A., Mhamdi, L., McLernon, D., Zaidi, S. A. R., &
Measures for Multi-valued Attributes and Solutions. In Artificial Ghogho, M. (2018). Deep recurrent neural network for intrusion
Neural Networks and Machine Learning – ICANN 2011, Berlin, detection in sdn-based networks, in 2018 4th IEEE Conference
Heidelberg, T. Honkela, W. Duch, M. Girolami, & S. Kaski, on Network Softwarization and Workshops (NetSoft),
(Eds.), Springer Berlin Heidelberg, pp. 293–300. pp. 202–206, https://doi.org/10.1109/NETSOFT.2018.8460090.
102. Tong, S., & Koller, D. (2001). Support vector machine active 121. Yu, Y., Si, X., Hu, C., & Zhang, J. (2019). A review of recurrent
learning with applications to text classification. Journal of neural networks: LSTM cells and network architectures. Neural
Machine Learning Research, 2, 45–66. Computation, 31(7), 1235–1270. https://doi.org/10.1162/neco_
103. Miranda, C., Kaddoum, G., Bou-Harb, E., Garg, S., & Kaur, K. a_01199
(2020). A collaborative security framework for software-defined 122. Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning
wireless sensor networks. IEEE Transactions on Information to forget: continual prediction with LSTM. Neural computation,
Forensics and Security, 15, 2602–2615. https://doi.org/10.1109/ 12(10), 2451–2471.
TIFS.2020.2973875 123. Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical eval-
104. Liu, Y., & Pi, D. (2017). A novel kernel SVM algorithm with uation of generic convolutional and recurrent networks for
game theory for network intrusion detection. KSII Transactions sequence modeling. arXiv preprint arXiv:1803.01271.
on Internet and Information Systems, 11, 4043–4060. 124. Tschannen, M., Bachem, O., & Lucic, M. (2018). Recent
105. Hu, W., Liao, Y., & Vemuri, V. R. (2003). Robust support advances in autoencoder-based representation learning. arXiv
vector machines for anomaly detection in computer security, in preprint arXiv:1812.05069.
ICMLA. 125. Hinton, G. E. (2012). A practical guide to training restricted
106. Cutler, D. R., et al. (2007). Random forests for classification in Boltzmann machines. In Neural Networks: Tricks of the Trade:
ecology. Ecology, 88(11), 2783–2792. https://doi.org/10.1890/ Second Edition (pp. 599-619). Berlin, Heidelberg: Springer
07-0539.1 Berlin Heidelberg.
107. Buczak, A. L., & Guven, E. (2016). A survey of data mining and 126. Mayuranathan, M., Murugan, M., & Dhanakoti, V. (2021). Best
machine learning methods for cyber security intrusion detection. features based intrusion detection system by RBM model for
IEEE Communications Surveys and Tutorials, 18, 1153–1176. detecting DDoS in cloud environment. Journal of Ambient
108. Doshi, R., Apthorpe, N., & Feamster, N. (2018, May). Machine Intelligence and Humanized Computing, 12, 3609–3619.
learning ddos detection for consumer internet of things devices. 127. Fiore, U., Palmieri, F., Castiglione, A., & Santis, A. D. (2013).
In 2018 IEEE Security and Privacy Workshops (SPW) (pp. Network anomaly detection with the restricted Boltzmann
29-35). IEEE. machine. Neurocomputer, 122, 13–23. https://doi.org/10.1016/j.
109. Pal, N. R., Pal, K., Keller, J. M., & Bezdek, J. C. (2005). A neucom.2012.11.050
possibilistic fuzzy c-means clustering algorithm. IEEE Trans- 128. Keyvanrad, M. A., & Homayounpour, M. M. (2014). A brief
actions on Fuzzy Systems, 13, 517–530. survey on deep belief networks and introducing a new object
110. Moustafa, N., Ahmed, M., & Ahmed, S. (2020, December). Data oriented toolbox (DeeBNet). arXiv preprint arXiv:1408.3264.
analytics-enabled intrusion detection: Evaluations of ToN_IoT 129. Dietterich, T. G. (2000). Ensemble methods in machine learn-
linux datasets. In 2020 IEEE 19th International Conference on ing. In International workshop on multiple classifier sys-
Trust, Security and Privacy in Computing and Communications tems (pp. 1-15). Berlin, Heidelberg: Springer Berlin Heidelberg.
(TrustCom) (pp. 727-735). IEEE. 130. Woniak, M., Graña, M., & Corchado, E. (2014). A survey of
111. Abdi, H., & Williams, L. J. (2010). Principal component anal- multiple classifier systems as hybrid systems. Information
ysis. WIREs Computational Statistics, 2(4), 433–459. https://doi. Fusion, 16, 3–17. https://doi.org/10.1016/j.inffus.2013.04.006
org/10.1002/wics.101 131. Illy, P., Kaddoum, G., Moreira, C. M., Kaur, K., & Garg, S.
112. Huang, G.-B., Zhu, Q.-Y., & Siew, C.-K. (2006). Extreme (2019). Securing fog-to-things environment using intrusion
learning machine: Theory and applications. Neurocomputing, detection system based on ensemble learning. In 2019 IEEE
70(1), 489–501. https://doi.org/10.1016/j.neucom.2005.12.126 wireless communications and networking conference
113. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. (WCNC) (pp. 1-7). IEEE.
Nature, 521(7553), 436–444. https://doi.org/10.1038/ 132. Domingos, P. M. (2012). A few useful things to know about
nature14539 machine learning. Communications of the ACM, 55, 78–87.
114. Chen, X. W., & Lin, X. (2014). Big Data Deep learning: 133. Breiman, L. (1996). Bagging predictors. Machine Learning,
Challenges and perspectives. IEEE Access, 2, 514–525. https:// 24(2), 123–140. https://doi.org/10.1007/BF00058655
doi.org/10.1109/ACCESS.2014.2325029 134. Baba, N. M., Makhtar, M., Fadzli, S. A., & Awang, M. K.
115. Ciresan, D. C., Meier, U., Masci, J., Gambardella, L. M., & (2015). CURRENT ISSUES IN ENSEMBLE METHODS AND
Schmidhuber, J. (2011). Flexible, high performance convolu- ITS APPLICATIONS. Journal of Theoretical & Applied Infor-
tional neural networks for image classification. In Twenty-sec- mation Technology, 81(2).
ond international joint conference on artificial intelligence. 135. Santana, L. E., Silva, L., Canuto, A. M., Pintro, F., & Vale, K.
116. Chen, Y., Zhang, Y., & Maharjan, S. (2017). Deep learning for O. (2010). A comparative analysis of genetic algorithm and ant
secure mobile edge computing. arXiv preprint colony optimization to select attributes for an heterogeneous
arXiv:1709.08025. ensemble of classifiers. In IEEE congress on evolutionary
117. Hermans, M., & Schrauwen, B. (2013). Training and analyzing computation (pp. 1-8). IEEE.
deep recurrent neural networks, in NIPS 2013. 136. Bosman, H. H. W. J., Iacca, G., Tejada, A., Wörtche, H. J., &
118. Pascanu, R., Gülçehre, Ç., Cho, K., & Bengio, Y. (2014). How Liotta, A. (2015). Ensembles of incremental learners to detect
to construct deep recurrent neural networks, CoRR, vol. abs/ anomalies in ad hoc sensor networks. Ad Hoc Networks, 35,
1312.6026. 14–36.
119. Nweke, H. F., Teh, Y. W., Al-garadi, M. A., & Alo, U. R. 137. Abu Al-Haija, Q., & Al-Dala’ien, M. A. (2022). ELBA-IoT: An
(2018). Deep learning algorithms for human activity recognition ensemble learning model for botnet attack detection in IoT
using mobile and wearable sensor networks: State of the art and networks. Journal of Sensor and Actuator Networks, 11(1), 18.

123
482 Wireless Networks (2024) 30:453–482

138. Aho, A. V., & Corasick, M. J. (1975). Efficient string matching. research interests include Artificial Intelligence, Cyber Security and
Communications of the ACM, 18, 333–340. Data Mining.
139. Alicherry, M., Muthuprasanna, M., & Kumar, V. (2006,
November). High speed pattern matching for network IDS/IPS. Taha Ait Tchakoucht was born in
In Proceedings of the 2006 IEEE International Conference on Rabat, Morocco. He received
Network Protocols (pp. 187-196). IEEE. Engineering degree in computer
140. Knuth, D. E., Morris, J. H., & Pratt, V. R. (1977). Fast pattern science from National Superior
matching in strings. SIAM Journal on Computing, 6, 323–350. School of Mines of Rabat 2012
141. Wu, S., & Manber, U. (1994). A fast algorithm for multi-pattern and worked as a Software
searching (pp. 1-11). Tucson, AZ: University of Arizona. Engineer, before receiving his
Department of Computer Science. Ph.D. degree in 2018 from the
142. Boyer, R. S., & Moore, J. S. (1977). A fast string searching Faculty of Sciences and Tech-
algorithm. Communications of the ACM, 20(10), 762–772. niques (FST) of Tangier, in
https://doi.org/10.1145/359842.359859 ‘‘Artificial Intelligence’’. Cur-
143. Asharf, J., Moustafa, N., Khurshid, H., Debie, E., Haider, W., & rently he is s an Assistant Pro-
Wahab, A. (2020). A review of intrusion detection systems fessor at Euromed University of
using machine and deep learning in internet of things: Chal- Fes. Artificial intelligence and
lenges, solutions and future directions. Electronics, 9(7), 1177. cybersecurity are the main sub-
jects of his research. He has a significant body of work in recognized
Publisher’s Note Springer Nature remains neutral with regard to journals both domestically and internationally.
jurisdictional claims in published maps and institutional affiliations.
Yakub Kayode Saheed (Mem-
Springer Nature or its licensor (e.g. a society or other partner) holds ber, IEEE) received his PhD
exclusive rights to this article under a publishing agreement with the degree in computer science
author(s) or other rightsholder(s); author self-archiving of the from Kwara State University,
accepted manuscript version of this article is solely governed by the Malete, Nigeria. He is currently
terms of such publishing agreement and applicable law. an assistant professor with the
School of IT & Computing,
American University of Nigeria.
Oluwadamilare Harazeem
He is co-founder of Kaptain
Abdulganiyu earned a master’s
Machine Learning Lab, where
degree in computer science
he leads the machine learning
from Department of Computer
lab. He is a member of the
Science, Bayero University
Internet Society, IAENG,and
Kano, He also received his
SDWIC. He has published in
Bachelor degree in Computer
high-impact international jour-
Science from Al-Hikmah
nals and conference proceedings.
University Ilorin. He has
worked as a System Analyst for
about 8 years, currently he is a
lecturer at University of Mai-
duguri, Nigeria. He is currently
pursuing his Ph.D. degree in
School of Engineering and
Artificial Intelligence, EuroMed University of Fes, Morocco. His

123

You might also like