CAPSTONE PROJECT 2
Implement IDS system integrating machine
learning for Hai Dang Travel company
C2NE.02
1
OUR TEAM
Mentor
Assoc. Prof.
Nhu, Nguyen Gia
Leader Hoang,
Hieu, Le Khai, Tran
Vu, Duong Duong
Quang Dinh
The Ngoc
2
TABLE OF CONTENTS
01 03 05
02 04 06
INTRODUCTION OPERATION DEMO
DIAGRAM DATA
PROJECT PROCESSING CONCLUSIO
OBJECTIVES N
3
01
Introduction
4
The increase in numbers and types of
INTRODUCTION networked devices inevitably leads to a
wider surface of attack whereas the
impact of successful attacks is
becoming increasingly severe as more
critical responsibilities are assumed be
these devices.
5
HAI DANG TRAVEL
COMPANY
6
HAI DANG TRAVEL
COMPANY
Services
Tour in Overseas
country tour
Group
Event
tour Customer satisfaction is
our success
Study
Visa
abroad 7
!!!
PROBLEM
Upgrade your network, warn
and prevent attacks
8
THEM
They need us how
network administrators
can receive alerts from
US
attacks.
The problem we had was
they wanted a moderate
budget.
9
SOLUTION
We came up with a solution to deploy an IDS
system with machine learning to detect and
prevent attacks.
10
PROJECT OBJECTIVES
02
The goal of the project is to fulfill the requirements
of the customer.
11
PROJECT
OBJECTIVES
Research new approaches for intrusion
detection does not depend on signatures.
Build an Intrusion Detection System.
Build a Machine Learning Model.
Prevent intrusion
12
PRODUCT
OVERVIEW
IDS MACHINE DATASET
An intrusion detection LEARNING
Machine learning is the A data set consists of
system (IDS) is a device study of computer roughly two
or software application algorithms that improve components. The two
that monitors a network automatically through components are rows
or systems for malicious experience. It is seen as and columns.
activity or policy a subset of artificial Additionally, a key
violations. intelligence. feature of a data set is
that it is organized so
that each row contains
one observation 13
03
OPERATION DIAGRAM
14
Company local network diagram
Hai Dang Travel
Network diagram
15
An overview of Logical Network
Diagram
Network diagram with the appearance of IDS
16
Intrusion Detection System Operation
How IDS
work ?
Intrusion Detection System Operation 17
Intrusion Detection System Operation
How Machine
Learning Model
Works
?
Machine Learning Model Operation 18
04
DATA PROCESSING
19
DATA PROCESSING
CSE-CIC-IDS2018 dataset Realistic background traffic
provided by the Canadian and different attack scenarios
Institute for Cybersecurity Datasets were conducted.
The dataset contains both
Ten days of operation inside
benign network traffic as well
a controlled network
as captures of the most
environment on AWS.
common network attacks
20
DATA PROCESSING
Datasets Overview
21
Number of flow per attack type
22
DATA PROCESSING
Datasets Problems
23
Data cleaning and
features engineering
Remove Replace infinity Drop all null and
duplicate header value to mean negative value
Up sampling Scale the data
Remove strong
data to 100000 using
correlation
samples per a Standard
features
attack category Scaler
24
Remove strong correlation features
Before After
25
Machine Learning
Gradient Boosting
26
Machine Learning
Gradient Boosting
27
down_up_ratio active_mean flow_pkts_s flow_duration label
0.714285714 0 1396.084766 8821.249008 Malicious
1.116666667 2415745.579 1.98322469 64048571.59 Benign
Build Decision Tree from data
28
EXAMPLE
GRADIENT
BOOSTING
29
EXAMPLE
GRADIENT
BOOSTING
30
PRODUCT DEMO
Demo Diagram 31
In this project, we tried our best and
CONCLUSION finished it. However, there are still
some issues that need to be improved
in the latest updates. In addition, our
project has received a lot of positive
contributions from international
friends through GitHub.
32