0% found this document useful (0 votes)

113 views29 pages

What Is Anomaly Detection: MR Hew Ka Kian Hew - Ka - Kian@rp - Edu.sg

Anomaly detection identifies rare items that differ significantly from the majority of data. Anomalies may indicate problems and require further investigation. Anomaly detection techniques can be unsupervised or supervised. Time series anomaly detection analyzes timestamped metric values for anomalies. Multivariate detection considers relationships between multiple factors, while univariate looks at single variables. The Anomaly Detection Toolkit provides Python functions for time series anomaly detection, including detecting point outliers, spikes, and level shifts. It supports common anomaly detection workflows involving data preparation, algorithm selection, model training, and inference.

Uploaded by

Ng Kai Ting

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views29 pages

What Is Anomaly Detection: MR Hew Ka Kian Hew - Ka - Kian@rp - Edu.sg

Uploaded by

Ng Kai Ting

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

OFFICIAL (CLOSED) \ NON-SENSITIVE

What is
Anomaly Detection
Mr Hew Ka Kian
hew_ka_kian@rp.edu.sg
OFFICIAL (CLOSED) \ NON-SENSITIVE

What is Anomaly Detection

• Anomaly detection (also outlier detection) is the identification of rare items,
events or observations which raise suspicions by differing significantly from the
majority of the data.
• The anomalous items may
translate to some kind of problem such
as bank fraud, network breach, a structural
defect, medical problems or errors in a text.
• Anomalies are also referred to as outliers,
novelties, noise, deviations and exceptions.
• Anomaly detection is applied on unlabeled
data is known as unsupervised anomaly
detection although supervised anomaly
detection is possible with labeled data
Source: https://en.wikipedia.org/wiki/Anomaly_detection
OFFICIAL (CLOSED) \ NON-SENSITIVE

Anomaly Types
• Anomaly is a broad concept, which may refer to many different types
of events in time series.
• A spike of value, a shift of volatility etc. could all be anomalous or
normal, depending on the specific context.

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Time series data anomaly detection

• Successful anomaly detection hinges on an ability to accurately analyze time series data in real
time.
• Time series data is composed of a sequence of values over time. That means each point is
typically a pair of two items — a timestamp for when the metric was measured, and the value of
that metric.
• Time series data anomaly detection can be used
for valuable metrics such as: Seismic
Virus
infection reading
cases

Power
Transaction
generator
volume
output

Login Mobile app

attempts installs

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Univariate vs. Multivariate

Univariate Multivariate

• Looking at one variable • Need to consider multiple

• If we want to look at factors and the relationship
anomalous weather patterns, between them
univariate anomaly detection • If we want to look at
will measure a single indicator, anomalous weather patterns,
such as temperature. We can multivariate analysis will
then ask questions like “is this consider a host of factors, like
temperature strange for this precipitation, humidity and air
region?” pressure.
OFFICIAL (CLOSED) \ NON-SENSITIVE

Anomaly Detection Toolkit

Anomaly Detection Toolkit (ADTK) is a Python package for unsupervised time

series anomaly detection.

This package offers a set of functions that makes the training of the dataset and
the detection of anomaly easier to code

It also provides some functions to process and visualize time series and
anomaly events.

ADTK is open sourced and its code and many examples of how to use the
package is at https://github.com/arundo/adtk
OFFICIAL (CLOSED) \ NON-SENSITIVE

ADTK Anomaly Types

• ADTK can detect a point anomaly where there is a data point whose value is
significantly different from others.
• An outlier point in a time series time is one that exceed the normal range of this
series.
• To detect outliers, the normal range of time series values (baseline) is what a
detector needs to learn.
OFFICIAL (CLOSED) \ NON-SENSITIVE

ADTK Anomaly Types

• Spike and Level Shift: In some situations, whether a time point is
normal depends on if its value is aligned with its near past.
• An abrupt increase or decrease of value is called a spike if the change
is temporary.

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

ADTK Anomaly Types

• An abrupt increase or decrease of value is called a spike if the change
is temporary.
• However we should use ADTK to detect a level shift if the change is
permanent.

Source: https://cloud.google.com/ai-platform/docs/ml-solutions-overview
OFFICIAL (CLOSED) \ NON-SENSITIVE

Workflow Source data

Data can come from the places You can also get your hands on
you have access to like the credit privileged data through Prepare data
card transaction if you work in the commercial arrangement like
bank paying for the data
Select the
algorithm

There are also plenty of open (public) dataset that is shared with the public free: Train the
https://kaggle.com is an online community for machine learning enthusiasts and it model
has many open dataset
https://data.gov.sg was first launched in 2011 as the government's one-stop portal
to its publicly-available datasets from 70 public agencies. To date, more than 100 Test the model
apps have been created using the government’s open data.

Use the model

for Inference
OFFICIAL (CLOSED) \ NON-SENSITIVE

Workflow Source data

Filter the data Transform

Prepare data

• The rows of interest like • Combine multiple source Select the

those above 65 years old • Extract feature by applying algorithm
• The columns of interest, mathematical functions
basically just the datetime like moving average of 20
and the feature columns data points Train the
model

Test the model

Use the model

for Inference
OFFICIAL (CLOSED) \ NON-SENSITIVE

Workflow Source data

What type of anomaly?

Prepare data

Global (point) Contextual anomalies Collective anomalies

Select the
algorithm

ADTK volatility shift

ADTK threshold detector
to detect values outside
ADTK quantile detector to
detect values outside
detector detects shift of
volatility by comparing 2
ADTK Seasonal detector
detects departure from a
Train the
certain threshold values certain percentile windows of values repeating pattern model

Test the model

Use the model

for Inference
OFFICIAL (CLOSED) \ NON-SENSITIVE

Workflow Source data

Train and Test

Prepare data

To train the model using ADTK, call the fit(df)

function Select the
algorithm

To test or use the model to detect anomaly, call Train the

the detect(df) function model

Convenient function that train followed by Test the model

detect, call the fit_detect(df) function

Use the model

for Inference
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise D
Pandas is a fast and powerful Python library for data
manipulation.

Import the library

• import pandas as pd

Read the content of the comma separated values (CSV) into a

Pandas Series object
• s = pd.read_csv('dataset.csv', index_col="pr_date",
parse_dates=True, infer_datetime_format=True)
• index_col is the column that holds the datetime
• infer_datetime_format=True tries to guess the date format
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise D

Use the ADTK validate_series(df) to check for error in the Series

Import the library

• from adtk.data import validate_series

Validate the Series

• s = validate_series(s)
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise D Detecting Simple Threshold

Use ThresholdAD to detect outlier (point anomaly) that exceeds the
baseline threshold

Import the library

• from adtk.detector import ThresholdAD

Create the ThresholdAD object with the high and low threshold values. Values above the
high or below the low threshold are flagged as anomaly
• threshold_ad = ThresholdAD(high=30, low=15)

Detect anomalies

• anomalies = threshold_ad.detect(s)
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise D

Use the plot() function to visualize the graph

Import the library

• from adtk.visualization import plot

Plot the graph with the DataFrame and anomalies. Can specify the anomaly
marker colour and the tag as marker (dot on the graph)

• plot(s, anomaly=anomalies, anomaly_color='red’,

anomaly_tag="marker");
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise D Percentile
• Percentile: the value below which a percentage of data falls.

Source: https://www.mathsisfun.com/data/percentiles.html
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise D Percentile
Use QuantileAD to detect outlier (point anomaly) that exceeds the
certain percentile of the series

Import the library

• from adtk.detector import QuantileAD

Create the object with the high and low percentile with values outside of
this boundary considered anomalous
• quantile_ad = QuantileAD(high=0.99, low=0.01)

Need to be trained on the range before detection. fit_detect() does this

in one step
• anomalies = quantile_ad.fit_detect(df)
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise D
Use VolatilityShiftAD() to detect a change in the volatility in the
time series

Import the library

• from adtk.detector import VolatilityShiftAD

VolatilityShiftAD() compares the volatity between 2 windows next to each other. We

have to create the object specifying how many time points do the windows contain

• volatility_shift_ad = VolatilityShiftAD(window=30)
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise E
• Read tsla.us.txt csv file and print the content.
s = pd.read_csv('data/tsla.us.txt’,
index_col="Date", parse_dates=True,
infer_datetime_format=True)
print(s)
• What are the columns
Open, High, Low, Close, volume, OpenInt
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise E
• Write the code to drop the other columns except the Date and Volume columns.
s = s.drop(['High','Low','Open','Close','OpenInt'],axis=1)
• Write the code to detect the volatility shift with window of 60 values
s = validate_series(s)
volatility_shift_ad = VolatilityShiftAD(window=60)
anomalies = volatility_shift_ad.fit_detect(s)
• Plot the graph and do you have something like the graph shown in the worksheet that
detects the starts of increase volatility?
plot(s, anomaly=anomalies, anomaly_color='red’);
• What did you find that was the likely reason for the anomaly? Bare in mind that anomaly
does not have to be bad, simply something that deviates from the standard or norm.
• Tesla announced it is getting profitable in 2013.
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise F
• Open the file weekly-infectious-disease-bulletin-cases.csv using Excel.
• What are the columns

epi_week: week of the year

disease: disease
no._of_cases: no of cases
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise F
• How do we filter for the rows with ‘Dengue Fever’?
infectious = infectious[infectious['disease'] == 'Dengue Fever’]
• Set the epi_week as the index column.
infectious = infectious.set_index('epi_week')
• Drop the disease column.
infectious = infectious.drop('disease',axis=1)
OFFICIAL (CLOSED) \ NON-SENSITIVE

Student Activity Source data

In exercise G, we are going to do the whole ML workflow Prepare data

1. Source for the data at Data.gov.sg

Select the
2. Prepare the data algorithm
• Change the datetime format
• Choose a disease to examine and filter out the irrelevant columns Train the
model
3. Select the algorithm – Shift in volatility
4. Train and use the model for inference Test the
model

Use the model

for Inference

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise G
• Acute Upper Respiratory Tract infections
infectious = pd.read_csv(
'data/average-daily-polyclinic-attendances-for-selected-diseases.csv’)
infectious['epi_week'] = pd.to_datetime(
infectious['epi_week'] + '-1 00:00:00', format='%Y-W%W-%w %H:%M:%S’)
infectious = infectious[
infectious['disease'] == 'Acute Upper Respiratory Tract infections’]
infectious = infectious.set_index('epi_week’)
infectious = infectious.drop('disease',axis=1)
volatility_shift_ad = VolatilityShiftAD(window=52)
anomalies = volatility_shift_ad.fit_detect(infectious)
plot(infectious, anomaly=anomalies, anomaly_color='red');
OFFICIAL (CLOSED) \ NON-SENSITIVE

Problem Solution
Is there is any abrupt change in trend for visitors to Singapore?

Source:
OFFICIAL (CLOSED) \ NON-SENSITIVE

Problem Solution
• Source data
• Get the visitor-international-arrivals-to-singapore-by-region-monthly.csv from Data.gov.sg
• The datetime column is well defined so no need to modify
s = pd.read_csv('data/visitor-international-arrivals-to-singapore-by-region-
monthly.csv', index_col="month",
parse_dates=True)
• Filter for a region like Africa
s = s[s['region']=='Africa']
• Drop the regions column
s = s.drop(['region'],axis=1)
• Train and inference
s = validate_series(s)
volatility_shift_ad = VolatilityShiftAD(window=12)
anomalies = volatility_shift_ad.fit_detect(s)
• Plot
plot(s, anomaly=anomalies, anomaly_color='red')
OFFICIAL (CLOSED) \ NON-SENSITIVE

Exercise H
• Any anomaly?
Yes, around 2002-2003 (depending on country)

Time Series Anomaly Detection Intro
No ratings yet
Time Series Anomaly Detection Intro
43 pages
Bubathi PHD Thesis Bearing
No ratings yet
Bubathi PHD Thesis Bearing
284 pages
A Deep Learning Model For Remaining Usef
No ratings yet
A Deep Learning Model For Remaining Usef
16 pages
Chemical Process Control An Introduction To Theory and Practice
No ratings yet
Chemical Process Control An Introduction To Theory and Practice
358 pages
Digital Twin For Monitoring of Industrial
No ratings yet
Digital Twin For Monitoring of Industrial
14 pages
SULFOX
No ratings yet
SULFOX
6 pages
Support Vector Regression
No ratings yet
Support Vector Regression
15 pages
Matlab Modal Analysis
No ratings yet
Matlab Modal Analysis
5 pages
Matlabch 01
No ratings yet
Matlabch 01
135 pages
Buku - Data Driven Detection and Diagnosis of Faults in Traction Systems of High Speed Train
No ratings yet
Buku - Data Driven Detection and Diagnosis of Faults in Traction Systems of High Speed Train
164 pages
A Systematic Review of A Digital Twin City
No ratings yet
A Systematic Review of A Digital Twin City
10 pages
Introduction To Graph Theory. Implementation of The Graph Using The Python Language.
No ratings yet
Introduction To Graph Theory. Implementation of The Graph Using The Python Language.
18 pages
Spatial Domain-Filtering
No ratings yet
Spatial Domain-Filtering
119 pages
HEAT and MASS TRANSFER - Lecture 10
100% (1)
HEAT and MASS TRANSFER - Lecture 10
34 pages
GraphSignalProcessing ICIP 2013 Ortega
No ratings yet
GraphSignalProcessing ICIP 2013 Ortega
125 pages
Unit 6
No ratings yet
Unit 6
126 pages
MAT3003 Probability Statistics and Reliability
No ratings yet
MAT3003 Probability Statistics and Reliability
2 pages
M.Sc. Graph - Theory 2024 24 1
No ratings yet
M.Sc. Graph - Theory 2024 24 1
165 pages
Residual Generation For Fault Diagnosis
No ratings yet
Residual Generation For Fault Diagnosis
184 pages
Multibody Mechanical Systems SEO Guide
No ratings yet
Multibody Mechanical Systems SEO Guide
102 pages
D Smith Thesis June 2022
100% (1)
D Smith Thesis June 2022
285 pages
Lecture 4.4 Genetic Algorithms For Optimum Design
100% (1)
Lecture 4.4 Genetic Algorithms For Optimum Design
75 pages
Industrial Wet ESP Technology
No ratings yet
Industrial Wet ESP Technology
8 pages
Unit 5
No ratings yet
Unit 5
110 pages
Minimum Weight Design of Aero Engine Turbine Disks
No ratings yet
Minimum Weight Design of Aero Engine Turbine Disks
8 pages
Mining Structures of Factual Knowledge From Text - 9781681733937 - WEB PDF
No ratings yet
Mining Structures of Factual Knowledge From Text - 9781681733937 - WEB PDF
199 pages
Intro To Machine Learning With Apache Cassandra and Apache Spark
No ratings yet
Intro To Machine Learning With Apache Cassandra and Apache Spark
80 pages
All Slides DT Only 2017
100% (1)
All Slides DT Only 2017
551 pages
Methods For Automated Design of
No ratings yet
Methods For Automated Design of
255 pages
Digital Transformation in Marketing
No ratings yet
Digital Transformation in Marketing
10 pages
Data Science Engineering Full Time Program Brochure
No ratings yet
Data Science Engineering Full Time Program Brochure
21 pages
GAN-Based Anomaly Detection for CPS
No ratings yet
GAN-Based Anomaly Detection for CPS
10 pages
Monte Carlo and API RBI Technology
No ratings yet
Monte Carlo and API RBI Technology
5 pages
Maquet PDF
100% (1)
Maquet PDF
12 pages
Reporton Gas Leakage Detection Sensor
No ratings yet
Reporton Gas Leakage Detection Sensor
13 pages
Final Year Report Submitted
No ratings yet
Final Year Report Submitted
61 pages
2008 - Nonlinear System Identification Using Wavelet Based SDP Models - THESIS - RMIT
No ratings yet
2008 - Nonlinear System Identification Using Wavelet Based SDP Models - THESIS - RMIT
247 pages
(SpringerBriefs in Mathematics) Qi He, Le Yi Wang, George G. Yin - System Identification Using Regular and Quantized Observations - Applications of Large Deviations Principles-Springer (2013)
No ratings yet
(SpringerBriefs in Mathematics) Qi He, Le Yi Wang, George G. Yin - System Identification Using Regular and Quantized Observations - Applications of Large Deviations Principles-Springer (2013)
108 pages
Correlated Attention Based Transformer For Multivariate Time Series
No ratings yet
Correlated Attention Based Transformer For Multivariate Time Series
15 pages
Ultra-Short Pulse Propagation Guide
No ratings yet
Ultra-Short Pulse Propagation Guide
45 pages
EDA On Titanic Dataset
100% (1)
EDA On Titanic Dataset
39 pages
MPSC Psi Question Paper 2017 Mains Paper 2
No ratings yet
MPSC Psi Question Paper 2017 Mains Paper 2
40 pages
A Review On Bayesian Modeling Approach To Quantify Failure Risk Assessment of Oil and Gas Pipelines Due To Corrosion
No ratings yet
A Review On Bayesian Modeling Approach To Quantify Failure Risk Assessment of Oil and Gas Pipelines Due To Corrosion
19 pages
Chapter 6 Introduction To Predictive Analytics
100% (1)
Chapter 6 Introduction To Predictive Analytics
46 pages
Flexmix Intro
No ratings yet
Flexmix Intro
18 pages
Nonlinear Programming Concepts PDF
No ratings yet
Nonlinear Programming Concepts PDF
224 pages
STAT3006 Lecture Notes 2021 Aug8 2021
No ratings yet
STAT3006 Lecture Notes 2021 Aug8 2021
110 pages
Green Cloud Computing Term Paper
No ratings yet
Green Cloud Computing Term Paper
5 pages
Machine Learning Educator Guide
No ratings yet
Machine Learning Educator Guide
55 pages
An Analytical Insight of Omicron Sentiments by N-Gram Using Machine Learning
100% (1)
An Analytical Insight of Omicron Sentiments by N-Gram Using Machine Learning
5 pages
Runaway
No ratings yet
Runaway
7 pages
ME6501 Only 4 Units Available
No ratings yet
ME6501 Only 4 Units Available
140 pages
Advanced Sequences & Golden Ratio
No ratings yet
Advanced Sequences & Golden Ratio
6 pages
Analysis of Image Quality Using Sobel Filter
No ratings yet
Analysis of Image Quality Using Sobel Filter
6 pages
Data Anomaly Diagnosis Method of Temperature Sensor
No ratings yet
Data Anomaly Diagnosis Method of Temperature Sensor
11 pages
Animal Eyes
100% (1)
Animal Eyes
39 pages
ISA Transactions: P.K. Gayen, D. Chatterjee, S.K. Goswami
No ratings yet
ISA Transactions: P.K. Gayen, D. Chatterjee, S.K. Goswami
16 pages
070 Vol 01 Sec 02 Comp
No ratings yet
070 Vol 01 Sec 02 Comp
266 pages
Module 11 (C)
No ratings yet
Module 11 (C)
4 pages
WP S-Ax Key Steps To Detect An Anomaly in Real-time-JAN10
No ratings yet
WP S-Ax Key Steps To Detect An Anomaly in Real-time-JAN10
10 pages
Pai 2008
No ratings yet
Pai 2008
26 pages
BRM CS
No ratings yet
BRM CS
4 pages
Ai-Augmented Security Models For Software Development: Leveraging Machine Learning For Threat Detection and Mitigation
No ratings yet
Ai-Augmented Security Models For Software Development: Leveraging Machine Learning For Threat Detection and Mitigation
11 pages
PTSP Jntua Old Question Papers
100% (1)
PTSP Jntua Old Question Papers
32 pages
L6-Discrete Memoryless Channels PDF
No ratings yet
L6-Discrete Memoryless Channels PDF
9 pages
Language Modelling-NGRAM, NeuralLM
No ratings yet
Language Modelling-NGRAM, NeuralLM
16 pages
ML Model for Building Vibration Impact
No ratings yet
ML Model for Building Vibration Impact
8 pages
Loop Control With SIMATIC
No ratings yet
Loop Control With SIMATIC
20 pages
An Overview Of: Robot That Used PID Control
No ratings yet
An Overview Of: Robot That Used PID Control
20 pages
Spe 19556 Pa
No ratings yet
Spe 19556 Pa
7 pages
Data Modeling With DAX-Concepts
No ratings yet
Data Modeling With DAX-Concepts
3 pages
Lec-2 - Maxima & Minima of One Variable
No ratings yet
Lec-2 - Maxima & Minima of One Variable
19 pages
Lagrangian Density in Schrödinger Equation
No ratings yet
Lagrangian Density in Schrödinger Equation
1 page
Robust Control: Saba Rezvanian
No ratings yet
Robust Control: Saba Rezvanian
44 pages
STA222 Week7
No ratings yet
STA222 Week7
14 pages
Lecture Notes On Probability, Statistics & Linear Algebra
No ratings yet
Lecture Notes On Probability, Statistics & Linear Algebra
124 pages
Worksheet 2.1 (1) SPSS
No ratings yet
Worksheet 2.1 (1) SPSS
11 pages
Linear Regression for Analysts
No ratings yet
Linear Regression for Analysts
24 pages
Crusher PBM TM
No ratings yet
Crusher PBM TM
4 pages
Abstract:: 1.topic Modelling On Instagram Hashtags: An Alternative Way To Automatic Image Annotation Authors
No ratings yet
Abstract:: 1.topic Modelling On Instagram Hashtags: An Alternative Way To Automatic Image Annotation Authors
4 pages
6.2 Worksheet - Sorting Algorithms (Oly)
No ratings yet
6.2 Worksheet - Sorting Algorithms (Oly)
2 pages
Business Mathematics Syllabus For 1st Semester
No ratings yet
Business Mathematics Syllabus For 1st Semester
2 pages
12th Maths
No ratings yet
12th Maths
6 pages
LINEAR ALJEBRA Outline
No ratings yet
LINEAR ALJEBRA Outline
2 pages
Chapter 6
No ratings yet
Chapter 6
35 pages
Advanced Financial Mathematics
No ratings yet
Advanced Financial Mathematics
7 pages
Statistics Mid Exam78
No ratings yet
Statistics Mid Exam78
3 pages
Monte Carlo
No ratings yet
Monte Carlo
10 pages
LOB Forecasting with LSTFT Model
No ratings yet
LOB Forecasting with LSTFT Model
23 pages
Bernoulli Distribution Notes
No ratings yet
Bernoulli Distribution Notes
2 pages

What Is Anomaly Detection: MR Hew Ka Kian Hew - Ka - Kian@rp - Edu.sg

Uploaded by

What Is Anomaly Detection: MR Hew Ka Kian Hew - Ka - Kian@rp - Edu.sg

Uploaded by

OFFICIAL (CLOSED) \ NON-SENSITIVE

What is Anomaly Detection

Time series data anomaly detection

Login Mobile app

Univariate vs. Multivariate

• Looking at one variable • Need to consider multiple

Anomaly Detection Toolkit

Anomaly Detection Toolkit (ADTK) is a Python package for unsupervised time

ADTK Anomaly Types

ADTK Anomaly Types

ADTK Anomaly Types

Workflow Source data

Use the model

Workflow Source data

Filter the data Transform

• The rows of interest like • Combine multiple source Select the

Test the model

Use the model

Workflow Source data

What type of anomaly?

Global (point) Contextual anomalies Collective anomalies

ADTK volatility shift

Test the model

Use the model

Workflow Source data

Train and Test

To train the model using ADTK, call the fit(df)

To test or use the model to detect anomaly, call Train the

Convenient function that train followed by Test the model

Use the model

Import the library

Read the content of the comma separated values (CSV) into a

Use the ADTK validate_series(df) to check for error in the Series

Import the library

• from adtk.data import validate_series

Validate the Series

Exercise D Detecting Simple Threshold

Import the library

• from adtk.detector import ThresholdAD

Use the plot() function to visualize the graph

Import the library

• from adtk.visualization import plot

• plot(s, anomaly=anomalies, anomaly_color='red’,

Import the library

• from adtk.detector import QuantileAD

Need to be trained on the range before detection. fit_detect() does this

Import the library

• from adtk.detector import VolatilityShiftAD

VolatilityShiftAD() compares the volatity between 2 windows next to each other. We

epi_week: week of the year

Student Activity Source data

In exercise G, we are going to do the whole ML workflow Prepare data

1. Source for the data at Data.gov.sg

Use the model

You might also like