Decreased Anomaly Score by Repeated Sequence (DASRS)

An Anomaly Detection Algorithm for Time Series

The DASRS algorithm identifies and counts the sequences of normalized values that appear in a time series and generates an anomaly score as a function of the number of times it identifies each sequence. We normalize observed values to limit the number of distinct sequences without changing the main characteristics of a time series. As the normalization reduces the number of distinct sequences, we can increase the performance of the anomaly detection algorithm. The first time DASRS identifies a given sequence, the returned score is as high as possible because the algorithm interprets it as a new behavior. Otherwise, the returned score decreases as the number of times a given sequence is found.

Sequence

Let X_t be a time series with the observations x₁, x₂, ... . A sequence of X_t is a subset of X_t consisting of consecutive elements, for example, x_i, ..., x_j, with i < j.

Normalization

The normalization applied by DASRS consists of transforming the observation value into an integer between 0 and a normalization factor that we call θ. The equation below represents the operations performed on x_i to get its normalized value (x_i'):

Where x_i is the input observation (x_i ∈ R, min_X and max_X are respectively the smallest and highest possible observation values of X_t. θ represents the normalization factor, x_i' is the normalized value of x_i observation, x_i' ∈ N and 0 ≤ x_i' ≤ θ.

The graphs below illustrate, respectively, a time series with observations of CPU usage and its equivalent normalized time series, with θ = 7.

Original Values	Normalized Values

Raw Score

We calculate the raw anomaly score, taking into account the current normalized sequence and the number of times that sequence appeared in the past. The raw anomaly score is given by the following equation:

Where occurrences represents the number of times the current sequence appears.

The above equation does not generate the final anomaly score. As explained in Real-Time Anomaly Detection for Streaming Analytics and Unsupervised real-time anomaly detection for streaming data, many times, a dataset analyzed register unpredictable behaviors caused by noise or the random nature of some metrics, generating a large number of false positives.

To address this, we develope two versions of DASRS: DASRS Rest and DASRS Likelihood.

The graph below shows the raw anomaly scores calculated by DASRS.

Raw Anomaly Scores

DASRS Rest

DASRS Rest defines a period after identifying an anomaly, in which the final anomaly score should be smoothed. DASRS Rest understands that very close anomalies are part of the same phenomenon. Therefore, after identifying an anomaly, the following scores are decreased for a period defined by the restPeriod parameter.

The graph below shows the final anomaly scores calculated by DASRS Rest.

DASRS Rest Anomaly Scores

DASRS Likelihood

We create DASRS Likelihood version from the NuPIC library. DASRS Likelihood uses the anomaly likelihood metric, which is a measure of the probability of the current state being anomalous based on the history of the raw anomaly scores calculated by the algorithm. A detailed explanation of the Likelihood score calculation is in Real-Time Anomaly Detection for Streaming Analytics and Unsupervised real-time anomaly detection for streaming data.

The graph below shows the final anomaly scores calculated by DASRS Likelihood.

DASRS Likelihood Anomaly Scores

Reference

This readme is a brief overview. For more details about DASRS, see the following paper:

Toward an Efficient Real-Time Anomaly Detection System for Cloud Datacenters

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
dasrs		dasrs
doc		doc
scripts		scripts
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Decreased Anomaly Score by Repeated Sequence (DASRS)

An Anomaly Detection Algorithm for Time Series

Sequence

Normalization

Raw Score

Raw Anomaly Scores

DASRS Rest

DASRS Rest Anomaly Scores

DASRS Likelihood

DASRS Likelihood Anomaly Scores

Reference

Publications in Portuguese

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ricardosdias/DASRS

Folders and files

Latest commit

History

Repository files navigation

Decreased Anomaly Score by Repeated Sequence (DASRS)

An Anomaly Detection Algorithm for Time Series

Sequence

Normalization

Raw Score

Raw Anomaly Scores

DASRS Rest

DASRS Rest Anomaly Scores

DASRS Likelihood

DASRS Likelihood Anomaly Scores

Reference

Publications in Portuguese

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages