0% found this document useful (0 votes)

14 views41 pages

FIOT UNIT 4 Pagenumber

This document provides an overview of Software Defined Networking (SDN), highlighting its centralized control plane compared to traditional networking's distributed control plane. It discusses the architecture of SDN, including the roles of southbound and northbound interfaces, and outlines the advantages and disadvantages of SDN, such as programmability and single points of failure. Additionally, it covers SDN's integration with IoT, data handling, and analytics, emphasizing the importance of managing large datasets generated by IoT devices.

Uploaded by

Anvitha Reddy Kolanu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views41 pages

FIOT UNIT 4 Pagenumber

Uploaded by

Anvitha Reddy Kolanu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

UNIT-IV

Introduction to Software Defined Network (SDN)

• All traditional networking devices like router and switches uses distributed
control plane. But newer model of networking i.e., Software-defined
Networking (SDN) uses centralized control plane.
• Distributed control plane means that control plane of all networking devices
lies within the device itself.
• Each device have their own control plane to control data plane.
• In Centralized control plane system, there is a device which contains control
plane of all devices.
• This device control the activities of data plane of all networking devices
simultaneously.
• This device is called Controller or SDN controller.

The following figure shows a model of controller based networking.

Figure- Controller based network model

1
1. Southbound Interface:
In SDN, all networking devices must be connected to controller so that it can
regulate data planes of all devices. When drawing architecture of network,
usually the network architect places networking devices below controller. Now
according to map conventions, interface between controller and networking
devices lies to south of controller. Hence, these interfaces are
called Southbound Interface.

Southbound interface is an interface between a program on controller and a

program on networking device. Note that these interfaces we are discussing are
software interface not physical one.

2. Northbound Interface :
Controller need to know many information regarding network so that it can
control data plane of networking devices All these information are provided by
Network Programmer. Network Programmer provide essential information to
controller through various software or script about what functions it has to do.
Again these softwares/scripts are placed above controller in network
architecture. This placement of software/script makes interfaces between
controller and software in north direction, according to map conventions.
Hence, Interfaces between controller and softwares are called Northbound
Interface. These interfaces enable programmability of network.

2
3. All interfaces we discussed above are program based interfaces. These
interfaces in a broader sense are called Application Program Interface (API).
An API is an interface through which two program can exchange data between
them.
In order to understand software defined networks, we need to understand the
various planes involved in networking.

Dataplane:
All the activities involving as well as resulting from data packets sent by the end
user belong to this plane. This includes:
• Forwarding of packets
• Segmentation and reassembly of data
• Replication of packets for multicasting
Control plane:
All activities necessary to perform data plane activities but do not involve end
user data packets belong to this plane. In other words, this is the brain of the
network. The activities of the control plane include:
• Making routing tables
• Setting packet handling policies
In a traditional network, each switch has its own data plane as well as control
plane. The control plane of various switches exchange topology information and
hence construct a forwarding table which decides where an incoming data packet
has to be forwarded via the data plane.

3
Advantages of SDN:
• Network is programmable hence can easily be modified via the controller
rather than individual switches.
• Switch hardware becomes cheaper since each switch only needs a data plane.
• Hardware is abstracted, hence applications can be written on top of controller
independent of switch vendor.
• Provides better security since the controller can monitor traffic and deploy
security policies. For example, if the controller detects suspicious activity in
network traffic, it can reroute or drop the packets.
Disadvantages of SDN:
The central dependency of the network means single point of failure, i.e. if the
controller gets corrupted, the entire network will be affected.

4
SDN Architecture

A typical SDN architecture consists of three layers.

• Application layer:
It contains the typical network applications like intrusion detection, firewall,
and load balancing
• Control layer:
It consists of the SDN controller which acts as the brain of the network. It
also allows hardware abstraction to the applications written on top of it.
• Infrastructure layer:
This consists of physical switches which forms the data plane and carries out
actual movement of data packets.

5
The layers communicate via a set of interfaces called the northbound
APIs(between application and control layer) and southbound APIs(between
control and infrastructure layer).

Challenges

✓ Rule placement

✓ Controller placement

Rule placement

✓ Switches forward traffic based on a rule – ‘Flow-Rule’ – defined by the

centralized controller.

▪ Traditionally, Routing Table in every switch (L3 switch/router).

SDN maintains Flow Table at every switch.

▪ Flow -Rule in the Flow Table.

✓ Each rule has a specific format, which is also defined by a protocol (e.g.,
OpenFlow).

✓ Size of ternary content-addressable memory (TCAM) is limited at the

switches.

▪ Limited number of rules can be inserted.

✓ Fast processing is done using TCAM at the switches.

✓ TCAM is very cost-expensive.

✓ On receiving a request, for which no flow-rule is present in the switch, the

switch sends a PACKET-IN message to the comtroller.

6
✓ The controller decides a suitable flow-rule for the request.

✓ The Flow-Rule inserted at the switch,

✓ Typically, 3-5ms delay is involved in a new rule placement

✓ How to define/place the rules at switches, while considering available

TCAM.

✓ How to define rules, so that less number of PACKET-IN

messages are sent to controller.

OpenFlow Protocol

✓ Only one protocol is available for rule placement – OpenFlow.

✓ It has different versions – 1.0, 1.1, 1.2, 1.3, etc. – to have different
number of match-fields.

7
✓ Different match-fields

▪ Source IP

▪ Destination IP

▪ Source Port

▪ Priority

▪ etc.

How much time a flow-rule is to be kept at the switch?

✓ Hard timeout

▪ All rules are deleted from the switch at hard timeout.

▪ This can used to reset the switch.

✓ Soft timeout

▪ If NO flow is received associated with a rule for a particular time,

the rule is deleted.

▪ This is used to empty the rule-space by deleting an unused rule.

✓ SDN is NOT OpenFlow.

8
▪ SDN is a technology/concept

▪ OpenFlow is a protocol used to communicate between data-plane

and control-plane.

▪ We may have other protocols for this purpose. However, OpenFlow

is the only protocol present today.

OpenFlow Switch Software

• Indigo: Open source, it runs on Mac OS X.

• LINC: Open source, it runs on Linux, Solaris, Windows, MacOS, and

FreeBSD.

• Pantou: Turns a commercial wireless router/access point to an OpenFlow

enabled switch. OpenFlow runs on OpenWRT.

• Of13softswitch: User-space software switch based on Ericsson

TrafficLab1.1 softswitch.

• Open vSwitch: Open Source, it is the MOST popular one present to

day.26Introduction to Internet of Things

Controller Placement

✓ Controllers define flow-rule according to the application- specific

requirements.

✓ The controllers must be able to handle all incoming requests from

switches.

✓ Rule should be placed without incurring much delay.

9
✓ Typically, a controller can handle 200 requests in a second (through a
single thread).

✓ The controllers are logically connected to the switches in one- hop

distance.

✓ Physically, they are connected to the switches in multi-hop distance.

✓ If we have a very small number of controllers for a large network, the

network might be congested with control packets (i.e., PACKET-IN
messages.
✓ if we look at this controller placement there are different architectures

Flat Architecture

So, one architecture the basic architecture is called the flat architecture, and here
basically the switch and the controller they are just logically one hop away the
switch sends a packet in message to the controller if the switch already does not
have this flow rule for the particular flow that it has received.

10
So, it will set a send a packet in message to the controller and the controller is
going to send back the flow rule corresponding to that to that particular. That
means, how it is at how the switch is going to treat it you know that particular
instruction is going to be sent by the controller the controller knows it the
controller knows how the different flows how the different packets are going to
be handled this is the assumption in this particular technology SDN technology.
Hierarchical (tree) Architecture

This is the hierarchical or the tree architecture and these I do not need to
elaborate further, but it is quite obvious we have these different switches and
hierarchically they are placed within the controllers are placed and connected to
these different switches in a tree like fashion.
And we have this packet in message and the corresponding flow rule coming

11
back for each of these connectivity’s.

Ring Architecture

In the ring architecture we have a similar kind of thing, but we have to keep in
mind that in the ring architecture. So, basically these controllers are placed in a
ring like fashion we have multiple controllers like this placed in ring like fashion,
but a particular switch is connected to only one controller in this version.

12
when the packet in request has to be sent this PACKET-IN request will be sent to
a single controller only not that it can be sent to any of the other controllers in the
ring it will be sent to a single controller and the low rule is going to be sent to
this particular switch that has

requested the rule, and then we have the mesh architecture mesh as we know
increases the reliability. And as you can see over here for instance we have 2
different switches who can be connected to a single controller. So, if this one
goes down there is the other one which can take over and so on. requested the
rule, and then we have the mesh architecture mesh as we know increases the
reliability. And as you can see over here for instance we have 2 different
switches who can be connected to a single controller. So, if this one goes down
there is the other one which can take over and so on.

Control Mechanisms
✓ Distributed

13
▪ The control decisions can be taken in a distributed manner
▪ Ex: each subnetwork is controlled by different controller
✓ Centralized
▪ The control decisions are taken in a centralized manner.
▪ Ex: A network is controlled by a single controller.

Backup Controller

✓ If a controller is down, what will happen?

▪ Backup controller is introduced

▪ Replica of the main controller is created

▪ If the main controller is down, backup controller controls the

network to have uninterrupted network management.

SDN one can have enhanced level of security in the network and in this
particular case we will be talking help of the firewall proxy,http, and the IDS
and these can have improved security with respect to this technology.
So, just as a very brief you know here we are not going to discuss about you
know improving security with SDN and in much detail, but just as a brief you
know this is the this is this is a this is a paper which was published in
SIGCOMM in 2013 very recently; that means, which is talking about the
simplifying protocol for policy enforcement.

So, what it does? So, you know let us look at this particular figure. So, it is an
example of a potential data plane ambiguity to implement the policy chain this
chain firewall IDS proxy in this particular topology and the sequence of flow of
instructions is like this. So, it will this is from the http when a http request comes
then it is sent from one switch to another switch. This particular switch then it

14
goes to the IDS comes back goes to the proxy and the forwarding and the
firewall.

And then finally, to this particular switch and then to the then finally, out of the
network. So, this is how you know security is implemented and enhanced using
SDN. So, we are not talking about as I mentioned already I just wanted to show
you that security can indeed be improved with the help of SDN. And we do not
want to discuss anything further on this particular issue.

Experimenting with SDN

✓ Simulator/Emulator

▪ Infrastructure deployment – MUST be supported with OpenFlow

▪ Controller placement – MUST support OpenFlow

▪ Remote controller-can be situated in a remote place, and

communicated using IP address and port number

▪ Local

Switch Deployment

✓ Mininet

▪ Used to create a virtual network with OpenFlow-enabled switches

▪ Based on Python language

▪ Supports remote and local controllers

✓ There are controller configuration software for example, Pox, Nox,
Floodlight open daylight and ONOS particularly open daylight and ONOS
are the most popular once that are used for controller configuration.

15
SDN for IOT

Benefits of Integrating SDN in IoT

✓ Intelligent routing decisions can be deployed using SDN

✓ Simplification of information collection, analysis and decision

making

✓ Visibility of network resources – network management is

simplified based on user, device and application-specific
requirements

✓ Intelligent traffic pattern analysis and coordinated decisions

SDN for IoT-I

16
Now, if we look at this particular figure in front of us we have these different
devices the IoT devices in different sub networks maybe and these devices
through mobile axis or fixed axis channels this data from these devices they can
be acquired and be transmitted to the data aggregator. Here all these data
aggregation are going to be done of the data that is received from these different
IoT devices. And then it passes through a transport network and from the
transport network it goes to the different gateways and the packet segregation is
going to bedone using this.

So, this is basically the simplified view of an IoT network now what happens is
when we want to integrate SDN what we are trying to do is we are going to use

17
the SDN controller. So, what the SDN controller is going to do is it is going to
control each of these different things different aspects and also it is you know it
is going to improve the orchestration between the different devices between the
different protocols that are running, etcetera, etcetera in this network and overall
it is going to improve the service logic that is behind it. So, this is going to be
improved.

Now with the SDN with the implementation of the SDN the control of these n
devices IoT devices which includes sensors actuators RF id tags and any other
IoT device. So, you know the centralized control is made possible then here as
we can see this part can take care of the rule placement, because we have these
access devices over here the rule placement while considering issues like
mobility etcetera and the heterogeneity of the n devices this can be implemented
here.

And the rule placement and traffic engineering and backbone networks can be
taken care of at the transport network and flow classification and enhanced
security are taken care of at the data center networks.

Data Handling and Analytics

18
Data handling is the process of ensuring that research data is stored, archived or
disposed off in a safe and secure manner during and after the conclusion of a
research project. This includes the development of policies and procedures to
manage data handled electronically as well as through non-electronic means .

Considerations/issues in data handling

Issues that should be considered in ensuring integrity of data handled include the
following:

• Type of data handled and its impact on the environment (especially if it is on

a toxic media).
• Type of media containing data and its storage capacity, handling and storage
requirements, reliability, longevity (in the case of degradable medium),
retrieval effectiveness, and ease of upgrade to newer media.
• Data handling responsibilities/privileges, that is, who can handle which
portion of data, at what point during the project, for what purpose, etc.
• Data handling procedures that describe how long the data should be kept,
and when, how, and who should handle data for storage, sharing, archival,
retrieval and disposal purposes.

In recent days, most data concern Big Data due to

✓ heavy traffic generated by IoT devices

✓ Huge amount of data generated by the deployed sensors.

What is Big Data

19
• Collection of data sets so large and complex that it becomes difficult to
process using on-hand database management tools or traditional data
processing applications .
• “Big Data” is the data whose scale, diversity, and complexity require new
architecture, techniques, algorithms, and analytics to manage it and extract
value and hidden knowledge from it.
• ‘Big Data’ is similar to ‘small data’, but bigger in size
• An aim to solve new problems or old problems in a better way.
• Big Data generates value from the storage and processing of very large
quantities of digital information that cannot be analyzed with traditional
computing techniques.

Types of Data:

There are two types of Data

✓ Structured data
✓ Data that can be easily organized.
✓ Usually stored in relational databases.
✓ Structured Query Language (SQL) manages structured data in
databases.
✓ It accounts for only 20% of the total available data today in the world.
✓ Unstructured data
✓ Information that do not possess any pre‐defined model.
✓ Traditional RDBMSs are unable to process unstructured data.
✓ Enhances the ability to provide better insight to huge datasets.
✓ It accounts for 80% of the total data available today in the world.

20
Characteristics of Big Data:

✓ Big Data is characterized by 7 Vs –

✓ Volume

✓ Velocity

✓ Variety

✓ Variability

✓ Veracity (Accuracy)

✓ Visualization

✓ Value

❖ Volume

o Quantity of data that is generated

o Sources of data are added continuously

o Example of volume ‐

▪ 30TB of images will be generated every night from the Large

Synoptic Survey Telescope (LSST)

▪ 72 hours of video are uploaded to YouTube every minute

❖ Velocity

o Refers to the speed of generation of data

21
o Data processing time decreasing day‐by‐day in order to provide
real‐time services

o Older batch processing technology is unable to handle high velocity

of data

o Example of velocity –

▪ 140 million tweets per day on average (according to a survey

conducted in 2011)

▪ New York Stock Exchange captures 1TB of trade information

during each trading session

❖ Variety

o Refers to the category to which the data belongs

o No restriction over the input data formats

o Data mostly unstructured or semi‐structured

o Example of variety –

▪ Pure text, images, audio, video, web, GPS data, sensor data,
SMS, documents, PDFs, flash etc.

❖ Variability

o Refers to data whose meaning is constantly changing.

o Meaning of the data depends on the context.

o Data appear as an indecipherable mass without structure

o Example:

22
▪ Language processing, Hashtags, Geo‐spatial data,
Multimedia, Sensor events

❖ Veracity

o Veracity refers to the biases, noise and abnormality in data.

o It is important in programs that involve automated decision‐making,

or feeding the data into an unsupervised machine learning
algorithm.

o Veracity isn’t just about data quality, it’s about data

understandability.

❖ Value

o It means extracting useful business information from scattered data.

o Includes a large volume and variety of data

o Easy to access and delivers quality analytics that enables informed

decisions

Data Handling Technologies:

❖ Cloud computing

o Essential characteristics according to NIST ( National Institute of

Standards and Technology )

▪ On‐demand self service

▪ Broad network access

▪ Resource pooling

▪ Rapid elasticity

23
▪ Measured service

o Basic service models provided by cloud computing

▪ Infrastructure‐as‐a‐Service (IaaS)

▪ Platform‐as‐a‐Service (PaaS)

▪ Software‐as‐a‐Service (SaaS)

❖ Internet of Things (IoT)

o According to Techopedia, IoT “describes a future where every day

physical objects will be connected to the internet and will be able to
identify themselves to other devices.”

o Sensors embedded into various devices and machines and deployed

into fields.

o Sensors transmit sensed data to remote servers via Internet.

o Continuous data acquisition from mobile equipment, transportation

facilities, public facilities, and home appliances

❖ Data handling at data centers

o Storing, managing, and organizing data.

o Estimates and provides necessary processing capacity.

o Provides sufficient network infrastructure.

o Effectively manages energy consumption.

o Replicates data to keep backup.

24
o Develop business oriented strategic solutions from big data.

o Helps business personnel to analyze existing data.

o Discovers problems in business operations.

Flow of Data

Data Sources:

✓ Enterprise data

✓ Online trading and analysis data.

✓ Production and inventory data.

✓ Sales and other financial data.

✓ IoT data

✓ Data from industry, agriculture, traffic, transportation

✓ Medical‐care data,

25
✓ Data from public departments, and families.

✓ Bio‐medical data

✓ Masses of data generated by gene sequencing.

✓ Data from medical clinics and medical R&Ds.

✓ Other fields

✓ Fields such as – computational biology, astronomy, nuclear research

etc

Data Acquisition:

✓ Data collection

✓ Log files or record files that are automatically generated by data

sources to record activities for further analysis.

✓ Sensory data such as sound wave, voice, vibration, automobile,

chemical, current, weather, pressure, temperature etc.

✓ Complex and variety of data collection through mobile devices. E.g. –

geographical location, 2D barcodes, pictures, videos etc.

✓ Data transmission

✓ After collecting data, it will be transferred to storage system for

further processing and analysis of the data.

✓ Data transmission can be categorized as – Inter‐DCN transmission

and Intra‐DCN transmission.

✓ Data pre‐processing

✓ Collected datasets suffer from noise, redundancy, inconsistency etc.,

thus, pre‐ processing of data is necessary.

✓ Pre‐processing of relational data mainly follows – integration,

cleaning, and redundancy mitigation

26
✓ Integration is combining data from various sources and provides users
with a uniform view of data.

✓ Cleaning is identifying inaccurate, incomplete, or unreasonable data,

and then modifying or deleting such data.

✓ Redundancy mitigation is eliminating data repetition through

detection, filtering and compression of data to avoid unnecessary
transmission.

Data Storage

✓ File system

✓ Distributed file systems that store massive data and ensure –

consistency, availability, and fault tolerance of data.

✓ GFS is a notable example of distributed file system that supports

large‐scale file system, though it’s performance is limited in case of
small files

✓ Hadoop Distributed File System (HDFS) and Kosmosfs are other

notable file systems, derived from the open source codes of GFS.

✓ Databases

✓ Emergence of non‐traditional relational databases (NoSQL) in order

to deal with the characteristics that big data possess.

✓ Three main No SQL databases – Key‐value databases,

column‐oriented databases, and document‐oriented databases.

Data Handling Using Hadoop:

✓ Hadoop is a software framework for distributed processing of large datasets

across large clusters of computers.

✓ Hadoop is open-source implementation for Google ‘s GFS and MapReduce.

27
Apache Hadoop's Map Reduce and Hadoop Distributed File System (HDFS)
components originally derived respectively from Google's MapReduce and
Google File System (GFS).

Building Blocks of Hadoop:

✓ Hadoop Common

✓ A module containing the utilities that support the other Hadoop

components

✓ Hadoop Distributed File System (HDFS)

✓ Provides reliable data storage and access across the nodes

✓ MapReduce

✓ Framework for applications that process large amount of datasets in

parallel.

✓ Yet Another Resource Negotiator (YARN)

✓ Next‐generation Map Reduce, which assigns CPU, memory and

storage to applications running on a Hadoop cluster.

Hadoop Distributed File System (HDFS):

✓ Centralized node

28
✓ Namenode

✓ Maintains metadata info about files

✓ Distributed node

✓ Datanode

✓ Store the actual data

✓ Files are divided into blocks

✓ Each block is replicated

Name and Data Nodes

✓ Namenode

✓ Stores filesystem metadata.

✓ Maintains two in‐memory tables, to map the datanodes to the blocks,

and vice versa

✓ Datanode

✓ Stores actual data

✓ Data nodes can talk to each other to rebalance and replicate data

✓ Data nodes update the namenode with the block information

periodically

✓ Before updating datanodes verify the checksums.

Job and Task Trackers

✓ Job Tracker –

✓ Runs with the Namenode

✓ Receives the user’s job

✓ Decides on how many tasks will run (number of mappers)

29
✓ Decides on where to run each mapper (concept of locality)

✓ Task Tracker –

✓ Runs on each datanode

Receives the task from Job Tracker.

Hadoop Master / Slave Architecture:

✓ Master‐slave shared‐nothing architecture

✓ Master

✓ Executes operations like opening, closing, and renaming files and

directories.

✓ Determines the mapping of blocks to Datanodes.

✓ Slave

✓ Serves read and write requests from the file system’s clients.

30
✓ Performs block creation, deletion, and replication as instructed by the
Namenode.

What is Data Analytics

✓ “Data analytics (DA) is the process of examining data sets in order to draw
conclusions about the information they contain, increasingly with the aid of
specialized systems and software. Data analytics technologies and
techniques are widely used in commercial industries to enable
organizations to make more‐ informed business decisions and
researchers to verify or disprove scientific models, theories and
hypotheses.”

[An admin's guide to AWS data management]

Types of Data Analysis

✓ Two types of analysis

31
✓ Qualitative Analysis

✓ Deals with the analysis of data that is categorical in nature

✓ Quantitative Analysis

✓ Quantitative analysis refers to the process by which numerical

data is analyzed

Qualitative Analysis

✓ Data is not described through numerical values

✓ Described by some sort of descriptive context such as text

✓ Data can be gathered by many methods such as interviews, videos and audio
recordings, field notes

✓ Data needs to be interpreted

✓ The grouping of data into identifiable themes

✓ Qualitative analysis can be summarized be

✓ Notice things

✓ Collect things

✓ Think about things

Quantitative Analysis

✓ Quantitative analysis refers to the process by which numerical data is

analyzed

✓ Involves descriptive statistics such as mean, media, standard deviation

32
✓ The following are often involved with quantitative analysis:

✓ Statistical models

✓ Analysis of variables

✓ Data dispersion

✓ Analysis of relationships between variables

✓ Contingence and correlation

✓ Regression analysis

✓ Statistical significance

✓ Precision

✓ Error limits

Comparison:

Qualitative Data Quantitative Data

Data is observed Data is measured

Involves descriptions Involves numbers

Emphasis is on quality Emphasis is on quantity

Examples are color, smell, taste,etc. Examples are volume, weight,etc.

33
Advantages

✓ Allows for the identification of important (and often mission‐critical) trends.

✓ Helps businesses identify performance problems that require some sort of

action.

✓ Can be viewed in a visual manner, which leads to faster and better decisions.

✓ Better awareness regarding the habits of potential customers.

✓ It can provide a company with aen edg over their competitors.

Statistical models

✓ The statistical model is defined as the mathematical equation that are

formulated in the form of relationships between variables.

✓ A statistical model illustrates how a set of random variables is related to

another set of random variables.

✓ A statistical model is represented as the ordered pair (X , P)

✓ X denotes the set of all possible observations

✓ P refers to the set of probability distributions on X

Statistical models are broadly categorized as

✓ Complete models

✓ Incomplete models

✓ Complete model does have the number of variables equal to the

number of equations

34
✓ An incomplete model does not have the same number of variables as
the number of equations

✓ In order to build a statistical model

✓ Data Gathering

✓ Descriptive Methods

✓ Thinking about Predictors

✓ Building of model

✓ Interpreting the Results

Analysis of variance

✓ Analysis of Variance (ANOVA) is a parametric statistical technique used to

compare datasets.

✓ ANOVA is best applied where more than 2 populations or samples are

meant to be compared.

✓ To perform an ANOVA, we must have a continuous response variable and at

least one categorical factor (e.g. age, gender) with two or more levels
(e.g. Locations 1, 2)

✓ ANOVAs require data from approximately normally distributed populations

✓ Properties to perform ANOVA –

✓ Independence of case

✓ The sample should be selected randomly

35
✓ There should not be any pattern in the selection of the sample

✓ Normality

✓ Distribution of each group should be normal

✓ Homogeneity

✓ Variance between the groups should be the same (e.g. should

not compare data from cities with those from slums)

✓ Analysis of variance (ANOVA) has three types:

✓ One way analysis

✓ One fixed factor (levels set by investigator). Factors: age, gender, etc.

✓ Two way analysis

✓ Factor variables are more than two

✓ K‐way analysis

✓ Factor variables are k

✓ Total Sum of square

✓ In statistical data analysis, the total sum of squares (TSS or SST) is a

quantity that appears as part of a standard way of presenting results of
such analyses. It is defined as being the sum, over all observations, of
the squared differences of each observation from the overall mean.

✓ Total SS = Σ(Yi – mean of Y)2.

✓ F –ratio

36
✓ Helps to understand the ratio of variance between two data sets

✓ The F ratio is approximately 1.0 when the null hypothesis is true and
is greater than

1.0 when the null hypothesis is false.

F=Msbetween/MSwithin

✓ Degree of freedom

✓ Factors which have no effect on the variance

✓ The number of degrees of freedom is the number of values in the final

calculation of a statistic that are free to vary.

Data dispersion

✓ A measure of statistical dispersion is a nonnegative real number that is zero

if all the data are the same and increases as the data becomes more diverse.

✓ Examples of dispersion measures:

✓ Range

✓ Average absolute deviation

✓ Variance and Standard deviation

✓ Range

✓ The range is calculated by simply taking the difference between the

maximum and minimum values in the data set.

✓ Average absolute deviation

37
✓ The average absolute deviation (or mean absolute deviation) of a data
set is the average of the absolute deviations from the mean.

✓ Variance

✓ Variance is the expectation of the squared deviation of a random

variable from its mean

✓ Standard deviation

✓ Standard deviation (SD) is a measure that is used to quantify the

amount of variation or dispersion of a set of data values

Contingence and correlation

✓ In statistics, a contingency table (also known as a cross tabulation or

crosstab) is a type of table in a matrix format that displays the (multivariate)
frequency distribution of the variables.

✓ Provides a basic picture of the interrelation between two variables

✓ A crucial problem of multivariate statistics is finding (direct‐)dependence

structure underlying the variables contained in high‐dimensional
contingency tables

✓ Correlation is a technique for investigating the relationship between two

quantitative, continuous variables

✓ Pearson's correlation coefficient (r) is a measure of the strength of the

association between the two variables.

✓ Correlations are useful because they can indicate a predictive relationship

that can be exploited in practice

38
Regression analysis:

✓ In statistical modeling, regression analysis is a statistical process for

estimating the relationships among variables.

✓ Focuses on the relationship between a dependent variable and one or more

independent variables.

✓ Regression analysis estimates the conditional expectation of the dependent

variable given the independent variables

✓ The estimation target is a function of the independent variables called the

regression function

✓ Characterize the variation of the dependent variable around the regression

function which can be described by a probability distribution

✓ Regression analysis is widely used for prediction and forecasting, where its
use has substantial overlap with the field of machine learning

✓ Regression analysis is also used to understand which among the independent

variables are related to the dependent variable

Statistic al significance

✓ Statistical significance is the likelihood that the difference in conversion

rates between a given variation and the baseline is not due to random
chance

✓ Statistical significance level reflects the risk tolerance and confidence level

✓ There are two key variables that go into determining statistical significance:

✓ Sample size

39
✓ Effect size

✓ Sample size refers to the sample size of the experiment

✓ The larger your sample size, the more confident you can be in the result of
the experiment (assuming that it is a randomized sample)

✓ The effect size is just the standardized mean difference between the two
groups

✓ If a particular experiment replicated, the different effect size estimates from

each study can easily be combined to give an overall best estimate of the
effect size

Precision and Error limits:

✓ Precision refers to how close estimates from different samples are to each
other.

✓ The standard error is a measure of precision.

✓ When the standard error is small, estimates from different samples will be
close in value and vice versa.

✓ Precision is inversely related to standard error.

✓ The limits of error are the maximum overestimate and the maximum
underestimate from the combination of the sampling and the non‐sampling
errors

✓ The margin of error is defined as –

✓ Limit of error = Critical value x Standard deviation of the statistic

40
✓ Critical value: Determines the tolerance level of error.

✓ The limits of error are the maximum overestimate and the maximum
underestimate from the combination of the sampling and the non‐sampling
errors

✓ The margin of error is defined as –

✓ Limit of error = Critical value x Standard deviation of the statistic

✓ Critical value: Determines the tolerance level of error.

Networks Unit 4
No ratings yet
Networks Unit 4
13 pages
D2S2 Introduction To SDN
No ratings yet
D2S2 Introduction To SDN
82 pages
Unit 4 FIoT Notes
No ratings yet
Unit 4 FIoT Notes
23 pages
Module 2 - Part A
No ratings yet
Module 2 - Part A
18 pages
Software Defined Networks: A Survey: Shruthi S Manvachar
No ratings yet
Software Defined Networks: A Survey: Shruthi S Manvachar
5 pages
Unit 3 - Part 7 - SDN - N SDN IOT - SRD
No ratings yet
Unit 3 - Part 7 - SDN - N SDN IOT - SRD
26 pages
A Comparsion of Load Balancing Strategy in Software Defined Networking
No ratings yet
A Comparsion of Load Balancing Strategy in Software Defined Networking
8 pages
MN Unit 2 Notes
No ratings yet
MN Unit 2 Notes
12 pages
18 Network Programming 64123 LIMU Network Programming Py SDN Basics Lec20 Week 14
No ratings yet
18 Network Programming 64123 LIMU Network Programming Py SDN Basics Lec20 Week 14
13 pages
SDN for Network Professionals
No ratings yet
SDN for Network Professionals
30 pages
Unit 4
No ratings yet
Unit 4
51 pages
Software-Defined Networking (SDN) : A New Network Paradigm
No ratings yet
Software-Defined Networking (SDN) : A New Network Paradigm
45 pages
Notes - All Units
No ratings yet
Notes - All Units
89 pages
Hakir I 2014
No ratings yet
Hakir I 2014
31 pages
Introduction To Software Defined Networking: Prasanna A
No ratings yet
Introduction To Software Defined Networking: Prasanna A
14 pages
Automation & Software Defined Networking (SDN) : Control Plane
No ratings yet
Automation & Software Defined Networking (SDN) : Control Plane
5 pages
Future Internet: Software-Defined Networking Using Openflow: Protocols, Applications and Architectural Design Choices
No ratings yet
Future Internet: Software-Defined Networking Using Openflow: Protocols, Applications and Architectural Design Choices
35 pages
Networking for IT Professionals
No ratings yet
Networking for IT Professionals
33 pages
Chapter 5 SDN Control Plane
100% (2)
Chapter 5 SDN Control Plane
22 pages
6CS029 Lecture 8 - SDN
No ratings yet
6CS029 Lecture 8 - SDN
36 pages
4 - SDN
No ratings yet
4 - SDN
40 pages
LEAD104723
No ratings yet
LEAD104723
14 pages
A Comprehensive Survey of Interface Protocols For Software Defined Networking
No ratings yet
A Comprehensive Survey of Interface Protocols For Software Defined Networking
30 pages
Cait
No ratings yet
Cait
17 pages
SDN Unit 1 Material
No ratings yet
SDN Unit 1 Material
17 pages
SDN Notes
No ratings yet
SDN Notes
89 pages
Deployment of SDN Based Openflow System
No ratings yet
Deployment of SDN Based Openflow System
12 pages
Introduction To SDN
No ratings yet
Introduction To SDN
22 pages
SDN: Revolutionizing Network Management
No ratings yet
SDN: Revolutionizing Network Management
41 pages
Lecture 10 Control Plane Functions 28th July
No ratings yet
Lecture 10 Control Plane Functions 28th July
39 pages
Software-Defined Networking: Challenges and Research Opportunities For Future Internet
No ratings yet
Software-Defined Networking: Challenges and Research Opportunities For Future Internet
26 pages
Assignment: Software Defined Networking (SDN)
No ratings yet
Assignment: Software Defined Networking (SDN)
12 pages
Module 6.Sdn
No ratings yet
Module 6.Sdn
14 pages
CCET0385 - Dd66ce3540 - Question Bank All 5 Units
No ratings yet
CCET0385 - Dd66ce3540 - Question Bank All 5 Units
13 pages
Openflow Switch - Control and Data Plane
No ratings yet
Openflow Switch - Control and Data Plane
51 pages
Controllers in SDN A Review Report
No ratings yet
Controllers in SDN A Review Report
15 pages
SDN Basics for Network Professionals
No ratings yet
SDN Basics for Network Professionals
31 pages
Hon Notes Unit 1 and Unit 2
No ratings yet
Hon Notes Unit 1 and Unit 2
51 pages
Software-Defined Networking Survey
No ratings yet
Software-Defined Networking Survey
17 pages
Software Defined Networking
No ratings yet
Software Defined Networking
83 pages
Section 3.1
No ratings yet
Section 3.1
32 pages
Fiot U 4
No ratings yet
Fiot U 4
22 pages
Software Defined Networking Basics: 1 Motivation For SDN
No ratings yet
Software Defined Networking Basics: 1 Motivation For SDN
9 pages
Unit Iv Software Defined Networks NT
No ratings yet
Unit Iv Software Defined Networks NT
21 pages
SDN
No ratings yet
SDN
57 pages
Lecture 5 - Software Defined Networking
No ratings yet
Lecture 5 - Software Defined Networking
43 pages
Scalable Software de Ned Networking - Example - Doc
No ratings yet
Scalable Software de Ned Networking - Example - Doc
9 pages
Lec8 Ch5
No ratings yet
Lec8 Ch5
26 pages
A Survey On SDN The Future of Networking
No ratings yet
A Survey On SDN The Future of Networking
17 pages
Chapter 4 (SDN)
No ratings yet
Chapter 4 (SDN)
50 pages
SoftwareDefinedNetworking - A Comprehensive Survey (01-31)
No ratings yet
SoftwareDefinedNetworking - A Comprehensive Survey (01-31)
31 pages
A Comparative Study On Software Defined
No ratings yet
A Comparative Study On Software Defined
5 pages
SDN: A Guide for Network Professionals
No ratings yet
SDN: A Guide for Network Professionals
7 pages
Fiot Unit Iv
No ratings yet
Fiot Unit Iv
18 pages
Software Defined Networking
No ratings yet
Software Defined Networking
7 pages
(IJCST-V3I5P29) : Mr. Sachin Ashok Vanjari, Dr. R. B. Ingle
No ratings yet
(IJCST-V3I5P29) : Mr. Sachin Ashok Vanjari, Dr. R. B. Ingle
5 pages
Artificial Intelligence Enabled Software-Defined Networking: A Comprehensive
No ratings yet
Artificial Intelligence Enabled Software-Defined Networking: A Comprehensive
21 pages
Sofware Defined Network
No ratings yet
Sofware Defined Network
19 pages
Fiot Answers
No ratings yet
Fiot Answers
13 pages
Fiot UNIT 2 Pagenumber
No ratings yet
Fiot UNIT 2 Pagenumber
38 pages
Fiot Unit - 5
No ratings yet
Fiot Unit - 5
123 pages
FLAT Unit5
No ratings yet
FLAT Unit5
18 pages
Fiot Unit 3
No ratings yet
Fiot Unit 3
45 pages
SL Unit Ii
No ratings yet
SL Unit Ii
26 pages
SL Unit5
No ratings yet
SL Unit5
20 pages
UNIT-5 Machine Learning
No ratings yet
UNIT-5 Machine Learning
31 pages
Characterisig Triphasic, Biphasic and Monophasic
No ratings yet
Characterisig Triphasic, Biphasic and Monophasic
9 pages
HTML 5
No ratings yet
HTML 5
31 pages
Chapter 14 Security
No ratings yet
Chapter 14 Security
3 pages
Battle of Kalisz
No ratings yet
Battle of Kalisz
8 pages
JNL College (Pallavi For Botany B.SC Part I) Topic-Marselia
No ratings yet
JNL College (Pallavi For Botany B.SC Part I) Topic-Marselia
13 pages
Logic Programming A Hands-On Approach
No ratings yet
Logic Programming A Hands-On Approach
67 pages
Reports: Myopic Children Show Insufficient Accommodative Response To Blur
No ratings yet
Reports: Myopic Children Show Insufficient Accommodative Response To Blur
5 pages
Breville BES860XL Manual
100% (1)
Breville BES860XL Manual
116 pages
Beam Search Strategies For Neural Machine Translation
No ratings yet
Beam Search Strategies For Neural Machine Translation
5 pages
A Study On Production of Pulp From Ground Nut Shells
No ratings yet
A Study On Production of Pulp From Ground Nut Shells
6 pages
MCX Crude
No ratings yet
MCX Crude
18 pages
Vacuum Insulation Breakdown
No ratings yet
Vacuum Insulation Breakdown
14 pages
Killarney Park Camping Checklist
No ratings yet
Killarney Park Camping Checklist
2 pages
Magnetic Mattress
No ratings yet
Magnetic Mattress
2 pages
Nonfiction Reading Test Chess
No ratings yet
Nonfiction Reading Test Chess
3 pages
Citi Investment Banking Analyst
No ratings yet
Citi Investment Banking Analyst
4 pages
Bhavya Catalog
No ratings yet
Bhavya Catalog
8 pages
Marine Insurance Lecture Notes HND Level 2
No ratings yet
Marine Insurance Lecture Notes HND Level 2
28 pages
Cree LED JSeries Feature Sheet
No ratings yet
Cree LED JSeries Feature Sheet
2 pages
51 - 19266410 - Tin (Metal) Powder CASNO 7440 31 5 MSDS
No ratings yet
51 - 19266410 - Tin (Metal) Powder CASNO 7440 31 5 MSDS
6 pages
Exercise Set 3.2
No ratings yet
Exercise Set 3.2
9 pages
2001 Menter
No ratings yet
2001 Menter
11 pages
O Level Computer Science Topical Solved by Ali Akram
67% (6)
O Level Computer Science Topical Solved by Ali Akram
25 pages
Boon - Welcometostarfinder
No ratings yet
Boon - Welcometostarfinder
1 page
Hotel Receptionist/Front of House Job Description
No ratings yet
Hotel Receptionist/Front of House Job Description
2 pages
Sabyasachi Brand Manual
No ratings yet
Sabyasachi Brand Manual
34 pages
Difference Between MS Word and MS Excel
33% (3)
Difference Between MS Word and MS Excel
2 pages
Tuberculosis - Madeleine R Jasin - Clean
No ratings yet
Tuberculosis - Madeleine R Jasin - Clean
49 pages
LMI P1 Series Parts List Chemical Metering Pump PDF
No ratings yet
LMI P1 Series Parts List Chemical Metering Pump PDF
4 pages
A Chosen Calling Jews in Science in The Twentieth Century Latest Edition Download
100% (12)
A Chosen Calling Jews in Science in The Twentieth Century Latest Edition Download
15 pages