002 2019 Methodology
002 2019 Methodology
Learn more about BICSI 002 and view the first five
sections for free at www.bicsi.org/002
2
An Overview of the ANSI/BICSI 002-2019 Data Center Availability Class Methodology
With the increased reliance on 24/7 availability over the total time of that interval and can be
of information and data processing support, expressed as the following equation:
the reliance on mission-critical data
processing facilities has also increased.
Mission-critical data centers have not Uptime within Observation Interval
Availability =
traditionally been high-profile projects, yet their Total Time of Observation Interval
design issues are increasingly complex and
critical. With an emerging design terminology
and vocabulary, their rapid evolution calls for While the previous equation can generate the
an exceptional degree of building and IT availability of a system, the result does not
systems coordination and integration. These provide information which can be used to
data centers are not merely warehouses for improve the observed availability value. By
servers; instead, they rival medical operating splitting total time into its two primary elements
rooms or semiconductor plants, with their (uptime and downtime), the equation changes
precise environmental controls and power to the following:
requirements.
To increase the likelihood of success of a
mission-critical facility, required performance Uptime
Availability
levels of availability and reliability should be = Uptime + Downtime
defined, prior to the start or formalization of the
design, procurement, and maintenance
requirements and processes. Failure to define
performance and availability levels prior to the Going one step further, downtime itself can be
project start often yields higher construction, split into two types: scheduled and
implementation, and operational costs as well unscheduled. When the two types of downtime
as inconsistent and unpredictable are inserted into the availability equation, it
performance. results in the following:
What is Availability?
Availability Uptime
Availability is the probability that a component = Uptime
or system is in a condition to perform its + Scheduled Downtime
intended function. While similar to reliability, + Unscheduled Downtime
availability is affected by more events than a
failure requiring repair or replacement of a
component or system.
Thus, availability can be increased by
While there are different formulae to calculate reductions in one or both types of downtime.
availability for calculations involving systems, Examples of common scheduled and
availability, in its simplest form, is the ratio of unscheduled events are shown in Table 1.
uptime observed during a specified interval
3
Data Center Design Tools
Risk Analysis
It is impossible to eliminate the risk of What would be the economic loss to
downtime, but risk reduction is an important the organization from damaged or
planning element. In an increasingly destroyed equipment?
competitive world, it is imperative to address What would be the impact of disrupted
downtime in business decisions. The design of service to the organization’s
systems supporting critical IT functions reputation? For example, would
depends on the interaction between the subscribers switch to a competitors’
criticality of the function and its operational service?
profile. What would be the regulatory or
Criticality is defined as the relative importance contractual impact, if any? For
of a function or process as measured by the example, if unplanned downtime
consequences of its failure or inability to resulted in loss of telephone service or
function. The operational profile expresses the electrical service to the community,
time intervals over which the function or would there be penalties from the
process must operate. government?
To provide optimal design solutions for a
mission-critical data center, consider several Data Center Availability Classes
key factors. NFPA 75 identifies seven To a great degree, design decisions are
considerations for protection of the guided by the identified Availability Class.
environment, equipment, function, Therefore, it is essential to understand the
programming, records, and supplies in a data meaning of each Availability Class before
center. These include: determining an Availability Class for a specific
What are the life safety aspects of the data center
function? For example, if the system
failed unexpectedly, would lives be put Availability Class 0
at risk? Examples of such applications The objective of Class 0 is to support the basic
include automated safety systems, air requirements of the IT functions without
traffic control, and emergency call supplementary equipment. Capital cost
centers. avoidance is the major driver. There is a high
What is the threat to occupants or risk of downtime because of planned and
exposed property from natural, man- unplanned events.
made, or technology-caused
catastrophic events? Availability Class 1
What would be the economic loss to The objective of Class 1 is to support the basic
the organization from the loss of requirements of the IT functions. There is a
function or loss of records? high risk of downtime because of planned and
What is the access to redundant off- unplanned events. However, in Class 1 data
site processing systems (e.g., “high centers, remedial maintenance can be
performance computing”, massively performed during nonscheduled hours.
paralleled systems, cloud service
provider, disaster recovery site,
backup data center)?
4
An Overview of the ANSI/BICSI 002-2019 Data Center Availability Class Methodology
5
Data Center Design Tools
6
An Overview of the ANSI/BICSI 002-2019 Data Center Availability Class Methodology
7
Data Center Design Tools
Of note, the cost of downtime must be weighed critical functions helps determine the tactics
against the cost of mitigating risks in achieving that will be deployed to mitigate downtime risk.
high availability. Even an event such as a less As shown in Table 5, there are five impact
than a second of power interruption or a few classifications, each associated with a specific
minutes of cooling interruption can result in impact scope.
hours of recovery time. Thus, the objective is
Step 4: Identify the Data Center
to identify the intersection between the allowed
Availability Class
maximum annual downtime and the intended
operational level. A function or process that The final step in determining the data center
has a high availability requirement with a low Availability Class is to combine the previously
operational profile has less risk associated identified factors to arrive at a usable
with it than a similar function with a higher expression of availability. Since operational
operational profile. level is subsumed within the availability
ranking, the task is to matrix the availability
Step 3: Determine Impact of Downtime ranking against the impact of downtime to
The third step in determining the Availability arrive at an appropriate Availability Class.
Class is to identify the impact or Table 6 shows the intersection of these two
consequences of downtime. This is an values, and the resultant Data Center
essential component of risk management Availability Class.
because not all downtime has the same impact
on mission-critical data center services.
Identifying the impact of downtime on mission-
Local in scope, affecting only a single function or operation, resulting in a minor disruption
Isolated
or delay in achieving non-critical organizational objectives.
Local in scope, affecting only a single site, or resulting in a minor disruption or delay in
Minor
achieving key organizational objectives.
Regional in scope, affecting a portion of the enterprise (although not in its entirety) or
Major
resulting in a moderate disruption or delay in achieving key organizational objectives.
Multiregional in scope, affecting a major portion of the enterprise (although not in its entirety)
Severe
or resulting in a major disruption or delay in achieving key organizational objectives.
Affecting the quality of service delivery across the entire enterprise or resulting in a
Catastrophic
significant disruption or delay in achieving key organizational objectives.
8
An Overview of the ANSI/BICSI 002-2019 Data Center Availability Class Methodology
Prior to virtualization, location transparent One of the values of the BICSI data center
applications and cloud services, the optimal services reliability framework model is it can
data center services configuration consisted of be used to:
an alignment of the reliability classes across all Identify the minimum reliability targets.
the data center service layers. This provided Provide a structured methodical
the minimum required level of reliability and approach to guide decisions on how to
redundancy without over building any one of adjust lower layer services to
the data center service layers. compensate for higher layer services
System designs with clustered systems having reliability inadequacies.
nodes spread across two or more Class 3 data Guide discussions regarding the
centers can meet or exceed the uptime of a possible technical and cost benefits of
system in a single Class 4 data center. In such increasing the reliability of the network
a design, the first failover is to the local node architecture and higher layers above
(synchronous), the second failover is to a the targeted reliability class across
nearby data center (~16 km [10 miles], and still multiple data centers so that cost
synchronous), and the third is to a remote data savings can be realized by building
center (but asynchronous). each of the data centers facilities to a
Such a design does increase the facility’s lower Class than the targeted
overhead and therefore, the cost. However, it reliability classification.
offers a way for designers to avoid many of the On the following pages, three examples are
costs associated with Class 4 data centers, provided to illustrate how the framework
whether owned, leased or collocated. provides for multi-data center architecture.
9
Data Center Design Tools
High Availability In-House Multi- There are times when there are man-made or
Data Center Architecture Example natural event common mode risks to both data
centers that have been deemed an acceptable
In this example, a customer has identified risk to the organization. An example would be
Class 3 as the targeted data center services multi-regional events, such as multi-State
reliability level. The customer has multiple power outages, that an organization deems
facilities that can support critical data center acceptable. There would be no loss of data
functions. By provisioning the applications with within the data center (running on backup
high-availability configuration across two data power sources); however, the users would not
center facilities, the customer will be able to have access to the applications or data as their
achieve the targeted reliability and availability networks and systems would be off-line
objectives. throughout the multi-state region. The
It is important that any man-made or natural organization might determine that the users
event common mode risks that may exist would not have an expectation of accessing
within the geographical region that is common the data in this scenario, and there would be
between the two data centers be identified and no loss of revenue or business reputation as a
evaluated. The communications between the result. Therefore, the costs associated with
two data centers can be synchronous or building out multiple data centers across a
asynchronous, depending on the recovery wider geographical area (possibly outside
point objective (RPO) and recovery time synchronous communication capabilities) may
objective (RTO) of the disaster not be justified.
recovery/business continuity requirements
and the physical distance limitations between
the two data centers.
Applications A0 A1 A2 A3 A4
Network
Architecture
N0 N1 N2 N3 N4 Sync Sync
Telecommunications
Cabling C0 C1 C2 C3 C4
Infrastructure
Facilities F0 F1 F2 F3 F4
Class 3
10
An Overview of the ANSI/BICSI 002-2019 Data Center Availability Class Methodology
Applications A0 A1 A2 A3 A4
Async Async
Network
N0 N1 N2 N3 N4
Architecture
Telecommunications
Cabling C0 C1 C2 C3 C4
Infrastructure
Sync
Facilities F0 F1 F2 F3 F4
Class 3
11
Data Center Design Tools
Private Cloud Multi-Data Center Two of the data centers are connected via
Architecture – Class 4 Solution/Four Class synchronous communications located within a
2 Facilities common region. The pair of data centers
The second example is a customer that has located within each common region are
identified two Class 4 data centers as the connected via asynchronous communications.
targeted data center services reliability level. The pair of data centers would be located
By provisioning the private cloud applications outside each other’s region, ensuring no
across four Class 2 data center facilities, both natural or man-made event represents a
within a common region and outside common common mode of failure.
regions, the customer may be able to achieve This example is not provided as a solution that
similar reliability and availability objectives. will always equate to two Class 4 data centers,
The applications can move around each of the but it is provided to show how the data center
data center facilities with the loss of any one services reliability framework can be used to
facility or facilities within a region having little evaluate various options.
or no impact on the enterprise.
Applications A0 A1 A2 A3 A4
Network
N0 N1 N2 N3 N4 Async Async
Architecture
Telecommunications
Cabling C0 C1 C2 C3 C4
Infrastructure
Sync
Facilities F0 F1 F2 F3 F4
Class 4
12
An Overview of the ANSI/BICSI 002-2019 Data Center Availability Class Methodology
Project name:
Project number:
Project description:
Project location:
13
Data Center Design Tools
14
Further Your Data Center
Career and Knowledge
BICSI Data Center Design Consultant (DCDC)
Data centers are more than just cabling and servers. Electrical and mechanical systems,
as well as the physical building and building systems all interact to meet a data center’s
operational requirements and mission.
Be recognized for not only having the knowledge, but also the ability to apply your
knowledge and skills in data center design. Those who have been awarded the DCDC
come from different facets of the data center environment, be it data center design,
construction or operations.
Relatively new to data centers? Qualifying to take the DCDC exam only requires a few
years of approved and verifiable experience in the design or construction of data centers
within the last six years. Hesitant because of today’s data center work pace? The DCDC
exam can be taken at any of 4,500 Pearson VUE Authorized Testing Centers worldwide,
avoiding the need to take extended time away from the site.
Learn more about all that BICSI has to offer for data
centers and the DCDC at www.bicsi.org/dcdc
15