0% found this document useful (0 votes)
373 views11 pages

Active Directory Disaster Recovery: Whitepaper Resource

Uploaded by

Kyan James
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
373 views11 pages

Active Directory Disaster Recovery: Whitepaper Resource

Uploaded by

Kyan James
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Active

WHITEPAPER RESOURCE

Directory Disaster
Recovery Russell
Written by
Smith

SPONSORED BY

A BWW Media Group Brand


EXECUTIVE SUMMARY Active Directory Disaster Recovery has
always been an extremely complicated
As the cornerstone of most enterprise process, requiring lengthy preparation,
IT systems, Active Directory has grown planning and testing. Depending on
both in importance and complexity in the size of the forest, and source of AD
recent years. Enterprise IT environments failure, restoring Active Directory can take
have evolved with the rise of the mobile days or more, rendering businesses non-
workforce and cloud-based applications, functional during the recovery process.
and as a result, businesses have become This white paper examines the complexity
increasingly dependent on Active Directory of Active Directory recovery, outlines
for authentication and authorization. potential Active Directory failures and
solutions, and proves the necessity for an
The new Active Directory usage landscape Active Directory Disaster Recovery plan.
has introduced greater complexity to the
enterprise IT environment, raising the risk
THE COMPLEXITY OF ACTIVE
of AD disasters tied to human error and
cyberattack. More and more frequently, DIRECTORY RECOVERY
attackers are using Active Directory as an Active Directory is not immune to disasters
attack vector to compromise enterprises and recovering AD in the event of a disaster
and, in some severe cases, wiping out the requires in-depth knowledge of how it
entire IT environment. works. Active Directory is designed for
distributed networks and uses a multi- efficiently in any location. As a multi-
master replication model to ensure that master replicated database, it is subject
the directory can be updated and queried to replication timing constraints with the

2
potential for different directory views reflecting the previous environment).
depending on the Domain Controller
(DC) to which you are connected. In order to properly restore the existing
environment, at least one domain
The key challenge in Active Directory controller in each domain should be
recovery is that you cannot simply restore a restored from backup in isolation, and
single domain controller from a backup and then reconnected to recreate the forest.
hope the environment is back to normal. Only once privileged accounts have had
DCs work together to form a topology their passwords reset and any issues
and provide a set of services across an that were present before the restore
organization - this topology is built into the operation have been corrected, should the
metadata of an AD forest and that metadata remaining domain controllers be
is retained within the AD’s backup. When redeployed, and the new directory
you restore a domain controller, you database allowed to replicate.
must ensure that the metadata of the
restored environment is consistent with The bottom line is that the orchestration of
the servers that are available—not those the recovery of Active Directory is just as
that used to be available. Otherwise, important as having backups of your DCs,
client systems will be unable to correctly and this complexity can greatly prolong the
leverage your newly restored environment. recovery process, if being done manually.
The Microsoft Active Directory forest
In addition, the restoration itself must recovery guid e only provides generic
be carefully orchestrated. Root domain instructions that need to be adapted for
services must be brought up before each unique restore operation and require
children, Flexible Single Master Operation a lot of manual effort, meaning that Active
(FSMO) roles must be restored and Directory could be unavailable for a few
the Global Catalog must be re-built. days if you need to restore a full forest.
If you throw DNS into the mix, client The complexity of the recovery process
systems will not find the right services will depend on what caused the disaster,
if DNS, a critical piece of AD health, so it’s critical to understand the root cause
does not reflect the actual environment of the failure prior to performing a restore.
as it exists right now (as opposed to

WHEN DISASTER STRIKES


Information systems rely on Active Directory
for user authentication and security, so any
outage can be catastrophic. Some common
events that cause Active Directory to fail,
or actions that are irreversible, include:
• Database corruption • Accidental or intentional deletion of
objects recorded on paper for the duration of the
• Planned schema changes outage and then entered manually once
• Unplanned or unsanctioned schema systems were back online.
changes
• Raising the functional level of the RANSOMWARE & MALICIOUS ACTS
domain or forest
• Permission changes Ransomware, and other types of
malware, infect end-user devices, but IT
Disk and memory errors can cause infrastructure is increasingly the target.
database corruption, which often results End-user devices are usually the first
in lsass.exe errors in the System event log target because they are not secured to the
and the Active Directory Domain Services same level as domain controllers. Hackers
to halt. If you have two or more domain can harvest privileged Active Directory
controllers in each site, the temporary account password hashes and Kerberos
unavailability of a single domain tickets from users’ PCs to stealthily access
controller shouldn’t be critical. But if domain controllers without needing to
physical or logical corruption spreads know an account password. This so called
to more than one domain controller, “pass the hash” attack on privileged Active
then it might be necessary to perform a Directory credentials gives hackers access
complete forest restore. to domain controllers and any systems
that rely on Active Directory for security.
For example, when a British Hospital
Trust suffered a complete Active But ransomware isn’t the only danger.
Directory failure in 2013, it took the Insiders can intentionally or accidentally
IT team days to diagnose and repair compromise Active Directory. Especially
the failure. The outage was caused by in situations where security best practices
database corruption which happened are not followed. IT staff are commonly
over a long holiday weekend and went granted privileged access to Active
unnoticed until the following Tuesday Directory on a permanent basis, which
morning. Experts from Microsoft and an makes a hacker’s job easier. Furthermore,
IT consultancy worked for two days to separation of administration roles is rarely
restore the Trust’s Active Directory. practiced, and security dependencies are
created between highly-trusted systems,
like domain controllers, and systems
The outage delayed the treatment of 706
with lower trust, like end-user devices.
patients, and new appointments were
Automation technologies, like PowerShell
scripts, can make large numbers of changes
to Active Directory that quickly propagate. controllers. But it only takes a single change
But poorly tested code can result in to cause a failure that prevents domain
failures of production systems. Malicious controllers servicing logon requests, breaks
software can also find its way onto domain replication, prevents additional domain
4
controllers being added to the domain, Server. When the domain and forest
or changes being made to the directory. functional levels are raised, all domain
controllers in the forest and domain must
Because of these threats, organizations need be running a version of Windows Server
to protect Active Directory and prepare for that is at least the same version as the
worst-case scenarios where the only option functional level of the forest and domain.
is to perform a complete forest restore.
Raising domain and forest functional levels
PLANNED & MALICIOUS CHANGES is a safe operation if all domain controllers
are running the required version of
Regardless of how much planning and
Windows Server to support the new
testing you carry out, applications and
functional level. Schema changes can be
systems in your production environment
more problematic and should be tested in
could be affected by changes to Active
a pre-production lab environment before
Directory. Changes sometimes happen
being approved for release in production
accidentally or are the result of malicious
because there’s no supported method
activity. Strict change control procedures
for backing out of schema changes.
can prevent unwanted changes, but
If schema or functional level changes
unsanctioned changes could be carried out
need to be reversed, the only option is
by a malicious actor, a disgruntled insider,
to perform a complete forest restore.
or accidentally by a system administrator.

OBJECT DELETION
Schema changes, and raising the forest
and domain function levels, are both Deleting directory objects, or changes to
irreversible actions. Forest and domain permissions on objects, can cause Active
functional levels determine the level of Directory to fail. Strict change control
compatibility for the forest and domain procedures, and adhering to security
respectively with domain controllers best practices, are the best ways to avoid
running older versions of Windows
accidental object deletion or modification.
Active Directory also includes a flag that
can be set on important objects to prevent
users deleting them with one click. To
enable the flag on every Organizational
Unit (OU) in a domain, use the Get-
ADOrganizationalUnit and Set-ADObject
Powershell cmdlets as shown below.
Get-ADOrganizationalUnit -ProtectedFromAccidentalDeletion:$true
-filter * | Set-
ADObject The Active Directory Recycle Bin can be
used to restore deleted objects but it isn’t malicious attack, AD can be configured so
enabled by default. The forest functional that objects are replicated immediately
level must be set to Windows Server 2008 to the lag site. Additionally, lag sites are
R2 (or higher) and it is an irreversible a security threat when objects deleted
change. Starting with the administration in the main site remain in the lag site.
tools for Windows Server 2012, deleted Consider a situation where a user account
objects can be restored using Active is deleted but still exists in the lag site.
Directory Administrative Center (ADAC). If the Netlogon service is enabled on
domain controllers in the lag site, a
Using the Recycle Bin is preferable deleted user might still be able to log on.
to restoring objects from backup or
reanimatin g tombstoned objects.
BOUNCING BACK
Performing an authoritative restore
RECOVERING A SINGLE DOMAIN
requires booting a domain controller
CONTROLLER
into Directory Services Restore Mode.
Note that removed link-valued attributes, Corruption problems can sometimes be
such as groups, and cleared non-link- repaired in Directory Services Restore
valued attributes, are not restored when Mode (DSRM) using ntdsutil, a built-
you reanimate tombstoned objects. in command-line tool. DSRM is a safe
mode for Active Directory that allows
Some organizations implement lag sites administrators to carry out repairs while
as a recovery solution and for restoring the database is offline. In a worst-case
deleted objects. A lag site is an Active scenario, where the database can’t be
Directory site which has delayed replication repaired and only one domain controller
from other sites in the domain. If objects is affected, the server can be removed
are deleted from the directory, the lag from the domain and re-promoted.
site can be used to restore them. But lag
sites shouldn’t be used as a complete If one domain controller needs to be
recovery solution for several reasons. removed from the domain, move or seize
(depending upon the state of the domain
Microsoft doesn’t support lag sites as controller) any FSMO roles it holds and
a recovery solution. In the event of a then remove the domain controller
from the domain using the Uninstall-
AddsDomainController PowerShell cmdlet
or Server Manager. If the domain controller
is Windows Server 2016, Windows Server
2012 R2, or Windows Server 2012, the
removal of this domain controller is not
demoted domain controller’s metadata is
selected, or the -forceremoval parameter
automatically removed from the directory
isn’t set to $true when using PowerShell.
providing that during removal, Force the

6
One domain controller in each domain
Reinstall Active Directory on the same
must be restored from backup in isolation,
or different hardware and then let the
and then reconnected to recreate the
directory partitions replicate to it. If you
forest. Only once privileged accounts
decide to use the same server hardware,
have had their passwords reset and any
it is important to determine the root cause
issues that were present before the
of the failure before reinstating the server.
restore operation have been corrected,
should the remaining domain controllers
PERFORMING A FOREST be redeployed, and the
RESTORE new directory database allowed to replicate.
In the event of a complete outage,
security breach, or irreversible change SELECTING A TRUSTED
BACKUP
to Active Directory, you should perform
a forest restore to bring back all the Microsoft recommends that you use a
domains in a forest. Restoring a forest trusted backup that is a few days old to
is a complicated process that involves avoid restoring a copy of the database
restoring Active Directory from full server that reintroduces the problem that caused
backups of one domain controller in the failure, unless you can pinpoint exactly
each domain, connecting the restored when the problem was introduced into the
domains on an isolated network, and then directory, with the help of the Windows
adding the remaining domain controllers. event logs. In the case of a malicious attack,
or complete forest melt down, the event
Performing a forest restore involves many logs might not be available unless they are
steps that mean Active Directory could be regularly shipped to a server that doesn’t
unavailable for a couple of days. Microsoft rely on Active Directory for its security.
only provides generic forest recovery
instructions that need to be adapted Using a backup that is a few days old will
for each unique restore operation. You mean that the restored domains won’t
can download the white paper here. include changes made to the directory in
the days before the outage. But the effort
required to reinstate these changes can
offset the time lost in restoring domains
that don’t resolve the issues present before
the outage occurred. All group
memberships should be reviewed
after restoration, and this process will
identify a significant number of the
changes made post backup.

Starting in Windows Server 2008, the


Active Directory database mounting tool authoritative-and-non-authoritative-
(Dsamain.exe) can be used to mount synchronization-fo ) in Active Directory
the Active Directory database from if SYSVOL is replicated using DFRS. If it is
backups made using ntdsutil, Windows replicated using FRS, then you will need
Backup, or a backup tool that supports to stop the FRS service, edit the BurFlags
Active Directory. A mounted database registry key (https://support.microsoft.
can be viewed using ldp.exe or Active com/en-us/help/290762/usi ng-the -
Directory Users and Computers (ADUC). burflags-registry-key-to-reinitialize-file-
The ability to view the database in this replication-servi), and restart the service.
way is useful when determining which
backup to use for a restore operation. In A writeable domain controller should
older versions of Windows Server, it was be restored in the forest root first to
necessary to restore a domain controller make sure that the Schema Admins and
to view the Active Directory database. Enterprise Admins groups are present
before other domains are restored and
FULL SERVER RESTORE to make sure that the trust hierarchy isn’t
Windows Server 2008 (and later) doesn’t broken during the restore process. Unless
support restoring a server using the the forest consists of a single domain, the
system state to a new installation of domain controller you restore should not
Windows, regardless of whether installed be a Global Catalog. If you have no choice
on the same or new hardware. Therefore, but to restore a domain controller that
you should make full server backups of was a Global Catalog, disable the Global
domain controllers, perform full server Catalog after the restore operation is
restores, and only perform a system complete to prevent lingering objects.
state restore after a full server restore to You should perform a non-authoritative
mark SYSVOL as authoritative if the restore of Active Directory ‘Directory
restored server is the first writeable domain Services’ and an authoritative restore of
controller in the domain. At least two the SYSVOL share so that when additional
writeable domain controllers should be domain controllers are added to the
backed up in each domain. domain, they synchronize the contents
of SYSVOL from a server that has been
If you don’t want to make two backups for set as authoritative. You can perform
each domain controller, i.e. a full server a restore using the built-in Windows
backup and a system state backup, then Backup tool or a third-party backup
SYSVOL can be marked authoritative by solution that supports Active Directory.
editing the msDFSR-Options attribute Once the forest root domain is in place,
(h ttps://support.microsof t.com/en- you can begin to recover other domains
us/ hel p/ 2218556/how -to-force-an - simultaneously, providing that parent

domains are always restored before the domain controller in the forest root
child domains. The last step is to make a Global Catalog. Once all the domains
8
are restored, you can check that they from the failed domain. If forest failure
are working using the dcdiag, nltest, and was caused by something outside of
repadmin tools on an isolated network. Active Directory, like ransomware, then
you must reinstall Windows Server.
Before connecting the restored forest
back to the production network and
RISK AND IMPACT
redeploying other domain controllers,
you should clean up the metadata for all
ASSESSMENT
other writeable domain controllers in the Well known for their high-performance
domain. This will make sure that NTDS- graphics cards, NVIDIA has embraced VR
settings objects are not duplicated, and Implementing security best practices,
unnecessary replication links are not and the latest technologies in Windows
created. Furthermore, if restored domain Server 2016 and Windows 10, helps
controllers held the RID master FSMO role reduce the likelihood of a successful
before recovery, it won’t be able to create ransomware attack. But systems can
new relative IDs (RIDs) until the metadata never be one hundred percent secure,
for all other writeable domain controllers so a disaster recovery plan for Active
is removed. RIDs form part of the unique Directory is essential. Performing an
security identifier (SID) that is assigned to impact assessment for Active Directory
each new Active Directory security principal. involves mapping security dependencies to
determine which critical business systems
RESTORING THE REMAINING rely on Active Directory for security.
DOMAIN CONTROLLERS Once these dependencies have been
established, you will be able to identify all
If you are sure that the forest failure
the systems that rely on Active Directory.
wasn’t cause by something outside of
Bringing Active Directory online as quickly
Active Directory, i.e. a hardware failure
as possible after a failure requires a
or security breach, you can connect
tested disaster recovery plan. The details
the restored forest to the production
of the plan will depend on many factors:
network and add the remaining domain
controllers to each domain without
• which version of Windows Server each
reinstalling Windows Server. Before you
domain controller is running
add the domain controllers back to the
• whether domain controllers are
restored forest, forcibly remove them
installed on physical or virtual hosts
• how you will determine which is the
latest trusted backup for each domain
• whether domain controllers will be
installed to the same or new hardware
Regulatory standards and service level decisions in how you plan for disaster
agreements (SLAs) may also impact recovery. The recovery process can be
speeded up by using full server restores authorization of mobile workforce and
instead of system state backups. But cloud-based applications. This increased
orchestrating a full forest restore is dependency has led to greater complexity
difficult using the standard tools in the enterprise IT environment and
because they are not designed for raised the risk of Active Directory disasters
automation. As part of designing a tied to ransomware, malicious acts or
recovery plan, you should determine misconfigurations, and human error.
which domain controllers are required to While some Active Directory failures
get line-of-business systems back online can be repaired manually, recovering
even if performance is impacted. Active Directory in case of a disaster is
a long, cumbersome process that can
A further concern for companies leave businesses offline for days. The
with a hybrid cloud solution is Azure only way to ensure continued business
Active Directory, which in larger operations is by making sure that
o rg an izatio n s is almost alway s Active Directory is truly protected and
synchronized with on-premise Active a solid disaster recovery plan is in place.
Directory. Extending on-premise Active
Directory to the cloud introduces an
additional risk and complexity to the
management and the recovery process.

A forest-wide Active Directory failure can


cause a complete outage of all business
systems, and recovery can be complex
and time-consuming. Following best
practice advice from Microsoft is an
essential step in ensuring that Active
Directory is protected. But nothing can
replace a proven disaster recovery plan.

SUMMARY
In recent years, businesses have become
increasingly dependent on Active Directory,
expanding their reliance on AD Directory
Services to include authentication and

10
Semperis is an enterprise identity protection company that enables organizations to quickly
recover from accidental or malicious changes and disasters that compromise Active Directory,
on-premises and on cloud. The Semperis Directory Services Protection Platform™ provides
enterprises with the capabilities to automatically restore an entire Active Directory forest,
quickly recover thousands of objects or a single crucial attribute, and instantly revert to a
previous Active Directory state. Semperis customers include Fortune 500 companies and
enterprises spanning financial, healthcare, government and other industries worldwide.

SPONSORED BY

11

You might also like