0% found this document useful (0 votes)
24 views12 pages

Databaserecovery

Uploaded by

Binod SAdhikari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views12 pages

Databaserecovery

Uploaded by

Binod SAdhikari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Recovery Concepts

In database management systems (DBMS), recovery concepts refer to the mechanisms and
processes used to ensure that a database remains in a consistent state despite failures or errors.
The two primary recovery concepts used in most DBMS are Undo/Redo logging and Write-
Ahead Logging (WAL).
1. Undo/Redo Logging:
 Undo Logging: Before making changes to a database, the DBMS writes the original data
values (undo information) to a log file. If a transaction needs to be rolled back, the DBMS
uses this undo information to restore the original data values.
 Redo Logging: After making changes to a database, the DBMS writes the new data values
(redo information) to a log file. If a transaction commits but the changes have not been
permanently written to the database files, the redo information is used to reapply the changes
during recovery.
2. Write-Ahead Logging (WAL):
 In WAL, changes to the database are first written to a log file before being applied to the
actual database files. This ensures that the log reflects all changes before they are applied to
the database, providing a way to recover the database to a consistent state if a failure occurs.
3. Checkpointing:
 Checkpointing is the process of periodically writing all modified data pages from memory to
disk and flushing the corresponding log records to disk. This ensures that all committed
changes are safely stored on disk, reducing the amount of redo work needed during recovery.
4. Transaction Commit and Rollback:
 A commit operation marks the successful completion of a transaction, ensuring that its
changes are durable and can be recovered in case of failure. A rollback operation reverses the
changes made by a transaction that has not yet committed.
5. Recovery Algorithms:
 DBMS use recovery algorithms like ARIES (Algorithm for Recovery and Isolation
Exploiting Semantics) to ensure that the database can be recovered to a consistent state after
a failure.

NO- UNDO/REDO
In database management systems, the NO-UNDO/REDO recovery concept refers to a recovery
scheme that does not use undo or redo logs for transaction recovery. Instead, it relies on other
mechanisms to ensure data consistency and durability. This approach is primarily used in some
real-time and embedded systems where the overhead of logging and recovery is considered too
high.
Here's how the NO-UNDO/REDO recovery concept works:
1. No Undo Logs: In NO-UNDO systems, there are no undo logs. Once a transaction commits, its
changes are considered permanent and cannot be undone.
2. No Redo Logs: Redo logs are used to replay changes made by committed transactions during a
system crash or failure. In NO-REDO systems, there are no redo logs. Instead, the system relies
on periodic checkpoints to ensure that committed changes are flushed to disk.
3. Checkpointing: Checkpointing is used to periodically write the database's current state to disk,
ensuring that all committed changes are safely stored. Checkpoints are crucial in
NO-UNDO/REDO systems to prevent the need for redo logs.
4. Logging for Crash Recovery: Although there are no undo or redo logs, some form of logging is
still used for crash recovery. This logging typically involves recording metadata and transaction
commit information to ensure that the database can recover to a consistent state after a crash.

Recovery Based on Deferred Update


Deferred update is to defer or postpone any actual updates to the database itself until the
transaction completes its execution successfully and reaches its commit point.
During transaction execution, the updates are recorded only in the log and in the transaction
workspace. After the transaction reaches its commit point and the log is force-written to disk, the
updates are recorded in the database itself.
If a transaction fails before reaching its commit point, there is no need to undo any operations,
because the transaction has not affected the database in any way.
The steps involved in the deferred update protocol are as follows:
1. When a transaction starts, write an entry start_transaction(T) to the log.
2. When any operation is performed that will change values in the database, write a log entry
write_item(T, x, old_value, new_value).
3. When a transaction is about to commit, write a log record of the form commit(T); write all log
records to disk.
4. Commit the transaction, using the log to write the updates to the database; the writing of data to
disk need not occur immediately.
5. If the transaction aborts, ignore the log records and do not write the changes to disk.

The database is never updated until after the transaction commits, and there is never a need to
UNDO any operations. Hence this technique is known as the NO-UNDO/REDO algorithm.
The REDO is needed in case the system fails after the transaction commits but before all its
changes are recorded in the database. In this case, the transaction operations are redone from the
log entries.

Recovery Technique Based on Immediate Update


In the immediate update techniques, the database may be updated by the operations of a
transaction immediately, before the transaction reaches its commit point. However, these
operations are typically recorded in the log on disk by force-writing before they are applied to
the database, so that recovery is possible.
When immediate update is allowed, provisions must be made for undoing the effect of update
operations on the database, because a transaction can fail after it has applied some updates to the
database itself. Hence recovery schemes based on immediate update must include the capability
to roll back a transaction by undoing the effect of its write operations.
The steps involved in the immediate update protocol are as follows:
1. When a transaction starts, write an entry start_transaction(T) to the log;
2. When any operation is performed that will change values in the database, write a log entry
write_item(T, x, old_value, new_value);
3. Write the log to disk;
4. Once the log record is written, write the update to the database buffers;
5. When convenient, write the database buffers to the disk;
6. When a transaction is about to commit, write a log record of the form commit(T);
7. Write the log to disk.
Shadow paging
In the shadow page scheme, the database is not directly modified but a copy, stored on
permanent storage (e.g. disk), is made of the portion of the database to be modified and all
modifications are made to this copy. Meanwhile, the old version of the database remains intact.
Once the transaction commits, the modified copy replaces the original in an atomic manner, i.e.
the replacement is carried out in its entirety or not at all. If a system crashes at this point, the old
version is still available for recovery.
Shadow paging scheme in a single-user environment
In the shadow page scheme, two page tables are used. The original page table (shadow page
table) and the current page table. Initially, both page tables point to the same blocks of physical
storage. The current page table is the only route through which a transaction can gain access to
the data stored on disk. That is, a transaction always uses the current page table to retrieve the
appropriate database blocks.

During transaction execution, the shadow page table is never modified. When a write operation
is performed, a new copy of the modified database page is created, but the old copy of that page
is not overwritten. Instead, the new page is written elsewhere on some previously unused disk
block. The current page table entry is modified to point to the new disk block, whereas the
shadow page table is not modified and continues to point to the old unmodified disk block. The
diagram above illustrates the concepts of a shadow page table and a current page table. For pages
updated by the transaction, two versions are kept. The old version is referenced by the shadow
page table and the new version by the current page table.
Database Backup & Recovery from Catastrophic Failure
A catastrophic failure is one where a stable, secondary storage device gets corrupt. With the
storage device, all the valuable data that is stored inside is lost.
In other word catastrophic failure is a type of database failure that is severe and often results in
the loss of a significant amount of data.
In these cases, it may not be possible to recover the database using traditional methods, such as
restoring from a backup or repairing the database.
If you are faced with a catastrophic failure, there are several steps you can take to try to recover
the database:
1. Assess the damage: The first step in recovering from a catastrophic failure is to assess the
damage and determine the extent of the data loss. This will help you to understand what needs to
be done to rebuild the database.
2. Shut down the archiver: The archiver software might still be trying to add increments. Close it
down since the backend is not working.
3. Identify the point of failure: Ascertain which components have failed. It can be hardware,
software, and database environments. Repair and replace components as required.
4. Gather data from other sources: If you do not have a recent, valid backup of the database, you
may need to gather data from other sources, such as spreadsheets or text files, to rebuild the
database.
5. Recreate the database structure: You will need to recreate the database structure, including
tables, indexes, and other objects, in order to rebuild the database.
6. Import the data: Once the database structure has been recreated, you can import the data into
the new database. This may involve cleaning and formatting the data to ensure that it is
compatible with the new database structure.
Source: learn.microsoft.com

7. Test and validate the database: Once the database has been rebuilt, it is important to test and
validate the data to ensure that it is complete and accurate. This may involve comparing the data
to the original source or running queries to verify that the data is correct.
Database Performance Tuning
Database Performance Tuning refers to the process of optimizing database systems for improved
performance and efficiency. It involves making adjustments in the database configuration, query
design, indexing, and other system parameters to achieve maximum performance. By fine-tuning
databases, businesses can ensure a faster response time for data retrieval and analytics,
contributing to better decision making and enhanced user experiences.
Database Performance Tuning involves various techniques and methodologies aimed at
improving database performance, such as:
 Query optimization: Rewriting SQL queries for better execution plans
 Index management: Creating, modifying, and deleting indexes to optimize data access
 Resource allocation: Assigning memory, CPU, and disk resources for optimal performance
 Database design: Designing database schemas that allow efficient data processing
 Data partitioning: Dividing large tables into smaller, more manageable pieces for improved
query performance
 Caching: Storing frequently accessed data in memory for faster retrieval
Database security
Database security refers to the protection of databases from unauthorized access, use, disclosure,
disruption, modification, or destruction. It includes a variety of security measures to ensure the
confidentiality, integrity, and availability of data stored in a database.
Database security is essential for organizations to protect sensitive information, such as personal
data, financial records, intellectual property, and other critical assets. Breaches in database
security can lead to data loss, identity theft, financial fraud, and damage to an organization's
reputation.
Key aspects of database security include:
1. Access Control: Implementing mechanisms to ensure that only authorized users and
applications can access the database. This includes user authentication, role-based access control
(RBAC), and the principle of least privilege.
2. Data Encryption: Encrypting sensitive data to protect it from unauthorized access. Encryption
should be applied both at rest (when data is stored) and in transit (when data is transmitted over
networks).
3. Auditing and Monitoring: Monitoring and logging database activities to detect and respond to
security incidents and anomalies. Auditing can help identify unauthorized access attempts and
ensure compliance with security policies and regulations.
4. Database Patching and Updates: Regularly applying patches and updates to the database
management system (DBMS) and associated software to protect against known vulnerabilities
and security threats.
5. Backup and Recovery: Implementing regular backups of the database to ensure data
availability and recoverability in the event of a data loss or security breach.
6. Security Policies and Procedures: Establishing and enforcing database security policies and
procedures to govern access, usage, and handling of sensitive data.
7. Security Best Practices: Following industry best practices and guidelines for database security
to enhance the overall security posture of the database environment.

Concept of Parallel and Distributed Databases


A Distributed database is defined as a logically related collection of data that is shared which is
physically distributed over a computer network on different sites.
The Distributed DBMS is defined as, the software that allows for the management of the
distributed database and make the distributed data available for the users.
A distributed DBMS consist of a single logical database that is divided into a number of pieces
called the fragments. In DDBMS, Each site is capable of independently processing the users
request.
Users can access the DDBMS via applications classified:
1. Local Applications –
Those applications that doesn’t require data from the other sites are classified under the
category of Local applications.
2. Global Applications –
Those applications that require data from the other sites are classified under the category of
Global applications.
Characteristics of Distributed DDBMS :
A DDBMS has the following characteristics-
1. A collection of logically related shared data.
2. The data is split into a number of fragments.
3. Fragments may be duplicate.
4. Fragments are allocated to sites.
5. The data at each site is under the control of DBMS and managed by DBMS.
Parallel DBMS :
A parallel DBMS is a DBMS that run across multiple processor and is designed to execute
operations in parallel, whenever possible. The parallel DBMS link a number of smaller
machines to achieve same throughput as expected from a single large machine.
There are three main architectures for Parallel DBMS-
1. Shared Memory –
Shared memory is a highly coupled architecture in which a number of processors within a single
system who share system memory. It is also known as symmetric multiprocessing (SMP). This
approach is more popular on platforms like personal workstations that support a few
microprocessor in parallel.
2. Shared Disk –
Shared disk is a loosely coupled architecture used for application that are centralized and
require a high availability and performance.Each processor is able to access all disks directly,
but each has it’s own private memory.It is also called Clusters.
3. Shared Nothing –
Shared nothing is a multiple processor architecture in which every processor is a part of a
complete system, which has its own memory and disk storage( has it’s own resources). It is also
called Massively Parallel Processing (MPP).
Features :
1. There are parallel working of CPUs
2. It improves performance
3. It divides large tasks into various other tasks
4. Completes works very quickly
Data Warehouse
Data Warehouse:
A Data Warehouse refers to a place where data can be stored for useful mining. It is like a quick
computer system with exceptionally huge data storage capacity. Data from the various
organization's systems are copied to the Warehouse, where it can be fetched and conformed to
delete errors. Here, advanced requests can be made against the warehouse storage of data.

Data warehouse combines data from numerous sources which ensure the data quality, accuracy,
and consistency. Data warehouse boosts system execution by separating analytics processing
from transnational databases. Data flows into a data warehouse from different databases. A data
warehouse works by sorting out data into a pattern that depicts the format and types of data.
Query tools examine the data tables using patterns.
Data warehouses and databases both are relative data systems, but both are made to serve
different purposes. A data warehouse is built to store a huge amount of historical data and
empowers fast requests over all the data, typically using Online Analytical Processing (OLAP).
A database is made to store current transactions and allow quick access to specific transactions
for ongoing business processes, commonly known as Online Transaction Processing (OLTP).
Advantages of Data Warehouse:
o More accurate data access
o Improved productivity and performance
o Cost-efficient
o Consistent and quality data
Data Mining:
Data mining refers to the analysis of data. It is the computer-supported process of analyzing huge
sets of data that have either been compiled by computer systems or have been downloaded into
the computer. In the data mining process, the computer analyzes the data and extract useful
information from it. It looks for hidden patterns within the data set and try to predict future
behavior. Data mining is primarily used to discover and indicate relationships among the data
sets.

Data mining aims to enable business organizations to view business behaviors, trends
relationships that allow the business to make data-driven decisions. It is also known as
knowledge Discover in Database (KDD). Data mining tools utilize AI, statistics, databases, and
machine learning systems to discover the relationship between the data. Data mining tools can
support business-related questions that traditionally time-consuming to resolve any issue.
Important features of Data Mining:
The important features of Data Mining are given below:
o It utilizes the Automated discovery of patterns.
o It predicts the expected results.
o It focuses on large data sets and databases
o It creates actionable information.
Advantages of Data Mining:
i. Market Analysis:
Data Mining can predict the market that helps the business to make the decision. For example, it
predicts who is keen to purchase what type of products.
ii. Fraud detection:
Data Mining methods can help to find which cellular phone calls, insurance claims, credit, or
debit card purchases are going to be fraudulent.
iii. Financial Market Analysis:
Data Mining techniques are widely used to help Model Financial Market
iv. Trend Analysis:
Analyzing the current existing trend in the marketplace is a strategic benefit because it helps in
cost reduction and manufacturing process as per market demand.
Big data
Big data refers to extremely large and complex data sets that cannot be easily managed,
processed, or analyzed using traditional data processing applications. Big data is characterized by
the volume, velocity, and variety of the data.
 Volume: Big data involves massive amounts of data, often ranging from terabytes to petabytes
or even exabytes.
 Velocity: Big data is generated at high speed and must be processed rapidly. This includes data
from various sources such as social media, sensors, and online transactions.
 Variety: Big data comes in various formats, including structured data (like databases), semi-
structured data (like XML files), and unstructured data (like text documents or social media
posts).
Big data is often analyzed to uncover patterns, trends, and associations, which can provide
valuable insights for businesses, governments, and other organizations to make better decisions.
NoSQL databases, or "Not Only SQL" databases,
NoSQL databases, or "Not Only SQL" databases, are a type of database system that provides
a mechanism for storage and retrieval of data that is modeled in ways other than the tabular
relations used in relational databases. NoSQL databases are often used for large-scale and real-
time web applications.
Some key features of NoSQL databases include:
1. Schema-less: NoSQL databases do not require a fixed schema, allowing for more flexible and
dynamic data models.
2. Scalability: NoSQL databases are designed to scale horizontally across a large number of
servers, making them suitable for handling large volumes of data and high traffic loads.
3. High availability: NoSQL databases are often designed with built-in replication and automatic
failover capabilities, ensuring that the system remains available even in the event of hardware or
network failures.
4. Types of NoSQL databases: There are several types of NoSQL databases, including document-
oriented, key-value, column-oriented, and graph databases. Each type is optimized for different
use cases and data models.
Popular examples of NoSQL databases include MongoDB (document-oriented), Apache
Cassandra (column-oriented), Redis (key-value), and Neo4j (graph).

You might also like