Database Management Systems
AL 502
Content
⚫ UNIT I:DBMS Concepts and architecture Introduction,
⚫ Database approach v/s Traditional file accessing approach, Advantages of
  database systems, Data models, Schemas and instances,
⚫ Data independence, Data Base Language and interfaces, Overall Database
⚫ Structure, Functions of DBA and designer,
⚫   ER data model: Entitles and attributes, Entity types, Defining the E-R diagram,
⚫   Concept of Generalization, Aggregation and Specialization.
⚫ Transforming ER diagram into the tables.
⚫ Various other data models object oriented data
⚫ Model, Network data model, and Relational data model,
⚫   Comparison between the three types of models.
⚫ Storage structures: Secondary Storage Devices, Hashing & Indexing
⚫ structures: Single level & multilevel indices.
What is Database
⚫The database is a collection of inter-related data which
 is used to retrieve, insert and delete the data efficiently.
 It is also used to organize the data in the form of a
 table, schema, views, and reports, etc.
⚫For example: The college Database organizes the data
 about the admin, staff, students and faculty etc.
⚫Using the database, you can easily retrieve, insert, and
 delete the information.
Database Management System
⚫Database management system is a software which is
 used to manage the database. For example: MySQL,
 Oracle, etc are a very popular commercial database
 which is used in different applications.
⚫DBMS provides an interface to perform various
 operations like database creation, storing data in it,
 updating data, creating a table in the database and a lot
 more.
⚫It provides protection and security to the database. In
 the case of multiple users, it also maintains data
 consistency.
DBMS allows users the
following tasks:
⚫ Data Definition: It is used for creation, modification, and
  removal of definition that defines the organization of data in
  the database.
⚫ Data Updation: It is used for the insertion, modification, and
  deletion of the actual data in the database.
⚫ Data Retrieval: It is used to retrieve the data from the
  database which can be used by applications for various
  purposes.
⚫ User Administration: It is used for registering and monitoring
  users, maintain data integrity, enforcing data security, dealing
  with concurrency control, monitoring performance and
  recovering information corrupted by unexpected failure.
Characteristics of DBMS
⚫ It uses a digital repository established on a server to store and
  manage the information.
⚫ It can provide a clear and logical view of the process that
  manipulates data.
⚫ DBMS contains automatic backup and recovery procedures.
⚫ It contains ACID properties which maintain data in a healthy
  state in case of failure.
⚫ It can reduce the complex relationship between data.
⚫ It is used to support manipulation and processing of data.
⚫ It is used to provide security of data.
⚫ It can view the database from different viewpoints according to
  the requirements of the user.
Advantages of DBMS
⚫ Controls database redundancy: It can control data redundancy
  because it stores all the data in one single database file and that
  recorded data is placed in the database.
⚫ Data sharing: In DBMS, the authorized users of an organization can
  share the data among multiple users.
⚫ Easily Maintenance: It can be easily maintainable due to the
  centralized nature of the database system.
⚫ Reduce time: It reduces development time and maintenance need.
⚫ Backup: It provides backup and recovery subsystems which create
  automatic backup of data from hardware and software failures and
  restores the data if required.
⚫ multiple user interface: It provides different types of user interfaces
  like graphical user interfaces, application program interfaces
Disadvantages of DBMS
⚫ Cost of Hardware and Software: It requires a high speed
  of data processor and large memory size to run DBMS
  software.
⚫ Size: It occupies a large space of disks and large memory
  to run them efficiently.
⚫ Complexity: Database system creates additional
  complexity and requirements.
⚫ Higher impact of failure: Failure is highly impacted the
  database because in most of the organization, all the data
  stored in a single database and if the database is damaged
  due to electric failure or database corruption then the data
  may be lost forever.
DBMS vs. File System
Database
What is Data?
⚫ Data is a collection of a distinct small unit of
  information. It can be used in a variety of forms like
  text, numbers, media, bytes, etc. it can be stored in
  pieces of paper or electronic memory, etc.
⚫ Word 'Data' is originated from the word 'datum' that
  means 'single piece of information.' It is plural of the
  word datum.
⚫ In computing, Data is information that can be
  translated into a form for efficient movement and
  processing. Data is interchangeable.
DBMS Architecture
⚫The DBMS design depends upon its architecture. The
 basic client/server architecture is used to deal with a
 large number of PCs, web servers, database servers
 and other components that are connected with
 networks.
⚫The client/server architecture consists of many PCs
 and a workstation which are connected via the
 network.
⚫DBMS architecture depends upon how users are
 connected to the database to get their request done.
Types of DBMS Architecture
1-Tier Architecture
⚫In this architecture, the database is directly available to
 the user. It means the user can directly sit on the
 DBMS and uses it.
⚫Any changes done here will directly be done on the
 database itself. It doesn't provide a handy tool for end
 users.
⚫The 1-Tier architecture is used for development of the
 local application, where programmers can directly
 communicate with the database for the quick response.
2-Tier Architecture
⚫ The 2-Tier architecture is same as basic client-server. In the
  two-tier architecture, applications on the client end can
  directly communicate with the database at the server side.
  For this interaction, API's like: ODBC, JDBC are used.
⚫ The user interfaces and application programs are run on the
  client-side.
⚫ The server side is responsible to provide the functionalities
  like: query processing and transaction management.
⚫ To communicate with the DBMS, client-side application
  establishes a connection with the server side.
2-Tier Architecture
3-Tier Architecture
⚫ The 3-Tier architecture contains another layer between
  the client and server. In this architecture, client can't
  directly communicate with the server.
⚫ The application on the client-end interacts with an
  application server which further communicates with the
  database system.
⚫ End user has no idea about the existence of the database
  beyond the application server. The database also has no
  idea about any other user beyond the application.
⚫ The 3-Tier architecture is used in case of large web
  application.
3-Tier Architecture
Three schema Architecture
⚫The three schema architecture is also called
 ANSI/SPARC architecture or three-level architecture.
⚫This framework is used to describe the structure of a
 specific database system.
⚫The three schema architecture is also used to separate
 the user applications and physical database.
⚫The three schema architecture contains three-levels. It
 breaks the database down into three different
 categories.
Three schema Architecture
Three schema Architecture
1. Internal Level
⚫ The internal level has an internal schema which describes the physical
   storage structure of the database.
⚫ The internal schema is also known as a physical schema.
⚫ It uses the physical data model. It is used to define that how the data will
   be stored in a block.
⚫ The physical level is used to describe complex low-level data structures
   in detail.
2. Conceptual Level
⚫ The conceptual schema describes the design of a database at the
   conceptual level. Conceptual level is also known as logical level.
⚫ The conceptual schema describes the structure of the whole database.
⚫ The conceptual level describes what data are to be stored in the
   database and also describes what relationship exists among those data.
⚫ In the conceptual level, internal details such as an implementation of the
   data structure are hidden.
⚫ Programmers and database administrators work at this level.
Three schema Architecture
3. External Level
⚫ At the external level, a database contains several
  schemas that sometimes called as subschema. The
  subschema is used to describe the different view of the
  database.
⚫ An external schema is also known as view schema.
⚫ Each view schema describes the database part that a
  particular user group is interested and hides the
  remaining database from that user group.
⚫ The view schema describes the end user interaction
  with database systems.
Data model Schema and
Instance
⚫ The data which is stored in the database at a particular moment of
  time is called an instance of the database.
⚫ The overall design of a database is called schema.
⚫ A database schema is the skeleton structure of the database. It
  represents the logical view of the entire database.
⚫ A schema contains schema objects like table, foreign key, primary
  key, views, columns, data types, stored procedure, etc.
⚫ A database schema can be represented by using the visual diagram.
  That diagram shows the database objects and relationship with
  each other.
⚫ A database schema is designed by the database designers to help
  programmers whose software will interact with the database. The
  process of database creation is called data modeling.
Data Independence
Data independence can be explained using the three-schema architecture.
Data independence refers characteristic of being able to modify the
   schema at one level of the database system without altering the schema
   at the next higher level.
There are two types of data independence:
1. Logical Data Independence
⚫ Logical data independence refers characteristic of being able to change
   the conceptual schema without having to change the external schema.
⚫ Logical data independence is used to separate the external level from
   the conceptual view.
⚫ If we do any changes in the conceptual view of the data, then the user
   view of the data would not be affected.
⚫ Logical data independence occurs at the user interface level.
Data Independence
2. Physical Data Independence
⚫ Physical data independence can be defined as the
  capacity to change the internal schema without having
  to change the conceptual schema.
⚫ If we do any changes in the storage size of the
  database system server, then the Conceptual structure
  of the database will not be affected.
⚫ Physical data independence is used to separate
  conceptual levels from the internal levels.
⚫ Physical data independence occurs at the logical
  interface level.
Data Independence
Database Language
⚫A DBMS has appropriate languages and interfaces to
 express database queries and updates.
⚫Database languages can be used to read, store and
 update the data in the database.
Types of Database Language
1. Data Definition Language
DDL stands for Data Definition Language. It is used to define database structure or
  pattern.
⚫ It is used to create schema, tables, indexes, constraints, etc. in the database.
⚫ Using the DDL statements, you can create the skeleton of the database.
⚫ Data definition language is used to store the information of metadata like the number
  of tables and schemas, their names, indexes, columns in each table, constraints, etc.
Here are some tasks that come under DDL:
⚫ Create: It is used to create objects in the database.
⚫ Alter: It is used to alter the structure of the database.
⚫ Drop: It is used to delete objects from the database.
⚫ Truncate: It is used to remove all records from a table.
⚫ Rename: It is used to rename an object.
⚫ Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come under
  Data definition language.
2. Data Manipulation
Language
DML stands for Data Manipulation Language. It is used for accessing
  and manipulating data in a database. It handles user requests.
Here are some tasks that come under DML:
⚫ Select: It is used to retrieve data from a database.
⚫ Insert: It is used to insert data into a table.
⚫ Update: It is used to update existing data within a table.
⚫ Delete: It is used to delete all records from a table.
⚫ Merge: It performs UPSERT operation, i.e., insert or update
  operations.
⚫ Call: It is used to call a structured query language or a Java
  subprogram.
⚫ Explain Plan: It has the parameter of explaining data.
⚫ Lock Table: It controls concurrency.
3. Data Control Language
DCL stands for Data Control Language. It is used to retrieve the
  stored or saved data.
⚫ The DCL execution is transactional. It also has rollback parameters.
⚫ (But in Oracle database, the execution of data control language does
  not have the feature of rolling back.)
Here are some tasks that come under DCL:
⚫ Grant: It is used to give user access privileges to a database.
⚫ Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of
  Revoke:
⚫ CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE
  and SELECT.
4. Transaction Control
Language
⚫TCL is used to run the changes made by the DML
 statement. TCL can be grouped into a logical
 transaction.
⚫Here are some tasks that come under TCL:
⚫Commit: It is used to save the transaction on the
 database.
⚫Rollback: It is used to restore the database to original
 since the last Commit.
Database Administrator (DBA)
⚫ A Database Administrator (DBA) is individual or person
  responsible for controlling, maintenance, coordinating, and
  operation of database management system. Managing, securing,
  and taking care of database system is prime responsibility.
⚫ They are responsible and in charge for authorizing access to
  database, coordinating, capacity, planning, installation, and
  monitoring uses and for acquiring and gathering software and
  hardware resources as and when needed. Their role also varies
  from configuration, database design, migration, security,
  troubleshooting, backup, and data recovery. Database
  administration is major and key function in any firm or
  organization that is relying on one or more databases. They are
  overall commander of Database system.
Types of Database
Administrator (DBA)
⚫ Administrative DBA –
  Their job is to maintain server and keep it functional. They are concerned with data backups, security,
  trouble shooting, replication, migration etc.
⚫ Data Warehouse DBA –
  Assigned earlier roles, but held accountable for merging data from various sources into data
  warehouse. They also design warehouse, with cleaning and scrubs data prior to loading.
⚫ Development DBA –
  They build and develop queries, stores procedure, etc. that meets firm or organization needs. They are
  par at programmer.
⚫ Application DBA –
  They particularly manages all requirements of application components that interact with database and
  accomplish activities such as application installation and coordinating, application upgrades, database
  cloning, data load process management, etc.
⚫ Architect –
  They are held responsible for designing schemas like building tables. They work to build structure
  that meets organisation needs. The design is further used by developers and development DBAs to
  design and implement real application.
⚫ OLAP DBA –
  They design and builds multi-dimensional cubes for determination support or OLAP systems.
Importance of Database
Administrator (DBA) :
⚫ Database Administrator manages and controls three levels of
  database like internal level, conceptual level, and external level
  of Database management system architecture and in discussion
  with comprehensive user community, gives definition of world
  view of database. It then provides external view of different
  users and applications.
⚫ Database Administrator ensures held responsible to maintain
  integrity and security of database restricting from unauthorized
  users. It grants permission to users of database and contains
  profile of each and every user in database.
⚫ Database Administrator also held accountable that database is
  protected and secured and that any chance of data loss keeps at
  minimum.
Role and Duties of Database
Administrator (DBA) :
⚫ Decides hardware –
    They decides economical hardware, based upon cost, performance and efficiency of
    hardware, and best suits organisation. It is hardware which is interface between end
    users and database.
⚫   Manages data integrity and security –
    Data integrity need to be checked and managed accurately as it protects and restricts
    data from unauthorized use. DBA eyes on relationship within data to maintain data
    integrity.
⚫   Database design –
    DBA is held responsible and accountable for logical, physical design, external model
    design, and integrity and security control.
⚫   Database implementation –
    DBA implements DBMS and checks database loading at time of its implementation.
⚫   Query processing performance –
    DBA enhances query processing by improving their speed, performance and accuracy.
⚫   Tuning Database Performance –
    If user is not able to get data speedily and accurately then it may loss organization
    business. So by tuning SQL commands DBA can enhance performance of database.
Role and Duties of Database
Administrator (DBA) :
⚫ Decides hardware –
    They decides economical hardware, based upon cost, performance and efficiency of
    hardware, and best suits organisation. It is hardware which is interface between end
    users and database.
⚫   Manages data integrity and security –
    Data integrity need to be checked and managed accurately as it protects and restricts
    data from unauthorized use. DBA eyes on relationship within data to maintain data
    integrity.
⚫   Database design –
    DBA is held responsible and accountable for logical, physical design, external model
    design, and integrity and security control.
⚫   Database implementation –
    DBA implements DBMS and checks database loading at time of its implementation.
⚫   Query processing performance –
    DBA enhances query processing by improving their speed, performance and accuracy.
⚫   Tuning Database Performance –
    If user is not able to get data speedily and accurately then it may loss organization
    business. So by tuning SQL commands DBA can enhance performance of database.
Data Model
⚫ Data models define how the logical structure of a database is modeled. Data
  Models are fundamental entities to introduce abstraction in a DBMS. Data
  models define how data is connected to each other and how they are processed
  and stored inside the system.
⚫ The very first data model could be flat data-models, where all the data used are
  to be kept in the same plane. Earlier data models were not so scientific, hence
  they were prone to introduce lots of duplication and update anomalies.
⚫ Relational Model
⚫ Entity-Relationship Model
⚫ Object-Based Data Model
⚫ Semi-Structured Data Model
⚫ Hierarchical Model
⚫ Network Model
Relational Model
⚫ Relational Model is the most widely used model. In this
  model, the data is maintained in the form of a two-
  dimensional table. All the information is stored in the form
  of row and columns. The basic structure of a relational
  model is tables. So, the tables are also called relations in
  the relational model. Example: In this example, we have an
  Employee table.
Features of Relational Model
⚫Tuples: Each row in the table is called tuple. A row
 contains all the information about any instance of the
 object. In the above example, each row has all the
 information about any specific individual like the first
 row has information about John.
⚫Attribute or field: Attributes are the property which
 defines the table or relation. The values of the attribute
 should be from the same domain. In the above
 example, we have different attributes of the employee
 like Salary, Mobile_no, etc.
Entity-Relationship Model
⚫ Entity-Relationship Model or simply ER Model is a high-level data
  model diagram. In this model, we represent the real-world problem in the
  pictorial form to make it easy for the stakeholders to understand. It is
  also very easy for the developers to understand the system by just
  looking at the ER diagram. We use the ER diagram as a visual tool to
  represent an ER Model. ER diagram has the following three components:
⚫ Entities: Entity is a real-world thing. It can be a person, place, or even a
  concept. Example: Teachers, Students, Course, Building, Department, etc
  are some of the entities of a School Management System.
⚫ Attributes: An entity contains a real-world property called attribute. This
  is the characteristics of that attribute. Example: The entity teacher has the
  property like teacher id, salary, age, etc.
⚫ Relationship: Relationship tells how two attributes are related. Example:
  Teacher works for a department.
 Entity-Relationship Model
In the above diagram, the entities are Teacher and Department. The
attributes of Teacher entity are Teacher_Name, Teacher_id, Age,
Salary, Mobile_Number. The attributes of entity Department entity
are Dept_id, Dept_name. The two entities are connected using the
relationship. Here, each teacher works for a department.
Features of ER Model
⚫Graphical Representation for Better Understanding:
 It is very easy and simple to understand so it can be
 used by the developers to communicate with the
 stakeholders.
⚫ER Diagram: ER diagram is used as a visual tool for
 representing the model.
⚫Database Design: This model helps the database
 designers to build the database and is widely used in
 database design.
Object-Oriented Data Model
⚫ The real-world problems are more closely represented
 through the object-oriented data model. In this model,
 both the data and relationship are present in a single
 structure known as an object. We can store audio,
 video, images, etc in the database which was not
 possible in the relational model(although you can store
 audio and video in relational database, it is adviced not
 to store in the relational database). In this model, two
 are more objects are connected through links. We use
 this link to relate one object to other objects. This can
 be understood by the example given below.
Object Oriented Data Model
Data models…..
⚫ Object-Relational Model
  As the name suggests it is a combination of both the relational
  model and the object-oriented model. This model was built to
  fill the gap between object-oriented model and the relational
  model. We can have many advanced features like we can make
  complex data types according to our requirements using the
  existing data types. The problem with this model is that this
  can get complex and difficult to handle. So, proper
  understanding of this model is required.
⚫ Flat Data Model
  It is a simple model in which the database is represented as a
  table consisting of rows and columns. To access any data, the
  computer has to read the entire table. This makes the modes
  slow and inefficient.
Hierarchical Model
⚫Hierarchical Model was the first DBMS model. This
 model organises the data in the hierarchical tree
 structure. The hierarchy starts from the root which has
 root data and then it expands in the form of a tree
 adding child node to the parent node. This model
 easily represents some of the real-world relationships
 like food recipes, sitemap of a website etc. Example:
 We can represent the relationship between the shoes
 present on a shopping website in the following way:
Hierarchical Model
Features of a Hierarchical Model
⚫ One-to-many relationship: The data here is organised in a tree-like
  structure where the one-to-many relationship is between the datatypes.
  Also, there can be only one path from parent to any node. Example: In
  the above example, if we want to go to the node sneakers we only have
  one path to reach there i.e through men's shoes node.
⚫ Parent-Child Relationship: Each child node has a parent node but a
  parent node can have more than one child node. Multiple parents are
  not allowed.
⚫ Deletion Problem: If a parent node is deleted then the child node is
  automatically deleted.
⚫ Pointers: Pointers are used to link the parent node with the child node
  and are used to navigate between the stored data. Example: In the
  above example the 'shoes' node points to the two other nodes 'women
  shoes' node and 'men's shoes' node.
Network Model
⚫This model is an extension of the hierarchical model. It
 was the most popular model before the relational
 model. This model is the same as the hierarchical
 model, the only difference is that a record can have
 more than one parent. It replaces the hierarchical tree
 with a graph. Example: In the example below we can
 see that node student has two parents i.e. CSE
 Department and Library. This was earlier not possible
 in the hierarchical model.
Network Model
Features of a Network Model
⚫ Ability to Merge more Relationships: In this model, as
  there are more relationships so data is more related. This
  model has the ability to manage one-to-one relationships
  as well as many-to-many relationships.
⚫ Many paths: As there are more relationships so there can
  be more than one path to the same record. This makes data
  access fast and simple.
⚫ Circular Linked List: The operations on the network
  model are done with the help of the circular linked list.
  The current position is maintained with the help of a
  program and this position navigates through the records
  according to the relationship.
Storage System in DBMS
⚫ A database system provides an ultimate view of the
  stored data. However, data in the form of bits, bytes
  get stored in different storage devices.
Types of Data Storage
⚫ For storing the data, there are different types of
  storage options available. These storage types differ
  from one another as per the speed and accessibility.
  There are the following types of storage devices used
  for storing the data:
⚫ Primary Storage
⚫ Secondary Storage
⚫ Tertiary Storage
 Hashing
⚫ Bullet Points:Hashing is a technique to map large
  amounts of data into fixed-size values.
⚫ It uses a hash function to generate a unique hash
  code for each data entry.
⚫ Provides fast and efficient data access in databases.
⚫ Visual: Image of a data record being mapped to a
  hash code via a hash function.
 Indexing in DBMS
⚫ Indexing is used to optimize the performance of a database by minimizing
   the number of disk accesses required when a query is processed.
⚫ The index is a type of data structure. It is used to locate and access the data in
   a database table quickly.
⚫ Index structure:
Indexes can be created using some database columns.
⚫ The first column of the database is the search key that contains a copy of the
  primary key or candidate key of the table. The values of the primary key are
  stored in sorted order so that the corresponding data can be accessed easily.
⚫ The second column of the database is the data reference. It contains a set of
  pointers holding the address of the disk block where the value of the
  particular key can be found.
Indexing Methods