DBMS
Database Management System (DBMS)
DATA:
Facts and statistics stored or free flowing over a network.
Generally it's raw and unprocessed.
INFORMATION:
Data becomes information when it is processed, turning it into
something meaningful.
Database:
It is collection of related data, organized in a way that data can
be easily accessed, managed and updated
Database Management System
DBMS= Database + Management System
DBMS is a collection of inter-related data and set of
programs to store & access those data in an easy and
effective manner.
Popular DBMS
 ❑ MySql
 ❑ Oracle
 ❑ SQL Server
 ❑ IBM DB2
 ❑ Amazon SimpleDB (cloud based) etc.
APPLICATIONS OF DBMS
• Banking
• Airlines
• Universities
• Credit card transactions
• Telecommunication
• Finance
• Sales
• Manufacturing
• Human Resource
• etc
File System vs Database management System
• File System :
  File system is basically a way of arranging the files in a storage medium
  like hard disk. File system organizes the files and helps in retrieval of files
  when they are required. File systems consists of different files which are
  grouped into directories. The directories further contain other folders and
  files. File system performs basic operations like management, file naming,
  giving access rules etc.
• DBMS(Database Management System) :
  Database Management System is basically a software that manages the
  collection of related data. It is used for storing data and retrieving the
  data effectively when it is needed. It also provides proper security
  measures for protecting the data from unauthorized access. In Database
  Management System the data can be fetched by SQL queries and
  relational algebra. It also provides mechanisms for data recovery and data
  backup.
   Disadvantage of using file systems
• Data redundancy and inconsistency
   • Multiple file formats, duplication of information in different files
• Difficulty in accessing data
   • Need to write a new program to carry out each new task
• Data isolation
   • Multiple files and formats
• Integrity problems
   • Integrity constraints (e.g., account balance > 0) become “buried” in program code
     rather than being stated explicitly
   • Hard to add new constraints or change existing ones
          Disadvantage of using file systems(Cont.)
• Atomicity of updates
   • Failures may leave database in an inconsistent state with partial updates
     carried out
   • Example: Transfer of funds from one account to another should either
     complete or not happen at all
• Concurrent access by multiple users
   • Concurrent access needed for performance
   • Uncontrolled concurrent accesses can lead to inconsistencies
      • Example: Two people reading a balance (say 100) and updating it by withdrawing
        money (say 50 each) at the same time
• Security problems
   • Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
Characteristics of DBMS
• Data stored into tables
• Reduced Redundancy
• Data Consistency
• Support multiple users and concurrent access
• Query language
• Security
Components of DBMS
• Hardware
• Software
• Data
• Procedures
• DB Access Language
1-Tier Architecture
• In this architecture, the database is directly available to the user. It
  means the user can directly sit on the DBMS and uses it.
• Any changes done here will directly be done on the database itself. It
  doesn't provide a handy tool for end users. It is used for
  development of the local application, where programmers can
  directly communicate with the database for the quick response.
2-Tier Architecture
• The 2-Tier architecture is same as basic client-server. In the two-tier
  architecture, applications on the client end can directly communicate
  with the database at the server side. For this interaction, API's like:
  ODBC, JDBC are used.
• The user interfaces and application programs are run on the client-
  side. The server side is responsible to provide the functionalities like:
  query processing and transaction management.
• To communicate with the DBMS, client-side application establishes a
  connection with the server side.
3-Tier Architecture
• The 3-Tier architecture contains another layer between the
  client and server. In this architecture, client can't directly
  communicate with the server.
• The application on the client-end interacts with an application
  server which further communicates with the database system.
• End user has no idea about the existence of the database
  beyond the application server. The database also has no idea
  about any other user beyond the application.
• The 3-Tier architecture is used in case of large web application.
2-tier Architecture vs 3-tier Architecture
Data Abstraction in DBMS
• Database systems are made-up of complex data structures. To ease
  the user interaction with database, the developers hide internal
  irrelevant details from users. This process of hiding irrelevant details
  from user is called data abstraction.
We have three levels of abstraction:
• Physical level: This is the lowest level of data abstraction. It describes
  how data is actually stored in database. You can get the complex data
  structure details at this level. (e.g., instructor)
• Logical level/conceptual level: This is the middle level of 3-level data
  abstraction architecture. It describes what data is stored in database.
• View level: Highest level of data abstraction. This level describes the
  user interaction with database system. application programs hide
  details of data types. Views can also hide information (such as an
  employee’s salary) for security purposes.
An architecture for a database system
  Database Schema
• A database schema is the skeleton structure that represents the
  logical view of the entire database. It defines how the data is
  organized and how the relations among them are associated. It
  formulates all the constraints that are to be applied on the data.
• A database schema defines its entities and the relationship among
  them. It contains a descriptive detail of the database, which can be
  depicted by means of schema diagrams. It’s the database designers
  who design the schema to help programmers understand the
  database and make it useful.
A database schema can be divided broadly into two categories −
• Physical Database Schema − This schema pertains to the actual
  storage of data and its form of storage like files, indices, etc. It
  defines how the data will be stored in a secondary storage.
• Logical Database Schema − This schema defines all the logical
  constraints that need to be applied on the data stored. It defines
  tables, views, and integrity constraints.
 Database Instance
• It is important that we distinguish these two terms individually.
  Database schema is the skeleton of database. It is designed
  when the database doesn't exist at all. Once the database is
  operational, it is very difficult to make any changes to it.
• A database schema does not contain any data or information.
• A database instance is a state of operational database with data
  at any given time. It contains a snapshot of the database.
• Database instances tend to change with time.
• A DBMS ensures that its every instance (state) is in a valid state,
  by diligently following all the validations, constraints, and
  conditions that the database designers have imposed.
Data Independence
• Data Independence is defined as a property of DBMS that
  helps you to change the Database schema at one level of a
  database system without requiring to change the schema
  at the next higher level.
• Data independence helps you to keep data separated from
  all programs that make use of it.
• There are two ways with which we can have data
  independence.
         • Physical Data Independence
         • Logical Data Independence
Data Independence
Physical Data Independence
• Examples of changes under Physical Data Independence
• Due to Physical independence, any of the below change will not
  affect the conceptual layer.
• Using a new storage device like Hard Drive or Magnetic Tapes
• Modifying the file organization technique in the Database
• Switching to different data structures.
• Changing the access method.
• Modifying indexes.
• Changes to compression techniques or hashing algorithms.
• Change of Location of Database from say C drive to D Drive
      Logical Data Independence
• Logical Data Independence is the ability to change the conceptual
  scheme without changing
        1.   External views
        2.   External API or programs
• Examples of changes under Logical Data Independence
    • Due to Logical independence, any of the below change will not
      affect the external layer.
    • Add/Modify/Delete a new attribute, entity or relationship is
      possible without a rewrite of existing application programs
    • Merging two records into one
    • Breaking an existing record into two or more records
Data Base Users
• Naïve Users: Using some application programs user
  is accessing the data , so no direct communication.
• Application Programmers: They are using the DB
  while writing the applications.
• Sophisticated Users: They directly interact with DB
  with permission. So they should know the language.
• Specialized Users: They interact with the DB only
  for a particular application at a time.
Data Base Administrator (DBA)
• DBA performs following tasks:
1. Schema Definition
2. Storage structure and access method definition
3. Schema and physical organization modification
4. Granting of authorization of data access
5. Routine maintenance
6. Free storage management
7. System performance
    Data Models
• A collection of tools for describing
     •   Data
     •   Data relationships
     •   Data semantics
     •   Data constraints
•   Data Model Basic Building Blocks
     • Entity: Unique and distinct object used to collect and store data.
     • Attribute: Characteristic of an entity
     • Relationship: Describes an association among entities
• Types of Data Models:
    • Relational model
    • Entity-Relationship data model (mainly for database design)
    • Object-based data models (Object-oriented and Object-
    relational)
    • Semistructured data model (XML)
    • Other older models:
           • Network model
           • Hierarchical model
Relational Model
    • All the data is stored in various tables.
    • Example of tabular data in the relational model
                                                        Columns
                                                                  Rows
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world
entities and relationships among them. While formulating real-world
scenario into the database model, the ER Model creates entity set,
relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
•Entities and their attributes.
•Relationships among entities.
• Object-based Data Model: An extension of the ER model with
  notions of functions, encapsulation, and object identity, as well.
  This model supports a rich type system that includes structured
  and collection types. Thus, in 1980s, various database systems
  following the object-oriented approach were developed. Here, the
  objects are nothing but the data carrying its properties.
• Semistructured Data Model: This type of data model is different
  from the other three data models (explained above). The
  semistructured data model allows the data specifications at places
  where the individual data items of the same type may have
  different attributes sets. The Extensible Markup Language, also
  known as XML, is widely used for representing the semistructured
  data.
Hierarchical Model
• This database model organises data into a tree-like-structure, with a
  single root, to which all the other data is linked. In this model, a child
  node will only have a single parent node.
• In hierarchical model, data is organised with one one-to-many
  relationship between two different types of data, for example, one
  department can have many courses, many professors and of-course
  many students.
Network Model
• This is an extension of the Hierarchical model. In this model data is
  organised more like a graph, and are allowed to have more than
  one parent node.
• In this database model data is more related as more relationships
  are established in this database model. Also, as the data is more
  related, hence accessing the data is also easier and fast. This
  database model was used to map many-to-many data
  relationships.
 Database Language
• Structured Query Language (SQL) is the database
  language by the use of which we can perform certain
  operations on the existing database and also we can use
  this language to create a database.
• SQL uses certain commands like Create, Drop, and Insert
  etc. to carry out the required tasks.
• These SQL commands are mainly categorized into three
  categories as:
      DDL – Data Definition Language
      DML – Data Manipulation Language
      DCL – Data Control Language
Data Definition Language (DDL)
• Specification notation for defining the database schema
   Example: create table instructor (
                        ID           char(5),
                        name          varchar(20),
                        dept_name varchar(20),
                        salary        numeric(8,2))
• DDL compiler generates a set of table templates stored in a
  data dictionary
• Data dictionary contains metadata (i.e., data about data)
   • Database schema
   • Integrity constraints
       • Primary key (roll uniquely identifies student)
   • Authorization
       • Who can access what
Data Definition Language (DDL)
• Examples of DDL commands:
   • CREATE – is used to create the database or its objects
     (like table, index, function, views, store procedure and
     triggers).
   • DROP – is used to delete objects from the database.
   • ALTER-is used to alter the structure of the database.
   • TRUNCATE–is used to remove all records from a table,
     including all spaces allocated for the records are
     removed.
   • COMMENT –is used to add comments to the data
     dictionary.
   • RENAME –is used to rename an object existing in the
     database.
  DML(Data Manipulation Language)
• DML(Data Manipulation Language) : The SQL commands that deals
  with the manipulation of data present in the database belong to DML
  or Data Manipulation Language and this includes most of the SQL
  statements.
• Examples of DML:
    • INSERT – is used to insert data into a table.
    • UPDATE – is used to update existing data within a table.
    • DELETE – is used to delete records from a database table.
    • SELECT – is used to retrieve data from the a database.
    • MERGE – is used to make changes in one table based on values
      matched from anther. It can be used to combine insert, update,
      and delete operations into one statement.
DCL(Data Control Language)
• DCL (Data Control Language): DCL includes commands
  such as GRANT and REVOKE which mainly deals with the
  rights, permissions and other controls of the database
  system.
• Examples of DCL commands:
    • GRANT-gives user’s access privileges to database.
    • REVOKE-withdraw user’s access privileges given by
      using the GRANT command.
DBMS INTERFACES
• It is a user interface that allows the users to input the queries to the
  DB without using the query language itself.
• Casual end users use high level language to specify their request
  whereas programmers use the DML.
• For Naïve and parametric users, there usually are user friendly
  interfaces for interacting with DB.
• User friendly interfaces provided by DBMS are:
➢Menu based interfaces for web clients or browsing
➢Form based interfaces
➢GUI
➢Natural Language interfaces
➢Speech input and output
➢Interfaces for DBA
    OVERALL DBMS STRUCTURE
• In this topic, we will cover the Structure of Database
  Management System (DBMS). DBMS is responsible to
  store huge amounts of data and is capable of handling
  multiple requests from users simultaneously, it should
  be arranged properly.
• A database is partitioned in modules that deal with
  each of the responsibilities of the overall system.
                             OVERALL DBMS STRUCTURE
Overall Structure of DBMS.
  OVERALL DBMS STRUCTURE
Components of DBMS are broadly classified as follows:
1. Query Processor :
(a) DML Compiler
(b) Embedded DML pre-compiler
(c) DDL Interpreter
(d) Query Evaluation Engine
  OVERALL DBMS STRUCTURE
2. Storage Manager :
(a) Authorization and Integrity Manager
(b) Transaction Manager
(c) File Manager
(d) Buffer Manager
3. Data Structure :
(a) Data Files
(b) Data Dictionary
(c) Indices
(d) Statistical Data
          Query Processor Components
• DML compiler: It translates DML statements in a query language
  into low level instructions that query evaluation engine
  understands.
• Embedded DML Pre-compiler: It converts DML statements
  embedded in an application program to normal procedure calls in
  the host language. The Pre-compiler must interact with the DML
  compiler to generate the appropriate code.
• DDL Interpreter: It interprets the DDL statements and records
  them in a set of tables containing meta data or data dictionary.
• Query Evaluation Engine: It executes low-level instructions
  generated by the DML compiler.
          Storage Manager Components
They provide the interface between the low-level data stored in the database and
application programs and queries submitted to the system.
   • Authorization and Integrity Manager: It tests for the satisfaction of integrity
constraints checks the authority of users to access data.
    • Transaction Manager : It ensures that the database remains in a consistent
state despite the system failures and that concurrent transaction execution
proceeds without conflicting.
    • File Manager : It manages the allocation of space on disk storage and the
data structures used to represent information stored on disk.
    • Buffer Manager : It is responsible for fetching data from disk storage into
main memory and deciding what data to cache in memory.
                       Data Structures
Following data structures are required as a part of the physical system
implementation.
• Data Files: It stores the database.
• Data Dictionary: It stores meta data (data about data) about the structure of the
database.
• Indices: Provide fast access to data items that hold particular values.
• Statistical Data: It stores statistical information about the data in the database.
This information is used by query processor to select efficient ways to execute
query.
Database Users and Administrators
              Database
Database System Internals