LECTURE -1
DATABASE MANAGEMENT SYSTEMS
14 AUGUST 2020
       TEXT BOOK : DATABASE SYSTEM CONCEPTS,SILBERCSHATZ,KORTH, SUDERSHAN. MC GRAW HILL,5/E
INTRODUCTION
❑ Good decisions require good information derived from
  raw facts
❑ Data managed most efficiently when stored in a database
❑ Databases evolved from computer file systems
❑ Understanding file system characteristics is important
                                                           3
DATA VS. INFORMATION
❑ Data are raw facts
❑ Information is the result of processing raw data to reveal
  meaning
❑ Information requires context to reveal meaning
❑ Raw data must be formatted for storage, processing, and
  presentation
❑ Data are the foundation of information, which is the
  bedrock of knowledge
                                                           4
DATA VS. INFORMATION (CONTINUED)
❑ Data: building blocks of information
❑ Information produced by processing data
❑ Information used to reveal meaning in data
❑ Accurate, relevant, timely information is the key to good
  decision making
❑ Good decision making is the key to organizational
  survival
                                                          5
DATABASE MANAGEMENT SYSTEM
❑ Database: shared, integrated computer structure that stores a
  collection of data
    ❖ End-user data: raw facts of interest to the end user
    ❖ Metadata: data about data
❑ Metadata provides description of data characteristics and
  relationships in data
    ❖ Complements and expands value of data
❑ Database management system (DBMS): collection of programs
    ❖ Manage structure and control access to data
                                                                  6
ROLE OF DBMS
❑ DBMS is the intermediary between the user and the
  database
❑ Database structure stored as file collection
❑ Access database through the DBMS
❑ DBMS enables data to be shared
❑ DBMS integrates many users’ views of the data
                                                      7
 DATABASE MANAGEMENT SYSTEM (DBMS)
❑ DBMS contains information about a particular enterprise
   ▪ Collection of interrelated data
   ▪ Set of programs to access the data
   ▪ An environment that is both convenient and efficient to use
❑ Database Applications:
   ▪ Banking: all transactions
   ▪ Airlines: reservations, schedules
   ▪ Universities: registration, grades
   ▪ Sales: customers, products, purchases
   ▪ Online retailers: order tracking, customized recommendations
   ▪ Manufacturing: production, inventory, orders, supply chain
   ▪ Human resources: employee records, salaries, tax deductions
❑ Databases touch all aspects of our lives
PURPOSE OF DATABASE SYSTEMS
❑ In the early days, database applications were built directly on
  top of file systems
❑ Drawbacks of using file systems to store data:
  ❖ Data redundancy and inconsistency
     ▪ Multiple file formats, duplication of information in different files
  ❖ Difficulty in accessing data
     ▪ Need to write a new program to carry out each new task
  ❖ Data isolation — multiple files and formats
  ❖ Integrity problems
     ▪ Integrity constraints (e.g. account balance > 0) become “buried”
       in program code rather than being stated explicitly
     ▪ Hard to add new constraints or change existing ones
PURPOSE OF DATABASE SYSTEMS (CONT.)
❑ Drawbacks of using file systems (cont.)
 ❖ Atomicity of updates
    ▪ Failures may leave database in an inconsistent state with partial
      updates carried out
    ▪ Example: Transfer of funds from one account to another should either
      complete or not happen at all
 ❖ Concurrent access by multiple users
    ▪ Concurrent accessed needed for performance
    ▪ Uncontrolled concurrent accesses can lead to inconsistencies
       ▪ Example: Two people reading a balance and updating it at the same
         time
 ❖ Security problems
    ▪ Hard to provide user access to some, but not all, data
❑ Database systems offer solutions to all the above problems
LEVELS OF ABSTRACTION
❑ Physical level: describes how a record (e.g., customer) is stored.
❑ Logical level: describes data stored in database, and the
  relationships among the data.
        type customer = record
        customer_id : interger;
   customer_name : string;
   customer_street : string;
   customer_city : string;
     end;
❑ View level: application programs hide details of data types. Views
  can also hide information (such as an employee’s salary) for
  security purposes.
VIEW OF DATA
❑ An architecture for a database system
 INSTANCES AND SCHEMAS
❑ Similar to types and variables in programming languages
❑ Schema – the logical structure of the database
     ▪ Example: The database consists of information about a set of
       customers and accounts and the relationship between them)
     ▪ Analogous to type information of a variable in a program
     ▪ Physical schema: database design at the physical level
     ▪ Logical schema: database design at the logical level
❑ Instance – the actual content of the database at a particular point in
  time
    ▪ Analogous to the value of a variable
❑ Physical Data Independence – the ability to modify the physical
  schema without changing the logical schema
    ▪ Applications depend on the logical schema
    ▪ In general, the interfaces between the various levels and components
      should be well defined so that changes in some parts do not seriously
      influence others.
DATA MODELS
❑ A collection of tools for describing
     ▪ Data
     ▪ Data relationships
     ▪ Data semantics
     ▪ Data constraints
❑ Relational model
❑ Entity-Relationship data model (mainly for database design)
❑ Object-based data models (Object-oriented and
  Object-relational)
❑ Semistructured data model (XML)
❑ Other older models:
     ▪ Network model
     ▪ Hierarchical model
DATA MANIPULATION LANGUAGE (DML)
❑ Language for accessing and manipulating the data organized by
  the appropriate data model
    ▪ DML also known as query language
❑ Two classes of languages
   ▪ Procedural – user specifies what data is required and how to get
     those data
   ▪ Declarative (nonprocedural) – user specifies what data is
     required without specifying how to get those data
❑ SQL is the most widely used query language
DATA DEFINITION LANGUAGE (DDL)
❑ Specification notation for defining the database schema
 Example:    create table
                  account (
                   account-number char(10),
                   balance        integer
                            )
❑ DDL compiler generates a set of tables stored in a data dictionary
❑ Data dictionary contains metadata (i.e., data about data)
  ❖ Database schema
  ❖ Data storage and definition language
     ▪ Specifies the storage structure and access methods used
  ❖ Integrity constraints
     ▪ Domain constraints
     ▪ Referential integrity (references constraint in SQL)
     ▪ Assertions
  ❖ Authorization
RELATIONAL MODEL
                                                  Attributes
Example of tabular data in the relational model
A SAMPLE RELATIONAL DATABASE
STRUCTURED QUERY LANGUAGE (SQL)
❑ SQL: widely used non-procedural language
 ❖ Example: Find the name of the customer with customer-id 192-83-7465
   select customer.customer_name
   from customer
   where customer.customer_id = ‘192-83-7465’
 ❖ Example: Find the balances of all accounts held by the customer with
   customer-id 192-83-7465
   select account.balance
   from depositor, account
   where depositor.customer_id = ‘192-83-7465’ and
        depositor.account_number = account.account_number
❑ Application programs generally access databases through one of
   • Language extensions to allow embedded SQL
   • Application program interface (e.g., ODBC/JDBC) which allow SQL
     queries to be sent to a database
DATABASE DESIGN
❑ The process of designing the general structure of the database:
❑ Logical Design – Deciding on the database schema. Database design
  requires that we find a “good” collection of relation schemas.
   ▪ Business decision – What attributes should we record in the database?
   ▪ Computer Science decision – What relation schemas should we have
     and how should the attributes be distributed among the various relation
     schemas?
❑ Physical Design – Deciding on the physical layout of the database
THE ENTITY-RELATIONSHIP MODEL
❑ Models an enterprise as a collection of entities and relationships
  ▪ Entity: a “thing” or “object” in the enterprise that is distinguishable
    from other objects
     ▪ Described by a set of attributes
  ▪ Relationship: an association among several entities
❑ Represented diagrammatically by an entity-relationship diagram:
STORAGE MANAGEMENT
❑ Storage manager is a program module that provides the
  interface between the low-level data stored in the database and
  the application programs and queries submitted to the system.
❑ The storage manager is responsible to the following tasks:
   ▪ Interaction with the file manager
   ▪ Efficient storing, retrieving and updating of data
❑ Issues:
    ▪ Storage access
    ▪ File organization
    ▪ Indexing and hashing
            QUERY PROCESSING
1.   Parsing and translation
2.   Optimization
3.   Evaluation
      QUERY PROCESSING (CONT.)
❑ Alternative ways of evaluating a given query
   ▪ Equivalent expressions
   ▪ Different algorithms for each operation
❑ Cost difference between a good and a bad way of evaluating a query
  can be enormous
❑ Need to estimate the cost of operations
  ▪ Depends critically on statistical information about relations which the
    database must maintain
  ▪ Need to estimate statistics for intermediate results to compute cost of
    complex expressions
TRANSACTION MANAGEMENT
❑ A transaction is a collection of operations that performs a single
  logical function in a database application
❑ Transaction-management component ensures that the database
  remains in a consistent (correct) state despite system failures (e.g.,
  power failures and operating system crashes) and transaction failures.
❑ Concurrency-control manager controls the interaction among the
  concurrent transactions, to ensure the consistency of the database.
DATABASE ARCHITECTURE
The architecture of a database systems is greatly influenced by
 the underlying computer system on which the database is running:
❑   Centralized
❑   Client-server
❑   Parallel (multi-processor)
❑   Distributed
2-TIER AND 3-TIER ARCHITECTURE
 DATABASE USERS
Users are differentiated by the way they expect to interact with
the system
❑ Application programmers – interact with system through DML calls
❑ Sophisticated users – form requests in a database query language
❑ Specialized users – write specialized database applications that do not
  fit into the traditional data processing framework
❑ Naïve users – invoke one of the permanent application programs that
   have been written previously
  ▪ Examples, people accessing database over the web, bank tellers,
   clerical staff
DATABASE ADMINISTRATOR
❑ Coordinates all the activities of the database system; the
  database administrator has a good understanding of the
  enterprise’s information resources and needs.
❑   Database administrator's duties include:
      ❖ Schema definition
      ❖ Storage structure and access method definition
      ❖ Schema and physical organization modification
      ❖ Granting user authority to access the database
      ❖ Specifying integrity constraints
      ❖ Acting as liaison with users
      ❖ Monitoring performance and responding to changes in
        requirements
OVERALL SYSTEM STRUCTURE
                         1
                   R E
                TU
              C
            LE
      O F
  D
EN