Unit 1: The Database Environment: Topic 1: Basic Concepts and Terminologies Data vs. Information - Data
Unit 1: The Database Environment: Topic 1: Basic Concepts and Terminologies Data vs. Information - Data
   -   information system
           o facilitates the transformation of data into information, and it allows for the
              management of both data and information
           o complete information system is composed of people, hardware, software, the
              database(s), application programs, and procedures
           o One key characteristics of current information systems is the strategic value of
              information in the age of global business and should always be aligned with the
              strategic business goals
           o Current information systems should always be integrated with the company’s
              enterprise-wide information systems architecture
   -   System analysis
           o the process that establishes the need for and the extent of an information
              system
   -   Systems development
           o The process of creating an information system
   -   Performance of Information system depends on three factors:
           o Database design and implementation
           o Application design and implementation
           o Administrative procedures
   -   Database development
           o describes the process of database design and implementation
           o The primary objective in database design is to create complete, normalized,
              non-redundant (to the extent possible), and fully integrated conceptual, logical,
              and physical database models
The Systems Development Life Cycle (SDLC)
   -   Systems Development Life Cycle (SDLC)
           o Traces the history (life cycle) of an information system. Perhaps more important
             to the system designer
           o provides the big picture within the database design and application development
             can be mapped out and evaluated
           o the traditional SDLC is divided into five phases:
                  planning
                  analysis
                  detailed systems design
                      implementation
                      maintenance
Phase 1. PLANNING
   -   yields a general overview of the company and its objectives
   -   An initial assessment of the information flow-and-extent requirements must be made
       during this discovery portion of the SDLC
Phase 2: ANALYSIS
   -   Problems defined during the planning phase are examined in greater detail during the
       analysis phase
   -   A macro analysis must be made of both individual needs and organizational needs
   -   The result of analysis should be a better understanding of the system’s functional areas,
       actual and potential problems, and opportunities
   -   analysis phase also includes the creation of a logical systems design
          o The logical design must specify the appropriate conceptual data model, inputs,
            processes, and expected output requirements
Phase 3: DETAILED SYSTEMS DESIGN
   -   the designer completes the design of the system’s processes
   -   The design includes all the necessary technical specifications for the screens, menus,
       reports, and other devices that might be used to help make the system a more efficient
       information generator.
Phase 4: IMPLEMENTATION
   -   the hardware, DBMS software, and application programs are installed, and the database
       design is implemented
   -   During the initial stages of the implementation phase, the system enters into a cycle of
       coding, testing, and debugging until it is ready to be delivered
   -   actual database is created
Phase 5: MAINTENANCE
   -   Almost as soon as the system is operational, end users begin to request changes in it.
       Those changes generate system maintenance activities, which can be grouped into
       three types:
           o Corrective maintenance in response to systems errors
           o Adaptive maintenance due to changes in the business environment
           o Perfective maintenance to enhance the system
           o Computer-aided systems engineering (CASE) tools
                   such as System Architect or Visio Professional, helps make it possible to
                     produce better systems within a reasonable amount of time and at a
                     reasonable cost
Topic 2: Database Development Process
The Database Life Cycle (DBLC)
   -   The Database Life Cycle (DBLC) contains six phases:
          o database initial study
          o database design
          o implementation and loading,
          o testing and evaluation
          o operation
          o maintenance and evolution
Phase 1: THE DATABASE INITIAL STUDY
   -   The overall purpose of the database initial study is to:
          o Analyze the company situation
                   The company situation describes the general conditions in which a
                      company operates, its organizational structure, and its mission
          o Define problems and constraints
                   The designer has both formal and informal sources of information. If the
                      company has existed for any length of time, it already has some kind of
                      system in place (either manual or computer-based).
          o Define objectives
                   A proposed database system must be designed to help solve at least the
                      major problems identified during the problem discovery process
          o Define scope and boundaries
                   The system’s scope defines the extent of the design according to
                      operational requirements
                      The proposed system is also subject to limits known as boundaries, which
                       are external to the system.
   -   The output of the database design phase is a series of instructions detailing the creation of
       tables, attributes, domains, views, indexes, security constraints, and storage and performance
       guidelines. In this phase, you actually implement all these design specifications.
           a. Install the DBMS
                      Virtualization is a technique that creates logical representations of computing
                        resources that are independent of the underlying physical computing resources.
                      This technique is used in many areas of computing such as the creation of virtual
                        servers, virtual storage, and virtual private networks
           b. Create the Database(s)
                      In most modern relational DBMSs, a new database implementation requires the
                        creation of special storage-related constructs to house the end-user tables. The
                        constructs usually include the storage group (or file groups), the table spaces,
                        and the tables.
           c. Load or Convert the Data
                      After the database has been created, the data must be loaded into the database
                        tables. Typically, the data will have to be migrated from the prior version of the
                        system
   -   This phase occurs in conjunction with applications programming. Programmers use database
       tools to prototype the applications during the coding of the programs.
           o Test the database
                     In database testing you must you must test the following:
                           o Physical security allows only authorized personnel physical access
                               to specific areas
                       o Password security allows the assignment of access rights to
                         specific authorized users and is usually enforced at login time at
                         the operating system level
                     o Access rights can be established through the use of database
                         software.
                     o Audit trails are usually provided by the DBMS to check for access
                         violations. Although the audit trail is an after-the-fact device, its
                         mere existence can discourage unauthorized use.
                     o Data encryption can be used to render data useless to
                         unauthorized users who might have violated some of the
                         database security layers
                     o Diskless workstations allow end users to access the database
                         without being able to download the information from their
                         workstations
        o Fine-Tune the Database
               Different systems will place different performance requirements on the
                 database
               Many factors can impact the database’s performance on various tasks
               Environmental factors, such as the hardware and software environment
                 in which the database exists, can have a significant impact on database
                 performance
        o Evaluate the Database and Its Application Programs:
               Testing and evaluation of the individual components should culminate in
                 a variety of broader system tests to ensure that all of the components
                 interact properly to meet the needs of the users.
               To ensure that the data contained in the database are protected against
                 loss, backup and recovery plans are tested.
               Database backups can be performed at different levels:
                     o A full backup of the database, or dump of the entire database. In
                         this case, all database objects are backed up in their entirety
                     o A differential backup of the database, in which only the last
                     o modifications to the database (when compared with a previous
                         full backup copy) are copied. In this case, only the objects that
                         have been updated since the last full backup are backed up.
                     o A transaction log backup, which backs up only the transaction log
                         operations that are not reflected in a previous backup copy of the
                         database. In this case, only the transaction log is backed up; no
                         other database objects are backed up.
PHASE 6: MAINTENANCE AND EVOLUTION
   -   The database administrator must be prepared to perform routine maintenance activities
       within the database. Some of the required periodic maintenance activities include:
          o Preventive maintenance (backup).
          o Corrective maintenance (recovery).
          o Adaptive maintenance (enhancing performance, adding entities and attributes,
               and so on).
          o Assignment of access permissions and their maintenance for new and old users.
          o Generation of database access statistics to improve the efficiency and
               usefulness of system audits and to monitor system performance.
          o Periodic security audits based on the system-generated statistics.
          o Periodic (monthly, quarterly, or yearly) system-usage summaries for internal
               billing or budgeting purposes.
The Database Development Process
Software Development Life Cycle – Waterfall
Waterfall Model
   -   shows the process as a strict sequence of steps where the output of one step is the
       input to the next and all of one step has to be completed before moving onto the next
   -   We can use the waterfall process as a means of identifying the tasks that are required,
       together with the input and output for each activity. What is important is the scope of
       the activities, which can be summarized as follows:
           o Establishing requirements involves consultation with, and agreement among,
               stakeholders about what they want from a system, expressed as a statement of
               requirements.
           o Analysis starts by considering the statement of requirements and finishes by
               producing a system specification. The specification is a formal representation of
               what a system should do, expressed in terms that are independent of how it may
               be realized
           o Design begins with a system specification, produces design documents and
               provides a detailed description of how a system should be constructed
           o Implementation is the construction of a computer system according to a given
               design document and taking into account the environment in which the system
               will be operating (e.g., specific hardware or software available for the
               development). Implementation may be staged, usually with an initial system that
               can be validated and tested before a final system is released for use.
           o Testing compares the implemented system against the design documents and
               requirements specification and produces an acceptance report or, more usually,
               a list of errors and bugs that require a review of the analysis, design and
             implementation processes to correct (testing is usually the task that leads to the
             waterfall model iterating through the life cycle).
           o Maintenance involves dealing with changes in the requirements or the
             implementation environment, bug fixing or porting of the system to new
             environments (e.g., migrating a system from a standalone PC to a UNIX
             workstation or a networked environment). Since maintenance involves the
             analysis of the changes required, design of a solution, implementation and
             testing of that solution over the lifetime of a maintained software system, the
             waterfall life cycle will be repeatedly revisited
Database Life Cycle - Waterfall
       We can use the waterfall cycle as the basis for a model of database development that
incorporates three assumptions:
   1. We can separate the development of a database – that is, specification and creation of
      a schema to define data in a database – from the user processes that make use of the
      database.
   2. We can use the three-schema architecture as a basis for distinguishing the activities
      associated with a schema.
   3. We can represent the constraints to enforce the semantics of the data once within a
      database, rather than within every user process that uses the data
Requirements Gathering
   -   The first step is requirements gathering where the database designers have to interview
       the customers (database users) to understand the proposed system and obtain and
       document the data and functional requirements.
-   The result of this step is a document that includes the detailed requirements provided
    by the users.
-   Establishing requirements involves consultation with, and agreement among all the
    users as to what persistent data they want to store along with an agreement as to the
    meaning and interpretation of the data elements.
-   The data administrator plays a key role in this process as they overview the business,
    legal and ethical issues within the organization that impact on the data requirements
-   The data requirements document is used to confirm the understanding of requirements
    with users. To make sure that it is easily understood, it should not be overly formal or
    highly encoded. The document should give a concise summary of all users’
    requirements – not just a collection of individuals’ requirements – as the intention is to
    develop a single shared database.
-   The requirements should not describe how the data is to be processed, but rather what
    the data items are, what attributes they have, what constraints apply and the
    relationships that hold between the data items
Analysis
-   Data analysis begins with the statement of data requirements and then produces a
    conceptual data model.
-   The aim of analysis is to obtain a detailed description of the data that will suit user
    requirements so that both high and low level properties of data and their use are dealt
    with
-   The conceptual data model
        o provides a shared, formal representation of what is being communicated
            between clients and developers during database development it is focused on
            the data in a database, irrespective of the eventual use of that data in user
            processes or implementation of the data in specific computer environments.
        o concerned with the meaning and structure of data, but not with the details
            affecting how they are implemented.
        o a formal representation of what data a database should contain and the
            constraints the data must satisfy. This should be expressed in terms that are
            independent of how the model may be implemented.
-   As a result, analysis focuses on the questions, “What is required?” not “How is it
    achieved?”
Logical Design
-   Database design starts with a conceptual data model and produces a specification of a
    logical schema; this will determine the specific type of database system (network,
    relational, object-oriented) that is required.
    -   The output of this stage is a detailed relational specification, the logical schema, of all
        the tables and constraints needed to satisfy the description of the data in the
        conceptual data model. It is during this design activity that choices are made as to which
        tables are most appropriate for representing the data in a database.
    Implementation
    -   involves the construction of a database according to the specification of a logical
        schema
    -    This will include the specification of an appropriate storage schema, security
        enforcement, external schema and so on
    -   Implementation is heavily influenced by the choice of available DBMSs, database tools
        and operating environment
Realizing the Design
    -   After the logical design has been created, we need our database to be created according to the
        definitions we have produced. For an implementation with a relational DBMS, this will probably
        involve the use of SQL to create tables and constraints that satisfy the logical schema description
        and the choice of appropriate storage schema (if the DBMS permits that level of control).
    -   One way to achieve this is to write the appropriate SQL DDL statements into a file that can be
        executed by a DBMS so that there is an independent record, a text file, of the SQL statements
        defining the database. Another method is to work interactively using a database tool like SQL
        Server Management Studio or Microsoft Access.
    -   Whatever mechanism is used to implement the logical schema, the result is that a database,
        with tables and constraints, is defined but will contain no data for the user processes.
    -   After a database has been created, there are two ways of populating the tables – either from
        existing data or through the use of the user applications developed for the database
Note: These are general guidelines that will assist in developing a strong basis for the actual database
design (the logical model):
   -   Many persons are involved in the design, use, and maintenance of a large database with a few
       hundred users. Here we will consider people who may be called “Actors on the Scene”, whose
       jobs involve the day-to-day use of a large database.
   1. Database Administrators:
                   person responsible for the control of the centralized and shared database
                   Administrating the primary (database) and secondary (DBMS and related
                     software) is the responsibility of the database administrator (DBA).
                   The DBA is responsible for authorizing access to the database, for coordinating
                     and monitoring its use, and for acquiring software and hardware resources as
                     needed.
                   DBA function might be defined as either a staff or line position.
                         o DBA function in a staff position often creates a consulting environment
                             in which the DBA is able to devise the data administration strategy but
                             does not have the authority to enforce it or to resolve possible conflicts.
                         o DBA function in a line position has both the responsibility and the
                             authority to plan, define, implement, and enforce the policies,
                             standards, and procedures used in the data administration activity
   2. Database Designers:
                   Database Designers are responsible for identifying the data to be stored in the
                     database and for choosing appropriate structures to represent and store this
                     data.
                   Database designer typically interact with each potential group and users and
                     develop a view of the database that meets the data and processing
                     requirements of this groups.
   3. End Users (database users):
                   End users are the people whose jobs require access to the database for
                     querying, updating and generating reports; the database primarily exists for
                     their use.
                   There are several categories of end users:
                         o Casual end user:
                                  o Occasionally access the database, but they may need different
                                      information each time. They are typically middle-or high-level
                                      managers or other occasional browsers.
                         o Naive or Parametric end user:
                                  o Their main job function revolves around constantly querying
                                      and updating the database, using standard types of queries and
                                      updates that have been carefully programmed and tested. Bank
                                      tellers, Reservation Clerks for airlines, hotels, etc. are the
                                      example of Naive end users
                          o   Sophisticated end users:
                                 o Sophisticated end users include engineers, scientist, business
                                      analyst and others who thoroughly familiarize themselves with
                                      the facilities of the DBMS so as to implement their applications
                                      to meet their complex requirements.
                          o   Stand-alone users:
                                  o They maintain personal database by using ready-made program
                                     packages that provide easy-to-use menu or graphics-based
                                     interfaces
   4. Software Engineers:
                  System analysts determines the requirements of end users, especially naive and
                     parametric end users, and develop specifications for canned transactions that
                     meet these requirements. Application programmers implement these
                     specifications as programs; then they test, debug, document, and maintain
                     these canned transactions.
                  Such analyst and programmers are called Software Engineers.
   5. Data Administrator (DA)
                  The DA, also known as the information resource manager (IRM), usually reports
                     directly to top management
                  given a higher degree of responsibility and authority than the DBA
                  responsible for controlling the overall corporate data resources, both
                     computerized and manual
                  in charge of controlling not only the computerized data but also the data
                     outside the scope of the DBMS.
                  responsible for providing a global and comprehensive administrative strategy
                     for all of the organization’s data
                  responsible for the consolidation and consistency of both manual and
                     computerized data
                  The DA must also set data administration goals. Those goals are defined by
                     issues such as:
                         o Data “sharability” and time availability.
                         o Data consistency and integrity.
                         o Data security and privacy.
                         o Data quality standards.
                         o Extent and type of data use.
   -   DA’s job typically has a strong managerial orientation with companywide scope.
   -   DBA’s job tends to be more technically oriented and has a narrower DBMS-specific scope.
          o Both the DA and the DBA perform “people” functions common to all departments in an
               organization.
DBA or Database Administrator
            Is the focal point for data/user interaction
            DBA defines and enforces the procedures and standards to be used by
              programmers and end users during their work with the DBMS
            DBA also verifies that programmer and end-user access meet the required
              quality and security standards.
            Database users might be classified by the:
                  o Type of decision-making support required (operational, tactical, or
                       strategic).
                  o Degree of computer knowledge (novice, proficient, or expert).
                  o Frequency of access (casual, periodic, or frequent).
            DBA must be able to interact with all of those people, understand their different
              needs, answer questions at all levels of the expertise scale, and communicate
              effectively.
            DBA must perform two distinct roles.
                  o The DBA’s managerial role is focused on personnel management and on
                       interactions with the end-user community
                           o As a manager, the DBA must concentrate on the control and
                                planning dimensions of database administration.
                           o the DBA’s managerial role is responsible for:
                                     Coordinating, monitoring, and allocating database
                                       administration resources: people and data.
                                     Defining goals and formulating strategic plans for the
                                       database administration function.