Database and Information Resource Management
BSc in BS i MIS
Handout 2 : Database Design Life Cycle, Database Architecture and Database Types rch tecture ypes *The lecture slides are slides of Dr Prasad Wimalaratne with minor The Dr.Prasad modifications.
Overview
Database design must reflect the information system of which the database is a part y p Information systems undergo evaluation and revision within a framework known as the Systems Development Life Cycle (SDLC) Databases also undergo evaluation and revision within a framework known as the h f kk h Database Life Cycle (DBLC) T Two general design strategies exist ld i t t i i t
top-down vs. bottom-up design centralized vs. d t li d decentralized d i t li d design
2
Lessons from Business Automation
Era of finance and operations: 60s - 70s
Business accounting systems g y
Manufacturing software: 70s - 80s
Separate applications for inventory, ordering, p pp y forecasting, shop floor operations, logistics, etc.
Era of the business enterprise: 90s - 00s
Separate applications get rolled into enterprise resource planning system Sales force automation, customer service center automation center, campaign management, automated email response, etc. get rolled into customer relationship management. t
3
Successful Automation Requires an Interlocking of Several Entities
User
(Someone doing real work )
Infrastructure (Computer and Human) )
Management (Organization)
4
American Airlines
American Airlines settled a lawsuit with Budget RentA-Car, Marriott Corp. and Hilton Hotels after the $165 million CONFIRM car rental and hotel illi t l dh t l reservation system project collapsed into chaos.
Typical SW Project Scenarios
Project Outcomes P O
84% of all automation projects have significant or major problems
6
Typical SW Project Scenarios
Percent O P Over Budget B d
9% 10% 4% 16% <20% 21% - 50% 51% - 100% 101%-200% % % 201%-400% >400%
31% 30%
53% of all automation proj cts are more than 50% 5 a automat on projects ar mor 5 over budget 23% of all automation projects are more than 100% over b d t budget
7
Typical SW Project Scenarios
Percent of Time Under Estimated P fT d E d
11% 1% 14% <20% 21%-50% 18% 36% 51-100% 51 100% 101%-200% 201%-400% >400% 20%
49% of all automation projects take twice as long to complete as planned
8
Typical SW Project Scenarios
Percent Planned Functionality P Pl dF l
7% 5% 27% 39% <25% 25-49% 50-74% 50 74% 75-99% 100%
22%
54% of all automation projects deliver less than half of the promised functionality
9
10
Most Problems are Non-Technical
Poorly selected data Badly organized data Incorrect data models Software has limited capability (oversell) Systems managers underestimate time requirements Systems can be underutilized Systems can be (and have been) abandoned Personnel problems
11
Changing Data into Information
Data
Raw facts stored in databases Need additional processing to become useful
Information
Required by decision maker Data processed and presented in a meaningful form Transformation
12
Information System (recap from handout 1)
Database
Carefully designed and constructed repository of y g p y facts Part of an information system
Information System f
Provides data collection, storage, and retrieval Facilitates data transformation F ilit t d t t f ti Includes people, hardware, and software Software: Database(s), Application programs, and Database(s) programs Procedures
13
From Databases to Business Intelligence
14
From Databases to Business Intelligence Multi Tiered Multi-Tiered Architectures
Other Sources Operational DBs Metadata Monitor & Integrator OLAP Server Analysis Query Reports Data mining
Extract Transform Load Refresh
Data Warehouse
Serve
Data Marts Data Sources Data Storage OLAP Engine Front-End Tools
15
e.g New York City Police Department Command Center
Data visualization in action
16
From Databases to Business Intelligence Management Cockpit
17
From Databases to Business Intelligence e.g Performance Dashboard
18
From Databases to Business Intelligence e.g Performance Dashboard
19
Information System (Cont.)
System Analysis
Establishes need and extent of an information system
Systems Development
Process of creating information system
Database Development
Process of database design and implementation Creation of data models Implementation Creating storage structure Loading data into database Providing for data management
20
Organizational Context for Using Database Requirements Systems ?
Consolidation and integration of data across organization g Maintenance of complex data p y p g pp Simplicity of developing new applications Data independence
Protecting application programs from changes in g pp p g g the underlying logical organization and in the physical access paths and storage structures
E t External Schemas lS h
Allow the same data to be used for multiple applications with each application having its own view of the data
21
Information System (Cont.)
Information System includes all resources involved in the collection, management, use g and dissemination of the information resources of the organization We consider two systems life cycles:
Macro Life Cycle
I f Information S t ti System Life C l Lif Cycle
Micro Life Cycle
Database System Life Cycle y f y
22
Database Design Methodology
A structured approach that uses procedures, techniques, tools, and documentation aids to q support and facilitate the process of design.
Conceptual Database Design Logical Database Design Physical Database Design
23
Database Design Steps
Outputs? O
Conceptual Design p g Logical Design Physical Design
24
Database Design Process
Problem
Design the logical and physical structure of one or more databases to accommodate the i f d t b t d t th information ti needs of the users in an organization for a defined set of applications. Satisfy the content requirements P id easy structuring of information Provide s st t i fi f ti Support processing requirements and performance objectives
Goals
25
Feasibility Analysis
Phases of Information System Life Cycle ( (Macro) )
Analyzing potential application areas Identifying the economics of information gathering and dissemination Performing cost benefit studies Setting up priorities among applications Detailed Requirements Collection Interaction with Users
Requirement Collection and Analysis D si Design
Design of Database System Design of programs that use and process the database
26
Implementation
Phases of Information System Life Cycle ( (Cont) )
Information system is implemented Database is loaded & its transactions are implemented and tested Testing against users requirements Testing against performance criteria Data conversion Training System maintenance S t i t Performance monitoring g Database tuning
Validation and Acceptance Testing
Deployment Operation and Maintenance Deployment,
27
Systems Development Life Cycle
System Analysis
Database Organization
28
Database Lifecycle (DBLC)
Phase 1 Phase 2 Phase 3 Phase 4 Phase 5 Phase 6
29
Database Organization O i i
Database System Life Cycle(DBLC)
System Definition Database Design
Defining scope of database system, its users and applications li ti Logical and physical design of the database system on the chosen DBMS Database implementation
Database Implementation and Loading
Specifying conceptual, external and internal database definitions Creating empty database files Implementing software applications
30
Database System Life Cycle (Cont...)
Database Implementation and Loading Cont
Loading or data conversion g
Populating the database
Application conversion
Converting applications to the new system l h
Testing and validation O Operation ti
Running the new system
M it i and M i t Monitoring d Maintenance
System maintenance Performance monitoring
31
Phase 1: Database Initial Study
Purposes
Analyze company situation y p y
Operating environment Organizational structure
D fi problems and constraints Define bl d i Define objectives Define scope and boundaries
32
Initial Study Activities
33
Phase 2: Database Design
Most Critical DBLC phase Makes sure final product meets requirements Focus on data requirements Subphases
I. Create Conceptual Design II. DBMS Software Selection III. Create Logical Design IV. Create Physical Design
34
Two Views of Data
35
I. Conceptual Design
Data modeling creates abstract data structure to represent real-world items p High level of abstraction p Steps
Data analysis and requirements *Entity relationship modeling and normalization* *Data model verification*
36
Conceptual Design Phase
TOP-DOWN Identify Entities
Identify Relationships
Identify Attributes
BOTTOM-UP
Identify Relationships
Identify Dependencies
DATA
Collect Data
37
Conceptual Schema Design
Goal
Complete understanding of the database structure, semantics, interrelationships and constraints ti i t l ti hi d t i t
Serves as a stable description of the database contents Good understanding crucial for the users and designers g Diagrammatic description serves as an excellent communication tool
38
Step 1 : Data Analysis and Requirements
Focus on:
Information needs Information users Information sources
Data Sources
Developing and gathering end-user data views Direct observation of current system Interfacing with systems design group
B i Business R l Rules
39
Step 2 : Entity Relationship g Modeling and Normalization
40
E-R Modeling is Iterative
41
Transaction Design
Design characteristics of known database transactions in a DBMS Types of Transactions
Retrieval Transactions Update Transactions Mixed Transactions
Update data Used to retrieve data
Techniques for Specifying Transactions
Input/output F Functional Behavior
Combination of update and retrieval
42
Conceptual Design: Tools and Sources
43
Step 3 : Data Model Verification
E-R model is verified against proposed system p processes
End user views and required transactions Business-imposed data requirements and constraints t i t
Reveals additional entity and attribute details
44
E-R Model Verification Process
45
Iterative Process of Verification
46
Approaches to Conceptual Schema Design
Centralized Schema Design Approach
Also known as one-shot approach Requirements of different applications and user groups are merged into a single set of requirements and a single schema is designed q g g Time consuming, places the burden on DBA to reconcile conflicts Schema is designed for each user group or application pp These schemas are then merged into a global conceptual schema during the view integration phase More practical
View Integration Approach
47
Strategies for Schema Design
Top Down Strategy gy
Start with a schema containing high-level abstractions and then apply successive topown down refinements
48
Strategies for Schema Design (Cont...)
Bottom-Up Strategy
Start with a schema containing g basics abstractions and then combine or add to these abstractions
49
Strategies for Schema Design (contd.)
Inside-out Strategy
Start with central set of concepts and then spread p p outward by considering new concepts in the vicinity of existing ones
Mixed Strategy
U a combination of t d Use bi ti f top-down and b tt d bottom-up strategies
50
Schema Integration
Identifying correspondence and conflict among different schemas
Naming Conflicts
Synonyms: The same concept but different names Homonyms: Different concepts but same name
e.g. entity types CUSTOMER and CLIENT e.g. entity type PART as computer parts and furniture parts
Type Conflicts: Representing the same concept by different m ff modeling constructs g
Also known as value set conflicts lso confl cts e.g. SSN as an integer and as a character string
Domain Conflicts: Attribute has different domains Conflict Among Constraints: Two schemas impose different constraints
e.g. DEPARTMENT may be an entity type and an attribute
e.g. different key of an entity type in different schemas
51
Desired Characteristics of Conceptual Data Model
Expressiveness
Able to distinguish different types of data, relationships and constraints l ti hi d t i t Easy to understand
Simplicity and Understandability Minimality
Diagrammatic Representation Formality
Small number of distinct basic concepts f p Diagrammatic notation for representing conceptual schema h Formal unambiguous specification of data
52
II. DBMS Software Selection
DBMS software selection is critical Advantages and disadvantages need study Some Key Factors affecting purchasing decision
Cost DBMS features and tools Underlying model Portability DBMS h d hardware requirements i
53
Choice of DBMS Factors to Consider
Technical Factors
Type of DBMS: Relational, object-relational, object etc etc. Storage Structures Architectural options Acquisition, maintenance, training and operating costs Database creation and conversion cost Organizational philosophy
Relational or Object Oriented Vendor Preference
E Economic Factors i F
Organizational Factors
Familiarity of staff with the system Availability of vendor services
54
III. Logical Design
Translates conceptual design into internal model Maps objects in model to specific DBMS constructs Design components
Tables Indexes Views Transactions T ti Access authorities Others
55
Logical Design Phase
Conceptual E.R M d l E R Model
1. REFINE THE CONCEPTUAL MODEL
Refined C R fi d Conceptual M d l t l Model
2.ER to RELATIONAL MAPPING & APPLY THE RULES OF NORMALIZATION Logical Data Model
56
IV. Physical Design
Selection of data storage and access characteristics
Technical More important in older hierarchical and network models d l
Becomes more complex for distributed systems Designers favor software that hides physical details
57
Physical Database Design
Logical Data Model
Track TR
Logical Process Model
01 Country
Physical Implementation Process
Database creation CREATE DATABASE CREATE TABLE LOAD
58
Physical Database Design
Design the specifications for the stored database in terms of physical storage structures, structures record placements and indexes. indexes Design Criteria
Response Time p Space Utilization
Elapsed Time between submitting a database transaction for execution and receiving a response St Storage space used b d t b d by database fil and their access files d th i path structures Average number of transactions/minute Must be measured under peak conditions
Transaction throughput
Result
I i i ld Initial determination of storage structures and i i f d access paths for database files
59
Phase 3: Implementation and Loading
Creation of special storage-related constructs to house end-user tables Data loaded into tables Other issues
Performance Security Backup and recovery Integrity Company standards C d d Concurrency controls
60
Phase 4: Testing and Evaluation
Database is tested and fine-tuned for performance, integrity, concurrent access, and security constraints d it t i t Done in parallel with application programming Actions taken if tests fail
Fine-tuning based on reference manuals Modification of physical design Modification of logical design Upgrade or change DBMS software or hardware
61
Phase 5: Operation
Database considered operational Starts process of system evaluation Unforeseen problems may surface Demand for change is constant
62
Phase 6: Maintenance and Evaluation
Preventative maintenance Corrective maintenance Adaptive maintenance Assignment of access permissions Generation of database access statistics to monitor performance Periodic security audits based on systemg generated statistics Periodic system usage-summaries
63
DB Design Strategy Notes Top-down
1) Identify data sets 2) Define data elements
Bottom up Bottom-up
1) Identify data elements 2) Group them into d t sets G th i t data t
64
Top-Down vs. Bottom-Up
65
Centralized vs. Decentralized Design
Centralized design
Typical of simple databases yp p Conducted by single person or small team
Decentralized design g
Larger numbers of entities and complex relations Spread across multiple sites Developed by teams
66
Decentralized Design
67
Critical Success Factors in Database Design
Work interactively with the users as much as p possible. . Follow a structured methodology throughout the data modelling process. Incorporate structural and integrity considerations into the data models. C bi Combine conceptualisation, normalisation, and t li ti li ti d transaction validation techniques into the data mode ng methodo ogy. modelling methodology.
68
Exercise 1 (Conceptual Design)
Create a conceptual ER model of the database for the
following lists. (List up the necessary entities ) DATA ITEMS, ITEMS set up ENTITIES and their ATTRIBUTES, and ATTRIBUTES identify the relationship among
List 1
Track No: 1 Participant code Track name: Managing information using Database Participant name Age Position Country Address
List 1 is the list of participants information by track List 2
Country code Country name Participant code Participant name Track name
List 2 is the list of participants information by countries
69
Exercise 2 : Primary and Foreign Keys
Please identify primary and foreign key.
Participant code Participant name Age Position Address Country code
Country code
Country name
Track code
Track name
Participant code
Primary key y y Foreign key
70
Exercise 3: Number the steps according to the correct order
____ Normalize the conceptual model. ____ Obtain a general description of company operations. operations ____ Load the database. ____ Create a description of each system process. process ___ Test the system. ____ Draw a data flow diagram and system flow charts. charts ____ Create a conceptual model, using E-R diagrams. ___ Create the application programs. C t th li ti s ____ Interview the mechanics. ____ Create the file (table) structures. ____ Interview the shop manager.
71
References
Chap 12 Practical Database Design Methodology and Use of UML Diagrams
72