CXB 3104 Advanced Database Systems
Outline Data & Information Database Types of database Database Schema Database models
Lecture 1
Muthu muthma@sunway.edu.my
Data
Meta data
Data can be classified as raw facts or the building block of information. These are usually unprocessed information. Data can exists in a variety of forms such as numbers or text on pieces of paper and bytes stored in electronic memory, or as facts stored in a persons mind.
3
Metadata describes how and when and by whom a particular set of data was collected, and how the data is formatted.
Information Information can be classified as processed data. Information should be meaningful, relevant, accurate and timely. These are essential ingredients to good decision-making, which is the key to organizational success.
Data management Data Management is a discipline that focuses on the proper generation, storage and retrieval of data
What is a database?
Database organization
A collection of information organized in such a way that a computer program can quickly select desired pieces of data. Databases are organized as fields, records, and files.
A field is a single piece of information that has a specific meaning and is used to define and store data. A record is one complete set of fields describing a person, place or thing. A file is a collection of related records.
Database management system
Advantages of DBMS
To access information from a database, you need a database management system A DBMS is a collection of programs that enables you to store, modify, and extract information from a database. DBMSs makes data management more effective
9
The end-users has better access to more and better-managed data. Allows integrated view of organizations data leading. The ability to produce quick responses to ad hoc queries.
10
DBMS Facilities
Types of Databases
11
Data Manipulation Language
has a central repository for all data and data descriptions provides inquiry facilities to this data called a query language.
Single-user database Desktop database Multi-user database Workgroup database Enterprise database
12
Data Definition Language
allows users to specify the data-type and structures, and the constraints on that data to be stored in the database
Other classifications of Databases
Database Design
A centralized database is one that supports data located at a single site. A distributed database is a single logical database that is spread physically across computers in multiple locations The transactional database is used to process transactions A data-warehouse database focuses on storage of data that will be used for information generation for the purpose of decision-making.
Database design refers to the design of the database structure that will be used to store and manage data. It does NOT refer to the designing the DBMS software. A well- designed database will minimize the problem of data-redundancy. A poorly designed database tends to generate errors
13
14
Schema
Database Schema vs. Database State
Database Schema:
The description of a database. Includes descriptions of the database structure, data types, and the constraints on the database.
Distinction
The database schema changes very infrequently. The database state changes every time the database is updated.
15
16
Example of a Database Schema
Example of a database state
17
18
Three-Schema Architecture
Three-Schema Architecture
Proposed to support DBMS characteristics of:
Program-data independence. Support of multiple views of the data.
Defines DBMS schemas at three levels:
Internal schema at the internal level to describe physical storage structures and access paths (e.g. indexes).
Typically uses a physical data model.
Not explicitly used in commercial DBMS products, but has been useful in explaining database system organization
Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users.
Uses a conceptual or an implementation data model.
External schemas at the external level to describe the various user views.
Usually uses the same data model as the conceptual schema.
19
20
The three-schema architecture
Data Independence
Logical Data Independence:
The capacity to change the conceptual schema without having to change the external schemas and their associated application programs.
Physical Data Independence:
The capacity to change the internal schema without having to change the conceptual schema. For example, the internal schema may be changed when certain file structures are reorganized or new indexes are created to improve database performance
21
22
Business rules
Business rules
A clear description of a policy, procedure, or principle within an organizations environment. Properly written business rules are used to define entities, attributes relationships and constraints. To be effective, a business rule must be easy to understand and widely disseminated to ensure that every person in the organization shares a common interpretation.
Examples
A customer may generate many invoices. Only one customer generates each invoice. A minimum of 15 participants is needed for the training to be scheduled
23
24
Business rules Why?
Hierarchical Model
They help standardize the companys view of data. They constitute a communications tool between users and designers. They allow the designer to understand the nature, role and scope of all data. They allow the designer to understand business processes. They allow the designer to develop appropriate relationship participation rules and constraints. .
Records are arranged in a top-down structure The parent and child are often used in describing hierarchical model a child is related to only one parent
25
26
Hierarchical Model
DEPARTMENT DEPTNO DNAME LOCATION
Network Data Model
no distinction between parent and child record types Any record type may be associated with any number of different record types network schema - the conceptual organization of the entire database as viewed by the database administrator. The subschema, which defines the portion of the database seen by the application programs that actually produce the desired information from the data contained in the database.
28
EQUIPMENT IDENT COST NUMBER
EMPLOYEE
EMPNO ENAME YEARS
DEPENDENT DNAME AGE
SKILL CODE SNAME NOYEARS
27
Network Data Model
Relational Model
Based on the mathematical concept of a relation, represented as a table. A table stores a collection of related entities. The database resembles a file but a unlike a file a table yields complete data and structural independence because it is purely logical. How the data is stored in the database is of no consequence to the user or designer. A table is a matrix made up a series of row and columns.
DEPARTMENT DEPTNO DNAME LOCATION
DEPT-EMP
EMPLOYEE EMPNO ENAME YEARS
DEPT-PROJ PROJECT
PROJNO DESCRIPTION
PROJ-EMP
29
30
Relational Model
Bno. B1 B2 B3 B4 Street 22, Oxley Rd 44, Shenton Way 11, Chai Chee Rd 4, Jurong West Drive location Central South East West PostCode 589652 548796 585698 526478 TelNo. 67894561 67896325 67665892 68875425
Advantages of the Relational model
Structural Independence Improved conceptual simplicity Easier design, implementation and management Ad hoc query capability Powerful DBMS
Fname Sno. S11 S12 S14 John Stella Muthu
Lname Doe Chan Ramasam y
Position Manager Clerk Senior Exec
Salar y 5000 1800 3200
Sex M F M
Bno B1 B3 B3
31
32
Disadvantages of the Relational model
Substantial
Object-oriented Model
overhead
Objects are abstraction of the real world entities that exhibit states and behaviors state of objects is expressed as values of the attributes of the object behavior of an object is expressed by a set of methods that operate on its attributes.
Can facilitate poor design and implementation Promote islands of information problem
33
34
Features of OO data model
Advantages of the OO model
Add semantic content Visual presentation includes semantic content. Database integrity Both structural and data independence
An object is an abstraction of real-world entity. It represents a single occurrence of an entity. Attributes describe the properties of an object. Objects that share similar characteristics are grouped in classes. A class is a collection of similar objects with shared structure.
A method represents a real-world action such as display; to print out the details of the Student. Methods define the behavior of an object. Classes are organized in a class hierarchy.
Inheritance is the ability of an object within the class hierarchy to inherit the attributes and methods of classes above it(parent).
35
36
Disadvantages of the OO model
Slow pace of OODM standards and development Complex navigational data access Steep learning curve High systems overhead slows transactions .
Relational Database Model
37
Relational Database Model
Tables
View of data is logical rather than physical. DBMS in between the application and database allows the designer to focus on the logical representation of the data and their relationships.
Logical view of the relational database is facilitated by the creation of relationships based on a logical construct known as a table or a relation.
39
40
Table characteristics
Table - example
A table is composed of rows and columns. Each row (tuple) represents a single occurrence. Each column represents an attribute and is uniquely named. Each row/column intersection represents a single piece of data. All values in a single column must be of the same data format. The range of possible values of a column is known as the attribute domain. Each table must have an attribute or combination of attributes that uniquely identifies each row.
Bno. B1
Street 22, Oxley Rd
location Central
PostCode 589652
Tel No. 67894561
B2 B3 B4
44, Shenton Way 11, Chai Chee Rd 4, Jurong West Drive
South East West
548796 585698 526478
67896325 67665892 68875425
41
42
Relational Keys
Candidate Key It is an attribute or combination of attributes that enables each row to be uniquely identified. Primary key It is an attribute or a set of attributes that uniquely identify a specific instance of an entity. Composite Key It is a primary key that made up of more than one attribute. Secondary key An attribute or combination of attributes used strictly for data retrieval purposes. Foreign Key - An attribute whose values match primary key values in the related table
Relational Keys
43
44
Integrity rules
Entity integrity
Relational Data Manipulation
Relational tables are sets. Rows of the tables can be considered as elements of the set Operations that can be performed on sets can be done on relational tables.
primary key can never be a null value used to identify a unique row in a relational table its value must always be specified. insert, update, and delete operations maintain the uniqueness and existence of all primary keys. A foreign key, must either be null or match the values in the relational table in which that foreign key is a primary key.
Referential integrity
45
46