0% found this document useful (0 votes)
74 views9 pages

Lecture 6

Week 6 covers topics related to data storage and manipulation, including: - The definition and differences between flat file and relational databases - Key terms used in relational databases like table, field, record, data type, primary key and foreign key - Software tools for creating database structures and manipulating data using SQL - Methods for presenting data and producing reports from the database
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views9 pages

Lecture 6

Week 6 covers topics related to data storage and manipulation, including: - The definition and differences between flat file and relational databases - Key terms used in relational databases like table, field, record, data type, primary key and foreign key - Software tools for creating database structures and manipulating data using SQL - Methods for presenting data and producing reports from the database
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Week 6. Data storage and manipulation.

Topics to be covered: Definition of the terms flat file database and relational


database. Advantages and disadvantages of using relational database rather than a
flat file database. Definition of the term's relation, table, field, attribute, record,
data type, primary key, foreign key. Software tools to store data appropriately in a
database. Creation of a database structure. Manipulation of data. Using SQL.
Presentation of data. Producing reports to display all the required data.

6.1 Flat file database and relational database


6.2 Main database terms
6.3 Relational Database Software.
6.4 Creation of a database structure. Manipulation of data and Using SQL.
6.5 Presentation of data and Producing reports

6.1 Flat file database and relational database


A database is an organized collection of structured data, typically stored
electronically in a computer system.
A database is usually controlled by a database management system
(DBMS). Together, the data and the DBMSare referred to as a database system, or
to just database.
Data within the database is typically modeled in rows and columns in a series
of tables to make processing and data querying efficient. The data can be easily
accessed, managed, modified, updated, controlled, and organized using structured
query language (SQL) for writing and querying data.
Data stored on an ICT system is stored and organized in files. When the data
is organized in rows and columns with data values being repeated, it is called a flat
file.
A flat file looks like a spreadsheet; it is two dimensional and has no related
files (or tables). Flat files are easy to create and use (for small amounts of data).
Data is often repeated in a flat file and therefore they can be quite large in size.
In the example
given in the figure, each subject that the student studies has to be entered as a
separate record.
The file given in the figure has a very simple row and column structure. There
are no relationships between data. The main weakness of this type of file is that
data has to be repeated. Look at the entry for “Omarov Askar”. His name is
recorded twice because he studies both ICT and Maths.
In a relational database system, this would not happen. The first name and last
name of each person would be stored only once and a related table would be
created to hold the rest of the data. The two tables would then be linked. When
data is repeated unnecessarily it is said to be redundant.
If another entry is made for Omarov Askar his name would repeated thrice.
The computer has more data to read through, so accessing and searching
through them can be slow.
Data has to be repeated and leads to inputting errors and inconsistencies. File
sizes can be large due to repeated data.
A relational database holds information in files or tables which are linked
together using key fields. The designer of the database will have identified
relationships between fields in each of the tables. The data will then be organized
in a way which makes adding records, searching and sorting easier and less error
prone than if it had been stored in flat files.
The relationships between the tables means that data is less likely to be
repeated so there should be no data redundancy. The number of times that data has
to be retyped is kept to a minimum so data is more likely to be free from typing
errors. This means that the integrity of the data is likely to be better.
Each table in a relational database has a field which is called a key field. The
key field is a value which will uniquely identify each record in the table. It can be
of any data type. In the case of the Student Table, the unique key field would be
the Student ID.
Relationships
are links that have been made between two tables, The links are made using
fields that are contained within both tables. From the flat files example, the Student
Table and Grades Table could be created:
The link relies on the fact that the data in the Student ID field in the Student
Table and the Grades Table are identical.
One Student has many Grades. This means that there is a one-to-many
relationship between the two tables. 
We can use a relational database package such as Microsoft Access to create
the relationship between the tables. The two tables are linked on Student ID, as this
field occurs in both tables. Each Student ID entered into the Grades table must be
present in the Student table. The database will not allow a grade to be entered for a
student who is not in the Student table. This helps ensure the integrity of the data.
When the tables are created and the data is entered, each student´s name is
entered only once regardless of the number of results that have been recorded for
that student.
Primary key is a field or group of fields in a table that uniquely identify
every row in that table. The Primary Key can't be a duplicate meaning the same
value can't appear more than once in the table. A table cannot have more than one
primary key. 
Rules for defining Primary key:
 Two rows can't have the same primary key value
 It must for every row to have a primary key value. 
 The primary key field cannot be null.
 The value in a primary key column can never be modified or updated if any
foreign key refers to that primary key.
A foreign key is a field (or collection of fields) in one table that uniquely
identifies a row of another table.
A foreign key creates a relationship between two tables. The purpose of
Foreign keys is to maintain data integrity and allow navigation between two
different instances of an entity. It acts as a cross-reference between two tables as it
references the primary key of another table.
In other words, the foreign key is defined in a second table, but it refers to the
primary key in the first table. 
Databases Relational over Flat File
Advantages Relational Databases over Flat File Databases are in:
 Data Integrity that means You Only need to change the data in one of the
tables, it will then update itself
 Data Redundancy that means By having a relational databse it ensures that
no attributes are repeated
 Data Consistency that means There is no chance of the same attribute being
stored in a different format in a different file
 Data Flexibility that means When dealing with queries, it gets much easier
creating deeper queries with a relational Database
 Greater Efficiency that means as you only have to input the data only once
into a relational database it saves time and human resources
Disadvantages are that
Relational databases can be very complex and without the necessary
training can be very hard to break down.
Relational databases are mostly commercial and require the user to buy
that piece of software or licenses for more than one machine
6.2 Main database terms
A relational database management system (RDBMS) stores and retrieves data
that is represented in tables. A relational database consists of a collection of tables
that store interrelated data.
In a relational database, all data is held in tables, which are made up
of rows and columns.
Each table has one or more columns, and each column is assigned a specific
datatype, such as an integer number, a sequence of characters (for text), or a date.
Each row in the table has a value for each column.
A typical fragment of a table containing employee information may look as
follows.
The tables of a relational database have some important characteristics:
 There is no significance to the order of the columns or rows.
 Each row contains one and only one value for each column.
 Each value for a given column has the same type.

Record is a complete set of information about one thing or individual. A


record will be made up from many fields.
A single item of data in a database is called a field.
A field refers to a single piece of data about a person or object. There can be
many fields in a record. Surname would be an example of a field. Each field is
given a name to describe its contents and assigned a data type so that the computer
knows how it can be processed Some fields will refer to a numeric value, whilst
others will refer to character or textual data.
When a table is created each of the fields will need to have a data type
assigned to them so that the database knows how to process the data.
The table in the next slide shows the most common data types in MS Access.
Most databases have similar datatypes.
The most common data types in MS Access are:
AutonumberThe next number generated in sequence automatically by
Access. Used or key fields.
CurrencyNumeric field with 2 decimal places and automatically displays the
local currency symbol (£)
Date/TimeStores both the date and time.
HyperlinkSpecial text field to represent hyperlinks
NumberOnly numerical data (decimal places can be specified if required)
Text255 characters to store letters, numbers and symbols (.,' etc).
Yes/NoBoolean data can also represent true/no
Most databases have similar datatypes
Primary and foreign keys define the relational structure of a database. These
keys enable each row in the database tables to be identified, and define the
relationships between the tables.

A primary key is a minimal set of attributes (columns) in a table that uniquely


identifies tuples (rows) in that table.

A foreign key is a field (or collection of fields) in one table that uniquely identifies
a row of another table.

In other words, the foreign key is defined in a second table, but it refers to the
primary key in the first table.
We have given definitions to main terms of relational database such as:
 Tables
 Fields
 Records
 Data types
 Relations
 Primary key
 Foreign key
6.3 Relational Database Software.
• A relational database is a data repository that stores data in structured tables
with rows and columns.

• Because data is stored in a structured way, it can be retrieved using a query


language that understands the table structure.
A Relational Database has the following Features & Capabilities:
• ACID (Atomicity, Consistency, Isolation, Durability) compliance ensures
database transactions to be processed reliably
• Referential integrity ensures that relationships between tables remain
consistent
• Synchronization mechanisms for managing simultaneous access of the same
data by multiple users
• Support for Unicode for multilingual capability
• Ability to run database seamlessly on multiple platforms
• Ability to recover data in the case of a failure
• A table is referred to as a relation in the sense that each row contains data
that is related to a single entity, while each column represents different keys or
concepts.
• When each row in a table is different, it is possible to use one or more
columns to identify a particular row. There is always a “primary key” which serves
as the unique identifier for each entity or record. For example, in the case of a table
of company employees, the primary key might be employee number.
• Data is retrieved from the table using a specialized language called
Structured Query Language or SQL. SQL allows users to query the data using
SELECT statements and WHERE clauses. A group of SQL statements can be
called as a mini-program or executable which is called a stored procedure.
• SQL is a widely understood language with a deep bench of practitioners in
the marketplace. For this reason, tools using SQL are being developed to query
non-relational big data stores like Hadoop, which use less well known, and harder
to use, interfaces to retrieve data.
One of the most popular Relational Database Software is Microsoft's SQL
Server. It has custom-built graphical integration of database designs.
Advantages of Microsoft's SQL Server are:
• Easy to set up a new database server from scratch
• Creates various designs, tables, and view data without syntax
• Can handle complicated queries and integrate with other programs
Disadvantages are:
• The lightweight package lacks user-friendliness in its user-interface.
• The execution of long queries often takes longer than the calculated time.
Postgre SQL is The open-source DBMS.
Postgre SQL supports Python programming languages and JSON format.

Advantages of Postgre SQL are:


• Storage and management of data in higher volumes
• Relatively secured data processing than others
• Straightforward installation process on Linux and Windows operating
system (OS)
Disadvantages are:
• Native interface limits the manipulation of data.
• The advanced nature of the tool slows down the insertion of small databases.
• The installation and configuration of the software can be time-consuming.
My SQL is a high-speed data processing and data productivity tool with
comprehensive features. The tool is designed to increase the security and
scalability of your databases.
Advantages of My SQL are:
• Easy to learn the foundational features without a programming background
• The open-source nature grant users complete freedom to customize data
• Well-suited for small businesses and entrepreneurs because of the low-cost
tructure
• Compatible with up-to-date industry's DBMS practices
• Instant integration with Apache web engines
• The data development process is adjustable for small and large applications.
Disadvantages are:
• Queries get stuck even after a refresh or restart
There is overdependence on third-party add-ons
Oracle RDBMS tool is one of the most effective object-relational DBMS
tools.
• You can run Oracle RDBMS on a PC or a high-powered supercomputer.
The active designs, for instance, trigger a dedicated referential system within the
software
Advantages of Oracle are:
• You can create partitions to have better administrative control over your
data
• Smooth transactional process and data security 
Disadvantages are:
• The price range of the tool is discouraging for small businesses
• Client applications are often plugged-in automatically
Microsoft Access works solely on Windows OS. It's a great tool to create
custom databases in numerous formats as per your business needs.
Advantages of Microsoft Access are:
• Custom templates of Microsoft Access are ideal for adding web databases
and simultaneously tracking, reporting, or sharing with other users
• Users get to have full access to Office Suite Packages such as Word, Excel,
and Outlook.
Disadvantages is:
• License extension to integrate third-party tools can take time.
Informix Dynamics Server is a commercial DBMS tool that is perfect for
businesses that want parallel and multi-threaded processing of data. The scalability
of the software makes it ideal, plus it has a great in-depth support option.
Advantages of Microsoft Access are:
• It's Online Transactional Processing (OLTP) performance is better than
others
• Flexibility to scale small systems to multi-layered distributed nodes
• It can merge relational data in a time series
Disadvantages is:
• The top edition is expensive for small businesses.
IBM DB2 is a commercial DBMS tool attracts users because of its easy
installation and setup to store a large amount of data instantly. It supports most of
the data science languages to handle simple or complex frameworks.
Advantages of IBM DB2 are:
• It offers AI-dedicated capabilities that are designed to manage and structure
complex data
• It supports private as well as cloud environments
Disadvantages is
• The functional features of the DBMS tool may be too much for a non-
developer

6.4 Creation of a database structure. Manipulation of data and Using


SQL.
A query is a database term that is used to retrieve data from a database
• To write queries for databases it is used SQL – Structured query language.
QL has it’s own syntax. SQL is divided into Data Definition Language (DDL)
• Data Manipulation Language (DML).
• Data Definition Language helps you to define the database structure or
schema.
• Data Manipulation Language (DML) allows you to modify the database
instance by inserting, modifying, and deleting its data. It is responsible for
performing all types of data modification in a database.
For example, SELECT queries are used to extract information from a
database and refers to DML
In the Example on the slide we
• SELECT all records with first_name, last_name
• FROM the table president
• WHERE state equals to ‘NY’;
Pay attention to the semicolon at the end of the query. A query should be
ended by semicolon.
DDL commands are:
CREATE
CREATE statements is used to define the database structure schema:
DROP
Drops commands remove tables and databases from RDBMS.
ALTER
Alters command allows you to alter the structure of the database.
TRUNCATE
This command used to delete all the rows from the table and free the space
ontaining the table.
Here are some important DML commands:
INSERT
• This is a statement is a SQL query. This command is used to insert data into the
row of a table.
UPDATE
• This command is used to update or modify the value of a column in the table.
DELETE
• This command is used to remove one or more rows from a table.
In the first query we
SELECT all records with first_name, last_name
FROM the table president
WHERE birth_date starts from 19 and followed by any sequence of
characters;
In the second query we compute the number of records of the table president that
matches a condition that state = ‘VT’;
In the third query we
SELECT all records with first_name, last_name
FROM the table president
WHERE state is either ME or NH or VT or MA or CT or RI);
To CREATE a TABLE it is Used the CREATE statement, specifying fields
and column types. CREATE refers to DDL.
In the Example on the slide we are creating a table friends with fields
first_name of varchar data type with (25) characters,
last_name of varchar data type with (25) characters,
gender can be either ‘M’ or ‘F’),
grade is unsigned small integer data type,
hair_color is a set of (20) characters,
email is a set of (30) characters,
screen_name varchar(25),
phone char(12));
In order to add new records into the table we Use the INSERT statement:
In the Example on the slide it is demonstrated adding a new record into the table
friends
In order to change the value of the certain attribute of the table we Use
the UPDATE statement:
In the Example on the slide it is demonstrated editing the value of the field phone
of the table friends according to the condition specified in the where clause
In order to remove records from the table we Use the DELETE statement:
In the Example on the slide it is demonstrated deleting the rows from the table
friends according to the condition specified in the where clause
Summarizing all we have learned in this video we can say that
SQL is a database language designed for the retrieval and management of data in a
relational database.
SQL is the standard language for database management. All the RDBMS
systems like MySQL, MS Access, Oracle, Sybase, Postgres, and SQL Server use
SQL as their standard database language.
widely used SQL queries are:
• Data Definition Language (DDL)
• Data Manipulation Language (DML)
Data Definition Language helps you to define the database structure or schema.
Data Manipulation Language (DML) allows you to modify the database instance
by inserting, modifying, and deleting its data. It is responsible for performing all
types of data modification in a database.
6.5 Presentation of data and Producing reports
• If you need to share information from your database with someone but don't
want that person actually working with your database, consider creating a report.
• Reports allow you to organize and present your data in a reader-friendly, visually
appealing format.
• MS Access makes it easy to create and customize a report using data from any
query or table in your database.
• Reports are created from one or more tables or queries. To use several
tables, you would first create a query to retrieve data from those tables.
• There are three basic ways to create a report in MS Access: with a single mouse
click, with the Report wizard, or in Design view.
To create a report in MS Access with a single click,
• Open the table or query upon which you want to base the report.
• On the Create tab, click Report.
• The report is created and displays all the fields from the underlying table or
query.
You can use the Report Wizard to be more selective about what fields appear
on your report. You can also specify how the data is grouped and sorted, and you
can use fields from more than one table or query, provided you have specified the
relationships between the tables and queries beforehand.
The Report Wizard asks you questions and creates a report based on your answers.
This method is best when you want to be stepped through the process of creating a
report.
• To create a report using the wizard,
• On the Create tab, click Report Wizard.
• Follow the onscreen prompts.
Design view gives you total freedom and control, right from the beginning.
Controls can be bound to data from your database, displaying it directly, or they
can be unbound, without a link to a data source (such as descriptive text, dividing
lines, a logo, etc).
• To create a report in Design view,
• On the Create tab, click Report Design.
The resulting report looks like this:
Summarizing all we have learned we can say that
• Reports allow you to organize and present your data in a reader-friendly, visually
appealing format.
• a report in MS Access can be created in three basic ways: with a single mouse
click, with the Report wizard, or in Design view.

You might also like