DBMS in 5
DBMS in 5
• Chapter-1 (Basics): Data & information, Database System vs File System, Views of Data Base, Data Independence, • UNIT-1 : INTRODUCTION Overview, Database System vs File System, Database System Concept and Architecture, Data
Instances & Schema, OLAP Vs OLTP, Types of Data Base, DBA, Architecture. Model Schema and Instances, Data Independence and Database Language and Interfaces, Data Definitions Language,
• Chapter-2 (ER Diagram): Entity, Attributes, Relationship, Degree of a Relationship, Mapping, Weak Entity set, DML, Overall Database Structure. Data Modeling Using the Entity Relationship Model: ER Model Concepts, Notation
Conversion from ER Diagram to Relational Model, Generalization, Specification, Aggregation. for ER Diagram, Mapping Constraints, Keys, Concepts of Super Key, Candidate Key, Primary Key, Generalization,
• Chapter-3 (RDBMS & Functional Dependency): Basics & Properties, Update Anomalies, Purpose of Normalization, Aggregation, Reduction of an ER Diagrams to Tables, Extended ER Model, Relationship of Higher Degree.
Functional Dependency, Closure Set of Attributes, Armstrong’s axioms, Equivalence of two FD, Canonical cover, Keys. • UNIT-2 : RELATIONAL DATA MODEL Relational Data Model Concepts, Integrity Constraints, Entity Integrity, Referential
• Chapter-4 (Normalization): 1NF, 2NF, 3NF, BCNF, Multivalued Dependency, 4NF, Lossy-Lossless Decomposition, 5NF, Integrity, Keys Constraints, Domain Constraints, Relational Algebra, Relational Calculus, Tuple and Domain Calculus.
Dependency Preserving Decomposition. Introduction on SQL: Characteristics of SQL, Advantage of SQL. SQl Data Type and Literals. Types of SQL Commands.
• Chapter-5 (Indexing): Overview of indexing, Primary indexing, Clustered indexing and Secondary Indexing, B-Tree. SQL Operators and Their Procedure. Tables, Views and Indexes. Queries and Sub Queries. Aggregate Functions. Insert,
• Chapter-6 (Relational Algebra) : Query Language, Select, Project, Union, Set Difference, Cross Product, Rename Update and Delete Operations, Joins, Unions, Intersection, Minus, Cursors, Triggers, Procedures in SQL/PL SQL.
Operator, Additional or Derived Operators. • UNIT-3 : DATA BASE DESIGN & NORMALIZATION Functional dependencies, normal forms, first, second, 3 third normal
• Chapter-7(SQL) : Introduction to SQL, Classification, DDL Commands, Select, Where, Set Operations, Cartesian forms, BCNF, inclusion dependence, loss less join decompositions, normalization using FD, MVD, and JDs, alternative
Product, Natural Join, Outer Join, Rename, Aggregate Functions, Ordering, String, Group, having, Trigger, embedded, approaches to database design.
dynamic SQL. • UNIT-4 : TRANSACTION PROCESSING CONCEPT Transaction System, Testing of Serializability, Serializability of
• Chapter-8 (Relational Calculus) : Overview, Tuple Relation Calculus, Domain Relation Calculus. Schedules, Conflict & View Serializable Schedule, Recoverability, Recovery from Transaction Failures, Log Based
• Chapter-9 (Transaction) : What is Transaction, ACID Properties, Transaction Sates, Schedule, Conflict Serializability, Recovery, Checkpoints, Deadlock Handling. Distributed Database: Distributed Data Storage, Concurrency Control,
View Serializability, Recoverability, Cascade lessness, Strict Schedule. Directory System.
• Chapter-10 (Recovery & Concurrency Control) : Log Based Recovery, Shadow Paging, Data Fragmentation, TIME • UNIT-5 : CONCURRENCY CONTROL TECHNIQUES Concurrency Control, Locking Techniques for Concurrency Control,
STAMP ORDERING PROTOCOL, THOMAS WRITE RULE, 2 phase locking, Basic 2pl, Conservative 2pl, Rigorous 2pl, Strict Time Stamping Protocols for Concurrency Control, Validation Based Protocol, Multiple Granularity, Multi Version
2pl, Validation based protocol Multiple Granularity. Schemes, Recovery with Concurrent Transaction, Case Study of Oracle.
Knowledge Gate Website Knowledge Gate Website
Aspect File System Database Management System View of Data Base (Data Abstraction)
Data Access
Slower data retrieval due to unstructured Structured querying capabilities allow for • Physical Level
querying capabilities. quicker data access.
Challenges in correlating data across Facilitates data integration, reducing data • Logical Level/ Conceptual Level
Data Isolation
separate files leading to data isolation. isolation issues.
• View Level
Risk of inadvertent data alterations or Features to prevent unauthorized data
Data Integrity
deletions creating integrity problems. alterations, maintaining integrity.
Types of Data Base • Temporal Database: Keeps track of changing data over time, allowing for
queries concerning time-based data. A historical trading database in the
• Commercial Database: Predominantly used in the business sector to handle
financial sector which keeps track of stock prices over time.
large volumes of transactions and customer data. A CRM system like Salesforce
which handles large volumes of customer data and transactions.
• Geological Information System (GIS): Stores, organizes, and analyzes
• Multimedia Database: Stores data types such as images, audio, and video files,
geographical data, aiding in spatial analysis and mapping projects. A system like
facilitating the management and retrieval of multimedia content. A digital asset
ArcGIS which enables the storage, analysis, and visualization of geographical
management system like Adobe Experience Manager that facilitates the storage
data.
and retrieval of multimedia content.
1.Query Processor: This is the component of a DBMS that interprets and executes user queries. It comprises several A Database Management System (DBMS) consists of three primary components:
sub-components including:
1. DML Compiler: Processes Data Manipulation Language (DML) statements into low-level instructions that can
be executed.
1. Internal Level: Concerns the physical storage of data in databases, overseeing data storage
2. DDL Interpreter: Processes Data Definition Language (DDL) statements into metadata tables. on hardware devices, and managing low-level aspects like data compression and indexing.
3. Embedded DML Pre-compiler: Translates DML statements embedded in application programs into
procedural calls. 2. Conceptual Level: Represents the logical layout of the database, detailing the schema with
4. Query Optimizer: Determines the most efficient way to execute a query by evaluating different query plans. tables and attributes and their interrelations. It's independent of specific DBMS
2.Storage Manager: Also known as the Database Control System, it is responsible for managing the data stored in
the database, ensuring its consistency and integrity. It includes the following sub-components:
implementations, focusing on organizing and connecting data elements.
1. Authorization Manager: Manages access controls and privileges.
2. Integrity Manager: Ensures that data modifications adhere to integrity constraints. 3. External Level: Embodies the user interface of the database, facilitating data access and
3. Transaction Manager: Manages concurrent access to the database and maintains database consistency interaction through user-friendly views and interfaces tailored to various user groups.
during transactions.
4. File Manager: Manages file space and data structures representing information in the database.
5. Buffer Manager: Manages data cache and data transfer between main memory and secondary storage.
3.Disk Storage: Represents the storage aspect of a DBMS, encompassing the following components:
1. Data Files: Files where the actual data is stored.
2. Data Dictionary: Repository containing information about the structure and characteristics of database
objects.
3. Indices: Data structures that facilitate faster data retrieval.
Knowledge Gate Website Knowledge Gate Website
ER Diagram ER diagram for bank
• Developed by Dr. Peter Chen in 1976, this conceptual level
method, grounded in real-world perceptions, facilitates
diagrammatic data representation, simplifying
comprehension for non-technical users.
• An entity may be concrete, such as a person or a book, or it may be abstract, such as a course,
a course offering, or a flight reservation.
• Types of Entity • In ER diagram we cannot represent an entity, as entity is an instant not schema,
and ER diagram is designed to understand schema
• Tangible - Entities which physically exist in real world. E.g. - Car, Pen, locker
• In a relational model entity is represented by a row or a tuple or a record in a
• Intangible - Entities which exist logically. E.g. – Account, video. table.
• In an ER diagram an entity set is represented by a rectangle • Attributes are the descriptive properties possessed by each member of an entity set.
for each attribute there is a set of permitted values called domain.
• In a relational model it is represented by a separate table
• In an ER diagram attributes are represented by ellipse or oval connected to rectangle. Types of Attributes
• Single valued- Attributes having single value at any instance of time for an entity. E.g. –
• While in a relational model they are represented by independent column. e.g. Aadhar no, dob.
Instructor (ID, name, salary, dept_name)
• Multivalued - Attributes which can have more than one value for an entity at same time. E.g.
- Phone no, email, address.
• A multivalued attribute is represented by a double ellipse in an ER diagram and by an
independent table in a relational model.
• Separate table for each multivalued attribute, by taking mva and pk of main table as fk in
new table
Customer
Customer
• Simple - Attributes which cannot be divided further into sub parts. E.g. Age
Customer ID First Name Surname Telephone Number
123 Rabri Devi Singh 555-861-2025, 192-122-1111
• Degree
• NAME- every relation must have a unique name. Degree of a relationship/relationship set
• Means number of entities set(relations/tables) associated(participate) in the
relationship set.
• Most of the relationship sets in a data base system are binary.
• Occasionally however relationship sets involve more than two entity sets.
• Logically, we can associate any number of entity set in a relationship called N-ary
Relationship.
• Ternary Relationship - When three entities participate in a Relationship. E.g. • Quaternary Relationship - When four entities participate in a Relationship.
The University might need to record which teachers taught which subjects in
which courses.
• An E-R enterprise schema may define certain constraints to which the contents
of a database must conform.
MAPPING CARDINALITIES / CARNINALITY RATIOS One to One (1:1) Relationship - An entity in A is associated with at most one entity
in B, and an entity in B is associated with at most one entity in A.
• Express the number of entities to which another entity can be associated via a
relationship set. Four possible categories are-
• One to One (1:1) Relationship.
• One to Many (1: M) Relationship.
• Many to One (M: 1) Relationship.
• Many to Many (M: N) Relationship.
E.g.- The directed line from relationship set advisor to both entities set indicates that ‘an instructor
may advise at most one student, and a student may have at most one advisor’.
E.g.- This indicates that student may have many instructors but an instructor can
advise at most one student.
E.g.- This indicates that an instructor may advise many students, but a student may
have at most one advisor.
Many to Many(M:N) Relationship - An entity in A is associated with any number • PARTICIPATION CONSTRAINTS- it defines participations of entities of an entity
(zero or more) of entities in B, and an entity in B is associated with any number type in a relationship.
(zero or more) of entities in A.
• Partial participation
• Total Participation
E.g.- This indicates a student may have many advisors and an instructor may advise
many students.
• An entity set that does not process sufficient attributes to form a primary key is called a • The identifying entity set is said to own weak entity set that it identifies.
weak entity set. • A weak entity set may participate as owner in an identifying relationship with another
• It contains discriminator attributes (partial key) which contain partial information about weak entity set.
the entity set, but it is not sufficient enough to identify each tuple uniquely.
Represented by double rectangle.
• The relationship associating the weak entity set with the identifying entity set is called REASONS TO HAVE WEAK ENTITY SET
the identifying relationship (double diamonds).
• The identifying relationship is many to one from the weak entity set to identifying • Weak entities reflect the logical structure of an entity being dependent on
entity set, and the participation of the weak entity set in relationship is always total. another.
• The primary key of weak entity set will be the union of primary key and discriminator • Weak entity can be deleted automatically when their strong entity is deleted.
attributes. • Without weak entity set it will lead to duplication and consequent possible
inconsistencies.
Specialization Aggregation
• A process where a higher-level entity is broken down into more specific, lower-
level entities. • A concept wherein relationships are abstracted to form higher-level
• This top-down approach delineates complexity into simpler components. entities, enabling a more organized representation of complex
• Acts as the converse of the generalization process, focusing on differentiating relationships.
properties rather than similarities.
• Domain (set of permissible value in particular column) is a set of atomic values. • Tuple - Each row of a Relation/Table is called Tuple.
• By atomic we mean that each value in the domain is indivisible as far as the formal • Arity/Degree - No. of columns/attributes of a Relation. E.g. - Arity is 5 in Table Student.
relational model is concerned.
• A common method of specifying a domain is to specify a data type from which the • Cardinality - No of rows/tuples/record of a Relational instance. E.g. - Cardinality is 4 in table
data values forming the domain are drawn. Student. NAME ID CITY COUNTRY HOBBY
NISHA 1 AGRA INDIA PLAYING
• E.g. Names: The set of character strings that represent names of persons. Rows/Tuples/Record/ NIKITA 2 DELHI INDIA DANCING
Cardinality AJAY 3 AGRA INDIA CHESS
ARPIT 4 PATNA INDIA READING
Domain/ NAME ID CITY COUNTRY HOBBY
NISHA 1 AGRA INDIA PLAYING
Field/ NIKITA 2 DELHI INDIA DANCING
Column/ AJAY 3 AGRA INDIA CHESS
ARPIT 4 PATNA INDIA READING
Arity/
Degree Knowledge Gate Website Knowledge Gate Website
Properties of Relational tables Problems in relational database
1. Cells contains atomic values • Update Anomalies- Anomalies that cause redundant work to be done during
insertion into and Modification of a relation and that may cause accidental loss
2. Values in a column are of the same kind of information during a deletion from a relation
• Insertion Anomalies
3. Each row is unique
• Modification Anomalies
4. No two tables can have the same name in a relational schema.
• Deletion Anomalies
5. Each column has a unique name
• Insertion anomalies: An independent piece of information cannot be recorded • Modification anomalies: The update of a piece of information must occur at
into a relation unless an irrelevant information must be inserted together at the multiple locations.
same time.
Roll no name Age Br_code Br_name Br_hod_name
Roll no name Age Br_code Br_name Br_hod_name 1 A 19 101 Cs Abc
1 A 19 101 Cs Abc 2 B 18 101 Cs Abc
2 B 18 101 Cs Abc 3 C 20 101 Cs Abc
3 C 20 101 Cs Abc
4 D 20 102 Ec Pqr
4 D 20 102 Ec Pqr
Br_code Br_hod_name
Br_code Br_hod_name
Q Consider the following relation instance, which of the following • Trivial Functional dependency - If β is a subset of α, then the functional
dependency doesn’t hold dependency α → β will always hold.
X Y Z
A) A → b
जजसका होना न होना बराबर हो 1 4 2
A B C 1 4 3
2 6 3
1 2 3 3 2 2
B) BC → A
4 2 3
C) B → C 5 3 3
D) AC → B
Knowledge Gate Website Knowledge Gate Website
ATTRIBUTES CLOSURE/CLOSURE ON ATTRIBUTE SET/ CLOSURE SET OF ATTRIBUTES ARMSTRONG’S AXIOMS
• Attribute closure of an attribute set can be defined as set of attributes which can
1. An axiom or postulate is a statement that is taken to be true, to
be functionally determined from F. serve as a premise or starting point for further reasoning and
• DENOTED BY F+ arguments.
Q R(ABCDE)
R (A, B, C, D) R(ABCDEFG) A → BC 2. Armstrong's axioms are a set of axioms (or, more
precisely, inference rules) used to infer all the functional
A→B A→B CD → E dependencies on a relational database. They were developed
B→C BC → DE B→D
by William W. Armstrong in his 1974 paper.
3. The axioms are sound in generating only functional dependencies
AB → D AEG → G E→A in the closure of a set of functional dependencies (denoted as F+)
when applied to that set (denoted as F).
A+ = (AC)+ =? (B)+ =
Armstrong Axioms From these rules, we can derive these secondary rules-
• Augmentation: If X → Y, then XZ → YZ
• Pseudo transitivity: If X → Y and WY → Z, then WX → Z
Q Consider the following set of fd R(ACDEH) To find the MINIMAL COVER / CANONICAL COVER / IRREDUCIBLE SET
F G • A canonical cover (also known as a minimal cover) for a set of functional dependencies in a database is
a minimal set of functional dependencies that is equivalent to the original set, but with redundant
A→C A → CD dependencies and extraneous attributes removed. It is used in the normalization process of database
design to simplify the set of functional dependencies and to find a good set of relations.
AC → D • There may be any following type of redundancy in the set of functional dependencies: -
E → AH • Complete production may be Redundant.
E → AD • One or more than one attributes may be redundant on right hand side of a production.
• One or more than one attributes may be redundant on Left hand side of a production.
E→H
• Candidate key which are not chosen as primary key is alternate key. • The concept of referential integrity is derived from foreign key theory.
PARTIAL DEPENDENCY- When a non – prime attribute is dependent only on a part (Proper SECOND NORMAL FORM
subset) of candidate key then it is called partial dependency. (PRIME > NON-PRIME) • Relation R is in 2NF if,
• R should be in 1 NF.
Full DEPENDENCY- When a non – prime attribute is dependent on the entire candidate key then • R should not contain any Partial dependency. (that is every non-prime
it is called Full dependency. attribute should be fully dependent upon candidate key)
e.g. R(ABCD)
AB→D
A→C
TRANSITIVE DEPENDENCY – A functional dependency from non-Prime attribute to THIRD NORMAL FORM
non-Prime attribute is called transitive
• Let R be the relational schema, it is said to be in 3 NF
E.g.- R(A, B, C, D) with A as a candidate key • R should be in 2NF
A→B • It must not contain any transitive dependency
B→C [ transitive dependency]
C→D [transitive dependency]
Q Find the normal form of relation R(A, B, C, D, E) having FD set F= {A → B, BC → E, Multivalued Dependency
ED → A}.
• Denoted by, A →→ B, Means, for every value of A, there may exist more than
one value of B.
• E.g. S_name → → Club_name
S_Name Club_name
Kamesh Dance
Kamesh Guitar
S_name → → Club_name
S_name → → Club_name
S_name → → P_no S_Name Club_name P_no
Kamesh Dance 123
S_Name Club_name S_Name Club_name P_no
Kamesh Dance Kamesh Guitar 123
Kamesh Dance 123
Kamesh Guitar Kamesh Dance 789
Kamesh Guitar 123
Kamesh Guitar 789
Kamesh Dance 789
Kamesh Guitar 789
NOTE: The above Student schema is in BCNF as no functional dependency holds on EMP, but
still redundancy due to MVD.
r
• if we decompose a table r into two tables r1 and r2 because of normalization then
at some later stage if we want to join(combine) (natural join) these tables r1 and A B C
r2, then we must get back the original table r, without any extra or less tuple. 1 a p
But some information may be lost during retrieval of original relation or table. 2 b q
For e.g. 3 a r
r
r1 r2
A B C
A B B C
1 a p
1 a a p
2 b q
2 b b q
3 a r A B C
3 a a r
r1 r2
A B B C
A B C D E How to check for lossless join decomposition using FD set, following conditions must hold:
A 122 1 W A • Union of Attributes of R1 and R2 must be equal to attribute of R. Each attribute of R must be
E 236 4 X B either in R1 or in R2. Att(R1) U Att(R2) = Att(R)
A 199 1 Y C
B 213 2 Z D • Intersection of Attributes of R1 and R2 must not be NULL. Att(R1) ∩ Att(R2) ≠ Φ
• Common attribute must be a key for at least one relation (R1 or R2)
• Att(R1) ∩ Att(R2) → (R1) or Att(R1) ∩ Att(R2) → (R2)
• Let relation R be decomposed into Relations R1, R2, R3…………. RN with their respective
• A Relational table R is said to be in 5th normal form if functional Dependencies set as F1, F2, F3…………. FN, then the Decomposition is Dependency
• it is in 4 NF Preserving iff
• it cannot be further non-loss decomposed • {F1 ∪ F2 ∪ F3 ∪ F4………. ∪ FN }+ = F+
Q Suppose we have ordered file with records stored r = 30,000 on a disk with Block Size B 1. Indexes can be established on any relation field, be it primary key or
= 1024 B. File records are of fixed size and are unspanned with record length R = 100 B. non-key.
Suppose that ordering key field of file is 9 B long and a block pointer is 6 B long,
Implement primary indexing? 2. Each attribute can have a dedicated index file, meaning multiple
index files may exist for one main file.
3. Index files are always organized, allowing for the utilization of binary
search advantages, irrespective of the main file's order.
4. Indexing accelerates data retrieval time but also introduces space
overhead for storing the index file.
5. The correct block in the main file can be located with log2(number of
blocks in index file) + 1 accesses.
Index
• Main file is always sorted according to primary key.
• Indexing is done on Primary Key, therefore called as primary indexing
Single
Multilevel • Index file have two columns, first primary key and second anchor pointer (base
level
address of block)
Primary Clustering Secondary Simple • It is an example of Sparse Indexing.
B tree B+ tree
Index Index index multilevel
• Here first record (anchor record) of every block gets an entry in the index file
• No. of entries in the index file = No of blocks acquired by the main file.
• In single-level indexing, an index file is created for the main file, marking the end
of the indexing process.
• Multiple-level indexing, on the other hand, involves creating an index for the
index file and continually repeating this procedure until only a single block
remains.
Knowledge Gate Website Knowledge Gate Website
CLUSTERED INDEXING Q Suppose we have ordered file with records stored r = 30,000 on a disk with Block Size B
= 1024 B. File records are of fixed size and are unspanned with record length R = 100 B.
• Main file will be ordered on some non-key attributes Suppose that ordering key field of file is 9 B long and a block pointer is 6 B long,
Implement Secondary indexing?
• No of entries in the index file = no of unique values of the attribute on which
indexing is done.
• Means user provides both what data to be retrieved and how data to be retrieved. e.g. • What data to be retrieved e.g. Relational Calculus. Tuple relational calculus, Domain relational
Relational Algebra. calculus are declarative query languages based on mathematical logic
• Every operator in the RA accepts (one or two) relation/table as input arguments • It also does not consider duplicity by default as it is based on set theory. Same
and returns always a single relation instance as the result without a name. query is written in RA and SQL the result may be different as SQL considers
duplication.
Name Symbol
Select
(σ) Name Symbol DERIVED FROM
Project
(∏) Join (⋈) (Χ)
Union
(∪) (−)
Set difference
(−)
Intersection (∩) A∩B=A-(A-B)
Cross product
(Χ) Division (÷) (X,-, ∏)
Rename
(ρ) Assignment (=)Knowledge Gate Website
Knowledge Gate Website
• The select, project, and rename operations are called unary operations, because Relational schema - A relation schema R, denoted by R (A1, A2, ..., An), is made up
they operate on one relation. of a relation name R and a list of attributes, A1, A2, ..., An. Each attribute Ai is the
name of a role played by some domain D in the relation schema R. It is use to
• Union, Cartesian product and set difference operate on pairs of relations and describe a Relation.
are, therefore, called binary operations. E.g. Schema representation of Table Student is as –
STUDENT (NAME, ID, CITY, COUNTRY, HOBBY).
• Relational algebra also provides the framework for query optimization.
Relational Instance - Relations with its data at particular instant of time.
• Πcolumn_name (table_name)
Q Write a RELATIONAL ALGEBRA query to find the name of all customer without duplication The Select Operation (Horizontal Selection)
having bank account? • The select operation selects tuples that satisfy a given predicate/Condition p.
Πcustomer_name (depositor)
• Lowercase Greek letter sigma (σ) is used to denote selection.
Q Write a RELATIONAL ALGEBRA query to find all the details of bank branches?
• It is a unary operator.
(branch)
• Eliminates only tuples/rows.
• σ condition (table_name)
• Commutative in Nature, σp1 ^ p2(r) = σp2 ^ p1(r) = σp1( σp2(r)) = σp2( σp1(r)) The Union Operation
• It is a binary operation, denoted, as in set theory, by ∪.
• We allow comparisons using =, ≠, <, >, ≤ and ≥ in the selection predicate. • Written as, Expression1 ∪ Expression2, r ∪ s = {t | t ∈ r or t ∈ s}
• Using the connectives and (∧), or (∨), and not (¬), we can combine several predicates into a • For a union operation r ∪ s to be valid, we require that two conditions hold:
larger predicate.
• The relations r and s must be of the same arity. That is, they must have the same number
• Minimum number of tuples selected can be 0, Maximum selected tuples can be all. of attributes.
• The domains of the ith attribute of r and the ith attribute of s must be the same, for all i.
Q Write a RELATIONAL ALGEBRA query to find all the customer name who have a loan or an The Set-Difference Operation
account or both?
Πcustomer_name(depositor)) U Πcustomer_name(borrower)) • The set-difference operation, denoted by −, allows us to find tuples that are in
one relation but are not in another. It is a binary operator.
• The expression r − s produces a relation containing those tuples in r but not in s.
Q Write a RELATIONAL ALGEBRA query to find all the customer name who have a loan but do not
have an account? • We must ensure that set differences are taken between compatible relations.
Πcustomer_name(borrower)) - (Πcustomer_name(depositor)))
• For a set-difference operation r − s to be valid, we require that the relations r
and s be of the same arity, and that the domains of the ith attribute of r and the
ith attribute of s be the same, for all i.
• 0 <= ӀR - SӀ <= ӀRӀ
The Cartesian-Product Operation • R1 Χ R2 returns a relational instance whose schema contains all the fields of R1 (in
• The Cartesian-product operation, denoted by a cross (×), allows us to combine order as they appear in R1) and all fields of R2 (in order as they appear in R2).
information from any two relations.
• It is a binary operator; we write the Cartesian product of relations R1 and R2 as R1 × R2. • If R1 has m tuples and R2 has n tuples the result will be having = m*n tuples.
• Cartesian-product operation associates every tuple of R1 with every tuple of R2.
• R1 Χ R2 = {rs | r ∈ R1 and s ∈ R2}, contains one tuple <r, s> (concatenation of tuples r • Same attribute name may appear in both R1 and R2, we need to devise a naming
and s) for each pair of tuples r ϵ R1, s ϵ R2. schema to distinguish between these attributes.
R1 X R2
A R1.B R2.B C
1 P Q X
R1 R2 1 P R Y
A B B C 1 P S Z
Q X 2 Q Q X
1 P 2 Q R Y
2 Q R Y 2 Q S Z
3 R S Z 3 R Q X
3 R R Y
Knowledge3 GateRWebsite
S Z Knowledge Gate Website
Q Write a RELATIONAL ALGEBRA query to find the name of all the customers along Q Write a RELATIONAL ALGEBRA query to find the name of all the customers along
with account balance, who have an account in the bank? with account balance, who have an account in the bank?
• ρLearner(Student)
• r ∩ s = r − (r − s)
Πcustomer_name(depositor)) U Πcustomer_name(borrower))
• Set intersection is not a fundamental operation and does not add any
power to the relational algebra.
• r ∩ s = {t | t ∈ r and t ∈ s}
DIVISION
The Natural-Join Operation
*
• The natural join is a binary operation that allows us to combine certain
• In general, the DIVISION operation is applied when we have query like student
selections and a Cartesian product into one operation. who have completed both database1 and data base2 tasks.
• The natural join is associative in nature, the natural join is a Lossy operator.
R1 R2
A B B C R1 ⋈ R2
1 P Q X A B C
2 Q R Y 2 Q X
3 R S Z 3 R Y
CREATE TABLE table_name ( • list of some common data types supported by SQL along with a brief description of each:
column1 data_type [constraints],
• Numeric Data Types:
column2 data_type [constraints],
• 1. `INT`: For storing integer values.
column3 data_type [constraints], • 2. `SMALLINT`: A smaller range of integers compared to INT.
... • 3. `BIGINT`: For storing larger integers.
); • 4. `DECIMAL(p, s)`: For storing exact numerical values, where `p` is the precision and `s` is
the scale.
• 5. `FLOAT`: For storing floating-point numbers.
CREATE TABLE Students ( • 6. `REAL`: A data type that can store floating-point numbers, generally with less precision
StudentID INT PRIMARY KEY, compared to FLOAT.
FirstName VARCHAR(50),
LastName VARCHAR(50), • String Data Types:
• 7. `VARCHAR(n)`: Variable-length character string, where `n` is the maximum length.
Age INT,
• 8. `CHAR(n)`: Fixed-length character string, where `n` is the length.
Email VARCHAR(100) • 9. `TEXT`: For storing long text strings.
);
Knowledge Gate Website Knowledge Gate Website
Adding a New Column Renaming a Column CREATE TABLE Orders (
ALTER TABLE Employees
ALTER TABLE Employees RENAME COLUMN PhoneNumber TO ContactNumber;
OrderID INT PRIMARY KEY,
ADD PhoneNumber VARCHAR(15); CustomerID INT,
Renaming a table OrderDate DATE,
Dropping a Column FOREIGN KEY (CustomerID) REFERENCES Customers (CustomerID)
ALTER TABLE Employees ALTER TABLE Employees
RENAME TO Staff; );
DROP COLUMN PhoneNumber;
INSERT INTO table_name (column1, column2, column3, ...) DELETE FROM table_name
VALUES (value1, value2, value3, ...); WHERE condition;
INSERT INTO Students (StudentID, FirstName, LastName, Age) DELETE FROM Students
VALUES WHERE StudentID = 1;
(1, 'Amit', 'Sharma', 20),
(2, 'Priya', 'Gupta', 22),
(3, 'Ravi', 'Kumar', 19);
DELETE FROM table_name;
• The basic structure of an SQL query consists of three clauses: select, from, and where. The 2. SQL in general is not case sensitive i.e. it doesn’t matter whether we write query in upper or lower case.
query takes it’s input the relations listed in the from clause, operates on them as specified in
3. In practice, duplicate elimination is time-consuming. Therefore, SQL allows duplicates in relations as well as in
the where and select clauses, and then produces a relation as the result without any name the results of SQL expressions. In those cases where we want to force the elimination of duplicates, we insert
unless specified. the keyword distinct after select, will discuss in detail later. SQL allows us to use the keyword all to specify
explicitly that duplicates are not removed, Since duplicate retention is the default, we shall not use all in our
• A typical SQL query has the form. examples.
Select Clause Q Write a SQL query to find all the details of bank branches?
• The function of Select clause in SQL is more or less same as that of ‘∏’ projection in the relational
algebra. It is used to pick the column required in result of the query out of all the columns in Q Write a SQL query to find each loan number along with loan amount?
relation/table. (Vertical filtering)
• Select A1, A2,..., An (Column name) Q Write a SQL query to find the name of all customer without duplication having bank account?
• We can use ‘*’ to specify that we need all columns
• Select * Q Write a SQL query to find all account_no and balance with 6% yearly interest added to it?
• The select clause may also contain arithmetic expressions involving the operators +, -, / and *
operating on constants or attributes of tuples. however, that it does not result in any change to the
relation/table.
Q find all account_no and balance with 6% yearly interest added to it?
Select account_number, balance*1.06
from account
Q Write a SQL query to find all account_no where balance is less the 1000? Q find all account_no where balance is less the 1000?
Select account_number
Q Write a SQL query to find branch name which is situated in Delhi and having assets less than 1,00,000? from account
Where balance < 1000
Q Write a SQL query to find branch name and account_no which has balance greater than equal to 1,000 but less
than equal to 10,000? Q Write a SQL query to find branch name which is situated in Delhi and having assets less than 1,00,000?
Select branch_name
from branch
Where branch_city = ‘delhi’ and assets < 1,00,000
Q Write a SQL query to find branch name and account_no which has balance greater than equal to 1,000 but less
than equal to 10,000?
Select branch_name, account_number
from account
Where balance between 1000 and 10000
3. The intersect operation automatically eliminates duplicates. If we want to retain all duplicates,
we must write intersect all in place of intersect.
Natural Join Q Write a SQL query to find the name of all the customers along with account
• To make the life of an SQL programmer easier for this common case, SQL supports an operation called
balance, who have an account in the bank?
the natural join. The natural join operation like cartesian product operates on two relations and
produces a relation as the result.
• Natural join considers only those pairs of tuples with the same value on those attributes that appear in
the schemas of both relations. Notice that we do not repeat those attributes that appear in the
schemas of both relations; rather they appear only once.
• Notice also the order in which the attributes are listed: first the attributes common to the schemas of
both relations, second those attributes unique to the schema of the first relation, and finally, those
attributes unique to the schema of the second relation. Commutative in nature.
R1 R2
A B B C R1 ⋈ R2
1 P Q X A B C
2 Q R Y 2 Q X
3 R S Z 3 R Y
Knowledge Gate Website Knowledge Gate Website
Q Write a SQL query to find the name of all the customers along with account Outer Join
balance, who have an account in the bank? • The problem with natural join or join or inner join is only those values that appears in both
relations will manage to reach final table, but if some value is explicitly in table one or in
Select customer_name, balance second table then that information will be lost, and that will be loss of information.
From account natural join depositor
• The outer join operation works in a manner similar to the join operations we have already
studied, but preserve those tuples that would be lost in a join, by creating tuples in the result
containing null values.
• There are in fact three forms of outer join:
• The left outer join (left join) preserves tuples only in the relation named before (to the
left of) the left outer join operation.
• The right outer join (right join) preserves tuples only in the relation named after (to the
right of) the right outer join operation.
• The full outer join preserves tuples in both relations.
R1 R2 R1 * R2 R1 ⋈ R2
Alias Operation/rename
A B B C A R1.B R2.B C A B C • SQL aliases are used to give a table, or a column in a table, a temporary name. Just create a
1 P Q X 1 P Q X 2 Q X new copy but do not change anything in the data base. An alias only exists for the duration of
2 Q R Y 1 P R Y 3 R Y
3 R S Z 1 P S Z
the query.
2 Q Q X • Aliases are often used to make column names more readable.
2 Q R Y
2 Q S Z • It uses the as clause, taking the form: old-name as new-name. The as clause can appear in
3 R Q X both the select and from clauses.
3 R R Y
3 R S Z
R1 ⟕ R2 R1 ⟖R2 R1 ⟗ R2
A B C A B C A B C
1 P null 2 Q X 1 P null
2 Q X
2 Q X 3 R Y
3 R Y
3 R Y null S Z null S Z
Q Write a SQL query to find the loan_no with maximum loan amount? Aggregate Functions
Select balance • Aggregate functions are functions that take a collection (a set or multiset) of values as input
From account and return a single value. SQL offers five built-in aggregate functions:
Except • Average: avg
Select A.balance • Minimum: min
From account as A, account as B • Maximum: max
• Total: sum
Where A.balance <B.balance
• Count: count
• The input to sum and avg must be a collection of numbers, but the other operators can
operate on collections of nonnumeric data types, such as strings, as well. Count is the only
aggregate function which can work with null, all other aggregate functions simply ignore null.
• We use the aggregate function count frequently to count the number of tuples in a relation.
The notation for this function in SQL is count (*).
Select sum(balance)/count(balance)
from account
Q find all the branch name who have exactly 5 character in their name ? • We define the escape character for a like comparison using the escape keyword. To illustrate,
Select branch_name consider the following patterns, which use a backslash (\) as the escape character:
from branch
where branch_name like ‘_ _ _ _ _ _’ • like ’ab\%cd%’ escape ’\’ matches all strings beginning with “ab%cd”.
Q find all the customer name who have ‘kumar’ in their name ? • like ’ab\\cd%’ escape ’\’ matches all strings beginning with “ab\cd”.
Select customer_name
from customer
where customer_name like ‘%kumar%’
Q find the branch name of Gwalior city with average balance more than 1500? • Trigger Definition
Select branch.branch_name, avg(balance) • A trigger is a special procedure that automatically activates in response to modifications on
from branch, account a table or view in a database, aiding in maintaining data integrity.
Where branch.branch_name = account.branch_name and branch_city = ‘gwalior’
Group by branch_name • Usage of Triggers
Having avg(balance) > 1500 • Triggers help maintain database integrity by automatically enforcing conditions or
modifying data during database operations. They are crucial for applying business rules,
auditing database alterations, and supporting data replication, ensuring compliance before
any data insertions, updates, or deletions.
Q Find the loan number for each loan of amount over 1200? Q Find the name of all the customers who have a loan from Noida branch?
{t| ∃s ∈ loan (t[loan number] = s[loan number]) ∧ s[amount] > 1200] } {t| ∃s ∈ borrower (t[customer name] = s[customer name])
∧ ∃u ∈ loan (u[customer name] = s[customer name])
∧ u[branch name] = ‘Noida’ }
Q Find the name of all the customers who have a loan from the bank and do not Expressive Power of Languages
have a account?
• The tuple relational calculus restricted to safe expressions is equivalent in expressive
{t| ∃u ∈ depositor borrower (t[customer name] = u[customer name]) power to the basic relational algebra (with the operators ∪, −, ×, and , but without
the extended relational operations such as generalized projection and aggregation
∧ ¬∃s ∈ borrower (t[customer name] = s[customer name]) } (G)).
• For every relational-algebra expression using only the basic operations, there is an
equivalent expression in the tuple relational calculus, and for every tuple-relational-
calculus expression, there is an equivalent relational algebra expression.
• There are infinitely many tuples that are not in instructor. Most of these tuples • The variable range over single values from domains of attributes. To form a relation of
contain values that do not even appear in the database. We do not want to degree n for a query result, we must have n of these domain variables. One for each
allow such expressions. attribute.
• An expression of the domain calculus is of the form
• (x1, x2, …, xn | COND(x1, x2, …, xn, xn+1, xn+2, …, x n+m ))
• Where x1, x2, …, xn are domain variables that range over domains and COND is a
condition of the domain relational calculus.
Q Find the loan number for each loan of amount over 1200? Q Find the name of all the customers who have a loan from Noida branch with loan amount?
{ < c, a > | ∃l (< c, l > ∈ borrower ∧ ∃b (< l, b, a > ∈ loan ∧ b = ‘Noida’ ))}
{ < l > | ∃b, a < l, b, a > ∈ loan ∧ a > 1200] }
T1
Read(A)
A = A-100
Write(A)
Read(B)
B = B+100
Write(B)
Knowledge Gate Website Knowledge Gate Website
Desirable properties of transaction • Transactions should possess several properties, often called the ACID
properties; to provide integrity and consistency of the data in the
• Now as the smallest unit which have atomicity in DBMS view is transaction, so if database. The following are the ACID properties:
want that our data should be consistent then instead of concentrating on data
base, we must concentrate on the transaction for our data to be consistent.
T1
Read(A)
A = A-100
Write(A)
Read(B)
B = B+100
Write(B)
T1
Read(A)
A = A-100
Write(A)
Read(B)
B = B+100
Write(B)
• Isolation - A transaction should appear as though it is being executed in isolation • Durability - The changes applied to the database by a
from other transactions, even though many transactions are executing
concurrently.
committed transaction must persist in the database.
• That is, the execution of a transaction should not be interfered with by any other
transactions executing concurrently.
• Sometimes it is possible that even though individual transaction are satisfying the acid
properties even though the final statues of the system will be inconsistent.
• Conclusion of schedules
• We do not have any method to proof that a schedule is consistent, but from the
above discussion we understand that a serial schedule will always be consistent. On the basis of
SERIALIZABILITY
• So if somehow we proof that a non-serial schedule will also have same effects as of
a serial schedule than we can proof that, this particular non-serial schedule will
also be consistent.
Conflict
serializable
View
serializable
T1 T2 T1 T2
T1 T2 T1 T2 T1 T2 T1 T2
R(A) W(B) R(A) W(A) R(A) R(B)
W(B) R(A) W(A) R(A)
A=A-50 B=B+50
R(B) R(A)
T1 T2 T1 T2 T1 T2 T1 T2
B=B+50 A=A-50
R(A) R(A) W(A) W(A)
R(A) R(A) W(A) W(A) R(B) R(B)
B=B+50 B=B+50
R(A) R(A)
A=A+10 A=A+10
Knowledge Gate Website Knowledge Gate Website
• So, the instructions I and J are said to be conflicting, if they are operations by different
transactions on the same data item, and at least one of these instructions is a write operation.
• If an edge Ti → Tj exists in the precedence graph, then in any serial schedule S’ equivalent to S,
Ti must appear before Tj.
• If the precedence graph for S has no cycle, then schedule S is conflict serializable, else it is
not.
VIEW SERIALIZABLE • Two schedules S and S’ are view equivalent, if they satisfy following conditions –
• If a schedule is not conflict serializable, still it can be consistent, so let us study a weaker form • For each data item Q, if the transaction Ti reads the initial value of Q in schedule S , then then the
of serializability called View serializability, and even if a schedule is view serializable still it can transaction Ti must, in schedule S’ ,also read the initial value of Q.
be consistent. • If a transaction Ti in schedule S reads any data item Q, which is updated by transaction Tj, then a
transaction Ti must in schedule S’ also read data item Q updated by transaction Tj in schedule S’.
• If a schedule is conflict serializable then it will also be view serializable, so we must check view
serializability only if a schedule is not conflict serializable. • For each data item Q, the transaction (if any) that performs the final write(Q) operation in schedule
S, then the same transaction must also perform final write(Q) in schedule S’.
A schedule S is view serializable, if it is view equivalent to a serial schedule. On the basis of On the basis of
SERIALIZABILITY RECOVERABILITY
Conflict
Recoverable
serializable
View
Cascadeless
serializable
Strict
S1 S2 S3
T1 T2 T1 T2 T1 T2
R(a) R(a) R(a)
W(a) W(a) R(b)
W(a) C W(a)
C W(a) W(b)
R(a) R(a) C
C C R(a)
C
Log based Recovery Aspect Deferred Database Modification Immediate Database Modification
• The system log, or transaction log, records all the changes or activities happening in the Write Operation Timing Only at the commit point. As soon as changes occur.
database, ensuring that transactions are durable and can be recovered in the event of a
failure.
Reduced, as changes are batched
Increased, as each change triggers
• Different types of log records represent various stages or actions in a transaction. and written at once during the
I/O Operations immediate write operations, leading
• <Ti start>: This log entry indicates that the transaction Ti has started its execution. commit, saving on intermediate I/O
to more frequent I/O operations.
• < Ti, Xj, V1, V2>: This log entry documents a write operation. It states that transaction Ti has operations.
changed the value of data item Xj from V1 to V2.
• < Ti commit>: This log entry marks the successful completion of transaction Ti.
• < Ti abort>: This log entry denotes that the transaction Ti has been aborted, either due to
an error or a rollback operation. Simpler, as uncommitted changes More complex, as it may require
are not reflected in the database, undoing changes from uncommitted
Recovery Complexity
• Before any write operation modifies the database, a log record of that operation needs to be making rollback easier in case of transactions that have been written
created. This is to ensure that in case of a failure, the system can restore the database to a failures. to the database.
consistent state using the log records.
Knowledge Gate Website Knowledge Gate Website
Shadow Paging Recovery Technique • Commit: Upon transaction commit, the shadow page table is discarded, and the current page
table becomes the new committed state of the database. The database atomically switches to
In the shadow paging recovery technique, the database maintains two page tables during a using the current page table, ensuring that changes are installed all at once.
transaction: the current page table (reflecting the state before the transaction began) and the
shadow page table (which tracks changes made during the transaction). Here's how it operates: • Recovery: In case of a system failure before the transaction commits, the database can easily
recover by discarding the current page table and reverting back to the shadow page table,
• Initialization: When a transaction begins, the database creates a shadow copy of the page thereby restoring the database to its state before the transaction began.
table. The database pages themselves are not duplicated; only the page table entries are
duplicated.
• Modifications: As the transaction progresses, any changes are reflected in the current page
table, while the shadow page table retains the original state. If a page is modified, a new copy
of the page is created, and the current page table is updated to point to this new version,
thereby ensuring that the shadow page table still points to the original unmodified page.
• Validation based protocol – Majority of transactions are read only transactions, the rate of
conflicts among the transaction may be low, thus many of transaction, if executed without the
supervision of a concurrency control scheme, would nerveless leave the system in a consistent
Knowledge Gate Website state. Knowledge Gate Website
• Basic idea of time stamping is to decide the order between the transaction • With each transaction ti, in the system, we associate a unique fixed timestamp, denoted by
TS(ti).
before they enter in the system using a stamp (time stamp), in case of any
conflict during the execution order can be decided using the time stamp. • This timestamp is assigned by database system to a transaction at time transaction enters
into the system.
• Let’s understand how this protocol works, here we have two idea of
timestamping, one for the transaction, and other for the data item. • If a transaction has been assigned a timestamp TS(ti) and a new transaction tj , enters into
the system with a timestamp TS(tj), then always TS(ti) <TS(tj).
• These timestamps are updated whenever a new read(Q) or write(Q) instruction is executed.
2. If TS(Ti )≥ W-timestamp(Q), then the read operation is executed, and R-timestamp(Q) is 2. If TS(Ti ) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. Hence, this
set to the maximum of R-timestamp(Q) and TS(Ti). write operation is rejected, and Ti is rolled back.
3. If TS(Ti ) ≥ R-timestamp(Q), then the write operation is executed, and W-timestamp(Q) is set to
max(W-timestamp(Q), TS(Ti )).
4. If TS(Ti ) ≥ W-timestamp(Q), then the write operation is executed, and W-timestamp(Q) is set to
max(W-timestamp(Q), TS(Ti )).
Time Stamp Ordering YES YES NO NO YES • Thomas write is an improvement in time stamping protocol, which makes some modification
and may generate those protocols that are even view serializable, because it allows greater
Thomas Write Rule potential concurrency.
Basic 2PL
• It is a Modified version of the timestamp-ordering protocol in which Blind write operations
Conservative 2PL may be ignored under certain circumstances.
Rigorous 2PL
• The protocol rules for read operations remain unchanged. while for write operation, there is
Strict 2PL slightly change in Thomas write rule than timestamp ordering protocol.
When Ti attempts to write data item Q, • This modification is valid as the any transaction with TS(Ti ) < W-timestamp(Q), the value
written by this transaction will never be read by any other transaction performing Read(Q)
ignoring such obsolete write operation is considerable.
• if TS(Ti ) < W-timestamp(Q), then Ti is attempting to write an obsolete value
of {Q}. Rather than rolling back Ti as the timestamp ordering protocol would • Thomas' Write Rule allows greater potential concurrency. Allows some view-serializable
have done, this {write} operation can be ignored. schedules that are not conflict serializable.
• In general, we support two modes of lock because, to provide better concurrency. Lock –Compatibility Matrix
• Conclusion shared is compatible only with shared while exclusive is not compatible either with
• Shared mode shared or exclusive.
• If transaction Ti has obtained a shared-mode lock (denoted by S) on any data item
• To access a data item, transaction Ti must first lock that item, if the data item is already locked
Q, then Ti can read, but cannot write Q, any other transaction can also acquire a
by another transaction in an incompatible mode, or some other transaction is already waiting
shared mode lock on the same data item(this is the reason we called this shared in non-compatible mode, then concurrency control manager will not grant the lock until all
mode). incompatible locks held by other transactions have been released. The lock is then granted.
• Exclusive mode
• If transaction Ti has obtained an exclusive-mode lock (denoted by X) on any data
item Q, then Ti can both read and write Q, any other transaction cannot acquire
either a shared or exclusive mode lock on the same data item. (this is the reason we
called this exclusive mode)
Two phase locking protocol(2PL) • Initially a transaction is in growing phase and acquires lock as needed and in between can
• The protocol ensures that each transaction issue lock and unlock requests in two phases, note perform operation reach to lock point and once a transaction releases a lock, it can issue no
that each transaction will be 2 phased not schedule. more lock requests i.e. it enters the shrinking phase.
• Growing phase- A transaction may obtain locks, but not release any locks.
T1 T2
• Shrinking phase- A transaction may release locks, but may not obtain any new locks. LOCK-X(A)
READ(A)
WRITE(A)
LOCK-S(B)
READ(B)
LOCK-X(B)
READ(B)
WRITE(B)
LOCK-S(A)
READ(A)
UNLOCK(B)
UNLOCK(A)
UNLOCK(B)
Knowledge Gate Website Knowledge Gate Website
UNLOCK(A)
Conflict View Recoverability Cascadelessness Deadlock
Serializability Serializability Freedom Properties
• 2PL ensures conflict serializability, and the ordering of transaction over lock points is itself a
Time Stamp Ordering YES YES NO NO YES serializability order of a schedule in 2PL.
Thomas Write Rule NO YES NO NO YES
• If a schedule is allowed in 2PL protocol then definitely it is always conflict serializable. But it is
Basic 2PL YES YES NO NO NO not necessary that if a schedule is conflict serializable then it will be generated by 2pl.
Conservative 2PL Equivalent serial schedule is based on the order of lock points.
IS IX S X IS IX S X
IS IS true true true false
IX IX true true false false
S S true false true false
X X false false false false
Validation-Based Protocols • Read phase. During this phase, the system executes transaction Ti. It reads the values of the
various data items and stores them in variables local to Ti. It performs all write operations on
• In cases where a majority of transactions are read-only transactions, the rate of conflicts temporary local variables, without updates of the actual database.
among transactions may be low.
• Validation phase. The validation test (described below) is applied to trans- action Ti. This
• Thus, many of these transactions, if executed without the supervision of a concurrency-control determines whether Ti is allowed to proceed to the write phase without causing a violation of
scheme, would nevertheless leave the system in a consistent state. serializability. If a transaction fails the validation test, the system aborts the transaction.
• A concurrency-control scheme imposes overhead of code execution and possible delay of • Write phase. If the validation test succeeds for transaction Ti, the temporary local variables
transactions. It may be better to use an alternative scheme that imposes less overhead. that hold the results of any write operations performed by Ti are copied to the database.
Read-only transactions omit this phase.
• A difficulty in reducing the overhead is that we do not know in advance which transactions
will be involved in a conflict. To gain that knowledge, we need a scheme for monitoring the
system.
• The validation protocol requires that each transaction Ti executes in two or three different
phases in its lifetime, depending on whether it is a read-only or an update transaction. The
phases are, in order:
Knowledge Gate Website Knowledge Gate Website