0% found this document useful (0 votes)
25 views52 pages

DBMS Course File

The document outlines the course file for the Database Management Systems course (CS3492) offered in the 4th semester of the Information Technology program. It includes the course's vision, mission, educational objectives, outcomes, specific outcomes, syllabus, and a timetable, emphasizing the importance of technical skills, ethical standards, and lifelong learning. Additionally, it lists the students enrolled in the course and provides a detailed lecture schedule for the academic year.

Uploaded by

vinotha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views52 pages

DBMS Course File

The document outlines the course file for the Database Management Systems course (CS3492) offered in the 4th semester of the Information Technology program. It includes the course's vision, mission, educational objectives, outcomes, specific outcomes, syllabus, and a timetable, emphasizing the importance of technical skills, ethical standards, and lifelong learning. Additionally, it lists the students enrolled in the course and provides a detailed lecture schedule for the academic year.

Uploaded by

vinotha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 52

DEPARTMENT OF INFORMATION TECHNOLOGY

PREFACE OF THE COURSE FILE

Batch :

Academic Year : / EVEN

Program : INFORMATION TECHNOLOGY

Year & Semester : 2nd Year / 4th Semester

Course Code : CS3492 NBA Course Code:

Name of the Course : DATABASE MANAGEMENT SYSTEMS

Faculty in-charge : T.VINOTHA/AP

Signature of the Faculty in-charge HoD / IT


VISION
To produce globally competent Information Technologists through outcome-based education, cutting-
edge research, critical thinking, hands-on skills development, and adherence to ethical principles,
while fostering responsible citizens of India who contribute to societal and global growth.

MISSION
Outcome based Education: To impart quality education in Information Technology that meets the
evolving needs of the profession and society, ensuring that students are equipped with the necessary
skills and knowledge for successful careers in agriculture and related fields.
Innovation and Research Environment: To create an excellent innovation and research
environment that encourages students and faculty to engage in cutting-edge research, addressing
pressing challenges in information technology and contributing to sustainable development.
Networking Opportunities: To facilitate networking opportunities with alumni, industry
professionals, and research organizations, fostering collaborations that enhance learning and career
prospects for our students.
Technical Skills and Ethical Standards: To enlighten and develop the hidden technical skills and
abilities of youth while instilling the highest standards of professional ethics, teamwork, transparency,
and entrepreneurial skills, preparing them to be responsible leaders in the field of Information
Technology.

PROGRAM EDUCATIONAL OBJECTIVES (PEOs) :


PEO 1 Career Advancement in Information Technology: Graduates will establish successful
careers in the Information Technology Industry by applying cutting-edge knowledge and skills to
develop Innovative solutions that meet the evolving needs of the profession and society.
PEO 2-Research and Development in IT: Graduates will engage in ploneering research and
development activities addressing current challenges in Information Technology and contributing to
technological advancements in areas such as software engineering. cybersecurity, data science, and
emerging technologies.
PEO 3-Lifelong Learning and Professional Networking: Graduates will pursue lifelong learning
and professional development, actively engaging with alumni, Industry professionals, and research
organizations to foster collaboration, gain exposure to real-world practices, and enhance career
advancement opportunities.
PEO 4- Ethical Leadership and Technical Excellence: Graduates will demonstrate high standards
of professional ethics, teamwork, and transparency, leveraging their technical skills and
entrepreneurial abilities to lead responsibly in the field of Information Technology, fostering
Innovation while adhering to ethical principles.

PROGRAMME OUTCOMES (POs):


On successful completion of the programme,
1. Engineering knowledge: Graduates will demonstrate knowledge of mathematics, science and
engineering.
2. Problem Analysis: Graduates will demonstrate an ability to identify, formulate and solve
engineering problems.
3. Design/development of solutions: Design solutions for complex engineering problems and design
system components or processes that meet the specified needs with appropriate consideration for the
public health and safety and the cultural, societal and environmental consideration.
4. Conduct investigation of complex problems: Graduate will demonstrate an ability to design and
conduct experiments, analyze and interpret data, and synthesis of the information to provide valid
conclusions.
5. Modern tool usage: create, select and apply appropriate technique, recourses and modern
engineering and IT tools including prediction and modeling to complex engineering activities with an
understanding of the limitations.
6. The engineer and society: apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and consequent responsibilities relevant to the
professional engineering practice.
7. Environment and sustainability: understand the impact of the professional engineering solutions
in societal and environmental contexts and demonstrate the knowledge of, and need for sustainable
development
8. Ethics: apply ethical principles and commit to professional ethics and responsibilities and norms of
the engineering practice.
9. Individual and team work: function effectively as an individual and as a member or leader in
diverse teams and in multidisciplinary settings.
10. Communication: communicate effectively on complex engineering activities with the
engineering community and with society at large, such as being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and receive clear
instructions
11. Project management and finance: demonstrate knowledge and understanding of the engineering
and management principles and apply these to one’s own work, as a member and leader in a team, to
manage projects and in multidisciplinary environments.
12. Life-long learning: recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.

PROGRAM SPECIFIC OUTCOMES (PSOS):


PSO 1- Development and Management of IT Solutions: Graduates will demonstrate the ability to
design, develop, and manage innovative IT solutions that meet the evolving needs of businesses and
organizations. They will be proficient in utilizing modern software development methodologies,
frameworks, and tools to create scalable and efficient applications.
PSO 2 Cyber security and Risk Management Competence: Graduates will possess a strong
understanding of cyber security principles and practices, enabling them to assess and mitigate risks in
information systems. They will be capable of implementing security protocols and strategies to
protect sensitive data and ensure the integrity, confidentiality, and availability of information
resources

STUDENTS NAME LIST


Sl.No Register Number Name of the Student
.
1 811223205001 Aruldhas A
2 811223205002 Arunkumar A
3 811223205003 Aswin S
4 811223205004 Balaji S
5 811223205006 Bennyhinn Meshanth S
6 811223205008 Dhanush R
7 811223205010 Dhavakirshanan S
8 811223205011 Eniya R
9 811223205012 Eswaran C
10 811223205014 Gokul S
11 811223205015 Gokulamani M
12 811223205016 Hariharan E
13 811223205017 Jedan J
14 811223205018 Jegatheeswaran M
15 811223205019 Jenily Christy J
16 811223205022 Kathirvel V
17 811223205023 Kaviya G
18 811223205024 Kilson Christopher A
19 811223205025 Krishna khumaran T.U
20 811223205026 Lenin N
21 811223205027 S.Madheshwaran
22 811223205028 Madhumitha M
23 811223205029 Mavitha K
24 811223205030 Mayuri B
25 811223205031 Mohamed Jameer Basha A
26 811223205032 Nisha M
27 811223205033 Nivash P
28 811223205034 Pandiyarajan M
29 811223205035 Paul S
30 811223205036 Praveen Kumar S
31 811223205037 Premkumar E
32 811223205039 Priyadharshini K
33 811223205040 Rajeswari S
34 811223205042 Reethika S
35 811223205043 Sanjay R
36 811223205044 Sanjay P
37 811223205045 Sanjay P
38 811223205047 Saravanan D
39 811223205048 Shalika Shahana M
40 811223205049 Shobana R
41 811223205050 Sneka V
42 811223205051 Sowmiya S
43 811223205052 Sowmiya S
44 811223205053 Sudarvizhi S
45 811223205054 Sudhar singh M
46 811223205055 Udhayam U
47 811223205056 Varshini S
48 811223205057 Vetrivel P
49 811223205059 Vigneshwarar A
50 811223205060 Vinitha S
51 811223205061 Vishalini R
52 811223205062 Viviliya joicy A
53 811223205063 Yogeshwaran P

Signature of the Faculty in-charge HoD / IT

COURSE TIMETABLE
INDRA GANESAN COLLEGE OF ENGINEERING
IG Valley, Manikandam, Tiruchirappalli, Tamil Nadu – 620 012, India
(Approved by AICTE, New Delhi, Affiliated to Anna University, Chennai-25)

DEPARTMENT OF INFORMATION TECHNOLOGY

Faculty Time Table

Day Order 1 2 3 4 5 6 7 8

I DBMS LAB
II DBMS

III
DBMS
IV DBMS
V DBMS

S. Code Title Year / Branch Hours


CS3492 II/ IT 5
TOTAL - 13 hours

Signature of the Faculty in-charge HoD / IT


SYLLABUS

CS3492 DATABASE MANAGEMENT SYSTEMS

COURSE OBJECTIVES:
 To learn the fundamentals of data models, relational algebra and SQL
 To represent a database system using ER diagrams and to learn normalization
techniques
 To understand the fundamental concepts of transaction, concurrency and recovery
processing
 To understand the internal storage structures using different file and indexing
techniques which will help in physical DB design
 To have an introductory knowledge about the Distributed databases, NOSQL and
database security

UNIT I RELATIONAL DATABASES 10
Purpose of Database System – Views of data – Data Models – Database System
Architecture – Introduction to relational databases – Relational Model – Keys – Relational
Algebra – SQL fundamentals – Advanced SQL features – Embedded SQL– Dynamic SQL

UNIT II DATABASE DESIGN 8


Entity-Relationship model – E-R Diagrams – Enhanced-ER Model – ER-to-Relational
Mapping – Functional Dependencies – Non-loss Decomposition – First, Second, Third
Normal Forms, Dependency Preservation – Boyce/Codd Normal Form – Multi-valued
Dependencies and Fourth Normal Form – Join Dependencies and Fifth Normal Form

UNIT III TRANSACTIONS 9


Transaction Concepts – ACID Properties – Schedules – Serializability – Transaction support
in SQL – Need for Concurrency – Concurrency control –Two Phase Locking- Timestamp –
Multiversion – Validation and Snapshot isolation– Multiple Granularity locking – Deadlock
Handling – Recovery Concepts – Recovery based on deferred and immediate update –
Shadow paging – ARIES Algorithm

UNIT IV IMPLEMENTATION TECHNIQUES 9


RAID – File Organization – Organization of Records in Files – Data dictionary Storage –
Column Oriented Storage– Indexing and Hashing –Ordered Indices – B+ tree Index Files –
B tree Index Files – Static Hashing – Dynamic Hashing – Query Processing Overview –
Algorithms for Selection, Sorting and join operations – Query optimization using Heuristics -
Cost Estimation.

UNIT VADVANCED TOPICS 9

Distributed Databases: Architecture, Data Storage, Transaction Processing, Query


processing and optimization – NOSQL Databases: Introduction – CAP Theorem – Document
Based systems – Key value Stores – Column Based Systems – Graph Databases. Database
Security: Security issues – Access control based on privileges – Role Based access control
– SQL Injection – Statistical Database security – Flow control – Encryption and Public Key
infrastructures – Challenges

TOTAL: (L=+P=)

COURSE OUTCOMES:
 CO1: Construct SQL Queries using relational algebra
 CO2: Design database using ER model and normalize the database
 CO3: Construct queries to handle transaction processing and maintain consistency
of the database
 CO4: Compare and contrast various indexing strategies and apply the knowledge to
 tune the performance of the database
 CO5: Appraise how advanced databases differ from Relational Databases and find a
suitable database for the given requirement.
TOTAL:45 PERIODS

TEXT BOOKS:
1. Abraham Silberschatz, Henry F. Korth, S. Sudharshan, “Database System Concepts”,
Seventh Edition, McGraw Hill, 2020.
2. Ramez Elmasri, Shamkant B. Navathe, “Fundamentals of Database Systems”, Seventh
Edition, Pearson Education, 2017

REFERENCES:
1. C.J.Date, A.Kannan, S.Swamynathan, “An Introduction to Database Systems”, Eighth
Edition, Pearson Education, 2006

COs- PO’s & PSO’s MAPPING


CO’s PO’s PSO’s
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
1 2 2 3 2 1 - - - 2 1 1 1 2 1 3
2 3 1 1 1 1 - - - 2 3 3 3 3 1 2
3 3 2 3 2 1 - - - 2 1 1 2 2 3 3
4 1 2 3 2 - - - - 3 2 3 3 1 2 3
5 1 1 3 3 2 - - - 1 3 3 1 2 2 2
AVg. 2 2 3 2 1 - - - 2 2 2 2 2 2 3

Signature of the Faculty in-charge HoD / IT

Lecture Schedule
Degree/Program: B. Tech Information Technology
Course code &Name:CS3492 Data Base Management systems

Duration: 2024-2025 Even Semester: IV Faculty: T.vinotha

WEEK 1: RELATIONAL DATABASES(


 Topics: Purpose of Database System – Views of data – Data Models – Database System
 . Views of data – Data Models
 Architecture
 WEEK 2- UNIT I

 SQL fundamentals – Advanced SQL features – Embedded SQL– Dynamic SQL

 Introduction to relational databases – Relational Model – Keys – Relational Algebra.

 Topics:
1. Relational Model – Keys.
 WEEK 3

 Relational Algebra. SQL fundamentals


1. Drawing of tangents and normals to the above curves.
Advanced SQL features Embedded SQL– Dynamic SQL

WEEK 4- UNIT II –Topics: DATABASE DESIGN(


 Entity-Relationship model – E-R Diagrams
 Enhanced-ER Model – ER-to-Relational Mapping –
 Functional Dependencies – Non-loss Decomposition.
WEEK 5
 First, Second, Third Normal Forms, Dependency Preservation
 Boyce/Codd Normal Form – Multi-valued Dependencies
 Fourth Normal Form – Join Dependencies and Fifth Normal Form
WEEK 6: UNIT III –TRANSACTIONS (
 Topics:
 Transaction Concepts – ACID Properties – Schedules.
 Serializability – Transaction support in SQL – Need for Concurrency – Concurrency control.
 Two Phase Locking- Timestamp – Multiversion – Validation and Snapshot isolation
.
WEEK 7
 Multiple Granularity locking – Deadlock Handling – Recovery Concepts
 Recovery based on deferred and immediate update
 Shadow paging – ARIES Algorithm

WEEK 8-9: UNIT IV IMPLEMENTATION TECHNIQUES (6 Hours)


 Topics:
 RAID – File Organization – Organization of Records in Files.
 Data dictionary Storage – Column Oriented Storage– Indexing and Hashing
 Ordered Indices – B+ tree Index Files – B tree Index Files
 Static Hashing – Dynamic Hashing – Query Processing Overview
 Algorithms for Selection, Sorting and join operations
 Query optimization using Heuristics - Cost Estimation.

WEEK 10-11: UNIT V – ADVANCED TOPICS (6+ Hours)


 Topics:
 Distributed Databases: Architecture, Data Storage,
 Transaction Processing, Query processing and optimization
 NOSQL Databases: Introduction – CAP Theorem
 Document Based systems – Key value Stores – Column Based Systems
 Graph Databases. Database Security: Security issues – Access control based on privileges.
 Role Based access control – SQL Injection –
 Statistical Database security – Flow control
 Encryption and Public Key infrastructures – Challenges

Signature of the Faculty in-charge HoD / IT

CIA I QUESTION PAPER

Register
Number:

INDRA GANESAN COLLEGE OF


ENGINEERING
(AN AUTONOMOUS INSTITUTION)
IG Valley, Manikandam, Tiruchirappalli, Tamil Nadu – 620 012, India
(Approved by AICTE, New Delhi and affiliated to Anna University,
Chennai)
Continuous Internal Assessment -I Date 05.04.2025 Marks 100
Course code CS3492 Course Title DATABASE MANAGEMENT SYSTEMS
Academic
Regulation 2024 Duration 3 hrs 2024-25
Year
Year II Semester II Department IT
COURSE OUTCOMES
Q.No. Question CO BTS
PART A
(Answer all the Questions 5 x 20 = 100 Marks)
1.Define database management system and its applications.
A DBMS is software that allows users to define, create, maintain, and control access to databases.
Applications include banking, airlines, universities, telecommunications, and online shopping.
2.What is the purpose of a database management system?
To efficiently store, retrieve, and manage data in databases while ensuring data integrity, security,
and concurrency.
3.Name the composite attributes.
Composite attributes consist of multiple components. Example: Name can be split into FirstName
and LastName.
4 4.List the basic operations of relational algebra.

 Selection (σ)
 Projection (π)
 Union (∪)
 Set difference (−)
 Cartesian product (×)
 Rename (ρ)

44
5.Define specialization and multivalued attributes.

Specialization: Process of defining a set of subclasses of an entity type.Multivalued


Attribute: An attribute that can have multiple values (e.g., phone numbers).
6.Differentiate generalization and aggregation.

Generalization: Combining similar entities into a generalized entity.


Aggregation: Abstracting a relationship set as a higher-level entity
List the constraints involved in creating in ER diagram.

 Key constraints
 Participation constraints
 Cardinality constraints
 Referential integrity constraints
8.What is schema?
A schema is the logical structure of the database, describing tables, relationships, and
constraints.
9.Tell the ACID properties.

 Atomicity
 Consistency
 Isolation
 Durability

10.Outline the states of Transaction.

 Active → Partially Committed → Committed


 Or: Active → Failed → Aborted → Restart

P PART B – (Answer any 5 × 13 = 65 Marks)

11a. Explain DBMS Architecture with an example.

 Three-level architecture:
o Internal level (physical)
o Conceptual level (logical)
o External level (view)
 Example: University DB where students and professors view different interfaces.

OR

11b. Illustrate the various types of SQL commands. Discuss the features of
SQL.

 DDL (CREATE, ALTER, DROP)


 DML (SELECT, INSERT, UPDATE, DELETE)
 DCL (GRANT, REVOKE)
 TCL (COMMIT, ROLLBACK)
 Features: Easy syntax, supports joins, views, transactions.

12a. Define relational algebra. Explain various relational algebraic operations


with example.

 Selection (σ): σ(age>25)(STUDENT)


 Projection (π): π(name)(STUDENT)
 Join, Union, Set Difference, Rename

OR
12b.

i) Find the different types of keys used in DBMS.

 Super Key, Candidate Key, Primary Key, Foreign Key, Composite Key, Alternate
Key.

ii) Summarize Normalization. Explain 1NF, 2NF, 3NF, BCNF with examples.

 1NF: Atomic values


 2NF: Full functional dependency
 3NF: No transitive dependency
 BCNF: Determinant must be a candidate key

13a. Demonstrate an ER model by taking Hospital/Banking/University


Database and explain all the relationship sets with cardinality.

 Entities: Doctor, Patient, Treatment


 Relationships: Treats (1:N), Admits (1:1)
 Show cardinalities and keys.

OR
13b. Extend EER Model with a neat sketch for School Database.

 Entities: Student, Teacher, Course


 ISA: Student ISA UG_Student, PG_Student
 Show attributes and specialization.

14a. Show suitable example, the constraints of specialization and


generalization in ER data modelling.

 Generalization: Vehicle → {Car, Bike}


 Specialization constraints:
o Disjoint vs Overlapping
o Total vs Partial

OR

14b. Build the ACID properties. Explain Transactions with SQL support for a
banking application.

 Example: Transfer funds from Account A to B

sql
CopyEdit
BEGIN;
UPDATE account SET balance = balance - 1000 WHERE acc_id = 'A';
UPDATE account SET balance = balance + 1000 WHERE acc_id = 'B';
COMMIT;

 ACID discussed with this transaction.

: PART C – (1 × 15 = 15 Marks)

15a. Utilize and explain in detail about serializability with suitable transaction
examples.

 Conflict Serializability: Reordering of instructions


 Example with two transactions accessing same data item
 Precedence Graph to determine serializability

OR
15b.Construct an E-R diagram for a car insurance company whose customers own
one or more cars each. Each car has associated with it zero to any number of
recorded accidents.

 Entities: Customer, Car, Accident


 Relationships:
o Owns (Customer–Car) – 1:N
o Involved_in (Car–Accident) – 0:N
 Keys and cardinalities shown clearly
 Attributes:
o Customer (CID, Name, Phone)
o Car (RegNo, Model)
o Accident (AID, Date, Location

Signature of the Faculty in-charge HoD / IT


STUDENTS MARK STATEMENT - CIA I
24GE201, ENGINEERING GRAPHICS
YEAR/SEM: I/II MONTH & YEAR: April & 2025
CO1 CO2 C03
S.NO REG NO STUDENT NAME TOTAL (100)
(40) (40) (20)
8112232050
1. Aruldhas A
01

2. 811223205002 Arunkumar A
3. 811223205003 Aswin S
4. 811223205004 Balaji S
5. 811223205006 Bennyhinn Meshanth S
6. 811223205008 Dhanush R
7. 811223205010 Dhavakirshanan S
8. 811223205011 Eniya R

9. 811223205012 Eswaran C

10 811223205014 Gokul S

11 811223205015 Gokulamani M
12 811223205016 Hariharan E
13 811223205017 Jedan J
14 811223205018 Jegatheeswaran M
15 811223205019 Jenily Christy J
16 811223205022 Kathirvel V
17 811223205023 Kaviya G
18 811223205024 Kilson Christopher A
19 811223205025 Krishna khumaran T.U
20 811223205026 Lenin N
21 811223205027 S.Madheshwaran
22 811223205028 Madhumitha M
23 811223205029 Mavitha K
24 811223205030 Mayuri B
25 811223205031 Mohamed Jameer Basha A
26 811223205032 Nisha M
27 811223205033 Nivash P
28 811223205034 Pandiyarajan M
29 811223205035 Paul S
30 811223205036 Praveen Kumar S
31 811223205037 Premkumar E
32 811223205039 Priyadharshini K
33 811223205040 Rajeswari S
34 811223205042 Reethika S
35 811223205043 Sanjay R
36 811223205044 Sanjay P
37 811223205045 Sanjay P
38 811223205047 Saravanan D
39 811223205048 Shalika Shahana M
40 811223205049 Shobana R
41 811223205050 Sneka V
42 811223205051 Sowmiya S
43 811223205052 Sowmiya S
44 811223205053 Sudarvizhi S
45 811223205054 Sudhar singh M
46 811223205055 Udhayam U
47 811223205056 Varshini S
48 811223205057 Vetrivel P
49 811223205059 Vigneshwarar A
50 811223205060 Vinitha S
51 811223205061 Vishalini R
52 811223205062 Viviliya joicy A
53 811223205063 Yogeshwaran P

MARKS RANGE:
<20 20-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100

Total No.of Candidates Present 42


Total No.of Candidates Absent 15
Total No.of Students Pass 11
Total No. of Students Fail 31
Percentage of Pass 26.19

Signature of the Faculty in-charge HoD / IT

CIA II QUESTION PAPER

Register
Number:
INDRA GANESAN COLLEGE OF
ENGINEERING
(AN AUTONOMOUS INSTITUTION)
IG Valley, Manikandam, Tiruchirappalli, Tamil Nadu – 620 012, India
(Approved by AICTE, New Delhi and affiliated to Anna University,
Chennai)
Continuous Internal Assessment –II/Set 2 Date 30.04.2025 Marks 100
Course code Course Title Data Base Management Systems
Academic
Regulation 2024 Duration 3 hrs 2024-25
Year
Year I Semester II Department IT
Q.No. Question CO RBT
PART A
(Answer all the Questions
Part A (Answer all the Questions 10 x 2 = 20 Marks)
1. Define Atomicity and Consistency of ACID properties.

Atomicity: An atomic transaction is an indivisible unit of work. Either all


operations within the transaction are completed successfully, or none of them are. If
any part of the transaction fails, the entire transaction is rolled back, leaving the
database in its state before the transaction began. This is often referred to as "all or
nothing."

Consistency: A consistent transaction brings the database from one valid state to
another valid state. It ensures that any data written to the database must be valid
according to all defined rules, constraints, and triggers (e.g., primary key
constraints, foreign key constraints, check constraints). If a transaction violates any
of these rules, it is aborted.

2. Explain ACID properties in transaction management.

 ACID is an acronym for a set of properties that guarantee reliable processing of


database transactions:

Atomicity: (Explained above) All or nothing.

Consistency: (Explained above) Ensures valid data.

Isolation: Ensures that concurrent transactions do not interfere with each other.
Each transaction appears to execute in isolation, as if it were the only
transaction running. This prevents phenomena like dirty reads, unrepeatable
reads, and phantom reads.

Durability: Guarantees that once a transaction has been committed, its changes
are permanent and will survive any subsequent system failures (e.g., power
outages, crashes). This is typically achieved by writing committed changes to
non-volatile storage (like disk) and using logging.
o .3. What are the challenges associated with concurrency in databases?
 Lost Update Problem: When two transactions read the same data, and both try to
update it, one update might overwrite the other, leading to data loss.
 Dirty Read (Uncommitted Dependency): A transaction reads data that has been
written by another transaction that has not yet committed. If the second transaction
aborts, the first transaction will have read invalid data.
 Unrepeatable Read: A transaction reads a data item twice, and the item's value
changes between the two reads because another committed transaction updated it.

4.List the commands used for transaction control in SQL.

 COMMIT: Saves all changes made during the current transaction permanently to the
database.
 ROLLBACK: Undoes all changes made during the current transaction, restoring the
database to its state before the transaction began.
 SAVEPOINT: Sets a point within a transaction to which you can later roll back. This
allows for partial rollbacks within a transaction.
 SET TRANSACTION: Establishes the characteristics for the current transaction (e.g.,
isolation level, read-write mode).

5. Illustrate the isolation in transaction management.

 Isolation ensures that concurrent transactions behave as if they are executing


sequentially. The goal is to prevent the problems mentioned in Q3 (lost updates,
dirty reads, unrepeatable reads, phantom reads).

6.List the primary key and foreign key.

 Primary Key: A column or a set of columns in a table that uniquely identifies each
row in that table.
o Characteristics: Must contain unique values, cannot contain NULL values.
A table can have only one primary key.
 Foreign Key: A column or a set of columns in one table (the "referencing table")
that refers to the primary key or a unique key in another table (the "referenced
table").
o Purpose: Establishes a link between two tables, enforcing referential
integrity. Ensures that relationships between tables are maintained

7. Define an Entity, Entity Type, and Entity Set in DBMS.

 Entity: A "thing" or object in the real world that is distinguishable from other
objects. It represents something about which data is stored. Examples: a specific
student (e.g., "John Doe"), a specific course (e.g., "Database Management").
 Entity Type: A collection of entities that share common characteristics or
attributes. It describes the structure of a group of similar entities. Examples:
"Student" (representing all students), "Course" (representing all courses).

8 What are the challenges associated with concurrency in databases?


Unrepeatable Read: A transaction reads a data item twice, and the item's value changes
between the two reads because another committed transaction updated it.
 Phantom Read: A transaction reads a set of rows satisfying a WHERE clause. When the
transaction reads the set again, new rows appear (phantoms) that were inserted by another
committed transaction.

 Incorrect Summary Problem: When a transaction calculates an aggregate function


(e.g., sum, average) over a set of data, and other transactions are updating the data
concurrently, the aggregate result might be incorrect.
9. Classify is concurrency control important in database management systems?

 Yes, concurrency control is very important in database management systems for


several reasons:
o Data Integrity: Prevents anomalies like lost updates, dirty reads,
unrepeatable reads, and phantom reads, ensuring that the data remains
consistent and correct even with simultaneous access.
o Consistency: Guarantees that transactions move the database from one
consistent state to another, upholding defined business rules and constraints.

10. Define Relational Algebra and Relational Calculus.

 Relational Algebra:
o Definition: A procedural query language that consists of a set of operations
that take one or two relations (tables) as input and produce a new relation as
output. It specifies how to obtain the desired result.
o Operations: Common operations include select (σ), project (π), union (∪),
intersection (∩), set difference (−), Cartesian product (×), join (⋈), and
rename (ρ).

Part B (Answer all the Questions 5 x 13 = 65 Marks)


11a. Explain the concept of RAID. Discuss the different types of RAID levels and their
advantages and disadvantages.

 RAID (Redundant Array of Independent Disks):


o Concept: A data storage virtualization technology that combines multiple
physical disk drive components into one or more logical units for the
purposes of data redundancy, performance improvement, or both. Data is
distributed across the drives in various ways called "RAID levels."
 Different Types of RAID Levels:

1. RAID 0 (Stripping):
 Concept: Data is split into blocks and written across multiple drives
in a "striped" fashion. No redundancy.
 Advantages: Highest performance (both read and write) due to
parallel access. All disk space is usable.
 Disadvantages: No fault tolerance. If one drive fails, all data is lost.
 Use Case: Applications where performance is critical and data loss
is acceptable (e.g., temporary files, video editing scratch disks).
2. RAID 1 (Mirroring):
 Concept: Data is duplicated (mirrored) across two or more drives.
For every block written to one drive, an identical copy is written to
another.
 Advantages: Excellent fault tolerance (can tolerate N-1 drive
failures in an N-drive mirror set). Fast read performance (can read
from either mirror).
 Disadvantages: High cost per GB, as only 50% of the raw capacity
is usable.
 Use Case: Critical data where high availability and reliability are
paramount (e.g., operating system drives, transactional databases).
3. RAID 5 (Stripping with Distributed Parity):
 Concept: Data is striped across multiple drives, and parity
information (used for recovery) is distributed among all drives.
Requires at least 3 drives.
 Advantages: Good balance of performance, capacity, and fault
tolerance (can tolerate one drive failure). More cost-effective than
RAID 1 for large capacities.
 Disadvantages: Write performance can be slower due to parity
calculation and write. Recovery from a drive failure can be slow and
I/O intensive.
 Use Case: General-purpose servers, file servers, web servers.
4. RAID 6 (Stripping with Dual Parity):
 Concept: Similar to RAID 5 but includes two independent parity
blocks distributed across the drives. Requires at least 4 drives.
 Advantages: Higher fault tolerance than RAID 5 (can tolerate two
simultaneous drive failures).
 Disadvantages: Slower write performance than RAID 5 due to
calculating two parity blocks. Higher cost than RAID 5 due to extra
parity drive equivalent.
 Use Case: Mission-critical applications requiring very high data
availability, especially with large-capacity drives where rebuild
times are long and a second failure during rebuild is a concern.
5. RAID 10 (RAID 1+0 - Stripped Mirrors):
 Concept: A nested RAID level combining RAID 1 (mirroring) and
RAID 0 (stripping). Data is mirrored in pairs, and then these
mirrored pairs are striped. Requires at least 4 drives.
 Advantages: Excellent performance (both read and write) due to
stripping and fast recovery due to mirroring. High fault tolerance
(can lose one drive in each mirrored set).
 Disadvantages: High cost per GB, similar to RAID 1.
 Use Case: High-performance, highly available database servers,
intensive I/O applications

OR
11b. Examine the Query Processing Steps in DBMS. Determine the Concept and
Operations of Views in SQL.

 Query Processing Steps in DBMS: (This is largely a repeat of Q15a from the first
image, but I'll re-summarize for completeness in this section.)
1. Parsing and Translation:
 Parser: Checks the query for correct syntax and semantics (e.g.,
valid table/column names).
 Translator: Converts the valid query into an internal representation,
often a relational algebra tree.
2. Optimization:
 The query optimizer takes the internal representation and generates
multiple alternative execution plans.
 It uses database statistics (e.g., number of rows, index availability,
data distribution) and cost models to estimate the cost (CPU, I/O,
network) of each plan.
 The goal is to find the plan with the lowest estimated cost. This
involves techniques like:
 Reordering operations (e.g., pushing selections down).
 Choosing join algorithms (e.g., nested-loop, hash join, sort-
merge join).
 Selecting appropriate indexes.
3. Code Generation:
 The chosen optimal execution plan is translated into a sequence of
low-level instructions (executable code) that the database engine can
directly execute.
4. Execution:
 The generated code is executed by the query execution engine.
 This involves interacting with the storage manager to retrieve data
blocks, performing necessary operations (filtering, joining, sorting,
aggregation), and returning the final result to the user.
 Concept and Operations of Views in SQL:

o Concept of Views:
 A view in SQL is a virtual table based on the result-set of a SQL
query. It does not store data itself; instead, its content is derived
dynamically from the underlying base tables whenever the view is
queried.
 Views provide a way to:
 Simplify Complex Queries: By encapsulating complex
JOIN operations or aggregations.
 Enhance Security: By restricting user access to specific
rows and columns of underlying tables, rather than granting
access to the entire table.
 Customize Data Presentation: Allowing different users to
see the same data in different ways.
 Provide Data Independence: If the underlying base table
schema changes, the view definition can be modified to keep
the view consistent, without affecting applications that use
the view.
o Operations on Views in SQL:
 CREATE VIEW: Used to define a new view.

SQL

CREATE VIEW Sales_Summary AS


SELECT ProductID, SUM(Quantity) AS TotalQuantity
FROM OrderDetails
GROUP BY ProductID;
 SELECT from a View: You can query a view just like a regular table.

SQL

SELECT * FROM Sales_Summary WHERE TotalQuantity > 1000;

 ALTER VIEW (or CREATE OR REPLACE VIEW): Used to modify an


existing view definition.

SQL

CREATE OR REPLACE VIEW Sales_Summary AS


SELECT ProductID, SUM(Quantity) AS TotalQuantity,
COUNT(OrderID) AS NumberOfOrders
FROM OrderDetails
GROUP BY ProductID;

DROP VIEW: Used to delete a view.

SQL

DROP VIEW Sales_Summary;

 INSERT, UPDATE, DELETE on Views:


 While views are not physical tables, some views are
"updatable," meaning you can perform INSERT, UPDATE, and
DELETE operations on them.
 Updatable View Conditions: Generally, a view is updatable
if it:

Is based on a single base table.

Does not contain DISTINCT, GROUP BY, HAVING, or aggregate functions.

Does not contain subqueries in the SELECT list.

Does not use set operators (UNION, INTERSECT, EXCEPT).

All non-nullable columns from the base table are included in the view (or have default
values).

If a view is updatable, the changes made to the view are automatically propagated to the
underlying base table. OR
12a. Evaluate the key differences, and when would you use each type (of clustered and non-
clustered indexes).

 Clustered Index:
o Concept: A clustered index determines the physical order of data storage in
a table. The data rows themselves are stored in the order of the clustered
index key. A table can have only one clustered index.
o Key Differences:
 Data is physically ordered according to the index.
 Faster for range queries and retrieving large sets of ordered data.
 The leaf nodes of a clustered index are the data pages.
o When to Use: Ideal for columns frequently used in ORDER BY, GROUP BY,
JOIN clauses, or range queries, especially on primary keys. Good for
columns with unique or highly selective values.
 Non-Clustered Index:
o Concept: A non-clustered index creates a separate structure that contains
the index key and a pointer to the actual data row in the table (either a row
ID or the clustered index key if one exists). The physical order of data is not
affected. A table can have multiple non-clustered indexes.
o Key Differences:
 Data is not physically ordered according to the index.
 Slower than clustered for range queries that require scanning large
portions of the table.
 The leaf nodes of a non-clustered index contain pointers to the data
rows.
o When to Use: Suitable for columns frequently used in WHERE clauses for
specific lookups, or on columns that are part of frequently accessed foreign
keys. Use when you need to quickly find individual rows based on specific
column values.

Evaluate the different sorting algorithms used in query processing. Discuss the trade-
offs between different sorting techniques in terms of time complexity and space
requirements.

 Sorting Algorithms in Query Processing: Databases often use external sorting


algorithms because data typically exceeds available main memory. Common
techniques include:
1. External Merge Sort:
 How it works: Divides the data into smaller sorted runs that fit in
memory, writes them to disk, and then repeatedly merges these runs
until a single sorted run is produced.
 Time Complexity: Generally O(NlogN) where N is the number of
records, but the constant factor can be high due to disk I/O. Number
of passes is O(logBN) where B is the number of blocks that can be
merged at once.
 Space Requirements: Requires disk space for intermediate runs.
The amount of main memory available affects the size of the initial
runs and thus the number of merge passes.
 Trade-offs: Robust for large datasets, relatively simple to
implement. Can be I/O bound.
2. Hybrid Sort-Merge (e.g., using quicksort/heapsort for initial runs):
 How it works: Uses an efficient in-memory sorting algorithm (like
quicksort or heapsort) to create the initial sorted runs, then proceeds
with the merge phase as in external merge sort.
 Time Complexity: Similar to external merge sort for the overall
process, but the initial sort phase is optimized for in-memory
performance.
 Space Requirements: Similar to external merge sort.
 Trade-offs: Leverages the efficiency of internal sorting for the
initial runs, potentially reducing the total number of merge passes
compared to pure merge sort if initial run creation is optimized.
3. Block-based Sort (e.g., using multi-way merge sort with larger blocks):
 How it works: Optimizes I/O by reading and writing larger blocks
of data at a time, reducing the number of disk accesses.
 Time Complexity: Improves the constant factor of O(NlogN) by
reducing I/O operations.
 Space Requirements: Requires larger buffer pool space for blocks.
 Trade-offs: Highly dependent on buffer pool size and efficient I/O
management.
 Trade-offs (General for External Sorting):

o Time Complexity (Dominant Factor): The primary concern is minimizing


disk I/O. Algorithms are designed to reduce the number of passes over the
data.
o Space Requirements: The amount of main memory available directly
impacts the size of the initial sorted runs and the number of merge passes,
which in turn affects the total time. More memory generally means fewer
I/O operations and faster sorting.
o Buffer Pool Management: Efficient use of the buffer pool to cache data
blocks is crucial for performance.
o Disk Characteristics: Sequential I/O is generally faster than random I/O,
and external sort algorithms are designed to maximize sequential access.

12b. Elaborate the structure and benefits of ordered indices in databases. How do
ordered indices improve query performance?

 Structure of Ordered Indexes:


o Ordered indexes store key-pointer pairs in a sorted sequence. The most
common data structures for ordered indexes are B-trees and B+ trees.
o B-Tree: A self-balancing tree data structure that maintains sorted data and
allows searches, sequential access, insertions, and deletions in logarithmic
time. Internal nodes can hold both keys and data pointers.
o B+ Tree: A variation of a B-tree where all data pointers (or records
themselves) are stored only at the leaf nodes. All leaf nodes are linked
together in a sequential manner, forming a sorted linked list. Internal nodes
only contain keys and pointers to child nodes.
o Key Characteristics:
 Sorted Order: Entries are kept in sorted order based on the index
key.
 Tree Structure: Organizes data hierarchically, typically as a
balanced tree, to allow efficient searching.
 Nodes: Consist of pages (blocks) on disk.
 Pointers: Contain pointers to actual data records (in non-clustered
indexes) or are the data records themselves (in clustered indexes).
 Benefits of Ordered Indexes:

Fast Data Retrieval (Equality and Range Queries):


1.
 For equality searches (e.g., WHERE id = 123), the tree structure
allows for quick traversal to the exact location of the record in
O(logN) time.
 For range queries (e.g., WHERE salary BETWEEN 50000 AND
70000), the sorted nature of the index (especially B+ trees with
linked leaf nodes) allows efficient scanning of the relevant range
without reading the entire table.
2. Efficient Sorting and Grouping: When queries involve ORDER BY or
GROUP BY clauses on the indexed columns, the database can often use the
pre-sorted index, avoiding the need for a separate sort operation, which can
be very expensive.
3. Faster Joins: Indexes on join columns (especially foreign keys) can
significantly speed up join operations by allowing the database to quickly
find matching rows in the joined tables.
4. Enforcing Uniqueness: Unique indexes prevent duplicate values in the
indexed column(s), which is crucial for primary keys and other unique
constraints.
5. Reduced I/O Operations: By providing a direct path to the data, indexes
reduce the number of disk I/O operations required to retrieve data compared
to a full table scan.
 How Ordered Indexes Improve Query Performance:

o Direct Access: Instead of scanning every row in a table (full table scan), an
ordered index allows the database to go directly to the relevant data page(s)
by traversing the tree structure. This is analogous to using an index in a
book.
o Reduced Disk I/O: Since disk I/O is typically the slowest operation in
database systems, minimizing it is crucial. Indexes achieve this by reducing
the amount of data that needs to be read from disk.
o Pre-sorted Data: For queries requiring sorted results, the database can
often read data directly from the sorted index, eliminating the need for an
explicit sort operation, which can be computationally intensive and require
temporary disk space.
o Faster Join Processing: When joining tables on indexed columns, the
database can use index lookups to quickly find matching records, rather than
performing nested loops or hash joins on unsorted data.

OR
13a. Explain the concept of RAID. Discuss the different types of RAID levels and their advantages
and disadvantages..

 RAID (Redundant Array of Independent Disks):


o Concept: A data storage virtualization technology that combines multiple
physical disk drive components into one or more logical units for the
purposes of data redundancy, performance improvement, or both. Data is
distributed across the drives in various ways called "RAID levels."
 Different Types of RAID Levels:

1. RAID 0 (Stripping):
 Concept: Data is split into blocks and written across multiple drives
in a "striped" fashion. No redundancy.
 Advantages: Highest performance (both read and write) due to
parallel access. All disk space is usable.
 Disadvantages: No fault tolerance. If one drive fails, all data is lost.
 Use Case: Applications where performance is critical and data loss
is acceptable (e.g., temporary files, video editing scratch disks).
2. RAID 1 (Mirroring):
 Concept: Data is duplicated (mirrored) across two or more drives.
For every block written to one drive, an identical copy is written to
another.
 Advantages: Excellent fault tolerance (can tolerate N-1 drive
failures in an N-drive mirror set). Fast read performance (can read
from either mirror).
 Disadvantages: High cost per GB, as only 50% of the raw capacity
is usable.
 Use Case: Critical data where high availability and reliability are
paramount (e.g., operating system drives, transactional databases).
3. RAID 5 (Stripping with Distributed Parity):
 Concept: Data is striped across multiple drives, and parity
information (used for recovery) is distributed among all drives.
Requires at least 3 drives.
 Advantages: Good balance of performance, capacity, and fault
tolerance (can tolerate one drive failure). More cost-effective than
RAID 1 for large capacities.
 Disadvantages: Write performance can be slower due to parity
calculation and write. Recovery from a drive failure can be slow and
I/O intensive.
 Use Case: General-purpose servers, file servers, web servers.
4. RAID 6 (Stripping with Dual Parity):
 Concept: Similar to RAID 5 but includes two independent parity
blocks distributed across the drives. Requires at least 4 drives.
 Advantages: Higher fault tolerance than RAID 5 (can tolerate two
simultaneous drive failures).
 Disadvantages: Slower write performance than RAID 5 due to
calculating two parity blocks. Higher cost than RAID 5 due to extra
parity drive equivalent.
 Use Case: Mission-critical applications requiring very high data
availability, especially with large-capacity drives where rebuild
times are long and a second failure during rebuild is a concern.
5. RAID 10 (RAID 1+0 - Stripped Mirrors):
 Concept: A nested RAID level combining RAID 1 (mirroring) and
RAID 0 (stripping). Data is mirrored in pairs, and then these
mirrored pairs are striped. Requires at least 4 drives.

13b. List out the different methods of indexing (e.g., B-tree, B+ tree, bitmap indexing).

 Different Methods of Indexing:


1. Ordered Indexes:
 B-tree Indexes:
 Concept: A self-balancing tree structure where each node
can have multiple children. Data can be stored in both
internal and leaf nodes. Good for exact match and range
queries.
 Usage: General-purpose index for most data types and query
patterns.
 B+ tree Indexes:
 Concept: A variant of B-tree where all data pointers are
stored only at the leaf nodes, and leaf nodes are linked
sequentially. Internal nodes only store keys for navigation.
 Usage: Most commonly used index structure in modern
relational databases due to its efficiency for both random and
sequential access, and good handling of disk I/O.
 Clustered Indexes: (As discussed, defines physical order of data,
typically implemented with a B+ tree.)
 Non-Clustered Indexes: (As discussed, separate structure with
pointers to data, typically implemented with a B+ tree.)
2. Hash Indexes:
 Concept: Uses a hash function to compute the address of a data
record directly from its key. Data is not stored in sorted order.
 Usage: Extremely fast for equality searches (WHERE key = value).
Not suitable for range queries or ordered retrieval. Less common as
a primary indexing method than B-trees due to lack of support for
ranges.
3. Bitmap Indexes:
 Concept: Stores data as a collection of bitmaps (bit arrays). Each
distinct value in the indexed column has a bitmap, where a '1' at a
particular position indicates that the corresponding row has that
value, and a '0' indicates it does not.
 Usage: Highly effective for columns with a low cardinality (few
distinct values), such as gender, marital_status, boolean_flags.
Excellent for complex queries with multiple AND or OR conditions, as
bitmaps can be efficiently combined using bitwise operations.
 Disadvantages: Inefficient for high cardinality columns or columns
with frequent updates.
4. Full-Text Indexes:
 Concept: Designed for efficient searching of text within large text
columns (e.g., VARCHAR(MAX), TEXT). It parses the text, creates a list
of words, and stores their locations.
 Usage: For keyword searches, natural language queries, and
proximity searches on textual data.
5. Spatial Indexes:
 Concept: Optimized for geographic or geometric data (points, lines,
polygons). Uses structures like R-trees to efficiently query spatial
relationships (e.g., "find all points within this area").
 Usage: GIS applications, location-based services.
6. Function-Based (Expression) Indexes:
 Concept: An index created on an expression or a function rather
than directly on a column (e.g., INDEX ON (UPPER(LastName))).
 Usage: Improves performance for queries where the WHERE clause
involves a function or expression on a column.

OR
14a. Distinguish query optimization using heuristics. How do heuristic-based query
optimization techniques improve query performance, and what are their limitations?

 Distinguishing Query Optimization using Heuristics:


o Heuristic-based optimization relies on a set of predefined rules or
"common sense" strategies to transform an initial query plan into a more
efficient one, without explicitly calculating the cost of different plans. It
aims to achieve "good enough" performance rather than strictly optimal
performance.
o In contrast, cost-based optimization estimates the cost of various execution
plans using statistical information about the data and the database system,
and then chooses the plan with the lowest estimated cost.
 How Heuristic-Based Query Optimization Techniques Improve Query
Performance:

1. Predicate Pushdown (Selection Pushdown): Moving SELECT operations as


early as possible in the query execution plan (before joins or other
expensive operations). This reduces the number of rows processed in
subsequent steps.
 Example: SELECT * FROM Orders JOIN Customers ON
Orders.CustomerID = Customers.CustomerID WHERE
Orders.OrderDate > '2025-01-01'. Heuristic will apply the
WHERE clause filter on Orders before joining with Customers.
2. Projection Pushdown: Moving PROJECT operations as early as possible to
eliminate unnecessary columns. This reduces the amount of data transferred
and processed.
 Example: SELECT ProductName FROM Products JOIN
Categories ON Products.CategoryID =
Categories.CategoryID WHERE CategoryName =
'Electronics'. Heuristic will project only ProductName and
CategoryID from Products before the join, rather than all columns.
3. Combining Select and Project Operations: Merging multiple SELECT and
PROJECT operations into a single scan to minimize passes over the data.
4. Using Indexes for Selections: If an index exists on a column used in a
WHERE clause, the optimizer will try to use the index to speed up data
retrieval.
5. Performing Joins on Smaller Relations First: In a multi-way join, a
heuristic might suggest joining smaller tables first to reduce the size of
intermediate results.
6. Commutativity and Associativity of Operations: Rearranging the order of
joins or other binary operations.
 Limitations of Heuristic-Based Optimization:

Suboptimal Plans: Since heuristics don't use actual data statistics, they might not always
find the truly optimal execution plan, especially for complex queries or highly skewed data
distributions.

Lack of Flexibility: The rules are fixed and may not adapt well to different data
characteristics or system configurations.

Ignores Data Distribution: Heuristics don't consider the actual distribution of data (e.g., if
a column has many duplicate values or a few unique ones), which can significantly impact
the efficiency of an execution plan.

Cannot Handle All Cases: Some complex queries or scenarios might not have a clear
heuristic rule, leading to less efficient plans.

No Cost Estimation: Without cost estimation, it's impossible to compare different plans
quantitatively or to make informed decisions about resource allocation
OR
14b. Discuss the tradeoffs between different sorting techniques in terms of time complexity and
space requirements.

 Sorting Algorithms in Query Processing: Databases often use external sorting


algorithms because data typically exceeds available main memory. Common techniques
include:

1. External Merge Sort:


o How it works: Divides the data into smaller sorted runs that fit in memory,
writes them to disk, and then repeatedly merges these runs until a single
sorted run is produced.
o Time Complexity: Generally O(NlogN) where N is the number of records,
but the constant factor can be high due to disk I/O. Number of passes is
O(logBN) where B is the number of blocks that can be merged at once.
o Space Requirements: Requires disk space for intermediate runs. The
amount of main memory available affects the size of the initial runs and thus
the number of merge passes.
o Trade-offs: Robust for large datasets, relatively simple to implement. Can
be I/O bound.
2. Hybrid Sort-Merge (e.g., using quicksort/heapsort for initial runs):
o How it works: Uses an efficient in-memory sorting algorithm (like
quicksort or heapsort) to create the initial sorted runs, then proceeds with the
merge phase as in external merge sort.
o Time Complexity: Similar to external merge sort for the overall process,
but the initial sort phase is optimized for in-memory performance.
o Space Requirements: Similar to external merge sort.
o Trade-offs: Leverages the efficiency of internal sorting for the initial runs,
potentially reducing the total number of merge passes compared to pure
merge sort if initial run creation is optimized.
3. Block-based Sort (e.g., using multi-way merge sort with larger blocks):
o How it works: Optimizes I/O by reading and writing larger blocks of data
at a time, reducing the number of disk accesses.
o Time Complexity: Improves the constant factor of O(NlogN) by reducing
I/O operations.
o Space Requirements: Requires larger buffer pool space for blocks.
o Trade-offs: Highly dependent on buffer pool size and efficient I/O
management.

OR
15a. Determine an overview of query processing in a database management system,
the steps involved in query execution and optimization.

 Overview of Query Processing: Query processing is the process of transforming a


user's query into an efficient execution plan that can retrieve the desired data from
the database. The goal is to minimize the time and resources required to answer the
query.
 Steps Involved in Query Execution:
1. Parsing and Translation: The query is parsed to check for syntax errors
and translated into an internal representation (e.g., a relational algebra tree).
2. Optimization: The query optimizer analyzes the query and generates
multiple possible execution plans, evaluating their costs (based on I/O,
CPU, network usage) to choose the most efficient one. This involves
techniques like join order optimization, selection pushdown, and projection
pushdown.
3. Code Generation: The chosen execution plan is translated into executable
code that the database system can run.
4. Execution: The generated code is executed, accessing and processing data
from the database to produce the final result.
 Query Optimization: Query optimization is the phase where the database system
tries to find the most efficient way to execute a given query. It involves:

o Heuristic-based Optimization: Uses a set of rules (heuristics) to reorder


operations in a query to improve performance (e.g., performing selections
before joins).
o Cost-based Optimization: Estimates the cost of different execution plans
using statistics about the data (e.g., number of rows, index availability) and
chooses the plan with the lowest estimated cost.

15b. Classify the different types of file organization. Explain the factors that influence
the choice of file organization.

 Types of File Organization:


1. Heap (Unordered) File Organization: Records are stored in the order they
are inserted, without any specific order. Simple to implement but inefficient
for searching.
2. Sequential File Organization: Records are stored in a specific order based
on a key field. Efficient for batch processing and sequential access, but
insertions/deletions can be costly.
3. Indexed Sequential File Organization (ISAM): Combines sequential
organization with an index to allow both sequential and direct access. Good
for both types of access but can have performance issues with frequent
insertions/deletions.
4. Hash File Organization: Records are stored based on a hash function
applied to a key field, distributing records across disk blocks. Provides very
fast direct access if the hash function is good, but range queries are difficult.
5. B-tree/B+ tree File Organization: Data is stored in a tree structure where
internal nodes hold keys and pointers, and leaf nodes contain data or
pointers to data. Excellent for both sequential and direct access, and handles
dynamic insertions/deletions well. B+ trees are generally preferred for
database indexes.

Part C(Answer all the Questions 1 x 15 = 15 Marks)


6a(i). Explain the concept of clustered and non-clustered indexes. What are the key
differences, and when would you use each type?

 Clustered Index:
o Concept: A clustered index determines the physical order of data storage in
a table. The data rows themselves are stored in the order of the clustered
index key. A table can have only one clustered index.
o Key Differences:
 Data is physically ordered according to the index.
 Faster for range queries and retrieving large sets of ordered data.
 The leaf nodes of a clustered index are the data pages.
o When to Use: Ideal for columns frequently used in ORDER BY, GROUP BY,
JOIN clauses, or range queries, especially on primary keys. Good for
columns with unique or highly selective values.
 Non-Clustered Index:
o Concept: A non-clustered index creates a separate structure that contains
the index key and a pointer to the actual data row in the table (either a row
ID or the clustered index key if one exists). The physical order of data is not
affected. A table can have multiple non-clustered indexes.
o Key Differences:
 Data is not physically ordered according to the index.
 Slower than clustered for range queries that require scanning large
portions of the table.
 The leaf nodes of a non-clustered index contain pointers to the data
rows.
o When to Use: Suitable for columns frequently used in WHERE clauses for
specific lookups, or on columns that are part of frequently accessed foreign
keys. Use when you need to quickly find individual rows based on specific
column values.

OR
16b(ii). Evaluate the different sorting algorithms used in query processing. Discuss
the trade-offs between different sorting techniques in terms of time complexity and
space requirements.

 Sorting Algorithms in Query Processing: Databases often use external sorting


algorithms because data typically exceeds available main memory. Common
techniques include:
1. External Merge Sort:
 How it works: Divides the data into smaller sorted runs that fit in
memory, writes them to disk, and then repeatedly merges these runs
until a single sorted run is produced.
 Time Complexity: Generally O(NlogN) where N is the number of
records, but the constant factor can be high due to disk I/O. Number
of passes is O(logBN) where B is the number of blocks that can be
merged at once.
 Space Requirements: Requires disk space for intermediate runs.
The amount of main memory available affects the size of the initial
runs and thus the number of merge passes.
 Trade-offs: Robust for large datasets, relatively simple to
implement. Can be I/O bound.
2. Hybrid Sort-Merge (e.g., using quicksort/heapsort for initial runs):
 How it works: Uses an efficient in-memory sorting algorithm (like
quicksort or heapsort) to create the initial sorted runs, then proceeds
with the merge phase as in external merge sort.
 Time Complexity: Similar to external merge sort for the overall
process, but the initial sort phase is optimized for in-memory
performance.
 Space Requirements: Similar to external merge sort.
 Trade-offs: Leverages the efficiency of internal sorting for the
initial runs, potentially reducing the total number of merge passes
compared to pure merge sort if initial run creation is optimized.
3. Block-based Sort (e.g., using multi-way merge sort with larger blocks):
 How it works: Optimizes I/O by reading and writing larger blocks
of data at a time, reducing the number of disk accesses.
 Time Complexity: Improves the constant factor of O(NlogN) by
reducing I/O operations.
 Space Requirements: Requires larger buffer pool space for blocks.
 Trade-offs: Highly dependent on buffer pool size and efficient I/O
management.
 Trade-offs (General for External Sorting):

o Time Complexity (Dominant Factor): The primary concern is minimizing


disk I/O. Algorithms are designed to reduce the number of passes over the
data.
o Space Requirements: The amount of main memory available directly
impacts the size of the initial sorted runs and the number of merge passes,
which in turn affects the total time. More memory generally means fewer
I/O operations and faster sorting.
o Buffer Pool Management: Efficient use of the buffer pool to cache data
blocks is crucial for performance.
o Disk Characteristics: Sequential I/O is generally faster than random I/O,
and external sort algorithms are designed to maximize sequential access.

STUDENTS' MARK STATEMENT - CIA II


24GE201, ENGINEERING GRAPHICS
YEAR/SEM: I/II MONTH & YEAR: May & 2025
CO3 CO4 C05
S.NO REG NO STUDENT NAME TOTAL (100)
(20) (40) (40)
8112232050
1. Aruldhas A
01

2. 811223205002 Arunkumar A
3. 811223205003 Aswin S
4. 811223205004 Balaji S
5. 811223205006 Bennyhinn Meshanth S
6. 811223205008 Dhanush R
7. 811223205010 Dhavakirshanan S
8. 811223205011 Eniya R

9. 811223205012 Eswaran C

10 811223205014 Gokul S

11 811223205015 Gokulamani M
12 811223205016 Hariharan E
13 811223205017 Jedan J
14 811223205018 Jegatheeswaran M
15 811223205019 Jenily Christy J
16 811223205022 Kathirvel V
17 811223205023 Kaviya G
18 811223205024 Kilson Christopher A
19 811223205025 Krishna khumaran T.U
20 811223205026 Lenin N
21 811223205027 S.Madheshwaran
22 811223205028 Madhumitha M
23 811223205029 Mavitha K
24 811223205030 Mayuri B
25 811223205031 Mohamed Jameer Basha A
26 811223205032 Nisha M
27 811223205033 Nivash P
28 811223205034 Pandiyarajan M
29 811223205035 Paul S
30 811223205036 Praveen Kumar S
31 811223205037 Premkumar E
32 811223205039 Priyadharshini K
33 811223205040 Rajeswari S
34 811223205042 Reethika S
35 811223205043 Sanjay R
36 811223205044 Sanjay P
37 811223205045 Sanjay P
38 811223205047 Saravanan D
39 811223205048 Shalika Shahana M
40 811223205049 Shobana R
41 811223205050 Sneka V
42 811223205051 Sowmiya S
43 811223205052 Sowmiya S
44 811223205053 Sudarvizhi S
45 811223205054 Sudhar singh M
46 811223205055 Udhayam U
47 811223205056 Varshini S
48 811223205057 Vetrivel P
49 811223205059 Vigneshwarar A
50 811223205060 Vinitha S
51 811223205061 Vishalini R
52 811223205062 Viviliya joicy A
53 811223205063 Yogeshwaran P

MARKS RANGE:
<20 20-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100
3 4 4 5 3 7 9 8 0
Total No.of Candidates Present 43
Total No.of Candidates Absent 14
Total No.of Students Pass 28
Total No. of Students Fail 15
Percentage of Pass 65.11

Signature of the Faculty in-charge H


Assignment Question Paper-1
Assignment – 01 Date of Issue: 09.03.2025 Marks 10
Course code CS3492 Course Title Data Base Management Systems
th
Year II Semester/Section 4 Year II
Q.N Questions CO
o
1 Purpose of Database System Views of data Data Models Database CO1, CO2 & CO3
System Architecture
2. Introduction to relational databases – Relational Model

Signature of the Faculty in-charge HoD / IT


STUDENTS MARK STATEMENT - ASSIGNMENT 1
Course code CS3492 Course Title Data Base Management Systems
YEAR/SEM: I/II MONTH & YEAR: March & 2025
Assignment
S.NO REG NO STUDENT NAME 1
(10)
1. 811223205001 Aruldhas A

2. 811223205002 Arunkumar A
3. 811223205003 Aswin S
4. 811223205004 Balaji S
5. 811223205006 Bennyhinn Meshanth S
6. 811223205008 Dhanush R
7. 811223205010 Dhavakirshanan S
8. 811223205011 Eniya R

9. 811223205012 Eswaran C

10 811223205014 Gokul S

11 811223205015 Gokulamani M
12 811223205016 Hariharan E
13 811223205017 Jedan J
14 811223205018 Jegatheeswaran M
15 811223205019 Jenily Christy J
16 811223205022 Kathirvel V
17 811223205023 Kaviya G
18 811223205024 Kilson Christopher A
19 811223205025 Krishna khumaran T.U
20 811223205026 Lenin N
21 811223205027 S.Madheshwaran
22 811223205028 Madhumitha M
23 811223205029 Mavitha K
24 811223205030 Mayuri B
25 811223205031 Mohamed Jameer Basha A
26 811223205032 Nisha M
27 811223205033 Nivash P
28 811223205034 Pandiyarajan M
29 811223205035 Paul S
30 811223205036 Praveen Kumar S
31 811223205037 Premkumar E
32 811223205039 Priyadharshini K
33 811223205040 Rajeswari S
34 811223205042 Reethika S
35 811223205043 Sanjay R
36 811223205044 Sanjay P
37 811223205045 Sanjay P
Assignment Question Paper-2
Assignment – 02 Date of Issue: 09.04.2025 Marks 20
Course code CS3492 Course Title Data Base Management Systems
th
Year II Semester/Section 4 Date of Submission: 12.04.2025
Q.N Questions CO
o
1 SQL – Need for Concurrency – Concurrency control
CO3, CO4 & CO5
2. File Organization – Organization of Records in Files

Signature of the Faculty in-charge HoD / IT


Rubrics for Assignments 1 & 2
Needs
Excellent (4 Average (2
Criteria Good (3 Marks) Improvement (1
Marks) Marks)
Mark)

All 11 models
Completion of All 9–10 models 7–8 models Less than 7 models
completed
Models completed completed completed
accurately

Shapes are perfectly Several shapes


Geometric Minor flaws in a Many models are
proportional and lack correct
Accuracy few shapes incorrect or unclear
correct geometry

All models labeled


Labeling & Most models Some labels Poor or missing
correctly and
Identification labeled missing or unclear labels
clearly

Average finish
Craftsmanship & Very neat, clean, Generally neat with Untidy, poorly
with noticeable
Neatness and well-assembled minor imperfections constructed models
flaws

Highly creative, Basic presentation,


Creativity & Good use of Lacks creativity or
attractive use of minimal
Presentation materials and colors effort
materials decoration

Signature of the Faculty in-charge HoD / IT

STUDENTS MARK STATEMENT - ASSIGNMENT 1I


CS3492 DATABASE MANAGEMENT SYSTEMS
YEAR/SEM: II/4th sem MONTH & YEAR: March & 2025
Assignment
S.NO REG NO STUDENT NAME 1
(20)
1 811223205001 Aruldhas A

2 811223205002 Arunkumar A
3 811223205003 Aswin S
4 811223205004 Balaji S
5 811223205006 Bennyhinn Meshanth S
6 811223205008 Dhanush R
7 811223205010 Dhavakirshanan S
8 811223205011 Eniya R

9 811223205012 Eswaran C

10 811223205014 Gokul S

11 811223205015 Gokulamani M
12 811223205016 Hariharan E
13 811223205017 Jedan J
14 811223205018 Jegatheeswaran M
15 811223205019 Jenily Christy J
16 811223205022 Kathirvel V
17 811223205023 Kaviya G
18 811223205024 Kilson Christopher A
19 811223205025 Krishna khumaran T.U
Si 20 811223205026 Lenin N
21 811223205027 S.Madheshwaran
22 811223205028 Madhumitha M
23 811223205029 Mavitha K
24 811223205030 Mayuri B
25 811223205031 Mohamed Jameer Basha A
26 811223205032 Nisha M
27 811223205033 Nivash P
28 811223205034 Pandiyarajan M
29 811223205035 Paul S
30 811223205036 Praveen Kumar S
31 811223205037 Premkumar E
32 811223205039 Priyadharshini K
33 811223205040 Rajeswari S
34 811223205042 Reethika S
35 811223205043 Sanjay R
36 811223205044 Sanjay P
37 811223205045 Sanjay P
Signature of the Faculty in-charge HoD / IT

You might also like