0% found this document useful (0 votes)
21 views18 pages

CS614 Quiz

The document consists of multiple-choice questions related to database normalization, data warehousing, data mining, and software development models. Key concepts include the definitions and characteristics of normal forms, de-normalization techniques, and the importance of understanding user needs in data warehouse implementation. The correct answers and explanations for each question are provided, emphasizing the principles of data management and analysis.

Uploaded by

sajidaruqaya760
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views18 pages

CS614 Quiz

The document consists of multiple-choice questions related to database normalization, data warehousing, data mining, and software development models. Key concepts include the definitions and characteristics of normal forms, de-normalization techniques, and the importance of understanding user needs in data warehouse implementation. The correct answers and explanations for each question are provided, emphasizing the principles of data management and analysis.

Uploaded by

sajidaruqaya760
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

2025 ) Total Marks: 1

The relation R will be in 3nd Normal Form if Select the correct option It is in 1NF and each non key
attribute is dependent upon a single column of composite primary key.

It is in 2NF and every non-key column is non-transitively dependent upon its primary key.

It is in 2NF and each cell contains single value.

It is in 2NF and each non key attribute is dependent upon other non key attribute.

The correct option is:

✅ It is in 2NF and every non-key column is non-transitively dependent upon its primary
key.

Explanation:

 1NF (First Normal Form): No repeating groups, atomic (single) values in each cell.
 2NF (Second Normal Form): Must be in 1NF, and all non-key attributes are fully
functionally dependent on the entire primary key.
 3NF (Third Normal Form): Must be in 2NF, and no transitive dependency exists
between non-key attributes and the primary key.

2025 ) Total Marks: 1


“Header size is reduced, allowing more rows per block, thus reducing I/O”. The above
statement is TRUE with respect to:

Select the correct option

Adding redundant column

Horizontal splitting

Vertical splitting

None of the given options

The correct option is:


✅ Vertical splitting
Explanation:
 Vertical splitting divides a table into smaller tables with fewer columns.
 This reduces the size of each row, and hence, the header size (metadata about
columns) is also smaller.
 As a result, more rows can fit into a block, which reduces I/O because fewer blocks
need to be read

2025 ) Total Marks: 1


Which of the following is not a technique of ‘De-Normalization’?

Select the correct option

Pre-joining

Adding Redundant Columns

ER Modeling

Splitting Tables

The correct option is:

✅ ER Modeling

Explanation:

 ER Modeling (Entity-Relationship Modeling) is a design technique used in the


early stages of database design to model data and relationships, typically before
normalization.
 The other options — Pre-joining, Adding Redundant Columns, and Splitting
Tables — are actual de-normalization techniques used to improve query
performance by reducing the number of joins or optimizing data access.
2025 ) Total Marks: 1
Suppose the amount of data recorded in an organization is doubled every year. This increase
is __________ .

Select the correct option

Linear

Logarithmic

Exponential

Quadratic

The correct option is:

✅ Exponential

Explanation:

 When data doubles every year, it follows an exponential growth pattern, not linear
or quadratic.
 Exponential growth means the data increases by a constant multiple over equal time
intervals (e.g., doubling each year).

Would you like a graph to visualize how exponential growth compares to linear or quadratic
growth?

5 ) Total Marks: 1
In the decision support environment, the decision maker is interested in _______________.

Select the correct option

Big picture of organizational data

Only sale related data

Only customer related data

Only limited organizational data

The correct option is:


✅ Big picture of organizational data

Explanation:

 In a decision support environment, the goal is to help decision makers analyze


comprehensive and integrated data from across the organization.
 This enables strategic planning, trend analysis, and informed decision-making—
not just focusing on a narrow subset like sales or customer data.

2025 ) Total Marks: 1


One of the fundamental purpose of de-normalization is to _____________ a number of
physical tables which ultimately reduce the number of joins to answer a query.

Select the correct option

Delete

Decrease

Share

Increase

The correct option is:

✅ Increase

Explanation:

 One of the main goals of de-normalization is to increase the number of physical


tables by merging or duplicating data, which can reduce the number of joins
needed during query execution.
 Although it seems counterintuitive, increasing table size or redundancy through de-
normalization can optimize performance for certain read-heavy applications.

Would you like an example of a situation where de-normalization improves performance?

2025 ) Total Marks: 1


Which one is the characteristic of data warehouse queries?
Select the correct option

Return very few rows

Use multiple tables

High selectivity

Use primary keys

The correct option is:

✅ Use multiple tables

Explanation:

 Data warehouse queries are typically complex and analytical in nature.


 They often involve joining multiple tables (facts and dimensions) to gather insights
across various aspects of the business.
 Unlike OLTP systems, they are not focused on single rows or high selectivity, and
primary keys are less relevant in complex aggregations or trend analysis.

2025 ) Total Marks: 1


Which one among the following is not an advantage of horizontal splitting?

Select the correct option

Increase I/O Overhead

Enhance security

Fast data retrieval

Organize tables for different queries

The correct option is:

✅ Increase I/O Overhead

Explanation:
 Horizontal splitting divides a table into multiple tables with the same columns but
fewer rows, often based on row criteria (e.g., region or time).
 This generally improves performance, enhances security, and organizes data
better for specific queries.
 However, increasing I/O overhead is not an advantage — it's actually a potential
disadvantage if the system has to scan multiple partitions unnecessarily.

Would you like to explore when horizontal vs. vertical splitting is most beneficial?

2025 ) Total Marks: 1


Telecommunication data warehouse is dominated by the _____________ volume of data
generated at the call level.

Select the correct option

Incomplete

Partial

Semi-complete

Sheer

The correct option is:

✅ Sheer

Explanation:

 The term "sheer volume" is commonly used to emphasize the massive amount of
data generated, especially in industries like telecommunications.
 At the call level, telecom companies generate huge volumes of data (call detail
records, durations, locations, etc.), making "sheer volume" the correct descriptor.

2025 ) Total Marks: 1


History is excellent predictor of the __________.

Select the correct option

History

Past
Present

Future

The correct option is:

✅ Future

Explanation:

 The saying "history is an excellent predictor of the future" reflects the idea that
patterns and trends from the past can be used to forecast future events or
behaviors.
 This principle is widely applied in fields like data analytics, forecasting, and
business intelligence.

) Total Marks: 1

In nested-loop join case, if there are ‘M’ rows in outer table and ‘N’ rows in inner table, time
complexity is

Select the correct option

O (M + N)

O (M log N)

O (log MN)
O (MN)

The correct option is:

✅ O(MN)

Explanation:

In a nested-loop join, for each row in the outer table (M rows), the algorithm scans all
rows in the inner table (N rows). This results in:

Total comparisons=M×N\text{Total comparisons} = M \times N

So the time complexity is:

O(MN)\boxed{O(MN)}

) Total Marks: 1

Implementation of a data warehouse requires ________ activities

Select the correct option

Tightly decoupled

Highly integrated
Loosely integrated

None of the given

The correct option is:

✅ Highly integrated

Explanation:

The implementation of a data warehouse involves combining data from various sources
into a single, consistent data store. This requires:

 Cleaning
 Transforming
 Integrating data from multiple heterogeneous sources

Because of this, the process is highly integrated to ensure consistency, accuracy, and
usability across the entire warehouse.

So, the correct answer is: Highly integrated.

) Total Marks: 1

In context of analytical applications track, the application development activity can begin
after:

Select the correct option

A subset of historical data has been loaded


Database design is complete

The data access tools and metadata are installed

All of the given options


The correct option is:

✅ All of the given options

Explanation:

In the context of analytical applications (such as in data warehousing or business


intelligence projects), application development typically begins after the following
foundational steps are completed:

1. A subset of historical data has been loaded – to allow for meaningful testing and
development.
2. Database design is complete – so the structure is in place.
3. The data access tools and metadata are installed – enabling efficient querying and
understanding of data.

Since all these components are essential prerequisites, the correct answer is:

👉 All of the given options.

) Total Marks: 1

In data mining, initially you _____ what you are looking for.

Select the correct option

Know

None of the given options

May or may not know

Don’t know
The correct option is:

✅ Don’t know

Explanation:

In data mining, the goal is to discover hidden patterns, relationships, or insights in large
datasets — often without a clear idea of what you'll find at the start. This is different from
querying or reporting, where you're looking for specific information.

So, initially, you don’t know what you're looking for, and the data mining process helps
uncover useful knowledge.

🔍 Correct answer: Don’t know.

) Total Marks: 1

In context of data mining definition, the term “nontrivial” means:

Select the correct option

We can not discover information

Discovering information is a complex task

We simply find things rather than discovery

Discovering information is a simple task


The correct option is:

✅ Discovering information is a complex task

Explanation:

In the context of data mining, the term "nontrivial" means that the process of discovering
patterns, trends, or knowledge from data is not obvious or simple. It requires sophisticated
algorithms, analysis, and effort to extract meaningful and useful insights from large
datasets.

So, "nontrivial" emphasizes that data mining involves complex and intelligent processing,
not just straightforward or surface-level data lookup.

🔍 Correct answer: Discovering information is a complex task.

) Total Marks: 1

Spiral model is ________

Select the correct option

All of the given options

An iterative model

Sequence of waterfall model

Risk oriented model


The correct option is:

✅ All of the given options

Explanation:

The Spiral Model is a software development model that combines elements of both design
and prototyping in stages. It is:

1. ✅ An iterative model – development occurs in repeated cycles (or spirals).


2. ✅ Sequence of waterfall model – each spiral (iteration) includes steps similar to the
waterfall model (requirements, design, implementation, etc.).
3. ✅ Risk oriented model – a key feature of the spiral model is identifying and
addressing risks at every iteration.

Since all these characteristics are true, the correct answer is:

👉 All of the given options.

) Total Marks: 1

Waterfall model is appropriate when

Select the correct option

When the deadline is strict

When the budget is low

Requirements are clearly defined

When resources are limited


The correct option is:

✅ Requirements are clearly defined

Explanation:

The Waterfall model is a sequential software development process where each phase must
be completed before the next begins. It works best when:

 Requirements are well-understood and unlikely to change.


 There is little need for frequent feedback or iteration.

Therefore, it is most appropriate when requirements are clearly defined from the start.

🔍 Correct answer: Requirements are clearly defined.

) Total Marks: 1

Horizontally wide data means:

Select the correct option

Dataset has attribute skews

Dataset has large no. of records

Dataset has partitioning skews

Dataset has large no. of attributes

The correct option is:


✅ Dataset has large no. of attributes

Explanation:

In data terminology:

 Horizontally wide data refers to a dataset with many attributes (columns) but
possibly fewer records (rows).
 This is the opposite of vertically long data, which has fewer columns and more rows.

So, when we say a dataset is horizontally wide, we mean it has a large number of
attributes (features/columns).

🔍 Correct answer: Dataset has large no. of attributes.

) Total Marks: 1

The first step of the “12-steps data warehouse implementation approach” of Shaku Atre is:

Select the correct option

Finding system scope

Planning system resources

Data acquisition and cleansing

Finding user needs

The correct option is:


✅ Finding user needs

Explanation:

In Shaku Atre's 12-step data warehouse implementation approach, the first step is to:

🔍 Understand and find out the user needs — because the data warehouse must be designed
to support decision-making and analytical requirements of its users.

This step involves:

 Interviewing users
 Understanding business processes
 Identifying what kind of data and reports are needed

So, the correct answer is: Finding user needs.

) Total Marks: 1

________ refers to the overall process of discovering useful knowledge from data and data
mining refers to a particular step in this process.

Select the correct option

Statistics

Information cleansing

Clustering

Knowledge discovery in database

You might also like