0% found this document useful (0 votes)
213 views11 pages

DWDM Question Bank MCQ

DWDM question bank MCQ

Uploaded by

ramyashan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
213 views11 pages

DWDM Question Bank MCQ

DWDM question bank MCQ

Uploaded by

ramyashan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Q1

Which of the following refers to the problem of finding abstracted patterns (or structures) in
unlabeled data?

● Answer: Unsupervised learning

Q2

Which of the following is an essential process in which intelligent methods are applied to extract
data patterns?

● Answer: Data Mining

Q3

Suppose one wants to predict the number of newborns according to the size of storks' population by
performing supervised learning.

● Answer: Regression

Q4

Euclidean distance measure can also be defined as

● Answer: The distance between two points as calculated using the Pythagoras theorem

Q5

Which of the following is also used as the first step in the knowledge discovery process?

● Answer: Data Cleaning

Q6
Identify the correct option which defines Data Mart.

● Answer: A subgroup of data warehouse

Q7

Choose the incorrect property of the data warehouse.

● Answer: Volatile

Q8

The time horizon in a Data Warehouse is usually

● Answer: 5-10 years

Q9

The proportion of transactions supporting X in T is called

● Answer: Support

Q10

The step eliminates the extensions of (k-1)-itemsets which are not found to be frequent, from being
considered for counting support.

● Answer: Pruning

Q11

The goal of _______ is to discover both the dense and sparse regions of a data set.

● Answer: Clustering
Q12

Which of the following technology is not well suited for data mining?

● Answer: Expert System technology

Q13

The star schema is composed of _______ fact table.

● Answer: 1

Q14

_______ is data about data.

● Answer: Metadata

Q15

The second stage of the Apriori algorithm is

● Answer: Candidate generation

Q16

Each neuron is made up of a number of nerve fibers called

● Answer: Dendrites

Q17

Which of the following process is not involved in the data mining process?

● Answer: Data archaeology


Q18

Which statement given below closely defines the term data selection?

● Answer: The selection of correct data for the process of Knowledge Discovery Database

Q19

Classification rules are extracted from

● Answer: Decision tree

Q20

Extreme values that occur infrequently are called

● Answer: Outliers

Q21

Incorrect or invalid data is known as

● Answer: Noisy data

Q22

If T consists of 500,000 transactions, 20,000 transactions contain bread, 30,000 transactions contain
jam, and 10,000 transactions contain both bread and jam. Then the support of bread and jam is

● Answer: 2%
Q23

The proportion of transactions supporting X in T is called

● Answer: Support

Q24

In a feed-forward network, the connections between layers are _______ from input to output.

● Answer: Unidirectional

Q25

In web mining, _______ is used to know which URLs tend to be requested together.

● Answer: Associations
Q1

What is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of


management decisions?

● Answer: Data Warehousing

Reason: Data warehousing is explicitly designed for management decision-making and aligns with the
described features.

Q2

The data is stored, retrieved, and updated in

● Answer: OLTP

Reason: OLTP (Online Transaction Processing) systems manage day-to-day transactions efficiently,
covering storage, retrieval, and updates.

Q3

Removing duplicate records is a process called

● Answer: Data cleansing

Reason: While “data cleaning” and “data cleansing” are closely related, "data cleansing" is the standard
term for removing duplicate or inconsistent data.

Q4

Which of the following refers to the problem of finding abstracted patterns (or structures) in
unlabeled data?

● Answer: Unsupervised learning

Reason: Unsupervised learning involves working with unlabeled data to identify hidden patterns or
groupings.
Q5

What is KDD in data mining?

● Answer: Knowledge Discovery in Database

Reason: KDD stands for Knowledge Discovery in Databases, a process for extracting valuable
knowledge from large datasets.

Q6

Classification rules are extracted from

● Answer: Decision tree

Reason: Decision trees are commonly used to derive classification rules by splitting data based on feature
values.

Q7

In ______, the groups are not predefined.

● Answer: Clustering

Reason: Clustering is an unsupervised technique where groups or clusters are formed without predefined
labels.

Q8

Multiple numbers of data sources get combined in which step of the Knowledge Discovery process?

● Answer: Data Integration

Reason: Data integration is the step where data from different sources is combined into a coherent data
repository.
Q9

If T consists of 500,000 transactions, 20,000 transactions contain bread, 30,000 transactions contain
jam, and 10,000 transactions contain both bread and jam, what is the confidence of buying bread
with jam?

● Answer: 50%

Reason: Confidence = (Transactions containing both items) / (Transactions containing the first item) =
10,000 / 20,000 = 50%.

Q10

The step that eliminates the extensions of (k-1)-itemsets, which are not frequent, from being
considered for counting support is called

● Answer: Pruning

Reason: Pruning in Apriori algorithms involves eliminating non-frequent itemsets to optimize support
counting.

Q11

________ describes the data contained in the data warehouse.

● Answer: Metadata

Reason: Metadata provides detailed information about the structure, contents, and organization of data in
a data warehouse.

Q12

The star schema is composed of ______ fact table.

● Answer: One

Reason: In a star schema, there is a central fact table surrounded by dimension tables.
Q13

________ is a good alternative to the star schema.

● Answer: Snowflake schema

Reason: A snowflake schema normalizes dimension tables, unlike the denormalized star schema.

Q14

Rule-based classification algorithms generate ______ rules to perform the classification.

● Answer: If-then

Reason: Rule-based classifiers create "if-then" rules for assigning classes based on attribute values.

Q15

Records cannot be updated in

● Answer: Data warehouse

Reason: Data warehouses are designed for analysis and reporting, not for transactional updates.

Q16

Data warehouse contains ______ data that is never found in the operational environment.

● Answer: Denormalized

Reason: Denormalization in a data warehouse improves performance by storing data in a more readable
and accessible format.

Q17

In customer relationship management, we can detect outlier customers using

● Answer: Contextual outlier detection


Reason: Contextual outlier detection is specifically used to identify anomalies within specific contexts,
such as customer behavior.

Q18

A two-step process is followed in the Apriori property algorithm

● Answer: Join and Prune

Reason: The Apriori algorithm joins itemsets to create candidate itemsets and prunes non-frequent ones.

Q19

Assume the minimum support is 60%, and the number of transactions in the database is 5. Find the
support value.

● Answer: 3

Reason: Support value = (Minimum support) x (Total transactions) = 0.6 x 5 = 3.

Q20

It was shown that the Naive Bayesian method

● Answer: Can be almost optimal only when the attributes are independent

Reason: Naive Bayes assumes independence between attributes, making it most effective in such cases.

Q21

The terms equality and roll-up are associated with

● Answer: OLAP

Reason: Roll-up is an OLAP operation used to aggregate data along a hierarchy.


Q22

________ are highly simplified models of biological neurons.

● Answer: Artificial Neurons

Reason: Artificial neurons in artificial neural networks mimic the structure and function of biological
neurons.

Q23

OLAP stands for

● Answer: Online Analytical Processing

Reason: OLAP is designed for analyzing multidimensional data interactively.

Q24

The human brain consists of a network of

● Answer: Neurons

Reason: The brain’s network consists of interconnected neurons that transmit signals.

Q25

The time horizon in a Data Warehouse is usually

● Answer: 5-10 years

Reason: Data warehouses typically maintain data over a longer time horizon to support historical
analysis.

You might also like