1.
Which of the following describes the process of summarizing data by navigating from a
more detailed data view to a less detailed data view in an OLAP cube?
A) Drill-down
B) Slice
C) Dice
D) Roll-up ✅
2. A data warehouse schema that consists of a large central fact table surrounded by several
dimension tables is known as a:
A) Star schema ✅
B) Snowflake schema
C) Fact constellation schema
D) Relational schema
3. Which of the following is a primary characteristic of a data warehouse?
A) It is volatile
B) It is subject-oriented✅
C) It is used for day-to-day transactional processing
D) Data is frequently updated and modified
4. The process of discovering patterns and insights from large datasets is called:
A) Data warehousing
B) OLAP
C) Data mining ✅
D) Data visualization
5. Which of the following is a technology used in cloud data warehousing?
A) Mainframe computers
B) On-premises servers
C) SaaS (Software as a Service)
D) Massively Parallel Processing (MPP) architecture ✅
6. What is the main purpose of a data cube in OLAP?
A) To store raw, unsummarized data
B) To provide a multi-dimensional view of data ✅
C) To perform online transaction processing (OLTP)
D) To normalize data for efficient storage
7. Which statistical measure is used to describe the degree of spread or dispersion of a set of
values?
A) Mean
B) Mode
C) Median
D) Standard deviation ✅
8. The Euclidean distance is a common metric used to measure:
A) Data integration
B) Data visualization
C) Data dissimilarity ✅
D) Data reduction
9. In data mining, an example of a frequent pattern is:
A) A single transaction in a dataset
B) A customer's age and salary
C) The price of a single product
D) The co-occurrence of 'milk' and 'bread' in many transactions ✅
10. What is the primary difference between OLAP and OLTP?
A) OLAP is for daily business transactions; OLTP is for long-term analysis.
B) OLAP uses a relational database; OLTP uses a data warehouse.
C) OLAP focuses on a few records for updates; OLTP focuses on historical data for queries.
D) OLAP is for analytical processing; OLTP is for transactional processing ✅
11. The process of filling in or removing missing values, smoothing noisy data, and identifying
or removing outliers is part of which data preprocessing step?
A) Data integration
B) Data reduction
C) Data cleaning ✅
D) Data transformation
12. Which data reduction technique replaces the original data with a smaller representation
of it by aggregating or summarizing the data?
A) Data cube aggregation ✅
B) Attribute subset selection
C) Clustering
D) Normalization
13. When data from multiple heterogeneous sources are combined into a coherent data store,
this process is known as:
A) Data transformation
B) Data discretization
C) Data integration ✅
D) Data reduction
14. Which of the following is a common method for handling missing values in a dataset?
A) Deleting the entire dataset
B) Ignoring the tuple ✅
C) Keeping the missing values as they are
D) Multiplying the values by a constant
15. The process of replacing a numeric attribute's values with a small number of intervals or
categories is called:
A) Normalization
B) Discretization ✅
C) Smoothing
D) Aggregation
16. The process of replacing a numeric attribute's values with a small number of intervals or
categories is called:
A) Normalization
B) Discretization ✅
C) Smoothing
D) Aggregation
17. Which of the following is a key issue in data integration?
A) Reducing the dimensionality of the data
B) Identifying and resolving schema integration issues ✅
C) Removing duplicate tuples from a single dataset
D) Converting continuous attributes to discrete ones
18. In the context of data preprocessing, what is the main goal of data reduction?
A) To remove all noisy data and outliers
B) To increase the number of dimensions for analysis
C) To obtain a reduced representation of the data set that is much smaller in volume yet
produces the same analytical results ✅
D) To combine data from multiple sources
19. Which of the following is an example of data smoothing?
A) Using regression to fill in missing values
B) Binning and using bin means to replace values ✅
C) Removing all duplicate records
D) Selecting a subset of attributes
20. The process of converting data into a format or structure that is suitable for data mining
and analysis is called:
A) Data cleaning
B) Data integration
C) Data transformation ✅
D) Data reduction