0% found this document useful (0 votes)
169 views5 pages

Safety and Risk Analytics Quiz

The document discusses key concepts around data quality, missing data imputation methods, data normalization, and data representation. It contains 10 multiple choice questions related to these topics, with explanations for each answer. Key areas covered include traceability, representational quality, missing data mechanisms, imputation methods, normalization using z-scores, defuzzification of linguistic variables, use of histograms for data representation, and using column profiles to analyze similarities from a contingency table.

Uploaded by

Vidhi Gaba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
169 views5 pages

Safety and Risk Analytics Quiz

The document discusses key concepts around data quality, missing data imputation methods, data normalization, and data representation. It contains 10 multiple choice questions related to these topics, with explanations for each answer. Key areas covered include traceability, representational quality, missing data mechanisms, imputation methods, normalization using z-scores, defuzzification of linguistic variables, use of histograms for data representation, and using column profiles to analyze similarities from a contingency table.

Uploaded by

Vidhi Gaba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Assignment 3

Safety and Risk Analytics


1. The extent to which the data are well documented verifiable and easily attributed to a source
is defined as

a) Relevancy of data
b) Believability of data
c) Flexibility of data
d) Traceability of data

Ans: (d) Traceability of data

Solution: Refer to Lecture 11

2. Interpretability of data is a part of

a) Intrinsic data quality


b) Contextual data quality
c) Representational data quality
d) Accessibility data quality

Ans: (c) Representational data quality

Solution: Refer to Lecture 11


3. Fill in the blank. is a type of missing data mechanism.

a) NCAR

b) NAR

c) NMAR

d) MMAR

Ans: (c) NMAR

Solution: Refer to Lecture 12


4. The method that involves imputing missing data on a variable for a given case by matching
that case with other cases in the data set on several other key variables that have complete
information for those cases.

a) Hot deck imputation

b) Maximum Likelihood Expectation-Maximization Imputation

c) Regression Imputation

d) All the above

Ans: (d) All of the above

Solution: Refer to Lecture 12

5. Suppose the data quality value of the dimensions, interpretability, and ease of understanding,
representational consistency and concise representation of data are 0.70, 0.85, 0.55, and 0.60,
respectively. If the average of the data quality scores of the dimensions gives the category
score, which of the following is true?

a) Intrinsic quality = 0.675

b) Representational quality = 0.575

c) Representational quality =0.675

d) Contextual data quality = 0.775

Ans: Representational data quality = (0.70+0.85+0.55+0.60)/4 = 0.675

Solution: Refer to Lecture 11


Question 6-8 are based on the following description.

Consider the following data table of sample observations.

Incident path Probability Consequence


(IP) (P) (C)
IP1 VH 1
IP2 M 4
IP3 H 2
IP4 L ?
IP5 VL 5
Here, data for probability of occurrence of incident paths are provided in linguistic terms.
Also, consider the following fuzzy scale

Linguistic variable Trapezoidal fuzzy numbers


Very Low ( VL) (0,0,0.1,0.2)
Low (L) (0.2,0.3,0.3,0.4)
Medium (M) (0.3,0.4,0.5,0.6)
High (H) (0.6,0.7,0.7,0.8)
Very High (VH) (0.8,0.9,1,1)

6. Using the mean imputation method, the missing value of consequence (C) for IP4 is

a) 3
b) 1
c) 4
d) 5
Ans: Option (a)
Solution: Refer to Lecture 12
(1+4+2+5)/4 = 3
7. Consider the complete data table by including the imputed value obtained in Q.6. Using the
z-score normalization, the transformed values of consequences (C) for IP2 and IP4,
respectively are

a) 0, 0.63
b) -0.63, 0
c) 0, -0.63
d) 0.63, 0
Ans: Option (d)
Solution: Refer to Lecture 13
4−3
(CIP 2 ) Normalized =
(CIP 2 − meanC ) / st.divC = =
0.63
1.58
3−3
(CIP 4 ) Normalized =
(CIP 4 − meanC ) / st.divC = =
0
1.58
8. Consider the data table and fuzzy scale in Q.6. The transformed defuzzified values of
probability (P) for IP2 and IP4, respectively are

a) 0.15, 0.08
b) 0.3, 0.16
c) 0.45, 0.3
d) 0.08, 0.15
Ans: Option (c)
Solution: Refer to Lecture 14
1 1
−a1a2 + a3 a4 − (a2 − a1 ) 2 + (a4 − a3 ) 2
3 3
−a1 − a2 + a3 + a4
1 1
−(0.3 × 0.4) + (0.5 × 0.6) − (0.4 − 0.3) 2 + (0.6 − 0.5) 2
3 3 0.18
=
P = = 0.45
−0.3 − 0.4 + 0.5 + 0.6
IP 2
0.4
1 1
−(0.2 × 0.3) + (0.3 × 0.4) − (0.3 − 0.2) 2 + (0.4 − 0.3) 2
3 3 0.06
=
PIP 4 = = 0.3
−0.2 − 0.3 + 0.3 + 0.4 0.2
9. The non-parametric approach for replacing the original data volume by alternative smaller
forms of data representation is/are

a) Principal Component Analysis

b) Histograms

c) Both a and b

d) None of the above

Ans: Option (b)

Solution: Refer to Lecture 15


10.

Consider the following contingency table.

I1 I2 I3 I4
Q1 2 4 7 6
Q2 5 1 1 2
Q3 7 3 2 3
Q4 3 2 2 1
Where, Q1, Q2, Q3, and Q4 are quarter of a year
I1, I2, I3, and I4 are type of incidents in the year.
The similarities and differences amongst the four type of incidents with respect to the four
quarters can be obtained by computing
a) Row profile of the contingency table
b) Column profile of the contingency table
c) Weighted Chi-square distances
d) Singular value decomposition
Ans: Option (b).
Solution: Refer to Lecture 16

You might also like