ROLL NO.
: NAME:
CS 536 / CS 432 – Data Mining
Final Exam Part B
May 13, 2020
Duration: 3 hours (online)
Instructions:
(1) Provide your answers on separate sheets of paper. Start each question on a new page. Scan all pages
into one document for submission.
(2) Alternatively, provide your answers in a word document (typed).
(3) Use a calculator where necessary and write answers rounded to 3 decimal places.
(4) There are 3 questions in this exam.
1. (20 points) Consider the following transactional database for a supermarket. The price of
each item is also given. Using the Apriori algorithm find all satisfying itemsets with
minimum support count of 2 and sum of prices less than or equal to 55. Show your working,
including pruning steps.
TID Itemset Item Price
1 ABE A 10
2 ABDEF B 15
3 AF C 20
4 BE D 25
5 BEF E 30
6 BCF F 35
7 BDE
8 ABE
9 BCD
10 AEF
C1 L1 C2 L2 C3 L3
A: 5 A: 5 AB: 3 AB: 3 ABE: 3 (55) ABE: 3
B: 8 B: 8 AC:0 AE: 4 ABF: 1
C: 2 C:2 AD: 1 AF: 3 AEF: 2 (75)
D: 3 D: 3 AE: 4 BC:2 BDE: 2 (70)
E: 7 E: 7 AF: 3 BD: 3 BDF: 2 (75)
F: 5 F: 5 BC:2 BE: 6 BEF: 2 (80)
BD: 3 BF: 3
BE: 6
BF: 3 DE:2
CD:1
CE:0 (50)
CF:1 (55)
DE:2 (55)
DF: 1 (60)
EF: 3 (65)
2. (15 points) Consider the directed graph depicting similarity between data objects A to F.:
CS 536 – Data Mining (Sp 19-20) – Dr. Asim Karim Page 1 of 2
a. (5 points) Write down the adjacency matrix (A) for this graph.
b. (10 points) Apply two iterations of the deterministic Chinese Whispers clustering
algorithm. Report the clusters after each iteration.
3. (15 points) Compute the LOF of the objects 25 and 80 from the 1-D data set: 25, 28, 29, 35,
38, 40, 42, 48, 50, 60, 70, 80. Assume k = 2. Which object is a stronger outlier?
80 is stronger.
CS 536 – Data Mining (Sp 19-20) – Dr. Asim KarimPage 2 of 2