0% found this document useful (0 votes)
87 views4 pages

Data Mining & Warehouse Q&A

The document contains questions about data mining and warehousing concepts. It includes questions about data cleaning, data modeling, data cube technology, classification vs prediction, data warehousing vs operational databases, the relationship between data warehousing and data mining, data preprocessing techniques, data reduction, data warehouse architecture, clustering methods, association rule mining and frequent pattern mining.

Uploaded by

Tanvi Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views4 pages

Data Mining & Warehouse Q&A

The document contains questions about data mining and warehousing concepts. It includes questions about data cleaning, data modeling, data cube technology, classification vs prediction, data warehousing vs operational databases, the relationship between data warehousing and data mining, data preprocessing techniques, data reduction, data warehouse architecture, clustering methods, association rule mining and frequent pattern mining.

Uploaded by

Tanvi Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Data Mining & Warehouse

(Question bank 2)

1 What do you mean by data cleaning?


2 What do you mean by noisy data?
3 What do you understand by multi-dimension data model?
4 Explain briefly data cube technology?
5 What is difference between classification & prediction?
6 What is Difference between Operation Database system and data warehouse.
7 What is the relation between data warehouse and data mining?
8 List the various forms of data pre-processing and describe the data reduction techniques.
9 Discuss the need and importance of data preprocessing.
10 Discuss 3-tier data warehouse architecture.
11 Define an efficient procedure for cleaning the noisy data.
12 Distinguish between data similarity and dissimilarity.
13 Show the Displays of Basic Statistical Descriptions of Data.
14 Classify the various methods for data smoothing.
15 Credit card fraud detection using transaction records.
16 Predicting the future stock price of a company using historical records.
17 Discuss on descriptive and predictive data mining tasks with illustrations.
18 Explain the various methods of data cleaning and data reduction techniques
19 How would you show your understanding of Multi-dimensional data model?
20 Generalize the function of OLAP tools in the internet.
21 How would you evaluate the goals of data mining?
22 Can you list the categories of tools in business analysis?
23 Give the need for OLAP.
24 Compare drill down with roll up approach.
25 Design the data warehouse architecture.
26 What are the prediction techniques supported by a data mining systems?
27 Identify what changes you make to solve the problem in cluster analysis.
28 Formulate the role of application and challenges in clustering
29 List the challenges of outlier detection
30 Classify the hierarchical clustering methods.
31 Distinguish between Classification and clustering
32 Show the intrinsic methods in cluster analysis
33 Evaluate the different types of data used for cluster analysis?
34 Evaluate agglomerative and divisive hierarchical clustering?
35 Discuss the various types of data in cluster analysis?
36 Explain the categories of major clustering methods?
37 State algorithms for k-means and k-medoids? Explain?
38 Explain the different types of OLAP servers.
39 Define association and correlations.
40 List the ways in which interesting patterns should be mined.
41 Are all patterns generated are interesting and useful? Give reasons to justify
42 Compare the advantages of FP growth algorithm over Apriori algorithm
43 How will you apply FP growth algorithm in Data mining?
44 Analyze the constraint based frequent pattern mining.
45 Evaluate the classification using Frequent patterns
46 Generalize on Mining Closed and Max Patterns.
47 Define correlation and market basket analysis.
48 Diagrammatically illustrate and describe the architecture of MOLAP and ROLAP.
49 Identify the major differences between MOLAP and ROLAP.
50 Compare the similarities and differences between the database and data warehouse.
51 Explain what data visualization is. How it helps in data warehousing.
52 Depict the 3 tier data warehousing architecture and explain its features in Detail.
53 Summarize in detail about various kinds of association rules.
54 Summarize the various classification methods using frequent patterns.
55 Analyze the various Frequent Item set mining method with examples.
56 What is Naïve Bayesian classification? How is it differing from Bayesian classification?
57 What approach would you use to apply decision tree induction?
58 Explain how the Bayesian Belief Networks are trained to perform classification.
59 Explain the hierarchical based method for cluster analysis.
60 What are the prediction techniques supported by a data mining systems?
61 Develop an algorithm for classification using decision trees. Illustrate the algorithm with
a relevant example.
62 What is Classification? What are the features of Bayesian classification? Explain in detail
with an example.
63 Generalize the Bayes theorem of posterior probability and explain the working of a
Bayesian classifier with an example.
64 Generalize how pattern mining is done in multilevel and multidimensional space with
necessary examples.
65 Define classification. With an example explain how support vector machines can be used
for classification.
66 Define the distance-based outlier? Illustrate the efficient algorithms forming distance-
based algorithm?
67 Examine the relevant examples discuss multidimensional online analytical processing
and multi relational online analytical processing.
68 What is data warehouse? Give the steps for design and construction of Data Warehouses
and explain with three tier architecture diagram.
69 Define Market Basket Analysis. Describe about Frequent Itemsets, Closed Itemset and
Association Rules.
70 Discuss about constraint based association rule mining with examples and state how
association mining to correlation analysis is dealt with.
1. Consider five points { X1, X2,X3, X4, X5} with the following coordinates as a two
dimensional sample for clustering: X1 = (0,2.5); X2 = (0,0); X3= (1.5,0); X4 = (5,0); X5 =
(5,2)Compose the K-means partitioning algorithm using the above data set.

2. Suppose that the data for analysis include the attributed age. The age values for the data tuples
are 13,15,16,19,20,20,21,22,22,25,25,25,25,30,33,33,35,35,
35,35,36,40,45,46,52,70.
i)Use smoothing by bin depth of 3.Illustrate your steps.

3. Perform KNN classification algorithm on following data set and predict the class for
X (P1=3, P2=7) find out K=3 nearest neighbor.

4. Apply KNN algorithm and predict the types of fruits or food type to which tomato
(sweet=6,cruch=4)

Ingredient sweet chruch Food types


grape 8 5 fruit
green bean 3 7 vegetable
nuts 3 6 protein
orange 7 3 fuite

5. Find all frequent item sets for the given training set using Apriori and FP growth respectively.
Compare the efficiency of the two mining processes (MIN SUPP.30%)
TID ITEMS BROUGHT
T100 {M , O, N , K , E , Y }
T200 {D , O, N, K , E, Y }
T300 {M , A K, E }
T400 {M ,U , C , K ,Y }
T500 {C , O , O ,K , I , E }

6. Generate FP tree for the following transaction data set (min supp.=30%)

Trisection ID ITEM
1 E,A,D,B
2 D,A,C,E,B
3 C,A,B,E
4 B,A,D
5 D
6 D,B
7 A,D,E
8 B,C

You might also like