PAPER III – 1.
DATA WAREHOUSING AND MINING
UNIT - I
- Data Warehousing Introduction – Definition - Architecture - Warehouse Schema
Warehouse server OLAP operations. Data Warehouse technology – Hardware and operating
system - Warehousing Software - Extraction tools – Transformation tools – Data quality tools
Data loaders – Data Access and retrieval tools – Data Modeling tools – Fact tables and –
,dimensions. Data warehousing case studies: Data warehousing in Government, Tourism
Industry, Genomics data. Information Retrieval - Introduction – Role of IR – Information
Retrieval systems - IR Applications Areas – IR Algorithms – Retrieval algorithms – Filtering
.algorithms – Indexing algorithms - Evaluation in Information Retrieval
UNIT - II
- Data Mining definition – DM Techniques – current trends in data mining
,Different forms of Knowledge – Data selection, cleaning, Integration, Transformation
Reduction and Enrichment. Data: Types of data - Data Quality - Data Preprocessing - Measures
.of similarity and dissimilarity. Exploration: Summary statistics – Visualization
UNIT – III
Association rules: Introduction – Methods to discover association rule – Apriori
algorithm Partition Algorithm – Pincher search algorithm – Dynamic Item set algorithm – FP
– Tree growth algorithm. Classification: Decision Tree classification – Bayesian Classification
.Classification by Back Propagation
UNIT – IV
Clustering Techniques: Introduction – Clustering Paradigms – Partitioning
Algorithms – K means & K Mediod algorithms – CLARA – CLARANS – Hierarchical
– clustering – DBSCAN – BIRCH – Categorical Clustering algorithms – STIRR – ROCK
CACTUS. Introduction to machine learning – Supervised learning – Unsupervised learning
– Machine learning and data mining. Neural Networks: Introduction – Use of NN –
Working of NN Genetic Algorithm: Introduction – Working of GA
.
UNIT - V
Web Mining: Introduction – Web content mining – Web structure mining – Web usage
mining – Text mining – Text clustering, Temporal mining - Spatial mining – Visual data
mining – Knowledge mining – Case Studies using R and Python - Analysis and Forecasting of
House Price Indices, Customer Response Prediction and Profit Optimization, Predictive Modeling
of Big Data with Limited Memory, Twitter Information Diffusion
:REFERENCE BOOKS
.C.Charu Agarwal, "Data Mining : The Text Book ", Springer, 2015 .1
Han, Jiawei, Jian Pei, and MichelineKamber, “Data mining: concepts and techniques”, 3rd .2
.Edition, Elsevier, 2011
Margaret H. Dunham, "Data Mining: Introductory and Advanced Topics", Pearson .3
.Education, 2012
Bing Liu, “Web Data Mining: Exploring Hyperlinks, Content, and Usage Data”, 2 nd .4
.Edition, Springer, 2011
Christopher D.Manning, Prabhakar Raghavan and Hinrich Schütze, “Introduction to .5
.Information Retrieval”, Cambridge University Press. 2008
.Pang-Ning Tan, Michael Steinbach, Vipin Kumar, “Introduction to Data Mining”, 2007 .6
:Stefan Büttcher, Charles L. A. Clarke, Gordon V. Cormack, "Information Retrieval .7
.Implementing and Evaluating Search Engines", MIT Press, 2010
M. Phil. / Ph .D. Computer Science (2018-19 onwards)Page 7 of 22
,Yanchang Zhao, “R and DataMining: Examples and Case studies” Elsevier Publication .8
.2015
.Yanchang Zhao, Yonghua Cen, “Data Mining Applications with R” Elsevier Publication .9
.2013
.Layton Robert, "Learning Data Mining with Python", Packt Publishing, 2011 .10
Finn Arup Nielsen, "Data Mining with Python" (working draft). Technical University of .11
.Denmark, 2017