DATA WAREHOUSING AND DATA MINING LTPC
3003
OBJECTIVES:
To understand data warehouse concepts, architecture, business analysis and tools
To understand data pre-processing and data visualization techniques
To study algorithms for finding hidden and interesting patterns in data
To understand and apply various classification and clustering techniques using tools.
UNIT I DATA WAREHOUSING, BUSINESS ANALYSIS AND ON-LINE ANALYTICAL PROCESSING (OLAP) 9
Basic Concepts – Data Warehousing Components – Building a Data Warehouse – Database Architectures
for Parallel Processing – Parallel DBMS Vendors – Multidimensional Data Model – Data Warehouse
Schemas for Decision Support, Concept Hierarchies -Characteristics of OLAP Systems – Typical OLAP
Operations, OLAP and OLTP.
UNIT II DATA MINING – INTRODUCTION 9
Introduction to Data Mining Systems – Knowledge Discovery Process – Data Mining Techniques – Issues
– applications- Data Objects and attribute types, Statistical description of data, Data Preprocessing –
Cleaning, Integration, Reduction, Transformation and discretization, Data Visualization, Data similarity
and dissimilarity measures.
UNIT III DATA MINING – FREQUENT PATTERN ANALYSIS 9
Mining Frequent Patterns, Associations and Correlations – Mining Methods- Pattern Evaluation Method
– Pattern Mining in Multilevel, Multi Dimensional Space – Constraint Based Frequent Pattern Mining,
Classification using Frequent Patterns
UNIT IV CLASSIFICATION AND CLUSTERING 9
Decision Tree Induction – Bayesian Classification – Rule Based Classification – Classification by Back
Propagation – Support Vector Machines –– Lazy Learners – Model Evaluation and Selection-Techniques
to improve Classification Accuracy. Clustering Techniques – Cluster analysis-Partitioning Methods –
Hierarchical Methods – Density Based Methods – Grid Based Methods – Evaluation of clustering –
Clustering high dimensional data- Clustering with constraints, Outlier analysis-outlier detection
methods.
UNIT V WEKA TOOL 9
Datasets – Introduction, Iris plants database, Breast cancer database, Auto imports database –
Introduction to WEKA, The Explorer – Getting started, Exploring the explorer, Learning algorithms,
Clustering algorithms, Association–rule learners.
OUTCOMES:
Upon completion of the course, the students should be able to:
Design a Data warehouse system and perform business analysis with OLAP tools.
Apply suitable pre-processing and visualization techniques for data analysis
Apply frequent pattern and association rule mining techniques for data analysis
Apply appropriate classification and clustering techniques for data analysis
TEXT BOOK:
Jiawei Han and Micheline Kamber, ―Data Mining Concepts and Techniques, Third Edition, Elsevier,
2012.
REFERENCES:
Alex Berson and Stephen J.Smith, ―Data Warehousing, Data Mining & OLAP‖, Tata McGraw – Hill
Edition, 35th Reprint 2016.
K.P. Soman, Shyam Diwakar and V. Ajay, ―Insight into Data Mining Theory and Practice, Eastern
Economy Edition, Prentice Hall of India, 2006.
Ian H.Witten and Eibe Frank, ―Data Mining: Practical Machine Learning Tools and Techniques, Elsevier,
Second Edition.