0% found this document useful (0 votes)
76 views8 pages

DM N DW New

The document outlines the syllabus for the Data Warehousing and Data Mining course (C303) for the III B Tech I Semester, detailing course objectives, units of study, and specific topics covered in each unit. It includes information on course outcomes, program-specific outcomes, and a lecture plan with references and teaching aids. Additionally, it lists textbooks and reference materials to support the curriculum.

Uploaded by

suni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views8 pages

DM N DW New

The document outlines the syllabus for the Data Warehousing and Data Mining course (C303) for the III B Tech I Semester, detailing course objectives, units of study, and specific topics covered in each unit. It includes information on course outcomes, program-specific outcomes, and a lecture plan with references and teaching aids. Additionally, it lists textbooks and reference materials to support the curriculum.

Uploaded by

suni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SYLLABUS
Course Name:DATA WAREHOUSING AND DATA MINING Course Code:C303
Year/ Sem: III B TECH I SEM Regulation: R20
Admitted Batch: 2020-21 Academic Year:2022-23

Course Objective:

● Introduce basic concepts and techniques of data warehousing and data mining
● Examine the types of the data to be mined and apply pre-processing methods on raw
data
● To understand and analyze supervised and unsupervised models
● To understand the issues and solutions of Discover interesting patterns.
● To understand various unsupervised models and estimate the accuracy of the
algorithms.

SYLLABUS
Unit-1 :Data Warehousing and Online Analytical Processing: Data Warehouse: Basic concepts, Data
Warehouse Modelling: Data Cube and OLAP, Data Warehouse Design and Usage, Data Warehouse
Implementation, Introduction: Why and What is data mining, What kinds of data need to be mined and
patterns can be mined, Which technologies are used, Which kinds of applications are targeted.

Unit-II

Data Pre-processing: An Overview, Data Cleaning, Data Integration, Data Reduction, Data Transformation
and Data Discretization.

UNIT - III

Classification: Basic Concepts, General Approach to solving a classification problem, Decision Tree
Induction: Attribute Selection Measures, Tree Pruning, Scalability and Decision Tree Induction, Visual
Mining for Decision Tree Induction.

UNIT – IV
Association Analysis: Problem Definition, Frequent Item set Generation, Rule Generation: Confident
Based Pruning, Rule Generation in Apriori Algorithm, Compact Representation of frequent item sets,
FPGrowth Algorithm.

UNIT - V
Cluster Analysis: Overview, Basics and Importance of Cluster Analysis, Clustering techniques, Different
Types of Clusters; K-means: The Basic K-means Algorithm, K-means Additional Issues, Bi-secting K
Means.

Text Books:
1. Data Mining concepts and Techniques, 3/e, Jiawei Han, Michel Kamber, Elsevier,2011.
2. Introduction to Data Mining: Pang-Ning Tan & Michael Steinbach, Vipin Kumar,
Pearson,2012.
Reference Books:
1. Data Mining Techniques and Applications: An Introduction, Hongbo Du, Cengage
Learning.
2. Data Mining: VikramPudi and P. Radha Krishna, Oxford Publisher.
3. Data Mining and Analysis - Fundamental Concepts and Algorithms; Mohammed J. Zaki,
Wagner
Meira, Jr, Oxford
4. Data Warehousing Data Mining & OLAP, Alex Berson, Stephen Smith, TMH.
http://onlinecourses.nptel.ac.in/noc18_cs14/preview
5. (NPTEL course by Prof.Pabitra Mitra)
http://onlinecourses.nptel.ac.in/noc17_mg24/preview
6. (NPTEL course by Dr. Nandan Sudarshanam& Dr. Balaraman Ravindran)
http://www.saedsayad.com/data_mining_map.htm

COURSE COORDINATOR HEAD OF THE DEPARTMENT


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CO-PO-PSO MAPPING
Course Name:DATA WARE HOUSING AND Course Code:C303
MINING
Year/ Sem : III YEAR -I SEM Regulation: R20
Admitted Batch: 2020-21 Academic Year:2022-23
Course Coordinator :Mrs.K.S.Rupa
COURSE OUTCOMES
CO DESCRIPTION
C303.1 Summarize the basic concepts of data mining. K2
C303.2 Describe various data pre-processing procedures and their application scenariosK2
C303.3 Use Decision Trees to solve Classification problem. K3
C303.4 Illustrate the alternative classification techniques on data. K3
C303.5 Discuss Association analysis on Frequent item sets. K3

PROGRAM SPECIFIC OUTCOMES


PSO1 Graduates exhibit knowledge of basic sciences, skills in engineering specialization
like information security, cloud computing, networking, software engineering and
data analytics.
PSO 2 Graduates can adapt to evolving technologies for design and development of full
stack applications, exploring with optimal programming skills

CO PO PSO
Cos PO1 PO PO3 PO PO5 PO PO7 PO PO PO1 PO1 PO1 PSO1 PSO2
2 4 6 8 9 0 1 2
3 3 2 2 3 - - - - - 3 2 - -
C303.1
C303.2 3 2 2 3 3 - - - - - 3 2 - 3
3 2 2 2 3 - - - - - 3 2 - -
C303.3
C303.4 3 3 - 3 3 - - - - - 3 3 3 -
3 3 2 3 3 - - - - - 3 3 - 3
C303.5
AVG 3 2.6 2 2.6 3 - - - - - 3 2.4 3 3

COURSE COORDINATOR HEAD OF THE DEPARTMENT

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


LECTURE PLAN
Course Name: DATA WAREHOUSING AND DATA MINING Course Code:C303

Year/ Sem: III B TECH II SEM Regulation: R20


Admitted Batch: 2020-21 Academic Year:2022-23
Number of Lectures per week: 05
Course Coordinator :Mrs. K.Santoshi Rupa
Course handled: Section A- Mrs. M.Srividya
Course handled: Section B - Mrs. K.Santoshi Rupa
Course handled: Section C - Mrs. M.Srividy

Lecture Plan :

UNIT - I
Data Warehousing and Online Analytical Processing: Data Warehouse: Basic concepts, Data Warehouse
Modeling: Data Cube and OLAP, Data Warehouse Design and Usage, Data Warehouse Implementation,
Introduction: Why and What is data mining, What kinds of data need to be mined and patterns can be
mined, Which technologies are used, Which kinds of applications are targeted.

Objective: To make the student understand the basic concepts and techniques of data warehousing

Sessio
Teaching
n Topics to be covered Reference
Aids
No.
1. Data Warehouse: Basic concepts T1:125-134 CB
2. Data Warehouse Modeling T1:135-136 CB/PPT
3. Data cube T1:136-137 CB
Stars, Snowflakes, and Fact Constellations: Schemas for
4. T1:13-141 PPT
Multidimensional Data Models
5. OLAP Operations T1:146 CB
6. Data Warehouse Design and Usage T1:150-155 CB
7. Data Warehouse Implementation T1:156-163 CB
8. OLAP Architectures T1:164-165 CB/PPT
9. Why and What is data mining T1:1-4 CB
10. Web mining T1:4-6 CB
11. KDD in Ml T1:6-8 CB
What kinds of data need to be mined and patterns can be
12. T1:8-15 CB
mined
13. What Kinds of Patterns Can Be Mined T1:15-21 CB
14. Which technologies are used T1:23-27 PPT
15. Which kinds of applications are targeted T1:27-28 CB

UNIT – II
Data Pre-processing: An Overview, Data Cleaning, Data Integration, Data Reduction, Data Transformation
and Data Discretization.

Objective: To understand types of the data to be mined and apply pre-processing methods on raw data.

Sessio
Teaching
n Topics to be covered Reference
Aids
No.
16. Data Pre-processing: An Overview T1:84-85 CB
17. Major Tasks in Data Preprocessing T1:85-87 CB
18. Data Cleaning T1:88-91 CB

19. Data Cleaning as a Process T1:91-92 CB

20. Data Integration T1:93-94 CB

21. Redundancy and Correlation Analysis T1:94-98 CB


Tuple Duplication T1:98-99
22. CB
Data Value Conflict Detection and Resolution
23. Data Reduction T1:99-101 CB/PPT

24. Principal Components Analysis T1:102-103 CB


Attribute Subset Selection
Regression and Log-Linear Models: T1:103-108
25. CB
Parametric Data Reduction
Histograms
Clustering
26. Sampling T1:108-110 CB
Data Cube Aggregation
27. Data Transformation and Data Discretization. T1:111-115 CB
Discretization by Histogram Analysis T1:115-116
28. CB
Discretization by Cluster, Decision Tree
Correlation Analyses T1:116-119
29. PPT
Concept Hierarchy Generation for Nominal Data

UNIT - III
Classification: Basic Concepts, General Approach to solving a classification problem, Decision Tree
Induction: Attribute Selection Measures, Tree Pruning, Scalability and Decision Tree Induction, Visual
Mining for Decision Tree Induction

Objective: To understand and analyze supervised and unsupervised models

Session
Topics to be covered Reference Teaching Aids
No.
T1:327
30. Classification: Basic Concepts CB
T2:193-195

31. General Approach to solving a classification problem T1:328-329 CB

32. Decision Tree Induction T1:330-335 PPT

33. Attribute Selection Measures T1:336-343 CB

34. Decision Tree Induction Example T1:336-343 CB

35. Tree Pruning T1:344-346 CB

36. Scalability and Decision Tree Induction T1:347-348 CB

Visual Mining for Decision Tree Induction T1:348-350 CB


37.

38. Rule-Based Classification T1:355-358 CB

39. Nearest Neighbor Classifiers T2:208-210 CB

40. * ID3 Algorithm R1:192-194 PPT

41. * Metrics for Evaluating Classifier Performance T1:364-372 CB/PPT

42. * Techniques to Improve Classification Accuracy T1:377-382 CB/PPT

UNIT – IV
Association Analysis: Problem Definition, Frequent Item set Generation, Rule Generation: Confident
Based Pruning, Rule Generation in Apriori Algorithm, Compact Representation of frequent item sets,
FPGrowth Algorithm.

Objective: To understand the issues and solutions of Discover interesting patterns.

Sessio
n Topics to be covered Reference Teaching Aids
No.
43. Association Analysis: Problem Definition T2:358-359 CB
44. Frequent Item set Generation T2:362-363 CB

45. The Apriori Principle T2:363-367 CB


Frequent Itemset Generation in the Apriori
Algorithm .
46. Candidate Generation and Pruning T2:368-372 CB

47. Support Counting T2:373-376 CB

48. Computational Complexity T2:377-379 CB

49. Rule Generation T2:380 PPT

50. Confident Based Pruning T2:380-381 CB

51. Rule Generation in Apriori Algorithm T2:381-382 CB

52. An Example: Congressional Voting Records T2:382-383 CB

53. Compact Representation of frequent item sets T2:384-386 CB

54. FPGrowth Algorithm T2:393-397 CB

UNIT - V
Cluster Analysis: Overview, Basics and Importance of Cluster Analysis, Clustering techniques, Different
Types of Clusters; K-means: The Basic K-means Algorithm, K-means Additional Issues, Bi-secting K
Means.

Objective: To understand various unsupervised models and estimate the accuracy of the algorithms.

Session
Topics to be covered Reference Teaching Aids
No.
55. Cluster Analysis: Overview T2:525-527 CB
56. Different Types of Clusterings T2:528-529 CB
57. Different Types of Clusters T2:529-530 CB

58. Different Types of Clusters T2:531-533 CB

59. K-means: The Basic K-means Algorithm T2:534-543 CB

60. K-means Additional Issues T2:544-546 CB

61. Bi-secting K Means. T2:547-548 PPT

62. K-means and Different Types of Clusters . T2:548-549 CB

63. Strengths and Weaknesses T2:549-553 CB


K-means as an Optimization Problem
64. Hierarchical Methods T1:457-467 CB
65. * Agglomerative Hierarchical Clustering T2:554-564 CB

66. * DBSCAN T2:565-569 CB

* Session duration: 50 mins *CB: CHALK & BOARD

TEXT BOOKS:
1. Data Mining concepts and Techniques, 3/e, Jiawei Han, Michel Kamber, Elsevier,2011.
2. Introduction to Data Mining: Pang-Ning Tan & Michael Steinbach, Vipin Kumar, Pearson,2012.

REFERENCE BOOKS:
1. Data Mining Techniques and Applications: An Introduction, Hongbo Du, Cengage Learning.
2. Data Mining: VikramPudi and P. Radha Krishna, Oxford Publisher.
3. Data Mining and Analysis - Fundamental Concepts and Algorithms; Mohammed J. Zaki, Wagner Meira,
Jr, Oxford

e-Resources:
1. http://onlinecourses.nptel.ac.in/noc17_mg24/preview

COURSE COORDINATOR HEAD OF THE DEPARTMENT

You might also like