Syllabus M.
Tech (Data Science & Artificial Intelligence)
                                           First Semester
    CS-5001              Fundamentals of Data Science                               L-T-P-C:3-0-0-3
Course objective:
     To provide strong foundation for data science and application area related to it and understand
      the underlying core concepts and emerging technologies in data science.
Unit 1
Introduction to core concepts and technologies: Introduction, Terminology, data science process, data
science toolkit, Types of data, Example applications, Mathematical Foundations for Data Science: linear
algebra; Analytical and numerical solutions of linear equations; Mathematical structures, concepts and
notations used in discrete mathematics. Introduction to Statistical Methods: basic and some advanced
concepts of probability and statistics; Concepts of statistics in solving problems arising in data science.
Unit 2
Data collection and management: Introduction, Sources of data, Data collection and APIs, Exploring and
fixing data, Data storage and management, using multiple data sources
Unit 3
Data analysis: Introduction, Terminology and concepts, Introduction to statistics, Central tendencies and
distributions, Variance, Distribution properties and arithmetic, Samples/CLT, Basic machine learning
algorithms, Linear regression, SVM, Naive Bayes.
Unit 4
Data visualization: Introduction, Types of data visualization, Data for visualization: Data types, Data
encodings, Retinal variables, mapping variables to encodings, Visual encodings.
Unit 5
Computer science and engineering applications Data mining, Network protocols, analysis of Web traffic,
Computer security, Software engineering, Computer architecture, operating systems, distributed systems,
Bioinformatics, Machine learning
Unit 6
Applications of Data Science, Technologies for visualization, Bokeh (Python), recent trends in various data
collection and analysis techniques, various visualization techniques, application development methods of
used in data science.
Course outcome:
         Explore the fundamental concepts of data science
         Understand data analysis techniques for applications handling large data
         Understand various machine learning algorithms used in data science process
         Visualize and present the inference using various tools.
         Learn to think through the ethics surrounding privacy, data sharing and algorithmic decision-making
  Text Book:
      1. Cathy O’Neil, Rachel Schutt, Doing Data Science, Straight Talk from The Frontline. O’Reilly,
         2013.
      2. Introducing Data Science, Davy Cielen, Arno D. B. Meysman, Mohamed Ali, Manning Publications
         Co., 1st edition, 2016
      3. An Introduction to Statistical Learning: with Applications in R, Gareth James, Daniela Witten,
         Trevor Hastie, Robert Tibshirani, Springer, 1st edition, 2013
  Reference Book:
      1. Jure Leskovek, Anand Rajaraman, Jeffrey Ullman, Mining of Massive Datasets. v2.1, Cambridge
         University Press, 2014.
      2. Data Science from Scratch: First Principles with Python, Joel Grus, O’Reilly, 1st edition, 2015.
      3. Doing Data Science, Straight Talk from the Frontline, Cathy O'Neil, Rachel Schutt, O’ Reilly, 1st
         edition, 2013.
      4. Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman, Cambridge
         University Press, 2nd edition, 2014.
     CS 5003                     Advanced Artificial Intelligence                      L T P C:3 0 0 3
Course Objective:
   ● To learn the difference between optimal reasoning Vs human like reasoning
   ● To understand the notions of state space representation, exhaustive search, heuristic search along with
       the time and space complexities
   ● To learn different knowledge representation techniques
   ● To understand the applications of AI: namely Game Playing, Theorem Proving, Expert Systems,
       Machine Learning and Natural Language Processing
   ● Able to work in uncertain environments using probabilistic reasoning techniques.
Course content:
Unit 1
Introduction: What is AI? , History, Overview, Intelligent Agents, Performance Measure, Rationality,
Structure of Agents, Problem solving agents, Problem Formulation, Uninformed Search Strategies. Informed
(Heuristic) Search and Exploration, Greedy best first search, A* search, Memory bounded heuristic search,
Heuristic functions, inventing admissible heuristic functions, Local Search algorithms, Hill climbing,
Simulated Annealing, Genetic Algorithms, Online search
Unit 2
Constraint Satisfaction Problems, Backtracking Search, variable and value ordering, constraint propagation,
intelligent backtracking, local search for CSPs, Adversarial Search, Games, The minimax algorithm, Alpha
Beta pruning, Imperfect Real Time Decisions, Games that include an Element of Chance
Unit 3
Knowledge Based Agents, Logic, Propositional Logic, Inference, Equivalence, Validity and Satisfiability,
Resolution, Forward and Backward Chaining, DPLL algorithm, Local search algorithms, First Order Logic,
Models for first order logic, Symbols and Interpretations, Terms, Atomic sentences, complex sentences,
Quantifiers, Inference in FOL, Unification and Lifting, Forward Chaining, Backward Chaining, Resolution
Unit 4
Planning, Language of planning problems, planning with state space search, forward and backward state space
search, Heuristics for state space search, partial order planning, planning graphs, planning with propositional
logic
Unit 5
Uncertainty, Handing uncertain knowledge, rational decisions, basics of probability, axioms of probability,
inference using full joint distributions, independence, Baye’s Rule and conditional independence, Bayesian
networks, Semantics of Bayesian networks, Exact and Approximate inference in Bayesian Networks.
Course outcome:
    ●   Formulate problems so that exploratory search can be applied.
    ●   Implement optimal, heuristic and memory bounded search techniques.
    ●   Represent knowledge using formal logic and design algorithms to work in a semi observable
        environment using logical reasoning.
    ●   Design and develop practical algorithms for solving real life planning problems.
        Implement probabilistic reasoning techniques to work in uncertain environments.
Text Book:
   1. Artificial Intelligence a Modern Approach : Russel and Norvig , Pearson Education, 2nd
    2. Artificial Intelligence – A Practical Approach : Patterson , Tata McGraw Hill, 3rd
    HS 5001                       Research Methodology & IPR                            L T P C:3 0 0 3
Course objective:
● Present research methodology and the technique of defining a research problem.
● Learn the meaning of interpretation, techniques of interpretation, precautions is to be taken in
  interpretation for research process,
● Application of statistical methods in research
● Learn intellectual property rights and its constituents.
Course content:
Unit 1
Introduction to research, Definitions and characteristics of research, Types of Research, Research Process,
Problem definition, Objectives of Research, Research Questions, Research design, Quantitative vs.
Qualitative Approach, Building and Validating Theoretical Models, Exploratory vs. Confirmatory Research,
Experimental vs. Theoretical Research, Importance of reasoning in research.
Unit 2
Problem Formulation, Understanding Modeling & Simulation, Literature Review, Referencing, Information
Sources, Information Retrieval, Indexing and abstracting services, Citation indexes, Development of
Hypothesis, Measurement Systems Analysis, Error Propagation, Validity of experiments, Statistical Design
of Experiments, Data/Variable Types & Classification, Data collection, Numerical and Graphical Data
Analysis: Sampling, Observation, Interpretation of Results.
Unit 3
Statistics: Probability & Sampling distribution, Estimation, Measures of central Tendency, Arithmetic mean,
Median, Mode, Standard deviation, Co efficient of variation (Discrete serious and continuous serious),
Hypothesis testing & application, Correlation & regression analysis, Orthogonal array, ANOVA, Standard
error, Concept of point and interval estimation, Level of significance, Degree of freedom, Analysis of
variance, One way and two way classified data, ‘F’ test.
Unit 4
Preparation of Dissertation and Research Papers, Tables and illustrations, Guidelines for writing the abstract,
introduction, methodology, results and discussion, conclusion sections of a manuscript. References, Citation
and listing system of documents.
Unit 5
Intellectual property rights (IPR) patents copyrights Trademarks Industrial design geographical indication.
Ethics of Research Scientific Misconduct Forms of Scientific Misconduct. Plagiarism, Unscientific
practices in thesis work, Ethics in science.
Course outcome:
● Design and formulation of research problem.
● Analyze research related information and statistical methods in research.
● Carry out research problem individually in a perfect scientific method
● Understand the filing patent applications processes, Patent search, and various tools of IPR, Copyright,
  and Trademarks.
Text Book:
   1. K. S. Bordens, and B. B.Abbott, , “Research Design and Methods – A Process Approach”, 8th
       Edition, McGraw Hill, 2011
   2. C. R. Kothari, “Research Methodology – Methods and Techniques”, 2nd Edition, New Age
       International Publishers
   3. Douglas C. Montgomary&George C. Runger, Applied Statistics & probabilityfor Engineers, 3 rd
      edition,2007,Wiley
   4. Robert P. Merges, Peter S. Menell, Mark A. Lemley, “Intellectual Property in New Technological
      Age”. Aspen Law & Business; 6th edition July 2012
   5. A Beginners Guide to Latex, Chetan Shirore, 5 July 2015.
Reference Book:
   1. Michael P. Marder,“ Research Methods for Science”, Cambridge University Press, 2011
   2. T. Ramappa, “Intellectual Property Rights Under WTO”, S. Chand, 2008.
   3. G.W. Snedecor and W.G. Cochrans, Lowa,Statistical Methods, state UniversityPress,1967.
   4. Davis, M., Davis K., and Dunagan M., “Scientific Papers and Presentations”, 3rd Edition, Elsevier
       Inc.
 CS 5101: Data Science LAB                                                                 L T P C:0 0 3 2
 List of Lab Assignments / Experiments:
     Python Programming Bootcamp, Data Preparation (Unix), Exploratory Data Analysis (Python, Pandas
     & matplotlib), Joining Multiple Tables (Python & Pandas), Classification and Regression (Python &
     Scikit), Map-Reduce (Python), Information Extraction from Text (python/NLTK), Page Rank (Map
     Reduce), Data Cleaning Task, Feature Extraction, Engineering and Clustering.
 CS 5103: Artificial Intelligence Lab                                                     L T P C:0 0 3 2
     AI search algorithms, planning, representational logic, probabilistic inference, machine learning, Markov
     processes, hidden Markov models (HMM) and filters, computer vision, robotics, and natural language
     processing.
                                           Second Semester
     CS 5002                  Data Mining and Data Warehousing                          L T P C:3 0 0 3
Course objective:
   ● To learn embedded system architecture.
   ● Study in detail process management and memory management.
   ● To learn Real Time Operating system principles and its components.
   ● Study in detail Linux kernel and Linux files systems.
   ● Study in detail device drivers.
Course Content:
   1. General Introduction of Warehousing: Historical Perspective, characteristics of data warehousing.
      Data Warehousing: its architecture, Logical design, Data Preprocessing Data Cleaning methods,
      Descriptive Data Summarization, Data Reduction, Data Discretization and Concept hierarchy
      generation
   2. Multidimensional data model, Attribute oriented induction, Overview of ETL and OLAP,
      Comparison of OLAP and OLTP systems, Data mart. Data mining vs Database, Data Warehousing
      architecture and implementation, Data mining as a component of data warehouse.
   3. Data Mining Techniques: Basic concepts of Association Rule Mining, Frequent Item set mining,
      Mining various kinds of association rules, Classification by decision tree induction
   4. Bayesian Classification, Rule based Classification, Classification Back propagation, Associative
      Classification, Lazy Learners, Rough set approach, Clustering methods
   5. Data Objects and Attribute Types, Basic Statistical Descriptions of Data, Measuring Data Similarity
      and Dissimilarity Partition based Clustering, Hierarchical based clustering, Density based clustering.
Course Outcome:
On completion of the course, student will be able to
   ● Understand formal machines, languages
   ● Understand stages in building a Data Warehouse
   ● Apply pre processing techniques for data cleansing
   ● Analyse multi dimensional modelling techniques
   ● Analyse and evaluate performance of algorithms for Association Rules
       Analyse Classification and Clustering algorithms
Text Book:
   1. Arun K. Pujari, Data Mining Techniques, University Press, 2001
   2. Vipin Kumar, Introduction to Data Mining Pang Ning Tan, Michael Steinbach, Addison Wesley,
       2006.
   3. Paulraj Ponniah, Data Warehousing: Fundamentals for IT Professionals, Wiley Pb. Linux", Packt
       Publishing, 1st Edition, 2017.
Reference Book:
   1. Jiawei Han and M Kamber , Data Mining Concepts and Techniques, , Second Edition, Elsevier
       Publication, 2011.
    CS 5004                      Advanced Machine Learning                            L T P C:3 0 0 3
Course objective:
   ● Focusing on recent advances in deep learning with neural networks, such as recurrent and Bayesian
       neural networks.
   ● The course will concentrate especially on natural language processing (NLP) and computer vision
       applications.
    ●    Introduce the mathematical definitions of the relevant machine learning models and derive their
         associated optimization algorithms.
         It will cover a range of applications of neural networks in natural language processing, including
         analyzing latent dimensions in text, translating between languages, and answering questions.
Course content:
Unit 1
Introduction to Machine Learning, Examples of Machine Learning applications -
Learning associations, Classification, Regression, Unsupervised Learning, Reinforcement Learning.
Supervised learning- Input representation, Hypothesis class, Version space, Vapnik-Chervonenkis
(VC) Dimension.
Unit 2
Advanced machine learning topics: Bayesian modelling and Gaussian processes, randomized methods,
Bayesian neural networks, approximate inference.
Unit 3
Deep learning: regularization, convolutional neural networks, recurrent neural networks, variational
autoencoders, generative models, applications.
Unit 4
Applications of machine learning in natural language processing: recurrent neural networks, backpropagation
through time, long short term memory, attention networks, memory networks, neural Turing machines,
machine translation, question answering, speech recognition, syntactic and semantic parsing, GPU
optimization for neural networks.
Unit 5
Evaluation in ML: metrics, cross-validation, statistics, addressing the multiple comparisons problem.
Course outcome:
At end of the course, students will be able to:
    ● Understand the definition of a range of neural network models.
    ● Be able to derive and implement optimization algorithms for these models.
    ● Understand neural implementations of attention mechanisms and sequence embedding models and
        how these modular components can be combined to build state of the art NLP systems.
    ● Be able to implement and evaluate common neural network models for language.
    ● Have a good understanding of the two numerical approaches to learning (optimization and integration)
        and how they relate to the Bayesian approach.
    ● Have an understanding of how to choose a model to describe a particular type of data.
    ● Understand the mathematics necessary for constructing novel machine learning solutions.
    ● Be able to design and implement various machine learning algorithms in a range of real world
        applications.
Text Book:
   1. Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. MIT Press 2012
   2. Ian Goodfellow, Yoshua Bengio and Aaron Courville. Deep Learning. MIT Press 2016
Reference Book:
   1. Bayesian Reasoning and Machine Learning David Barber, Cambridge University Press, 2012.
    HS 5002                 Professional and Communication Skills                       L-T-P-C:2-1-0-3
   Course objectives:
         To enable students to develop effective Language and Communication Skills
         To enhance students’ Personal and Professional skills.
Course content:
Unit 1
Personal Interaction: Introducing Oneself-one’s career goals, Activity: SWOT Analysis, Interpersonal
Interaction: Interpersonal Communication with the team leader and colleagues at the workplace, Activity:
Role Plays/Mime/Skit, Social Interaction: Use of Social Media, Social Networking, gender challenges,
Activity: Creating LinkedIn profile, blogs.
Unit 2
Résumé Writing: Identifying job requirement and key skills Activity: Prepare an Electronic Résumé,
Interview Skills: Placement/Job Interview, Group Discussions, Activity: Mock Interview and mock group
discussion.
Unit 3
Report Writing: language and Mechanics of Writing, Study Skills: Note making, Interpreting skills:
Interpret data in tables and graphs, Activity: Transcoding
Unit 4
Presentation Skills: Oral Presentation using Digital Tools, Activity: Oral presentation on the given topic
using appropriate non-verbal cues, Problem Solving Skills: Problem Solving & Conflict Resolution,
Activity: Case Analysis of a Challenging Scenario.
Text Book:
   1. Bhatnagar Nitinand Mamta Bhatnagar, Communicative English For Engineers And Professionals,
   2010, Dorling Kindersley (India) Pvt. Ltd
Reference Book:
   1. Jon Kirkman and Christopher Turk, Effective Writing: Improving Scientific, Technical and
   Business Communication, 2015, Routledge.
   2. Diana Bairaktarova and Michele Eodice, Creative Ways of Knowing in Engineering, 2017,
   Springer International Publishing.
   3. Clifford A Whitcomb & Leslie E Whitcomb, Effective Interpersonal and Team Communication
   Skills for Engineers, 2013, John Wiley & Sons, Inc., Hoboken: New Jersey.
   4. ArunPatil, Henk Eijkman &Ena Bhattacharya, New Media Communication Skills for Engineers
   and IT Professionals, 2012, IGI Global, Hershey PA.
CS 5102: Data Mining Lab                                                         L T P C: 0 0 3 2
   Build Data Warehouse and Explore WEKA, Perform data preprocessing tasks and Demonstrate
   performing association rule mining on data sets, Demonstrate performing classification on data sets,
   Demonstrate performing clustering on data sets, Demonstrate performing Regression on data sets.
   Beyond the Syllabus Simple Project on Data Preprocessing
   CS 5106                    Machine Learning Lab                            L T P C:0 0 3 2
Exercises to solve the real-world problems using the following machine learning methods: Linear
Regression, Logistic Regression, Multi-Class Classification, Neural Networks, Support Vector Machines,
K-Means Clustering & PCA.
Develop programs to implement Anomaly Detection & Recommendation Systems.
Implement GPU computing models to solving some of the problems mentioned in Problem 1.
Text Book:                1. R in a Nutshell, 2nd Edition O'Reilly Media.
                          2. Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. MIT
                              Press 2012
Reference Book:           1. Machine Learning, Tom M Mitchell.