RAMRAO ADIK INSTITUTE OF TECHNOLOGY, NERUL
DEPARTMENT OF COMPUTER ENGINEERING
Assignment-I
Subject: Big Data Analytics Academic Year: 2020-21
Class/Sem.: BE / VII Div.: A,B,C
Q.No. Question Marks CO BT
Explain the role of NameNode, DataNode, JobTracker and
Q.1 TaskTracker. Justify why is HDFS more suited for applications 06m CO1 BT4
having large datasets and not when there are small files?
Write Map Reduce Pseudo code to multiply two matrices. Illustrate
the procedure on the following matrices. Cleary show all the steps.
Q.2.a 07m CO2 BT5
Q.2.b Evaluate the operations of “shuffle” and “Sort” in the Map Reduce 08m CO2 BT4
framework? Justify with the help of to calculate Average
Temperature example.
Distinguish between replication and sharding? Discuss uses of key-
Q.3 value store with a business use-case and also state its weakness. 9m CO3 BT4
===================================================================
Course Outcomes (CO) Students’ will be able to:
CO1: Identify the key issues in big data management and use Hadoop framework for
resolving these issues.
CO2: Apply various tools and techniques for big data analytics like Hadoop, Map Reduce
and PySpark.
CO3: Understand the NoSQL Data Architecure Patterns and apply various tools and
techniques for NoSQL like MongoDB/Cassandra/ HBase/ Hypertable etc.
CO4: Apply fundamental enabling techniques and scalable algorithms for stream
mining and frequent-itemset mining.
CO5: Interpret business models and scientific computing paradigms, and apply software tools
for big data analytics.
CO6: Analyze the web links for relevant information retrieval and Achieve adequate
perspectives of big data analytics in various applications like recommender systems,
social media applications etc
----------------------------------------------------------------------------------------------------------------
Bloom's Taxonomy
BT1- Remember, BT2- Understand, BT3- Apply, BT4- Analyze, BT5- Evaluate, BT6- Create
Subject Incharge DQA Member