Course Code: 19CS0523 R19
SIDDARTHA INSTITUTE OF SCIENCE AND TECHNOLOGY:: PUTTUR
(AUTONOMOUS)
Siddharth Nagar, Narayanavanam Road – 517583
QUESTION BANK (DESCRIPTIVE)
Subject with Code: BIG DATA ANALYTICS(19CS0523) Course & Branch: B.Tech - CSE
Regulation: R19 Year & Sem: III-B.Tech & II-Sem
UNIT –I
Introduction To Big Data And Hadoop
1 Discuss in detail about History of Hadoop? [L2][CO1] [12M]
2 a) Examine the different types of digital data with examples? [L4][CO1] [6M]
b) Discuss Big Data in terms of three dimensions, volume, variety and velocity. [L2][CO1] [6M]
3 Establish the evolution of Hadoop ecosystem with neat diagram. [L3][CO2] [12M]
Explain the difference between structure, unstructured and semi-structure data
4 [L4][CO1] [12M]
with an examples.
5 a) List the Top challenges facing big data. [L1][CO1] [6M]
b) What is the Significance of big data analytics [L1][CO1] [6M]
Distinguish between Analysis of data through Unix tools and Hadoop
6 [L4][CO5] [12M]
Ecosystem
7 a) What is big data analytics? Identify the Classification of Analytics [L3][CO1] [6M]
b) Illustrate in detail about Hadoop streaming [L2][CO2] [6M]
8 a) What is Big Sheets? What can be done with big sheets? [L1][CO6] [6M]
b) Explain in detail about Infosphere Big Insights ? [L2][CO6] [6M]
9 a) Discriminate the Big Data in Healthcare,Trasportation & Medicine. [L5][CO1] [6M]
b) Why business are using big data for competitive advantage? [L4][CO1] [6M]
10 a) How to implement IBM Big Data Strategy? [L2][CO1] [6M]
b) Generalize the list of tools related to Hadoop. [L6][CO2] [6M]
Course Code: 19CS0523 R19
UNIT –II
HDFS(Hadoop Distributed File System)
1 Illustrate the HDFS concepts. [L3][CO2] [12M]
What are the advantages of Hadoop? Explain Hadoop Architecture and its
2 [L3][CO2] [12M]
Components with proper diagram
3 Explain the block, name node and data node in Hadoop file system [L2][CO3] [12M]
4 Determine the basic commands in Hadoop command line interface. [L3][CO5] [12M]
5 a) What is an interface? Establish the Hadoop system interfaces [L3][CO2] [6M]
b) Discuss about the Hadoop Archives and its Limitations [L2][CO2] [6M]
6 Describe the File read and File write operations in HDFS [L1][CO5] [12M]
7 a) Discuss about the data ingest operation using sqoop and flume [L2][CO2] [6M]
b) Differentiate the compression and serialization operation in Hadoop I/O. [L4][CO2] [6M]
8 Elaborate the AVRO file format with a diagram [L6][CO3] [12M]
9 a) What is data serialization? [L3][CO3] [4M]
b) Demonstrate the File Based Data structures. [L2][CO2] [8M]
10 a) Analyze the features of Apache Hadoop . [L4][CO6] [6M]
b) How does Hadoop work? [L2][CO2] [6M]
Course Code: 19CS0523 R19
UNIT –III
Map Reduce
1 Examine the Anatomy of a MapReduce Job Run. [L4][CO4] [12M]
2 Construct the Classic MapReduce Job Run with a neat diagram. [L6][CO5] [12M]
3 Estimate the Significance of YARN over Classic MapReduce Job Run. [L5][CO3] [12M]
4 a) What are the different types of failures in Classic MapReduce [L1][CO1] [6M]
b) What are the different types of failures in YARN [L1][CO1] [6M]
5 a) Examine the different types of Job Scheduling process in Map [L3][CO4] [6M]
Reduce.
b) Describe the Default MapReduce Job. [L3][CO4] [6M]
6 Describe the Shuffle and Sort operations in Map side and Reduce side [L1][CO3] [12M]
7 a) What are the Properties in Task Execution Environment. [L1][CO4] [6M]
b) Discuss about Speculative Execution and its Properties. [L2][CO4] [6M]
8 Categorize the different types of input formats in MapReduce. [L4][CO2] [12M]
9 Examine the different types of output formats in MapReduce. [L3][CO2] [12M]
10 Contrast the below features in MapReduce. [L4][CO3] [12M]
a) Counters b) Sorting c) Joins
Course Code: 19CS0523 R19
UNIT –IV
Hadoop Eco System-Pig
1 a) Illustrate the concept of grunt [L3][CO2] [5M]
b) Why Do We Need Apache Pig? Identify the features of PIG. [L4][CO2] [7M]
2 What is Pig? How to Install and execute PIG on Hadoop Cluster [L2][CO5] [12M]
3 a) Compare the PIG with Databases with an Example [L5][CO3] [6M]
b) Evaluate the Expressions and types in Pig Latin. [L4][CO4] [6M]
4 Examine the different execution modes available in Pig [L3][CO4] [12M]
5 Construct User Define Functions in Pig Latin. [L6][CO5] [12M]
6 a) Explain about Arithmetic Operators in Pig Latin . [L2][CO3] [6M]
b) Find the Grouping and Joining Data in Pig Latin. [L3][CO3] [6M]
7 Examine the Relational Operators in Pig Latin . [L4][CO2] [12M]
8 Develop the Schemas and Functions in Pig Latin [L3][CO5] [12M]
9 a) Explain about the data types in Pig Latin. [L2][CO2] [6M]
b) Develop a program to calculate the maximum recorded temperature by year for [L6][CO5] [6M]
the weather dataset in Pig Latin.
10 a) Discriminate the Structures, Statements in Pig Latin [L4][CO1] [6M]
b) Evaluate Data Processing Operators in Pig Latin. [L5][CO4] [6M]
Course Code: 19CS0523 R19
UNIT –V
Hive, Hbase, Big SQL
1 Illustrate Hive table with example. [L3][CO5] [12M]
2 Discuss about Hive shell command line interface. [L2][CO5] [12M]
3 a) Draw a neat sketch of Hive architecture. [L3][CO2] [4M]
b) Explain about components of Hive architecture. [L2][CO2] [8M]
4 a) Deduce the various services offered by Hive. [L4][CO4] [6M]
b) Examine the Characteristics of HBase [L4][CO1] [6M]
5 a) Infer the advantages of Hive over traditional databases? [L2][CO5] [6M]
b) What are the operators and functions in HIVE? [L1][CO2] [6M]
6 a) Appraise about Hive query language? [L4][CO5] [6M]
b) Review Metastore in Hive? [L2][CO5] [6M]
7 Differentiate Hbase over RDBMS. [L4][CO1] [12M]
8 Explain with a neat diagram the architecture of Hbase. [L2][CO2] [12M]
9 a) Categorize the joins in HiveQL [L4][CO5] [6M]
b) Report the Implementation of queries on sorting and aggregation of data in Hive [L6][CO3] [6M]
10 a) Explain about IBM Big SQL? [L2][CO6] [6M]
b) Assess how HBase is implemented at Streamy.com [L4][CO6] [6M]
Prepared by:
Mr.R.Purushothaman, Associate Professor, CSE SISTK.