lOMoARcPSD|37410823
KARPAGA VINAYAGA
COLLEGE OF ENGINEERING AND TECHNOLOGY
(Approved by AICTE, New Delhi, Affiliated to Anna University, Chennai and Accredited by NAAC)
GST Road, Chinna kolambakkam, Madhuranthagam Taluk, Chengalpattu District – 603 308, Tamil Nadu
Department of Artificial Intelligence And DataScience
Form No: KVCET/IQAC/coursefile /03
Question Bank
Course Code/Course Name: CCS334/ BIG DATA ANALYTICS
UNIT I UNDERSTANDING BIG DATA
Introduction to big data – convergence of key trends – unstructured data – industry examples of big data
– web analytics – big data applications– big data technologies – introduction to Hadoop – open
source technologies – cloud and big data – mobile business intelligence – Crowd sourcing analytics –
inter and Trans firewall analytics.
PART-A (2 Marks)
Q.No. Questions BT Level COs
1. Define Big data. BT-1
CO1
2. List out the key trends in big data. BT-1 CO1
3. Define Web analytics. BT-1 CO1
4. Write any five industrial applications of big data. BT-2 CO1
5. List out the big data technologies. BT-1 CO1
6. Write down any two disadvantages of big data. BT-1 CO1
7. Write any four applications of big data. BT-1 CO1
8. State the difference between Big data and cloud. BT-1 CO1
9. State the difference between inter and trans firewall analytics. BT-2 CO1
10. Define unstructured data. Give an example. BT-1 CO1
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
PART-B (13 Marks)
Q.No BT
Questions Marks Cos
Level
1 Explain about industrial example for Big data in deatil. 13 BT-1 CO1
2 Explain detail about any three big data technologies 13 BT-2 CO1
3 Write brief notes about Web analytics. 13 BT-2 CO1
4 Explain detail about Crowd sourcing analytics. 13 BT-2 CO1
5 Explain about mobile business intelligence with an example 13 BT-2 CO1
6 Explain detail inter and trans firewall analytics. 13 BT-2 CO1
PART-C (15 Marks)
1 Explain detail about any five big data applications with an
example. 15 BT-2 CO1
2.
Explain detail about web analytics in big data. 15 BT-2 CO1
3.
Explain about industrial example for big data in detail. 15 BT-2 CO1
4.
Explain about mobile business intelligence with an example. 15 BT-2 CO1
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
UNIT II NOSQL DATA MANAGEMENT
Introduction to NoSQL – aggregate data models – key-value and document data models –
relationships – graph databases – schemaless databases – materialized views – distribution models
master-slave replication – consistency - Cassandra – Cassandra data model – Cassandra examples
Cassandra clients.
PART-A (2 Marks)
Q.No. Questions BT Level Cos
1. Define NOSQL. BT-1 CO2
2. Define Schemaless databases. BT-1 CO2
3. What is Cassandra? BT-1 CO2
4. List out the components of Cassandra. BT-1 CO2
5. Write an advantages of NOSQL. BT-1 CO2
6. List out classification of NOSQL database. BT-1 CO2
7. Write any two examples of Key-value database. BT-1 CO2
8. Write an syntax for materialized view. BT-1 CO2
9. Define Graph databases. BT-1 CO2
10. List out any four Cassandra example. BT-1 CO2
11. Define Cassandra clients. BT-1 CO2
12. List out Key Features of NoSQL. BT-1 CO2
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
PART-B (13 Marks)
BT
Q.No Questions Marks Cos
Level
1 Explain detail about Schema less databases. 13 BT-1 CO2
2 Explain detail about Cassandra clients with an example. 13 BT-2 CO2
3 Explain detail about Master-Slave replication. 13 BT-3 CO2
4 Explain details about materialized views in NOSQL. 13 BT-2 CO2
5 Explain detail about graph databases with an example. 13 BT-3 CO2
6. 13 BT-1 Co2
Explain detail about Aggregate Data Model i n N O S Q L
databases.
PART-C (15 Marks)
Marks BT
S.No Questions Cos
Level
1.
Define Cassandra. Give an example how Cassandra create multiple copies
of data in database. 15 BT-2 CO2
2. Explain detail about graph databases with an real time example.
15 BT-4 CO2
3. Write brief about following:
(i)Master slave replication
(ii) Cassandra clients. 15 BT-2 CO2
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
UNIT III MAP REDUCE APPLICATIONS
MapReduce workflows – unit tests with MRUnit – test data and local tests – anatomy of
MapReduce job run – classic Map-reduce – YARN – failures in classic Map-reduce and YARN – job
scheduling – shuffle and sort – task execution – MapReduce types – input formats – output
formats.
PART-A (2 Marks)
Q.No. Questions BT Level Cos
1. Define Map Reduce.
BT-1 CO3
2. List out failures in classic map reduce.
3. Define the term MR unit.
BT-2 CO3
4. List out different types of job schedulers in Hadoop.
BT-2 CO3
5. List out types of Mapreduce.
BT-2 CO3
6. List out the features of YARN .
BT-1 CO3
7. List out the main components of YARN architecture.
BT-1 CO3
8. Define the term YARN.
BT-1 CO3
9. Write an advantages and disadvantages of map reduce.
BT-1 CO3
10. Write an applications of Map Reduce.
BT-2 CO3
PART-B (13 Marks)
Q.No Questions Marks BT Level Cos
1.
Explain detail about YARN architecture. 13 BT-2 CO3
2.
Write about failures in classic Map-reduce. 13 BT-2 CO3
3.
Explain detail about anatomy of MapReduce job run. 13 BT-2 CO3
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
4.
Explain detail about classic Map-reduce. 13 BT-2 CO3
5. Discuss about advantages and disadvantages of Mapreduce. 13 BT-2 CO3
6. Discuss about application workflow in Hadoop YARN
Architecture in detail. 13 BT-2
CO3
PART-C (15 Marks)
S.No Questions Marks BT Level Cos
15 CO3
BT-4
1. Explain detail about Pig and Pig Latin Scripts with an example.
Explain detail about various job schedulers in job scheduling in 15
BT-2 CO3
2. YARN.
15 CO3
Write brief about shuffling and sorting in Hadoop Mapreduce. BT-2
3.
15 CO3
BT-4
4. Write brief about Map Reduce workflows.
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
UNIT IV BASICS OF HADOOP
Data format – analyzing data with Hadoop – scaling out – Hadoop streaming –
Hadoop pipes – design of Hadoop distributed file system (HDFS) – HDFS concepts –
Java interface – data flow – Hadoop I/O – data integrity – compression – serialization –
Avro – file-based data structures - Cassandra – Hadoop integration.
.
PART-A (2 Marks)
Q.No. Questions BT Level Cos
1. Define the term scaling out. BT-2 CO4
2. Define the term name node and data node. BT-2
CO4
3. Write down the advantages of Hadoop. BT-2
CO4
4. List out types of Hadoop file formats. BT-2
CO4
5. Define HDFS. BT-2
CO4
6. Define the term Name node and Data node. BT-1
CO4
7. Write down the disadvantages of Hadoop file system. BT-2
CO4
8. List out types of Hadoop data formats. BT-1
CO4
9. Define serialization. BT-1
CO4
10. What is Cassandra and its uses? BT-2
CO4
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
PART-B (13 MARKS)
Q.No Questions Marks BT Level Cos
1 Write brief about Hadoop distributed file system. 13 BT-1 CO4
2 Explain detail about Hadoop pipes. 13 BT-2 CO4
3 Explain detail about serialization in Hadoop. 13 BT-3 CO4
4 Explain detail about Avro with an example. 13 BT-3 CO4
5 Write brief notes on Cassandra and its functions in big 13 BT-3 CO4
data.
6. Explain detail HDFS concepts in Hadoop. 13 BT-3 CO4
PART-C (15 MARKS)
S.No Questions Marks BT Cos
Level
15 BT-2 CO4
1. Explain detail about Hadoop I/O system.
15 BT-4 CO4
2. Explain detail about design of Hadoop distributed file system (HDFS)
15 BT-2 CO4
3. Explain detail about File based data structures.
15 BT-2 CO4
4. Explain detail about HDFS concepts and Java interface.
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
UNITV HADOOP RELATED TOOLS
Hbase –data model and implementations – Hbase clients – Hbase examples – praxis. Pig –Grunt – pig
data model – Pig Latin – developing and testing Pig Latin scripts. Hive – data types and file formats –
HiveQL data definition – HiveQL data manipulation – HiveQL queries.
PART-A (2 Marks)
Q.No. Questions BT Level Competence
1. Write any four applications of pig grunt in big data. BT-1 CO5
2. State the difference between RDBMS and HIVE. BT-1
CO5
3. What is meant by Hbase? BT-1
CO5
4. What is HiveQL? BT-1
CO5
5. Write any four applications of pig grunt in big data. BT-2
CO5
6. State the difference between RDBMS and HIVE.
BT-1
CO5
7. What is meant by Hbase? BT-1
CO5
8. What is HiveQL? BT-1
CO5
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)
lOMoARcPSD|37410823
PART-B (13 Marks)
Q.No Questions Marks BT Level Competence
1 Explain detail about Hbase data model in detail. 13 BT-2 CO5
2 Explain detail about pig data model. 13 BT-2 CO5
3 Write brief about Pig Latin scripts. 13 BT-2 CO5
4 Explain detail about HiveQL queries. 13 BT-2 CO5
5. Explain detail about Hive data types and file formats. 13 BT-2 CO5
6. Explain about Hbase clients with an example. 13 BT-2 CO5
PART C (15 MARKS)
S. Questions Marks BT Level Competence
No
1. What is pig? Analyze the pig data model with an example. 15 BT-4 CO5
2. Write Brief about following concepts with neat diagram 15
(a) Hbase clients BT-2 CO5
(b) Praxis
3. Explain detail about Hbase data model and its implementations 15 BT-2 CO5
with an example.
Downloaded by Thiyagarajan (thiyagu.cse86@gmail.com)