0% found this document useful (0 votes)
187 views5 pages

Mid - 2 Questions & Bits

This document contains 44 multiple choice questions about topics related to big data, Hadoop, HDFS, MapReduce, social media, and mobile analytics tools. The questions cover concepts such as components of Hadoop, how HDFS and MapReduce work, use cases for Hadoop, common tools used with Hadoop, and mobile analytics platforms.

Uploaded by

ASMA UL HUSNA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
187 views5 pages

Mid - 2 Questions & Bits

This document contains 44 multiple choice questions about topics related to big data, Hadoop, HDFS, MapReduce, social media, and mobile analytics tools. The questions cover concepts such as components of Hadoop, how HDFS and MapReduce work, use cases for Hadoop, common tools used with Hadoop, and mobile analytics platforms.

Uploaded by

ASMA UL HUSNA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Mid-2 Bits and questions

Choose the Best answer from the Following:


1. Which of the following mobile analytical tools is used to test an app?
a) Test flight b) Mobile App tracking c) Apsalar d) Mixpanel
2. Point out the correct statement :
a) Hadoop is an ideal environment for extracting and transforming small volumes of
data
b) Hadoop stores data in HDFS and supports data compression/decompression
c) The Graph framework is less useful than a MapReduce job to solve graph and
machine learning
d) None of the mentioned
3. What license is Hadoop distributed under?
a) Apache License 2.0 b) Mozilla Public License c) Shareware d) Commercial
4. Sun also has the Hadoop Live CD ________ project, which allows running a fully
functional Hadoop cluster using a live CD.
a) OpenOffice.org b) Open Solaris c) GNU d) Linux
5. Which of the following genres does Hadoop produce?
a) Distributed file system b) JAX-RS c) Java Message Service
d) Relational Database Management System
6. What was Hadoop written in?
a) Java (software platform) b) Perl c) Java (programming language)
d) Lua (programming language)
7. Which of the following platforms does Hadoop run on?
a) Bare metal b) Debian c) Cross-platform d) Unix-like
8. Hadoop achieves reliability by replicating the data across multiple hosts, and hence does
not require ________ storage on hosts.
a) RAID b) Standard RAID levels c) ZFS d) Operating system
9. Above the file systems comes the ________ engine, which consists of one Job Tracker,
to which client applications submit MapReduce jobs.
a) MapReduce b) Google c) Functional programming d) Facebook
10. The Hadoop list includes the HBase database, the Apache Mahout ________ system,
and matrix operations.
a) Machine learning b) Pattern recognition c) Statistical classification d) Artificial
intelligence
11. As companies move past the experimental phase with Hadoop, many cite the need for
additional capabilities, including:
a) Improved data storage and information retrieval
b) Improved extract, transform and load features for data integration
c) Improved data warehousing functionality
d) Improved security, workload management and SQL support
12. Point out the correct statement:
a) Hadoop do need specialized hardware to process the data
b) Hadoop 2.0 allows live stream processing of real time data
c) In Hadoop programming framework output files are divided in to lines or records
d) None of the mentioned
13. According to analysts, for what can traditional IT systems provide a foundation when
they’re integrated with big data technologies like Hadoop ?
a) Big data management and data mining
b) Data warehousing and business intelligence
c) Management of Hadoop clusters
d) Collecting and storing unstructured data
14. Hadoop is a framework that works with a variety of related tools. Common cohorts
include:
a) MapReduce, Hive and HBase b) MapReduce, MySQL and Google Apps
c) MapReduce, Hummer and Iguana d) MapReduce, Heron and Trumpet
15. Point out the wrong statement :
a) Hardtop processing capabilities are huge and it’s real advantage lies in the ability to
process terabytes & petabytes of data
b) Hadoop uses a programming model called “MapReduce”, all the programs should
confirms to this model in order to work on Hadoop platform
c) The programming model, MapReduce, used by Hadoop is difficult to write and test
d) All of the mentioned
16. Which of the following collectively represents a social network bound via specific sets
of social relationships?
a) Websites b) Big data c) People d) Analytical Tools
17. Which of the following is not performed by social media.
a) Participation b) Online Shopping c) Content Sharing d) Conversation
18. Which of the following text mining tools is used to extract who, what, where, when and
why facts.
a) Active Point b) Attensity c) Cross minder d) Conversation
19. Which of the following is a popular blogging website.
a) Facebook b) LinkedIn c) Twitter d) WordPress
20. Which of the following term represents passive observation of social media activities.
a) Participation b) Engaging c) Listening d) Production

21. A ________ serves as the master and there is only one NameNode per cluster.
a) Data Node b) NameNode c) Data block d) Replication
22. Point out the correct statement :
a) DataNode is the slave/worker node and holds the user data in the form of Data Blocks
b) Each incoming file is broken into 32 MB by default
c) Data blocks are replicated across different nodes in the cluster to ensure a low degree
of fault tolerance
d) None of the mentioned
23. HDFS works in a __________ fashion.
a) master-worker b) master-slave c) worker/slave d) all of the mentioned
24. ________ NameNode is used when the Primary NameNode goes down.
a) Rack b) Data c) Secondary d) None of the mentioned
25. Point out the wrong statement :
a) Replication Factor can be configured at a cluster level (Default is set to 3) and also at a
file level
b) Block Report from each DataNode contains a list of all the blocks that are stored on
that DataNode
c) User data is stored on the local file system of DataNodes
d) DataNode is aware of the files to which the blocks stored on it belong to
26. Which of the following scenario may not be a good fit for HDFS ?
a) HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same
file
b) HDFS is suitable for storing data related to applications requiring low latency data
access
c) HDFS is suitable for storing data related to applications requiring low latency data
access
d) None of the mentioned
27. The need for data replication can arise in various scenarios like :
a) Replication Factor is changed b) DataNode goes down
c) Data Blocks get corrupted d) All of the mentioned
28. ________ is the slave/worker node and holds the user data in the form of Data Blocks.
a) DataNode b) NameNode c) Data block d) Replication
29. HDFS provides a command line interface called __________ used to interact with HDFS.
a) “HDFS Shell” b) “FS Shell” c) “DFS Shell” d) None of the mentioned
30. HDFS is implemented in _____________ programming language.
a) C++ b) Java c) Scala d) None of the mentioned
31. A ________ node acts as the Slave and is responsible for executing a Task assigned to it
by the Job Tracker.
a) MapReduce b) Mapper c) Task Tracker d) Job Tracker
32. Point out the correct statement :
a) MapReduce tries to place the data and the compute as close as possible
b) Map Task in MapReduce is performed using the Mapper() function
c) Reduce Task in MapReduce is performed using the Map() function
d) All of the mentioned
33. ___________ part of the MapReduce is responsible for processing one or more chunks of
data and producing the output results.
a) Map task b) Mapper c) Task execution d) All of the mentioned
34. _________ function is responsible for consolidating the results produced by each of the
Map() functions/tasks.
a) Reduce b) Map c) Reducer d) All of the mentioned
35. Point out the wrong statement :
a) A MapReduce job usually splits the input data-set into independent chunks which are
processed by the map tasks in a completely parallel manner
b) The MapReduce framework operates exclusively on pairs
c) Applications typically implement the Mapper and Reducer interfaces to provide the
map and reduce methods
d) None of the mentioned
36. Although the Hadoop framework is implemented in Java , MapReduce applications need
not be written in :
a) Java b) C c) C# d) None of the mentioned
37. ________ is a utility which allows users to create and run jobs with any executables as
the mapper and/or the reducer.
a) Hadoop Strdata b) Hadoop Streaming. c) Hadoop Stream d) None of the mentioned
38. Which of the following mobile analytical tools will you use to work on all mobile
platforms for the measurement of user acquisition, engagement, and outcomes in native
mobile apps?
a) Admob b) Bango c) Google analytics d) Localytics
39. Which of the following location based mobile analytical tracking tools will you use to
work on all mobile platforms for the advanced geolocation functionality to mobile
devices running on ios, android?
a) Geoloqi b) Placed c) Geckoboard d) Mixpanel
40. Which of the following mobile analytical tools is used to test an app?
a) Testflight b) Mobile App tracking c) Apsalar d) Mixpanel
41. Which of the following is a popular blogging website.
a) Facebook b) LinkedIn c) Twitter d) WordPress

42. ________ is the architectural center of Hadoop that allows multiple data processing
engines.
a) YARN b) Hive c) Incubator d) Chuckwa
43. Point out the correct statement :
a) YARN also extends the power of Hadoop to incumbent and new technologies found
within the data center
b) YARN is the central point of investment for Hortonworks within the Apache
community
c) YARN enhances a Hadoop compute cluster in many ways
d) All of the mentioned
44. YARN’s dynamic allocation of cluster resources improves utilization over more static
_______ rules used in early versions of Hadoop.
a) Hive b) MapReduce c) Imphala d) All of the mentioned
45. The __________ is a framework-specific entity that negotiates resources from the
ResourceManager
a) NodeManager b) ResourceManager c) ApplicationMaster d) All of the mentioned
46. Point out the wrong statement :
a) From the system perspective, the ApplicationMaster runs as a normal container.
b) The ResourceManager is the per-machine slave, which is responsible for launching the
applications’ containers
c) The NodeManager is the per-machine slave, which is responsible for launching the
applications’ containers, monitoring their resource usage
d) None of the mentioned
47. Apache Hadoop YARN stands for :
a) Yet Another Reserve Negotiator b) Yet Another Resource Network
c) Yet Another Resource Negotiator d) All of the mentioned
48. Define the Port Numbers for NameNode, Task Tracker and Job Tracker.
a) NameNode b) Task Tracker c) Job Tracker d) All of the above
49. The ____________ is the ultimate authority that arbitrates resources among all the
applications in the system.
a) NodeManager b) ResourceManager c) ApplicationMaster d) All of the mentioned
50. The __________ is responsible for allocating resources to the various running
applications subject to familiar constraints of capacities, queues etc.
a) Manager b) Master c) Scheduler d) None of the mentioned
51. The Capacity Scheduler supports _____________ queues to allow for more predictable
sharing of cluster resources.
a) Networked b) Hierarchical c) Partition d) None of the mentioned
52. Point out the wrong statement :
a) HBase provides only sequential access of data
b) HBase provides high latency batch processing
c) HBase internally provides serialized access
d) All of the mentioned
53. The _________ Server assigns regions to the region servers and takes the help of Apache
ZooKeeper for this task.
a) Region b) Master c) Zookeeper d) All of the mentioned
54. Which of the following command provides information about the user?
a) status b) version c) whoami d) user
55. Which of the following command does not operate on tables?
a) enabled b) disabled c) drop d) all of the mentioned
56. _________ command fetches the contents of row or a cell.
a) select b) get c) put d) none of the mentioned
57. HBase Admin and ____________ are the two important classes in this package that
provide DDL functionalities.
a) HTableDescriptor b) HDescriptor c) HTable d) HTabDescriptor
58. Which of the following mobile analytical tools is used to test an app?
a) Testflight b) Mobile App tracking c) Apsalar d) Mixpanel
59. Which of the following is an example of a mobile application?
a) Android b) HTML c) Blackberry d) Java script
60. Localytics is an example of a___________
a) Mobile device b) Mobile application c) Mobile Platform
d) Mobile analytics tool
Describe the working of the Map reduce algorithm with an example.
Discuss about the role of HBASE in Big data processing

.Discuss about the Hadoop 2 YARN with its architecture in detail

Describe the steps to perform Text mining.

Discuss about the role of HBASE in Big data processing.


Discuss about the contrast between SQL and Map Reduce

List some common online tools used to perform sentiment analysis.

Discuss in detail about the various Mobile analytical tools.


Discuss some techniques to optimize Map reduce jobs.
Discuss the points you need to consider while designing a file system in Map reduce.
With an example, discuss about the Processing of Data with Hadoop by Map Reduce Job Tracker.
Discuss in detail about the Types of applications for Mobile Analytics.

You might also like