0% found this document useful (0 votes)

42 views7 pages

Week - 4-1

Uploaded by

many many

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views7 pages

Week - 4-1

Uploaded by

many many

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Week -4

1.Which of the following statements about Bloom filters is true?

a. Bloom filters guarantee no false negatives
b. Bloom filters use cryptographic hashing functions
c. Bloom filters may produce false positives but no false negatives
d. Bloom filters are primarily used for sorting large datasets

Option a: Bloom filters guarantee no false negatives - Incorrect. Bloom filters

can produce false positives (indicating an element is present when it's not), but
they guarantee no false negatives (indicating an element is absent when it's
present).

Option b: Bloom filters use cryptographic hashing functions - Incorrect. While

cryptographic hashing functions can be used, they are not a requirement. Bloom
filters typically use multiple hash functions.

Option c: Bloom filters may produce false positives but no false negatives -
Correct. This is the fundamental property of Bloom filters.

Option d: Bloom filters are primarily used for sorting large datasets - Incorrect.
Bloom filters are primarily used for approximate membership testing.

2. How does CAP theorem impact the design of distributed systems?

A) It emphasizes data accuracy over system availability

B) It requires trade-offs between consistency, availability, and partition tolerance

C) It prioritizes system performance over data security

D) It eliminates the need for fault tolerance measures

Option A: It emphasizes data accuracy over system availability - Incorrect.

The CAP theorem does not prioritize data accuracy; rather, it highlights the
trade-offs between consistency, availability, and partition tolerance in distributed
systems.

Option B: It requires trade-offs between consistency, availability, and

partition tolerance - Correct. The CAP theorem states that in the presence of a
network partition, a distributed system can only guarantee either consistency or
availability, but not both.

Option C: It prioritizes system performance over data security - Incorrect.

The CAP theorem does not address performance or security; it focuses
specifically on the consistency, availability, and partition tolerance trade-offs in
distributed systems.

Option D: It eliminates the need for fault tolerance measures - Incorrect.

The CAP theorem does not eliminate the need for fault tolerance; in fact, it
highlights the challenges that arise in maintaining consistency and availability
when partitions occur.

3. Which guarantee does the CAP theorem consider as mandatory for a distributed
system?
a. Consistency
b. Availability
c. Partition tolerance
d. Latency tolerance

Option a: Consistency - Incorrect. The CAP theorem states that it's impossible
to achieve all three guarantees (Consistency, Availability, and Partition tolerance)
simultaneously in a distributed system.

Option b: Availability - Incorrect. Availability is not guaranteed if partition

tolerance is required.

Option c: Partition tolerance - Correct. The CAP theorem states that partition
tolerance is essential for distributed systems, as network partitions are inevitable.

Option d: Latency tolerance - Incorrect. Latency tolerance is not explicitly

mentioned in the CAP theorem.
4.What consistency level in Apache Cassandra ensures that a write operation is
acknowledged only after the write has been successfully written to all replicas?
a. ONE
b. LOCAL_ONE
c. LOCAL_QUORUM
d. ALL

Option a: ONE - Incorrect. ONE requires only one replica to acknowledge the
write.

Option b: LOCAL_ONE - Incorrect. LOCAL_ONE requires one replica within

the same datacenter to acknowledge the write.

Option c: LOCAL_QUORUM - Incorrect. LOCAL_QUORUM requires a quorum

of replicas within the same datacenter to acknowledge the write.

Option d: ALL - Correct. ALL requires all replicas to acknowledge the write
before returning a response.

5. How does Zookeeper contribute to maintaining consistency in distributed systems?

A) By managing data replication

B) By providing a centralized configuration service

C) By ensuring data encryption

D) By optimizing data storage

Option A: By managing data replication – Incorrect. Zookeeper’s role is more

about coordination, not directly managing data replication.

Option B: By providing a centralized configuration service – Correct.

Zookeeper contributes to maintaining consistency in distributed systems by

providing a centralized coordination and configuration service, ensuring
consistent synchronization across distributed nodes.
Option C: By ensuring data encryption – Incorrect. Zookeeper doesn't handle
encryption.

Option D: By optimizing data storage – Incorrect. Zookeeper is not involved

in optimizing data storage.

Explanation:

Centralized Configuration: ZooKeeper acts as a centralized service where

distributed applications can store and retrieve configuration information. This
helps in ensuring that all nodes in the distributed system have consistent
configuration settings, reducing the chances of configuration mismatches or
inconsistencies.

6. A ___________ server is a machine that keeps a copy of the state of the entire
system and persists this information in local log files.
a) Master
b) Region
c) Zookeeper
d) All of the mentioned

Option A: Master – Incorrect. The master server may manage parts of the
system but does not persist the full system state in local logs.

Option B: Region – Incorrect. A region server manages a subset of data, but it

doesn’t maintain the full system state or persist it in logs.

Option C: Zookeeper – Correct. Zookeeper maintains a consistent view of the

system’s state and stores this information in local log files for fault tolerance and
recovery.

Option D: All of the mentioned – Incorrect. Only Zookeeper is responsible for

persisting the state of the entire system in local logs.

Explanation:

Master Server: A master server typically coordinates tasks within a cluster but
doesn't necessarily store the entire system state.
Region Server: This term is often used in context of distributed databases like
HBase, where region servers manage specific data partitions. They wouldn't hold
the entire system state.

Zookeeper

Zookeeper is a centralized service that coordinates and manages distributed

systems. It keeps a copy of the system's state and persists this information in
local log files. This allows it to provide services such as naming, configuration
management, and synchronization.

While a Master node might also have some state information, its primary role is
often different, such as coordinating tasks or managing data. A Region node is
typically a unit within a larger distributed system, and its role might involve
managing specific data or tasks.

7.What is Apache Zookeeper primarily used for in Big Data ecosystems?

A) Data storage

B) Data processing

C) Configuration management

D) Data visualization

Option A: Data storage – Incorrect. Zookeeper is not designed for storing large
amounts of data; its main purpose is coordination, not data storage.

Option B: Data processing – Incorrect. Zookeeper does not process data; it

provides coordination services for distributed systems.

Option C: Configuration management – Correct. Zookeeper is primarily used

for configuration management, leader election, and synchronization in distributed
systems within Big Data ecosystems.

Option D: Data visualization – Incorrect. Zookeeper has no role in data

visualization. Its function is more about system coordination and management.
8. Which statement correctly describes CQL (Cassandra Query Language)?
a. CQL is a SQL-like language used for querying relational databases
b. CQL is a procedural programming language used for writing stored procedures in
Cassandra
c. CQL is a language used for creating and managing tables and querying data in
Apache Cassandra
d. CQL is a scripting language used for data transformation tasks in Cassandra

Option A: CQL is a SQL-like language used for querying relational

databases – Incorrect. While CQL is SQL-like, Cassandra is a NoSQL
database, not a relational database.

Option B: CQL is a procedural programming language used for writing

stored procedures in Cassandra – Incorrect. CQL is not a procedural
language, nor is it used for writing stored procedures
.

Option C: CQL is a language used for creating and managing tables and
querying data in Apache Cassandra – Correct. CQL is primarily used in
Cassandra for creating, managing tables, and querying data.

Option D: CQL is a scripting language used for data transformation tasks in

Cassandra – Incorrect. CQL is not a scripting language and is not designed for
data transformation tasks.

9.Which aspect of CAP theorem refers to a system's ability to continue operating

despite network failures?

A) Consistency

B) Accessibility

C) Partition tolerance

D) Atomicity

Option A: Consistency – Incorrect. Consistency refers to ensuring that all

nodes see the same data at the same time, not handling network failures.
Option B: Accessibility – Incorrect. Availability refers to the system's ability to
respond to requests, but does not specifically address network partitioning.

Option C: Partition tolerance – Correct. Partition tolerance refers to the

system's ability to continue functioning even when network failures or partitions
occur.

Option D: Atomicity – Incorrect. Atomicity is a concept related to transactions,

ensuring that operations are fully completed or not at all, not related to network
failures.

10.Why are tombstones used in distributed databases like Apache Cassandra?

a. To mark nodes that are temporarily unavailable
b. To mark data that is stored in multiple replicas
c. To mark data that has been logically deleted
d. To mark data that is actively being updated

Option a: To mark nodes that are temporarily unavailable - Incorrect.

Tombstones are not used to mark unavailable nodes.

Option b: To mark data that is stored in multiple replicas - Incorrect.

Tombstones are not used to mark data replication.

Option c: To mark data that has been logically deleted - Correct. Tombstones
are used to mark data that has been deleted but still exists in the system for a
certain period to prevent accidental overwrites.

Option d: To mark data that is actively being updated - Incorrect. Tombstones

are not used to mark data that is being updated.

Assignment 04 BigData Computing Noc23-Cs112
No ratings yet
Assignment 04 BigData Computing Noc23-Cs112
7 pages
Noc19 cs33 Assignment5
No ratings yet
Noc19 cs33 Assignment5
3 pages
Week 6 Assignment 06
No ratings yet
Week 6 Assignment 06
4 pages
Big Data Computing Week-4
No ratings yet
Big Data Computing Week-4
3 pages
Casandra Brass Tacks
No ratings yet
Casandra Brass Tacks
2 pages
04.1 Fault Tolerance 2
No ratings yet
04.1 Fault Tolerance 2
24 pages
Bda r16 Csdlo7032 QP
No ratings yet
Bda r16 Csdlo7032 QP
4 pages
Final Sample Questions
No ratings yet
Final Sample Questions
10 pages
Big Data Computing - Assignment 4
No ratings yet
Big Data Computing - Assignment 4
4 pages
Quiz Results: Math & Comp Sci
No ratings yet
Quiz Results: Math & Comp Sci
7 pages
Cassandra Brass Tacks Q&A
No ratings yet
Cassandra Brass Tacks Q&A
4 pages
Big Data 2020
No ratings yet
Big Data 2020
13 pages
Big Data 2018
No ratings yet
Big Data 2018
6 pages
Nptel Big Data Full Assignment Solution 2021
90% (10)
Nptel Big Data Full Assignment Solution 2021
36 pages
Lect26 After
No ratings yet
Lect26 After
28 pages
Week 2
No ratings yet
Week 2
7 pages
Cse 803 Final
No ratings yet
Cse 803 Final
91 pages
Bda MCQ
100% (1)
Bda MCQ
44 pages
W 7 Assignment
No ratings yet
W 7 Assignment
2 pages
Cloud Computing Applications Part 1 Final
No ratings yet
Cloud Computing Applications Part 1 Final
130 pages
CCW CST308
No ratings yet
CCW CST308
6 pages
Final2009 Solve PDF
No ratings yet
Final2009 Solve PDF
11 pages
BD Imp Ques 2
No ratings yet
BD Imp Ques 2
26 pages
Cassandra Installation Review
No ratings yet
Cassandra Installation Review
6 pages
Questions and Answers 5-10 Marks
No ratings yet
Questions and Answers 5-10 Marks
2 pages
Final Paper
No ratings yet
Final Paper
29 pages
Cloud Computing and Distributed Systems - Unit 9 - Week 6
No ratings yet
Cloud Computing and Distributed Systems - Unit 9 - Week 6
4 pages
Bda CHP 3
No ratings yet
Bda CHP 3
8 pages
Exam 2: 6.1800 Computer Systems Engineering: Spring 2024
No ratings yet
Exam 2: 6.1800 Computer Systems Engineering: Spring 2024
10 pages
CS-Database System Principles: Final Exam - Summer 2001
No ratings yet
CS-Database System Principles: Final Exam - Summer 2001
18 pages
All Questions With Answers
No ratings yet
All Questions With Answers
14 pages
Big Data & NoSQL Exam Prep
No ratings yet
Big Data & NoSQL Exam Prep
5 pages
2023 BD All Assignment
100% (1)
2023 BD All Assignment
63 pages
Module 5
No ratings yet
Module 5
4 pages
Big Data Course: Key Concepts & Tools
No ratings yet
Big Data Course: Key Concepts & Tools
66 pages
Big Data QCM 1 PDF
100% (1)
Big Data QCM 1 PDF
7 pages
BDS EC2M Feb2024 Makeup AnswerKeys - 250117 - 161903
No ratings yet
BDS EC2M Feb2024 Makeup AnswerKeys - 250117 - 161903
5 pages
Technical Exam Questions
No ratings yet
Technical Exam Questions
42 pages
Multiple Response Tasks
No ratings yet
Multiple Response Tasks
11 pages
Cassandra
No ratings yet
Cassandra
24 pages
MCQ Big
No ratings yet
MCQ Big
7 pages
Capegeminitill 4 Thdec
No ratings yet
Capegeminitill 4 Thdec
30 pages
Cassandra for Developers & Analysts
No ratings yet
Cassandra for Developers & Analysts
6 pages
Bigdata MCQ QA Part2
No ratings yet
Bigdata MCQ QA Part2
9 pages
Imp Objective
No ratings yet
Imp Objective
9 pages
Questions On DS
No ratings yet
Questions On DS
8 pages
Model 1
No ratings yet
Model 1
8 pages
No SQL Quiz Questions
No ratings yet
No SQL Quiz Questions
7 pages
Week 0 To 8 Assignment
No ratings yet
Week 0 To 8 Assignment
31 pages
MP - Ent - 20 Computer Science PDF
No ratings yet
MP - Ent - 20 Computer Science PDF
12 pages
2006
No ratings yet
2006
13 pages
A Critique of The Cap Theorem
No ratings yet
A Critique of The Cap Theorem
14 pages
DS BigDATA 2ièmeN2TR UVT 2022 2023
No ratings yet
DS BigDATA 2ièmeN2TR UVT 2022 2023
4 pages
BigData Questions
No ratings yet
BigData Questions
17 pages
Sample Midterm 1 Answers
No ratings yet
Sample Midterm 1 Answers
7 pages
Distributed Computing System Quiz Questions
75% (4)
Distributed Computing System Quiz Questions
9 pages
Declarativity Project Report
No ratings yet
Declarativity Project Report
10 pages
2022 Assignment Answers
100% (1)
2022 Assignment Answers
37 pages
Distributed Systems Quiz 2024
No ratings yet
Distributed Systems Quiz 2024
24 pages
Software Testing - Unit 5 - Week 2
100% (1)
Software Testing - Unit 5 - Week 2
3 pages
Software Testing - Unit 4 - Week 1
100% (1)
Software Testing - Unit 4 - Week 1
3 pages
Week 8-2-9-Copy-0
No ratings yet
Week 8-2-9-Copy-0
1 page
Week - 5
No ratings yet
Week - 5
7 pages
Week 3-1
No ratings yet
Week 3-1
8 pages
Assignment Submission and Assessment
0% (1)
Assignment Submission and Assessment
9 pages
Profile - 2023-06-26T153039.737
No ratings yet
Profile - 2023-06-26T153039.737
1 page
DBMS Guide for Tech Enthusiasts
No ratings yet
DBMS Guide for Tech Enthusiasts
7 pages
MIS Paradigm Shift in Education
No ratings yet
MIS Paradigm Shift in Education
7 pages
Online Platforms & Internet Learning Guide
100% (1)
Online Platforms & Internet Learning Guide
21 pages
DBMS Assignment 1
No ratings yet
DBMS Assignment 1
4 pages
Private Image Metadata Analysis
No ratings yet
Private Image Metadata Analysis
27 pages
How To Query Hbase Data Using Solr
No ratings yet
How To Query Hbase Data Using Solr
1 page
Information Processing Cycle
100% (2)
Information Processing Cycle
11 pages
AI Career Coach Your Path To Success
No ratings yet
AI Career Coach Your Path To Success
9 pages
Sindhu 1
No ratings yet
Sindhu 1
14 pages
RMS 2nd Report
No ratings yet
RMS 2nd Report
13 pages
Chap 3
No ratings yet
Chap 3
20 pages
Force FX-8C Electrosurgical Generator Schematics
100% (11)
Force FX-8C Electrosurgical Generator Schematics
45 pages
NFSU Admission
No ratings yet
NFSU Admission
2 pages
Cloudera Enterprise Whitepaper
No ratings yet
Cloudera Enterprise Whitepaper
10 pages
OceanStor Dorado V3 Series V300R001 Quick Configuration Guide
No ratings yet
OceanStor Dorado V3 Series V300R001 Quick Configuration Guide
13 pages
Interview Questions - Hive and Querying
No ratings yet
Interview Questions - Hive and Querying
3 pages
Data Warehouse
No ratings yet
Data Warehouse
4 pages
Blood Bank Management
No ratings yet
Blood Bank Management
11 pages
Vijaya Bharathi
No ratings yet
Vijaya Bharathi
2 pages
SAP - Data Migration
No ratings yet
SAP - Data Migration
6 pages
Thematic Map - Wikipedia, The Free Encyclopedia
No ratings yet
Thematic Map - Wikipedia, The Free Encyclopedia
5 pages
Web Analytics
No ratings yet
Web Analytics
9 pages
Dec40073 PW6
No ratings yet
Dec40073 PW6
19 pages
Explorers Guide To The Semantic Web 24-42
No ratings yet
Explorers Guide To The Semantic Web 24-42
19 pages
Star and Snowflake Schemas
No ratings yet
Star and Snowflake Schemas
4 pages
E-R Modeling and Database Keys Guide
No ratings yet
E-R Modeling and Database Keys Guide
26 pages
6014 Question Paper
No ratings yet
6014 Question Paper
2 pages
School Cafeteria Milk Sales Data
No ratings yet
School Cafeteria Milk Sales Data
6 pages

Week - 4-1

Uploaded by

Week - 4-1

Uploaded by

Week -4

1.Which of the following statements about Bloom filters is true?

Option a: Bloom filters guarantee no false negatives - Incorrect. Bloom filters

Option b: Bloom filters use cryptographic hashing functions - Incorrect. While

2. How does CAP theorem impact the design of distributed systems?

A) It emphasizes data accuracy over system availability

B) It requires trade-offs between consistency, availability, and partition tolerance

C) It prioritizes system performance over data security

D) It eliminates the need for fault tolerance measures

Option A: It emphasizes data accuracy over system availability - Incorrect.

Option B: It requires trade-offs between consistency, availability, and

Option C: It prioritizes system performance over data security - Incorrect.

Option D: It eliminates the need for fault tolerance measures - Incorrect.

Option b: Availability - Incorrect. Availability is not guaranteed if partition

Option d: Latency tolerance - Incorrect. Latency tolerance is not explicitly

Option b: LOCAL_ONE - Incorrect. LOCAL_ONE requires one replica within

Option c: LOCAL_QUORUM - Incorrect. LOCAL_QUORUM requires a quorum

5. How does Zookeeper contribute to maintaining consistency in distributed systems?

A) By managing data replication

B) By providing a centralized configuration service

C) By ensuring data encryption

D) By optimizing data storage

Option A: By managing data replication – Incorrect. Zookeeper’s role is more

Option B: By providing a centralized configuration service – Correct.

Zookeeper contributes to maintaining consistency in distributed systems by

Option D: By optimizing data storage – Incorrect. Zookeeper is not involved

Centralized Configuration: ZooKeeper acts as a centralized service where

Option B: Region – Incorrect. A region server manages a subset of data, but it

Option C: Zookeeper – Correct. Zookeeper maintains a consistent view of the

Option D: All of the mentioned – Incorrect. Only Zookeeper is responsible for

Zookeeper is a centralized service that coordinates and manages distributed

7.What is Apache Zookeeper primarily used for in Big Data ecosystems?

Option B: Data processing – Incorrect. Zookeeper does not process data; it

Option C: Configuration management – Correct. Zookeeper is primarily used

Option D: Data visualization – Incorrect. Zookeeper has no role in data

Option A: CQL is a SQL-like language used for querying relational

Option B: CQL is a procedural programming language used for writing

Option D: CQL is a scripting language used for data transformation tasks in

9.Which aspect of CAP theorem refers to a system's ability to continue operating

Option A: Consistency – Incorrect. Consistency refers to ensuring that all

Option C: Partition tolerance – Correct. Partition tolerance refers to the

Option D: Atomicity – Incorrect. Atomicity is a concept related to transactions,

10.Why are tombstones used in distributed databases like Apache Cassandra?

Option a: To mark nodes that are temporarily unavailable - Incorrect.

Option b: To mark data that is stored in multiple replicas - Incorrect.

Option d: To mark data that is actively being updated - Incorrect. Tombstones

You might also like