1) Write a Short Note on NoSQL
NoSQL (Not Only SQL) is a database management system that does not follow the
traditional relational database structure. It offers flexible schema design, scalability, and high
performance, making it suitable for handling unstructured and semi-structured data. NoSQL
databases are primarily used in big data applications, real-time web apps, and distributed
systems.
Key Features:
Schema-less
High scalability
Supports unstructured, semi-structured, and structured data
Often open-source
Suitable for distributed architectures
2) Explain CAP Theorem in Detail
The CAP Theorem, proposed by Eric Brewer, states that a distributed database system can
provide only two out of the following three guarantees simultaneously:
Consistency (C): Every read receives the most recent write.
Availability (A): Every request receives a (non-error) response.
Partition Tolerance (P): The system continues to operate despite network partitions.
Explanation:
CA (Consistency + Availability): Not partition tolerant.
CP (Consistency + Partition Tolerance): May sacrifice availability.
AP (Availability + Partition Tolerance): May not guarantee immediate consistency.
3) Write a Short Note on Advantages and Disadvantages
of NoSQL
Advantages:
High scalability
Flexible data model
Better performance for large volumes of data
Easily handles big data and real-time web apps
Disadvantages:
Lack of standardization
Limited support for complex queries
No support for multi-row transactions in many cases
Requires more development effort to maintain consistency
4) Give Comparison Between SQL and NoSQL
Feature SQL (Relational DB) NoSQL
Data Model Relational (Tables) Document, Key-Value, Graph
Schema Fixed Dynamic
Scalability Vertical Horizontal
Query Language SQL Varies (No standard)
ACID Compliance Full Partial or BASE
Data Type Structured Semi-structured/Unstructured
5) Explain BASE Model of NoSQL
BASE stands for:
Basically Available: System guarantees availability.
Soft state: State of the system may change over time, even without input.
Eventual consistency: System will become consistent over time.
BASE is an alternative to the ACID model in traditional databases and is more suitable for
distributed systems.
6) Give All Comparison of ACID and BASE
Feature ACID (SQL) BASE (NoSQL)
Focus Consistency Availability
State Strong consistency Eventual consistency
Transactions Reliable and isolated Flexible and available
Performance Slower due to consistency High due to relaxation of rules
Suitable For Banking, ERP Social media, IoT, big data
7) Explain Various Technologies Used in Big Data
Hadoop: Distributed storage and processing framework.
Spark: In-memory data processing engine.
Hive: SQL-like queries on large datasets.
Pig: Data flow language for Hadoop.
HBase: Column-oriented NoSQL database.
Flume/Sqoop: Data ingestion tools.
Kafka: Distributed messaging system.
MongoDB: Document-oriented NoSQL database.
Cassandra: Distributed NoSQL database optimized for scalability.
8) Explain Categories of NoSQL Databases
1. Key-Value Stores: Store data as key-value pairs (e.g., Redis, Riak).
2. Document Stores: Use documents (JSON/XML) to store data (e.g., MongoDB).
3. Column Stores: Store data in columns instead of rows (e.g., Cassandra, HBase).
4. Graph Databases: Store data in nodes and edges (e.g., Neo4j).
9) Explain NoSQL with Its Features
NoSQL is designed for:
High availability
Horizontal scalability
Flexible schema
Handling large volumes of data
Key Features:
Schema-free design
Fast read/write operations
Supports big data and real-time apps
Optimized for distributed computing
10) Explain Brief History of NoSQL
1998: First mention of "NoSQL" for a lightweight database.
2009: Term reintroduced by Johan Oskarsson.
Growth of social media, big data, and distributed computing led to the rise of NoSQL.
Companies like Google, Amazon, and Facebook drove NoSQL innovation.
11) Explain Need of NoSQL and Features of Its Databases
Need:
Traditional RDBMSs fail to scale horizontally.
Unstructured data needs flexibility.
Real-time and big data applications demand high performance.
Features:
Schema-less
Horizontal scalability
High availability
Support for distributed architecture
12) Explain Following Terms:
a) Key-Value Based Database:
Stores data as key-value pairs.
Fast and scalable.
Example: Redis, DynamoDB.
b) Column-Based Database:
Stores data in columns instead of rows.
Efficient for queries involving large datasets.
Example: Cassandra, HBase.
c) Graph-Based Database:
Stores data as nodes and edges.
Suitable for complex relationships.
Example: Neo4j, ArangoDB.
Unit 2
1) Explain MongoDB with Its Features
MongoDB is an open-source, document-oriented NoSQL database designed for high performance,
high availability, and easy scalability. It stores data in flexible, JSON-like documents, making it easier
to work with hierarchical and complex data structures.
Features:
Document-Oriented Storage: Data is stored as BSON (Binary JSON) documents.
Schema-Less: No predefined structure; allows flexibility in data types.
Horizontal Scalability: Supports sharding for distributed data storage.
Indexing: Supports primary, secondary, and compound indexing for faster queries.
Aggregation Framework: Allows complex data processing and transformation.
Replication: Supports replica sets for high availability and fault tolerance.
Support for Geospatial Queries: Useful in location-based applications.
Flexible Data Model: Easy mapping between application objects and database documents.
2) Explain Following Terms
a) Speed:
Refers to the rate at which a database can process read and write operations. MongoDB provides
faster operations due to in-memory computing, indexing, and document-based storage.
b) Scalability:
Ability of a system to handle a growing amount of data and requests. MongoDB achieves this via
sharding (horizontal scaling), where data is split across multiple servers.
c) Agility:
Database’s ability to adapt quickly to changing requirements. MongoDB’s schema-less design
enables rapid development and easier adjustments in data structure.
3) What is Non-Relational Approach?
A non-relational approach refers to databases that don’t use the traditional table-based relational
model. Instead, they use various data models like key-value, document, graph, or columnar.
Characteristics:
No fixed schema
Suited for large-scale data storage
Enables rapid development
Flexible data types and structures
Ideal for unstructured or semi-structured data
4) Explain JSON-Based Document and Its Advantages
JSON Document:
JavaScript Object Notation (JSON) is a lightweight data-interchange format that uses key-value pairs.
In MongoDB:
Documents are stored in BSON, an extended form of JSON.
Advantages:
Human-readable format
Easy to map with application-level data structures
Supports nested documents and arrays
Schema flexibility for fast iteration
Ideal for web and mobile applications
5) Difference Between Performance vs Features
Aspect Performance Features
Focus Speed and efficiency Functionality and capabilities
Minimize latency and maximize
Goal Provide rich tools, APIs, and operations
throughput
Trade-off May reduce features for speed May slow down performance with more logic
Rich features like aggregation, geospatial
MongoDB High performance due to indexing, etc.
queries
6) Differentiate Between MongoDB and RDBMS
Feature MongoDB RDBMS
Data Structure Document-based (BSON) Table-based
Schema Dynamic Static
Transactions Multi-document (since 4.0) Full ACID support
Query Language Mongo Query Language (MQL) SQL
Feature MongoDB RDBMS
Scalability Horizontal (Sharding) Vertical
Joins Limited support via $lookup Strong JOIN capabilities
Storage Collections Tables
7) What is MongoDB Schema
In MongoDB, schema refers to the structure or organization of documents in a collection. MongoDB
is schema-less, meaning documents within the same collection can have different fields or data
types.
However, schema design is still important:
For efficient indexing
For query optimization
To maintain data integrity (optionally using schema validation)
8) Explain What Do You Mean by Running Database
Anywhere
This refers to MongoDB’s flexibility in deployment. MongoDB can be deployed:
On-premises
In cloud environments (AWS, Azure, GCP)
As a managed service (MongoDB Atlas)
On containers (Docker)
On serverless platforms
This “run anywhere” capability gives developers the freedom to choose their preferred environment
without changing the codebase.
9) MongoDB Database Model
MongoDB follows a document-oriented database model, where:
Database contains collections
Collections contain documents
Documents are BSON-encoded JSON objects
This model supports:
Embedded documents and arrays
No rigid schema
Nested and hierarchical data
10) Difference Between JSON and BSON
Feature JSON BSON
Format Text-based Binary format
Readability Human-readable Not human-readable
Data Types Limited More (e.g., Date, Binary, int32)
Size Smaller in size Slightly larger due to metadata
Performance Slower for processing Faster parsing and encoding
11) What is Capped Collection and Polymorphic Schema
Capped Collection:
Fixed-size collection that automatically overwrites the oldest documents.
Maintains insertion order.
Ideal for logging, real-time analytics.
Features:
Fast write performance
No deletion allowed
Useful for circular buffer-like behavior
Polymorphic Schema:
A collection where documents can have different sets of fields.
Encourages flexible and evolving data models.
Example:
json
CopyEdit
{ "_id": 1, "type": "book", "title": "Mongo Basics" }
{ "_id": 2, "type": "video", "title": "MongoDB Crash Course", "duration":
120 }
Each document is shaped differently based on its type, yet stored in the same collection.