Database Fundamentals: A Comprehensive Guide
Class: Computer Science 103
Date: February 24, 2024
What is a Database?
A database is an organized collection of structured information, or data, typically
stored electronically in a computer system. It is designed for efficient storage,
retrieval, and management of data. Think of it as a digital filing cabinet—instead
of papers, you store data, which can be easily accessed, managed, and updated. A
database management system (DBMS) is the software that interacts with the user,
other applications, and the database itself to capture and analyze data.
Relational vs. Non-Relational Databases
Databases are generally classified into two main types based on their underlying
structure.
Relational Databases (SQL)
Relational databases are the most traditional type. They store data in structured
tables consisting of rows and columns. Each row represents a record, and each
column represents a specific attribute. Relationships between tables are defined
using keys to link related data. This structure is ideal for applications that
require a high level of data integrity, complex queries, and predefined schemas.
The primary language for interacting with these databases is Structured Query
Language (SQL).
Examples: MySQL, PostgreSQL, Oracle, SQL Server
Non-Relational Databases (NoSQL)
Non-relational databases, or NoSQL (Not only SQL), provide a more flexible approach
to data storage. They are designed to handle large volumes of unstructured, semi-
structured, and polymorphous data. Instead of tables, they use various data models:
Key-Value: Data is stored as a simple collection of key-value pairs (e.g., Redis).
Document: Data is stored in JSON-like documents, which can contain nested fields
and arrays (e.g., MongoDB).
Column-Family: Data is stored in columns instead of rows, making them highly
efficient for analytical queries (e.g., Cassandra).
Graph: Data is stored in nodes and edges, representing relationships, which is
perfect for social networks or recommendation engines (e.g., Neo4j).
Examples: MongoDB, Redis, Cassandra, Neo4j
Key Database Concepts
Table: A collection of related data held in a structured format within a database.
Row (Record): A single entry in a table, representing a single, complete set of
data.
Column (Field): An attribute or specific piece of information for each record.
Primary Key: A unique identifier for each row in a table. It ensures that each
record is distinct.
Foreign Key: A field in one table that uniquely identifies a row of another table.
It's used to link data between tables, establishing a relationship.
Query: A request for data from a database. SQL is the standard language for this in
relational databases.
SQL vs. NoSQL: Choosing the Right Tool
The choice between a SQL and NoSQL database depends on the specific needs of your
application.
Use SQL when...
You need to handle complex queries and transactions (ACID compliance).
Your data has a well-defined, consistent structure.
You need strong data integrity and relationships.
Use NoSQL when...
You need high scalability and flexibility for rapidly changing data.
Your data is unstructured or semi-structured.
You are building an application with a high-speed data flow (e.g., real-time
analytics).
Database Design and Normalization
Database design is the process of creating a model that determines how data is
stored, structured, and organized. A critical part of this is normalization, a
systematic process that organizes tables to reduce data redundancy and improve data
integrity. It works through a series of "normal forms" (1NF, 2NF, 3NF) to eliminate
repeating groups and ensure that data dependencies are logical.