0% found this document useful (0 votes)
14 views11 pages

Database SQL Guide

This comprehensive guide covers database design, SQL, and best practices for developers, focusing on relational databases and their structure. It includes detailed explanations of tables, data types, constraints, SQL queries, joins, normalization, and performance optimization techniques. The guide emphasizes the importance of proper schema design, indexing, and query optimization for building efficient and scalable database systems.

Uploaded by

Ansh Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views11 pages

Database SQL Guide

This comprehensive guide covers database design, SQL, and best practices for developers, focusing on relational databases and their structure. It includes detailed explanations of tables, data types, constraints, SQL queries, joins, normalization, and performance optimization techniques. The guide emphasizes the importance of proper schema design, indexing, and query optimization for building efficient and scalable database systems.

Uploaded by

Ansh Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 11

# Database Design and SQL: Comprehensive Guide for Developers

## Introduction to Databases

A database is an organized collection of structured data stored and accessed


electronically. Databases are fundamental to modern applications, enabling
efficient data storage, retrieval, and management. This guide covers relational
databases, SQL, database design principles, and best practices for building
scalable database systems.

## Relational Databases and Tables

### Understanding Relational Database Concepts

A relational database organizes data into tables with rows and columns. Tables are
related through keys, creating a structured system for data management.

**Tables**: Collections of related data organized in rows (records) and columns


(fields).

**Rows**: Individual records containing data for a single entity.

**Columns**: Fields containing specific attributes of the entity.

**Primary Key**: A unique identifier for each row in a table.

**Foreign Key**: A reference to a primary key in another table, creating


relationships.

**Schema**: The structure of a database including tables, columns, data types, and
constraints.

### Creating Tables

```sql
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);

CREATE TABLE posts (


id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
title VARCHAR(200) NOT NULL,
content TEXT NOT NULL,
published BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id)
);

CREATE TABLE comments (


id INT PRIMARY KEY AUTO_INCREMENT,
post_id INT NOT NULL,
user_id INT NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (post_id) REFERENCES posts(id),
FOREIGN KEY (user_id) REFERENCES users(id)
);
```

## Data Types

### Common SQL Data Types

**Numeric Types**:
- INT: Integer values (-2,147,483,648 to 2,147,483,647)
- BIGINT: Large integers (-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)
- DECIMAL(precision, scale): Fixed-point numbers (e.g., prices)
- FLOAT/DOUBLE: Floating-point numbers

**String Types**:
- VARCHAR(length): Variable-length string up to specified length
- CHAR(length): Fixed-length string
- TEXT: Large text (up to 65,535 characters in MySQL)
- LONGTEXT: Very large text

**Date and Time Types**:


- DATE: Date in YYYY-MM-DD format
- TIME: Time in HH:MM:SS format
- DATETIME: Date and time combined
- TIMESTAMP: Date and time with automatic update

**Boolean**:
- BOOLEAN: TRUE or FALSE (stored as 0 or 1)

## Constraints

Constraints enforce data integrity and consistency:

```sql
CREATE TABLE employees (
id INT PRIMARY KEY AUTO_INCREMENT,

-- NOT NULL constraint


first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,

-- UNIQUE constraint
email VARCHAR(100) UNIQUE NOT NULL,
employee_id VARCHAR(20) UNIQUE,

-- DEFAULT constraint
department VARCHAR(50) DEFAULT 'Unassigned',
status VARCHAR(20) DEFAULT 'Active',

-- CHECK constraint
salary DECIMAL(10, 2) CHECK (salary > 0),
age INT CHECK (age >= 18),

-- Foreign key constraint


manager_id INT,
FOREIGN KEY (manager_id) REFERENCES employees(id),
-- Composite primary key
UNIQUE KEY unique_combo (first_name, last_name, email)
);
```

## SELECT Statements

### Basic SELECT Queries

```sql
-- Select all columns
SELECT * FROM users;

-- Select specific columns


SELECT id, username, email FROM users;

-- Select with WHERE clause


SELECT * FROM posts WHERE user_id = 5;

-- Multiple conditions
SELECT * FROM posts
WHERE user_id = 5 AND published = TRUE;

SELECT * FROM posts


WHERE published = TRUE OR user_id IN (5, 10, 15);

-- NOT operator
SELECT * FROM posts WHERE published != TRUE;
SELECT * FROM posts WHERE NOT published;
```

### Filtering with WHERE

```sql
-- Comparison operators
SELECT * FROM products WHERE price > 100;
SELECT * FROM products WHERE price >= 100;
SELECT * FROM products WHERE price < 50;
SELECT * FROM products WHERE price <= 50;
SELECT * FROM products WHERE price <> 100;

-- LIKE pattern matching


SELECT * FROM users WHERE username LIKE 'john%'; -- Starts with 'john'
SELECT * FROM users WHERE username LIKE '%smith'; -- Ends with 'smith'
SELECT * FROM users WHERE username LIKE '%an%'; -- Contains 'an'

-- BETWEEN
SELECT * FROM orders WHERE order_date BETWEEN '2025-01-01' AND '2025-12-31';

-- IN operator
SELECT * FROM users WHERE role IN ('admin', 'moderator', 'user');

-- IS NULL
SELECT * FROM users WHERE middle_name IS NULL;
```

### Sorting and Limiting

```sql
-- ORDER BY
SELECT * FROM posts ORDER BY created_at DESC;
SELECT * FROM products ORDER BY price ASC, name ASC;

-- LIMIT
SELECT * FROM posts LIMIT 10; -- First 10 rows
SELECT * FROM posts LIMIT 10 OFFSET 20; -- Skip first 20, get 10

-- LIMIT for pagination


SELECT * FROM posts
ORDER BY created_at DESC
LIMIT 10 OFFSET (page - 1) * 10;
```

## Joins

Joins combine data from multiple tables:

```sql
-- INNER JOIN - Only matching rows
SELECT users.username, posts.title, posts.created_at
FROM posts
INNER JOIN users ON posts.user_id = users.id
WHERE posts.published = TRUE;

-- LEFT JOIN - All rows from left table + matching from right
SELECT users.username, COUNT(posts.id) as post_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
GROUP BY users.id, users.username;

-- RIGHT JOIN - All rows from right table + matching from left
SELECT posts.title, users.username
FROM posts
RIGHT JOIN users ON posts.user_id = users.id;

-- FULL OUTER JOIN - All rows from both tables


SELECT *
FROM users
FULL OUTER JOIN posts ON users.id = posts.user_id;

-- CROSS JOIN - Cartesian product


SELECT * FROM users CROSS JOIN roles;

-- Multiple joins
SELECT users.username, posts.title, comments.content
FROM comments
INNER JOIN posts ON comments.post_id = posts.id
INNER JOIN users ON comments.user_id = users.id
WHERE posts.id = 42;
```

## Aggregation Functions

Aggregate functions perform calculations on groups of rows:

```sql
-- COUNT
SELECT COUNT(*) FROM users; -- Total users
SELECT COUNT(DISTINCT user_id) FROM posts; -- Unique users with posts

-- SUM
SELECT SUM(amount) FROM orders;
SELECT SUM(amount) FROM orders WHERE user_id = 5;

-- AVG
SELECT AVG(price) FROM products;

-- MIN/MAX
SELECT MIN(price) as cheapest, MAX(price) as most_expensive FROM products;
SELECT MIN(created_at) as oldest_post FROM posts;

-- GROUP BY
SELECT user_id, COUNT(*) as post_count
FROM posts
GROUP BY user_id;

-- GROUP BY with multiple columns


SELECT category, status, COUNT(*) as count
FROM products
GROUP BY category, status;

-- HAVING clause (filtering groups)


SELECT user_id, COUNT(*) as post_count
FROM posts
GROUP BY user_id
HAVING COUNT(*) > 5;
```

## INSERT, UPDATE, DELETE

### Inserting Data

```sql
-- Single row insert
INSERT INTO users (username, email, password_hash)
VALUES ('john_doe', 'john@example.com', 'hashed_password');

-- Multiple rows insert


INSERT INTO users (username, email, password_hash)
VALUES
('alice', 'alice@example.com', 'hash1'),
('bob', 'bob@example.com', 'hash2'),
('charlie', 'charlie@example.com', 'hash3');

-- Insert with calculated values


INSERT INTO posts (user_id, title, content)
SELECT id, CONCAT('Welcome ', username), 'Welcome to our platform'
FROM users;
```

### Updating Data

```sql
-- Update specific records
UPDATE users
SET email = 'newemail@example.com'
WHERE id = 5;
-- Update multiple columns
UPDATE posts
SET published = TRUE, updated_at = NOW()
WHERE id = 42;

-- Conditional update
UPDATE products
SET price = price * 1.1
WHERE category = 'Electronics';

-- Update from another table


UPDATE orders
SET status = 'shipped'
WHERE user_id IN (
SELECT id FROM users WHERE country = 'USA'
);
```

### Deleting Data

```sql
-- Delete specific records
DELETE FROM comments WHERE id = 123;

-- Delete with condition


DELETE FROM posts WHERE published = FALSE;

-- Delete from multiple tables (careful!)


DELETE FROM orders WHERE user_id NOT IN (
SELECT id FROM users
);

-- Clear entire table (careful - no WHERE clause)


DELETE FROM temp_data;
```

## Database Normalization

Normalization reduces data redundancy and improves data integrity.

### First Normal Form (1NF)

Each column contains atomic (indivisible) values:

```sql
-- ✗ Not normalized - hobbies is a list
CREATE TABLE users_bad (
id INT PRIMARY KEY,
name VARCHAR(100),
hobbies VARCHAR(500) -- "reading, gaming, coding"
);

-- ✓ Normalized - separate table for hobbies


CREATE TABLE users_good (
id INT PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE user_hobbies (
user_id INT,
hobby VARCHAR(50),
FOREIGN KEY (user_id) REFERENCES users_good(id)
);
```

### Second Normal Form (2NF)

All non-key attributes depend on the entire primary key:

```sql
-- ✗ Not normalized - instructor_phone depends only on instructor_id
CREATE TABLE classes_bad (
class_id INT,
instructor_id INT,
instructor_phone VARCHAR(20),
PRIMARY KEY (class_id, instructor_id)
);

-- ✓ Normalized - separate table for instructors


CREATE TABLE instructors (
id INT PRIMARY KEY,
phone VARCHAR(20)
);

CREATE TABLE classes_good (


id INT PRIMARY KEY,
instructor_id INT,
FOREIGN KEY (instructor_id) REFERENCES instructors(id)
);
```

### Third Normal Form (3NF)

Non-key attributes depend only on the primary key, not on other non-key attributes:

```sql
-- ✗ Not normalized - city depends on state
CREATE TABLE customers_bad (
id INT PRIMARY KEY,
state_id INT,
state_name VARCHAR(50),
city VARCHAR(50)
);

-- ✓ Normalized
CREATE TABLE states (
id INT PRIMARY KEY,
name VARCHAR(50)
);

CREATE TABLE customers_good (


id INT PRIMARY KEY,
state_id INT,
city VARCHAR(50),
FOREIGN KEY (state_id) REFERENCES states(id)
);
```
## Indexes

Indexes improve query performance by creating a data structure for faster lookups:

```sql
-- Create single column index
CREATE INDEX idx_email ON users(email);

-- Create composite index


CREATE INDEX idx_user_created ON posts(user_id, created_at);

-- Create unique index


CREATE UNIQUE INDEX idx_username ON users(username);

-- Create full-text index


CREATE FULLTEXT INDEX idx_content ON posts(content);

-- Drop index
DROP INDEX idx_email ON users;

-- View indexes
SHOW INDEX FROM users;
```

## Transactions

Transactions ensure data consistency by grouping operations:

```sql
-- Basic transaction
START TRANSACTION;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;


UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- Commit if successful
COMMIT;

-- Or rollback if there's an error


ROLLBACK;

-- Transaction with error handling (pseudocode)


BEGIN TRANSACTION
INSERT INTO logs VALUES (...)
UPDATE users SET last_login = NOW() WHERE id = 5
COMMIT
```

## Stored Procedures

Stored procedures are prewritten SQL statements stored in the database:

```sql
-- Create procedure
DELIMITER //
CREATE PROCEDURE GetUserPosts(IN user_id INT)
BEGIN
SELECT id, title, created_at
FROM posts
WHERE user_id = user_id
ORDER BY created_at DESC;
END//
DELIMITER ;

-- Call procedure
CALL GetUserPosts(5);

-- Procedure with output parameter


DELIMITER //
CREATE PROCEDURE CountUserPosts(IN user_id INT, OUT post_count INT)
BEGIN
SELECT COUNT(*) INTO post_count FROM posts WHERE user_id = user_id;
END//
DELIMITER ;

-- Call with output


CALL CountUserPosts(5, @count);
SELECT @count;
```

## Views

Views are virtual tables created from query results:

```sql
-- Create view
CREATE VIEW active_users AS
SELECT id, username, email, created_at
FROM users
WHERE status = 'Active';

-- Query view
SELECT * FROM active_users;

-- Update view
UPDATE active_users SET email = 'newemail@example.com' WHERE id = 5;

-- Drop view
DROP VIEW active_users;

-- Create complex view with joins


CREATE VIEW user_post_stats AS
SELECT
u.id,
u.username,
COUNT(p.id) as post_count,
MAX(p.created_at) as latest_post
FROM users u
LEFT JOIN posts p ON u.id = p.user_id
GROUP BY u.id, u.username;
```

## Database Design Best Practices

### 1. Use Appropriate Data Types

```sql
-- ✓ Good - appropriate types
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(200),
price DECIMAL(10, 2),
stock_quantity INT,
is_active BOOLEAN
);

-- ✗ Bad - inappropriate types


CREATE TABLE products_bad (
id VARCHAR(50) PRIMARY KEY,
name TEXT,
price VARCHAR(50),
stock_quantity VARCHAR(50),
is_active VARCHAR(10)
);
```

### 2. Use Foreign Keys for Referential Integrity

```sql
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
order_date DATETIME,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);
```

### 3. Create Indexes on Frequently Queried Columns

```sql
-- Index on columns used in WHERE, JOIN, and ORDER BY
CREATE INDEX idx_user_id ON posts(user_id);
CREATE INDEX idx_created_at ON posts(created_at);
```

### 4. Use Naming Conventions

```sql
-- Consistent naming
CREATE TABLE users ( -- Plural table names
id INT PRIMARY KEY, -- id for primary key
user_id INT, -- foreign keys with entity_id
created_at TIMESTAMP, -- snake_case for columns
is_active BOOLEAN -- boolean columns start with 'is_' or 'has_'
);
```

### 5. Document Your Schema

```sql
-- Add comments to explain purpose
ALTER TABLE users COMMENT = 'Stores user account information';
ALTER TABLE users MODIFY COLUMN email VARCHAR(100) COMMENT = 'User email address
must be unique';
```
## Performance Optimization

### Query Optimization

```sql
-- ✗ Inefficient - full table scan
SELECT * FROM large_table WHERE column LIKE '%value%';

-- ✓ More efficient - with index


CREATE INDEX idx_column ON large_table(column);
SELECT * FROM large_table WHERE column = 'value';

-- ✗ Inefficient - subquery
SELECT * FROM orders
WHERE user_id IN (SELECT id FROM users WHERE country = 'USA');

-- ✓ More efficient - join


SELECT o.* FROM orders o
INNER JOIN users u ON o.user_id = u.id
WHERE u.country = 'USA';
```

### Connection Pooling

Use connection pooling in applications to reuse database connections efficiently


rather than creating new connections for each query.

## Conclusion

Database design and SQL are essential skills for modern developers. Understanding
relational concepts, normalization principles, and SQL syntax enables the creation
of efficient, maintainable databases. Proper schema design, appropriate indexing,
and query optimization are crucial for building scalable applications that perform
well as data volumes grow. Regular practice with these concepts ensures developers
can make informed decisions when designing and maintaining databases.

You might also like