Topic
Informatik II D-MAVT Database Queries with SQL
Database queries with SQL Program What is SQL? Creating, altering, and deleting tables Inserting, updating, and deleting data entries Querying the database Queries across multiple tables
Storing queries as data views
Learning goals Be able to understand SQL statements and translate them to clear text. Be able to write SQL statements to create/define tables in a database. Be able to write SQL statements to insert, modify, and delete data in a table. Be able to write SQL statements to query the database across several tables and with search conditions.
Database Design Steps
Requirements Analysis (What needs to be stored, for whom, and why?) Conceptual Modeling (Data entities) Logical Modeling (Data relationships) Schema Refinement (Normalization) Physical Model (Mapping to tables) Implementation (in a RDBMS)
Further reading
Chapter 3, Simple Queries from SQL for Web Nerds, Philip Greenspun (http://philip.greenspun.com/sql/) Database Design for Mere Mortals, Michael J. Hernandez Information Modeling and Relational Databases, Terry Halpin Database Modeling and Design, Toby J. Teorey
What is SQL?
Non-procedural universal language for querying and manipulating data in RDBMS Structured Q uery Language Many standards out there: ANSI SQL SQL92 (a.k.a. SQL2) SQL99 (a.k.a. SQL3) Vendors support various sub- or super-sets of these What we discuss is common to all of them
The two parts of SQL
Data in SQL
Data Definition Language (DDL)
Create/alter/delete tables and their attributes
Data Manipulation Language (DML)
Insert/delete/modify entities in tables Query one or more tables
1. Atomic data types 2. Table attributes are assigned a data type
SQL is case-insensitive for all commands and specifiers!
Basic SQL Commands
Creating tables with CREATE Adding data with INSERT Viewing data with SELECT Removing data with DELETE Modifying data with UPDATE Changing tables with ALTER Destroying tables with DROP
Data Definition Language
SQL commands to dene and manipulate relations in a database. create tables alter tables drop tables create views
(see later)
Creating tables with CREATE
Generic form
CREATE TABLE tablename ( column_name data_type attributes, column_name data_type attributes, );
Data Types in SQL
Characters:
CHAR(20) VARCHAR(40) -- fixed length -- variable length
Numbers:
BIGINT, INT, SMALLINT, TINYINT REAL, FLOAT -- differ in precision MONEY
Times and dates:
DATE, TIME TIMESTAMP
Table and column names cannot have spaces or be reserved words like TABLE, CREATE, etc.
Binary objects (such as pictures, sound, etc.)
BLOB -- stands for Binary Large OBject
Others... All are simple
Column attributes in SQL
Contact Table: Physical Model
Name Address Company Phone Number URL/Web Page Age Height Birthday When we added the entry Character Character Character Character Character Integer Real (float) Date Timestamp
Keys:
PRIMARY KEY KEY INDEX -- primary key of the table -- foreign key (will be indexed) -- field to be indexed for fast search -- field must be filled in -- value to be used if user gives none -- see later
Null values:
NOT NULL
Default value:
DEFAULT value
Automatic values:
AUTO_INCREMENT
Others... Most of them are RDBMS-specific
Contact Table: SQL
CREATE TABLE contacts ( Name Address Company Phone URL Age Height Birthday WhenEntered ); VARCHAR(40), VARCHAR(60), VARCHAR(60), VARCHAR(11), VARCHAR(80), INT, FLOAT, DATE, TIMESTAMP
Contact Table: SQL
CREATE TABLE contacts ( ContactID Name Address Company Phone URL Age Height Birthday WhenEntered ); INT PRIMARY KEY, VARCHAR(40), VARCHAR(60), VARCHAR(60), VARCHAR(11), VARCHAR(80), INT, FLOAT, DATE, TIMESTAMP
Plan your tables very carefully! Once created, they may be difficult to change!
If you are going to use the relational nature of a database, dont forget you need to have a unique way to access records! There is a way to make the key automatically increment, so you dont have to worry about which one is next.
Automatic key generation
AUTO_INCREMENT field attribute:
CREATE TABLE contacts ( ContactID Name INT PRIMARY KEY AUTO_INCREMENT, VARCHAR(40),
Another Create Table Example
address name
CREATE TABLE customer (cust_id INT, name VARCHAR(50) NOT NULL, address VARCHAR(256) NOT NULL, PRIMARY KEY (cust_id)); CREATE TABLE credit_card (cust_id INT NOT NULL, credit_card_type CHAR(5) NOT NULL, credit_card_num INT NOT NULL, KEY (cust_id));
Customer
1 have N
Database automatically gives sequential numbers to the entities Makes sure each number is unique Numbers will never change once assigned Number value has no real-world meaning
Credit card
card type card number
Modifying Tables
Generic form
| | | | | | | |
ALTER specifications
ADD [COLUMN] column_definition [FIRST | AFTER col_name ] ADD {INDEX|KEY} [index_type] (index_col_name,...) MODIFY [COLUMN] column_definition [FIRST | AFTER col_name] DROP [COLUMN] col_name DROP PRIMARY KEY DROP {INDEX|KEY} index_name DROP FOREIGN KEY fk_symbol RENAME [TO] new_tbl_name
ALTER TABLE tablename alter specifications, ;
Examples: ALTER TABLE customer MODIFY COLUMN name VARCHAR(256); ALTER TABLE customer ADD COLUMN credit_limit REAL;
Destroying tables with DROP
Generic Form DROP TABLE tablename; Example: DROP TABLE contacts;
ATTN: All data will be destroyed along with the table. The system will not ask twice!
Data Manipulation Language
SQL commands to manipulate the data in a database or query the database. insert new data entries update existing entries delete entries query the database
Adding data with INSERT
Generic Form
INSERT INTO tablename (column_name,) VALUES (value,)
Inserting a record into contacts
INSERT INTO contacts (contactid,name,address,company,phone, url,age,height,birthday,whenentered) VALUES (1,Joe,123 Any St.,ABC, 800-555-1212,http://abc.com,30,1.9, 6/14/1972, now());
Inserting a partial record
Modifying data with UPDATE
Generic Form
INSERT INTO contacts (contactid,name,phone) VALUES (2,Jane,212-555-1212);
UPDATE table SET column=expression WHERE condition; Example: UPDATE contacts SET company=AOL WHERE company=Time Warner;
Removing data with DELETE
Generic Form DELETE FROM table WHERE condition; Example: DELETE FROM contacts WHERE age<13;
Viewing data with SELECT
Generic Form: SELECT column, FROM table, WHERE condition GROUP BY group_by_expression HAVING condition ORDER BY order_expression; The most used command Probably the most complicated also If used improperly, can cause very long waits because of complex combinations
Lets start simple
Simplified generic form: SELECT attributes FROM table(s) WHERE conditions
Simple SQL Query
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets Photography Household Supplier GizmoWorks GizmoWorks Canon Hitachi
SELECT * FROM Product WHERE category=Gadgets
PName Gizmo Price $19.99 $29.99 Category Gadgets Gadgets Supplier GizmoWorks GizmoWorks
Multiple tables can be joined in a single query Multiple conditions (search criteria) can be imposed Results can be sorted or grouped
selection
Powergizmo
Simple SQL Query
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets Photography Household Supplier GizmoWorks GizmoWorks Canon Hitachi
Eliminating Duplicates
Category
SELECT category FROM Product
Gadgets Gadgets Photography Household
SELECT PName, Price, Supplier FROM Product WHERE Price > 100
PName Price $149.99 $203.99 Supplier Canon Hitachi
Compare to:
Category
SELECT DISTINCT category FROM Product
Gadgets Photography Household
selection and projection
SingleTouch MultiTouch
More SELECT examples
SELECT * FROM contacts;
Display all records in the contacts table
Selection Operations
What goes in the WHERE clause: x = y, x < y, x <= y, etc
For numbers, they have the usual meanings For CHAR and VARCHAR: lexicographic ordering For dates and times: chronological ordering
SELECT contactid,name FROM contacts;
Display only the record number and names
SELECT DISTINCT url FROM contacts;
Display only one entry for every value of URL.
Negation: NOT Pattern matching on strings: s LIKE p (also: NOT LIKE) Strings are enclosed in single quotes: string Multiple criteria can be joined by AND or OR Probe for empty fields with IS NULL Special operators: BETWEEN x AND y, IN(),
WHERE examples
The WHERE subclause allows you to select records based on a condition. SELECT * FROM contacts WHERE age<10;
Display records from contacts where age<10
WHERE examples
SELECT * FROM contacts WHERE name IS NULL; SELECT * FROM contacts WHERE zip IN (14454,12345);
SELECT * FROM contacts WHERE age BETWEEN 18 AND 35;
Display records where age is 18-35
The LIKE operator
s LIKE p: pattern matching on strings p may contain two special symbols:
% = any sequence of characters _ = any single character
More LIKE examples
The LIKE condition
Allows you to look at strings that are alike
SELECT * FROM contacts WHERE name LIKE J%;
Display records where the name starts with J
Product(Name,Price,Category,Supplier) Find all products whose name contains gizmo:
SELECT * FROM Products WHERE PName LIKE %gizmo%
SELECT * FROM contacts WHERE url NOT LIKE %.com;
Display where url does not end in .com
Ordering the Results
SELECT pname, price, supplier FROM Product WHERE category=gadgets AND price > 50 ORDER BY price ASC
Ordering the Results
SELECT category FROM Product ORDER BY Pname ASC
Category Gadgets Household Gadgets Photography
ASC order in ascending order DESC in descending order List of several attributes can be given for nested ordering (e.g. order by price first and within all products of same price, order by supplier: ORDER BY price, supplier ASC)
PName Gizmo Powergizmo SingleTouch MultiTouch
Price $19.99 $29.99 $149.99 $203.99
Category Gadgets Gadgets Photography Household
Supplier GizmoWorks GizmoWorks Canon Hitachi
10
Ordering the Results
Joins in SQL
Connect two or more tables:
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 CName GizmoWorks Canon Hitachi Category Gadgets Gadgets Photography Household StockPrice 25 65 15 Supplier GizmoWorks GizmoWorks Canon Hitachi Country USA Japan Japan
SELECT DISTINCT category FROM Product ORDER BY category
Category Gadgets Household Photography
Company
What is the Connection between them ?
Joins
Product(pname,price,category,supplier) Company(cname,stockPrice,country) Find all products under $200 manufactured in Japan; return their names and prices. Join between Product and Company SELECT PName, Price FROM Product, Company WHERE Supplier=CName AND Country=Japan AND Price <= 200
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets
Joins in SQL
Company
Supplier GizmoWorks GizmoWorks Canon Hitachi Cname GizmoWorks Canon Hitachi StockPrice 25 65 15 Country USA Japan Japan
Photography Household
SELECT PName, Price FROM Product, Company WHERE Supplier=CName AND Country=Japan AND Price <= 200
PName SingleTouch Price $149.99
11
Joins
Product(pname,price,category,supplier) Company(cname,stockPrice,country) Find all countries that manufacture some product in the Gadgets category. SELECT Country FROM Product, Company WHERE Supplier=CName AND Category=Gadgets
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets
Joins in SQL
Company
Supplier GizmoWorks GizmoWorks Canon Hitachi Cname GizmoWorks Canon Hitachi StockPrice 25 65 15 Country USA Japan Japan
Photography Household
SELECT Country FROM Product, Company WHERE Supplier=CName AND Category=Gadgets
Country
Why does USA appear twice ?
USA USA
Internal joint table
RDBMS internally builds a joint table with ALL matches:
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets Photography Household Supplier GizmoWorks GizmoWorks Canon Hitachi Cname GizmoWorks GizmoWorks Canon Hitachi StockPrice 25 25 65 15 Country USA USA Japan Japan
Disambiguating Attributes
Sometimes two relations have the same attr: Person(pname,address,worksfor) Company(cname,address)
SELECT DISTINCT pname, address FROM Person, Company WHERE worksfor=cname Which address ?
SELECT Country FROM Product, Company WHERE Supplier=CName AND Category=Gadgets
Country
First the tables are joint and entries duplicated Then the selection is made!
USA USA
SELECT DISTINCT Person.pname, Company.address FROM Person, Company WHERE Person.worksfor=Company.cname
12
In fact
table names introduced automatically by the system:
Joining together multiple tables
SELECT name,phone,zip FROM people, phonenumbers, address WHERE people.addressid=address.addressid AND people.id=phonenumbers.id;
People
Id 1 2 3 Name Joe Jane Chris Addressid 1 2 3
Company(cname,address)
SELECT cname FROM Company Becomes:
PhoneNumbers
PhoneID 1 2 Id 1 1 1 2 3 3 Phone 5532 2234 3211 3421 2341 3211 1 2 3
Address
AddressID Company ABC XYZ PDQ Address 123 456 789 Zip 12345 14454 14423
SELECT Company.cname FROM Company
3 4 5 6
It NEEDS to be done by hand if column names are ambiguous!
Multiple Joins
Product(pname,price,category,supplier) Purchase(buyer,seller,store,product) Person(persname,phoneNumber,city) Find names of people living in Seattle that bought some product in the Gadgets category, and the names of the stores they bought such product from SELECT DISTINCT persname, store FROM Person, Purchase, Product WHERE persname=buyer AND product=pname AND city=Seattle AND category=Gadgets
Examples
Students
Name John Alice Max 1 TutorID 2
Student.Name John Max Student.Name John Alice Max Student.Name Max John NULL Student.Name John Alice Max NULL
Tutor.Name Greg Bob Tutor.Name Greg NULL Bob Tutor.Name Bob Greg Fabiana Tutor.Name Greg NULL Bob Fabiana
inner
left outer
right outer
Tutors
ID 1 2 3 Name Bob Greg Fabiana
full outer
13
Different types of JOINs
Inner Join
Unmatched rows in either table arent printed
General form of SELECT/JOIN
SELECT columns, FROM left_table join_type JOIN right_table ON condition; SELECT name,phone FROM people JOIN phonenumbers ON people.id=phonenumbers.id;
Left Outer Join
All records from the left side are printed. NULLs are filled in if no match on the right side can be found.
Right Outer Join
All records from the right side are printed. NULLs are filled in if not match on the left side is found.
Full Outer Join
All records are printed. NULLs are filled in for missing records on either side.
Natural Join
Attributes of the same name are automatically joined
Join type examples
SELECT name,phone FROM people LEFT JOIN phonenumbers ON people.id=phonenumbers.id; SELECT name,phone FROM people RIGHT JOIN phonenumbers ON people.id=phonenumbers.id; SELECT name,phone FROM people FULL JOIN phonenumbers ON people.id=phonenumbers.id;
Theta style vs. ANSI
Theta Style (only for equi-joins)
SELECT name,phone,zip FROM people, phonenumbers, address WHERE people.addressid=address.addressid AND people.id=phonenumbers.id;
ANSI Style (for all join types)
SELECT name,phone,zip FROM people JOIN phonenumbers ON people.id=phonenumbers.id JOIN address ON people.addressid=address.addressid;
14
Tuple Variables
Purchase(buyer,seller,store,product) Find buyer and seller of all purchases that were made in store BestBuy: SELECT DISTINCT x.buyer, x.seller FROM Purchase AS x WHERE x.store = BestBuy
Tuple Variables
Tuple variables introduced automatically by the system: Product(name,price,category,supplier) SELECT name FROM Product WHERE price > 100 Becomes: SELECT Product.name FROM Product AS Product WHERE Product.price > 100
Saves typing with long table names!
They NEED to be specified manually when the same table appears more than once!
Tuple Variables
Product(pname,price,category,supplier) Purchase(buyer,seller,store,product) Person(persname,phoneNumber,city) Find all stores that sold at least one product that the store BestBuy also sold: SELECT DISTINCT x.store FROM Purchase AS x, Purchase AS y WHERE x.product = y.product AND y.store = BestBuy
Renaming Columns
Product
PName Gizmo Powergizmo SingleTouch MultiTouch Price $19.99 $29.99 $149.99 $203.99 Category Gadgets Gadgets Photography Household Supplier GizmoWorks GizmoWorks Canon Hitachi
SELECT Pname AS prodName, Price AS askPrice FROM Product WHERE Price > 100
prodName askPrice $149.99 $203.99
JOIN of a table with itself.
Query with renaming
SingleTouch MultiTouch
15
GROUP BY/HAVING
The GROUP BY clause allows you to group results together before applying aggregation functions like:
AVG(), COUNT(), MAX(), MIN(), SUM() COUNT DISTINCT
GROUP BY/HAVING
SELECT customer, SUM(amount) FROM Sales;
Customer Oracle IBM Oracle Amount 17100 17100 17100
Sales
Customer Oracle IBM Oracle
Amount 5500 4500 7100
HAVING allows you to search the grouped results
Customer
Amount 12600 4500
SELECT customer, SUM(amount) FROM Sales GROUP BY customer;
Oracle IBM
GROUP BY/HAVING
SELECT customer, SUM(amount) FROM Sales GROUP BY customer WHERE SUM(amount)>10000;
Customer Oracle IBM Customer Oracle IBM Oracle Amount 5500 4500 7100 Amount 12600 4500
GROUP BY Examples
SELECT company,count(company) FROM contacts GROUP BY company;
How many times does each company appear in the contacts table?
Sales
SUM(amount)=17100 which is always >10000! ==> Cannot use WHERE
Customer Amount 12600
SELECT company,count(company) FROM contacts GROUP BY company HAVING count(company) > 5;
Only show counts for companies that appear more than 5 times.
SELECT customer, SUM(amount) FROM Sales GROUP BY customer HAVING SUM(amount)>10000;
Oracle
16
Nested SELECTs
The WHERE subclause of a SELECT statement can contain another SELECT statement: SELECT * FROM contacts WHERE zip NOT IN ( SELECT zip FROM address WHERE state=NY );
(Select all contacts that have a ZIP code outside NY state.)
Nested SELECTs
RDBMS starts with the inner-most SELECT and executes it Results are pasted into query in place of the executed SELECT Progress outward recursively until outermost SELECT has been executed
Views in SQL
An SQL view is a virtual table that is derived from other real or virtual tables Real tables are defined by CREATE TABLE commands and are permanently stored in the database Virtual tables are defined by the CREATE VIEW command to avoid defining complex SQL retrieval expressions repeatedly Only the definition of a view is stored, but not the data inside. They are recomputed every time the view is used (opened)
Creating and Deleting Views
Generic form of View definition CREATE VIEW viewname AS (SELECT-Statement); Deleting a view DROP VIEW viewname;
17
Example of a View definition
Views are stored queries and are created on the basis of a SELECT statement:
CREATE VIEW contactview AS (SELECT name,phone,zip FROM people,phonenumbers,address WHERE people.id=phonenumbers.id AND people.addressid=address.addressid);
Views can be used in other SELECT statements as if they were tables.
18