C20.0046: Database Management Systems Lecture - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

C20.0046: Database Management Systems Lecture

Description:

... total sales for the entire database: Q: Find total sales of ... Movie(title, year, ... the year of each star's first movie. Q: Find the span of each star's ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 47
Provided by: pagesSt
Category:

less

Transcript and Presenter's Notes

Title: C20.0046: Database Management Systems Lecture


1
C20.0046 Database Management SystemsLecture 13
  • M.P. Johnson
  • Stern School of Business, NYU
  • Spring, 2005

2
Agenda
  • Finish SQL queries
  • Updates and creating tables with SQL
  • Indices, views, programs talking to SQL
  • 3/10 (next week Thurs, before spring break)
  • Midterm
  • Hw2 due
  • Today returning proj2 at end
  • Mean 26
  • Stdev 4

3
Grouping Aggregation ops
  • In SQL
  • aggregation operators in SELECT,
  • Grouping in GROUP BY clause
  • Recall aggregation operators
  • sum, avg, min, max, count
  • strings, numbers, dates
  • Each applies to scalars
  • Count also applies to row count()
  • Can DISTINCT inside aggregation op
    count(DISTINCT x)
  • Grouping group rows that agree on single value
  • Each group becomes one row in result

4
Straight aggregation example
  • Purchase(product, date, price, quantity)
  • Q Find total sales for the entire database
  • Q Find total sales of bagels

SELECT SUM(price quantity) FROM Purchase
SELECT SUM(price quantity) FROM
Purchase WHERE product 'bagel'
5
Straight grouping
  • Group rows together by field values
  • Produces one row for each group
  • I.e., by each (combin. of) grouped val(s)
  • Dont select non-grouped fields
  • Reduces to DISTINCT selections

SELECT product FROM Purchase GROUP BY
product
SELECT DISTINCT product FROM Purchase
6
Illustrated GA example
  • Sometimes want to group and compute aggregations
    by group
  • Aggregation op applied to rows in group,
  • not to all rows in table
  • Purchase(product, date, price, quantity)
  • Find total sales for products that sold for
    0.50

SELECT product, SUM(pricequantity) total FROM
Purchase WHERE price .50 GROUP BY product
7
Illustrated GA example
Purchase
8
Illustrated GA example
  • First compute the FROM-WHERE
  • then GROUP BY product

9
Illustrated GA example
  • Finally, aggregate and select

SELECT product, SUM(pricequantity) total FROM
Purchase WHERW price .50 GROUP BY product
10
Illustrated GA example
  • GROUP BY may be reduced to (maybe more
    complicated) subquery

SELECT product, SUM(pricequantity) total FROM
Purchase WHERE price .50 GROUP BY product
SELECT DISTINCT x.product, (SELECT
SUM(y.pricey.quantity)
FROM Purchase y
WHERE x.product y.product
AND y.price .50) total FROM Purchase
x WHERE x.price .50
11
Multiple aggregations
For every product, what is the total sales and
max quantity sold?
SELECT product, SUM(price quantity)
SumSales, MAX(quantity)
MaxQuantity FROM Purchase WHERE price
.50 GROUP BY product
12
Another grouping/aggregation e.g.
  • Movie(title, year, length, studioName)
  • Q How many total minutes of film have been
    produced by each studio?
  • Strategy Divide movies into groups per studio,
    then add lengths per group

13
Another grouping/aggregation e.g.
SELECT studio, sum(length) totalLength FROM
Movies GROUP BY studio
14
Another grouping/aggregation e.g.
SELECT studio, sum(length) totalLength FROM
Movies GROUP BY studio
15
Another grouping/aggregation e.g.
SELECT studio, sum(length) totalLength FROM
Movies GROUP BY studio
16
Grouping/aggregation example
  • StarsIn(SName,Title,Year)
  • Q Find the year of each stars first movie
  • Q Find the span of each stars career
  • Look up first and last movies

SELECT sname, min(year) firstyear FROM
StarsIn GROUP BY sname
17
  • Acc(name,bal,type)
  • Q Who has the largest balance of each type?
  • Can we do this with grouping/aggregation?

18
G A for constructed relations
  • Movie(title,year,producerSsn,length)
  • MovieExec(name,ssn,netWorth)
  • Can do the same thing for larger, non-atomic
    relations
  • Q How many mins. of film did each producer make?
  • What happens to non-producer movie-execs?

SELECT name, sum(length) total FROM Movie,
MovieExec WHERE producerSsn ssn GROUP BY name
19
HAVING clauses
  • Sometimes we want to limit which tuples may be
    grouped
  • Q How many mins. of film did each rich producer
    (i.e., netWorth 10000000) make?
  • Q Is HAVING necessary here?
  • A No, could just add rich req. to WHERE

SELECT name, sum(length) total FROM Movie,
MovieExec WHERE producerSsn ssn GROUP BY
name HAVING netWorth 10000000
20
HAVING clauses
  • Sometimes we want to limit which tuples may be
    grouped, based on properties of the group
  • Q How many mins. of film did each old producer
    (i.e., who started before 1930) make?
  • Q Is HAVING necessary here?

SELECT name, sum(length) total FROM Movie,
MovieExec WHERE producerSsn ssn GROUP BY
name HAVING min(year)
21
General form of GA
  • NB Any attribute of relations in the FROM
    clause may be aggregated in the HAVING clause,
    but only those attributes that are in the GROUP
    BY list may appear unaggregated in the HAVING
    clause (the same rule as for the SELECT clause)
    (Ullman, p283).

SELECT S FROM R1,,Rn WHERE C1 GROUP
BY As HAVING C2
Why?
  • S may contain attributes As and/or any
    aggregates but no other attributes
  • C1 condition on the attributes in R1,,Rn
  • C2 condition on aggregations or attributes from
    As

Why?
22
Evaluation of GA
SELECT S FROM R1,,Rn WHERE C1 GROUP
BY a1,,ak HAVING C2
  • Evaluation steps
  • Compute the FROM-WHERE part as usual to obtain a
    table with all attributes in R1,,Rn
  • Group by the attributes a1,,ak
  • Compute the aggregates in C2 and keep only groups
    satisfying C2
  • Compute aggregates in S and return the result

23
Web page examples
  • Find all authors who wrote at least 10 documents
  • Authors(login, name)
  • Webpages(url, title, login)
  • Attempt 1 with nested queries

Bad!
SELECT DISTINCT name FROM Authors WHERE
COUNT(SELECT url FROM Webpages
WHERE Authors.loginWebpages.login)
10
24
Web page examples
  • Find all authors who wrote at least 10 documents
  • Attempt 2 Simplify with GROUP BY

Good!
SELECT name FROM Authors, Webpages WHERE
Authors.login Webpages.login GROUP BY
name HAVING count(Webpages.url) 10
No need for DISTINCT get for free from GROUP BY
25
Web page examples
  • Find all authors who have a vocabulary over 10000
    words
  • Authors(login, name)
  • Webpages(url, title, login)
  • Mentions(url, word)

SELECT name FROM Authors, Webpages,
Mentions WHERE Authors.loginWrote.login AND
Webpages.urlMentions.url GROUP BY
name HAVING count(distinct word) 10000
26
Summary SQL queries
  • Only SELECT, FROM required
  • Cant have HAVING without GROUP BY
  • Can have GROUP BY without HAVING
  • Any clauses used must appear in this order

SELECT LFROM Rs WHERE s GROUP
BY L2 HAVING s2 ORDER BY L3
27
New topic Nulls in SQL
  • If we dont have a value, can put a NULL
  • Null can mean several things
  • Value does not exists
  • Value exists but is unknown
  • Value not applicable
  • The schema specifies whether null is allowed for
    each attribute
  • NOT NULL if not allowed
  • By default, null is allowed

28
Null Values
  • x NULL ? 4(3-x)/7 NULL
  • x NULL ? x 3 x NULL
  • x NULL ? 3 (x-x) NULL
  • x NULL ? x Joe is UNKNOWN
  • In general no row using null fields appear in
    the selection test will pass the test
  • With one exception
  • Pace Boole, SQL has three boolean values
  • FALSE 0
  • TRUE 1
  • UNKNOWN 0.5

29
Null values in boolean expressions
  • C1 AND C2 min(C1, C2)
  • C1 OR C2 max(C1, C2)
  • NOT C1 1 C1
  • height 6 UNKNOWN
  • ? UNKNOWN OR weight 190 UNKOWN
  • ? (age

SELECT FROM Person WHERE (age (height 6 OR weight 190)
E.g.age20heightNULLweight180
30
Comparing null and non-nulls
  • Unexpected behavior
  • Some Persons are not included!
  • The trichotomy law does not hold!

SELECT FROM Person WHERE age 25
31
Testing for null values
  • Can test for NULL explicitly
  • x IS NULL
  • x IS NOT NULL
  • But
  • x NULL is always null
  • Now it includes all Persons

SELECT FROM Person WHERE age 25 OR age IS NULL
32
Null/logic review
  • TRUE AND UNKNOWN ?
  • TRUE OR UNKNOWN ?
  • UNKNOWN OR UNKNOWN ?
  • X NULL ?

33
Outerjoin
  • Like L R except that dangling tuples are
    included, padded with nulls
  • Left outerjoin dangling tuples from left are
    included
  • Nulls appear on the right
  • Right outerjoin dangling tuples from right are
    included
  • Nulls appear on the left

34
Joins operations
  • Variations
  • Cross join (Cartesian product)
  • Join On
  • Natural join
  • Outer join
  • Apply to relations appearing in selections

35
Cross join - example
MovieStar
MovieExec
36
Cross join example
SELECT FROM MovieStar CROSS JOIN MovieExec
37
Join On example
SELECT FROM MovieStar JOIN MovieExec ON
MovieStar.Name MovieExec.Name
38
Natural Joins
  • MovieStar(name, address, gender, birthdate)
  • MovieExec(name, address, networth)
  • Natural Join
  • MovieStar Natural Join MovieExec
  • Results in list of individuals who are
    movie-stars as well as executives
  • (Name, address, gender, birthdate, networth)

39
Example - Natural join
MovieStar
MovieExec
SELECT FROM MovieStar NATURAL JOIN MovieExec
40
Outer Join - Example
SELECT FROM MovieStar LEFT OUTER JOIN MovieExec
ON MovieStart.nameMovieExec.name
SELECT FROM MovieStar RIGHT OUTER JOIN
MovieExec ON MovieStart.nameMovieExec.name
41
Outer Join - Example
MovieStar
MovieExec
SELECT FROM MovieStar FULL OUTER JOIN MovieExec
ON MovieStart.nameMovieExec.name
42
New-style join syntax
  • Old-style syntax simply lists tables separated by
    commas
  • New-style makes the join explicit

SELECT FROM A,B WHERE
SELECT FROM A JOIN B ON WHERE
43
New-style join syntax
  • Functionally equivalent to old-style, but perhaps
    more elegant
  • Introduced in Oracle 8i, MySQL 3.x/4.x
  • Older versions / other DBMSs may only support
    old-style syntax

44
New-style join types
  • cross joins (simplest)
  • FROM A CROSS JOIN B
  • Inner joins (regular joins)
  • FROM A INNER JOIN B ON
  • Natural join
  • FROM A NATURAL JOIN B
  • Joins on common fields and merges
  • Outer joins

45
New-style outer joins
  • Outer joins may be left, right, or middle
  • FROM A LEFT OUTER JOIN B
  • FROM A RIGHT OUTER JOIN B
  • FROM A FULL OUTER JOIN B
  • OUTER is optional
  • If OUTER is included, then FULL is the
    default
  • Q How to remember left v. right?
  • A It indicates the side whose rows are always
    included

46
Old-style outer joins in Oracle
  • Outer joins can also be done with the old-style
    syntax, but with the ()
  • WHERE A.attB.att()
  • corresponds to
  • FROM A LEFT JOIN B
  • The () is applied to all B attributes referred
    to in the WHERE clause
  • Q How to remember which side gets the ()?
  • A The side that gets null rows added
Write a Comment
User Comments (0)
About PowerShow.com