Formulating Accurate Queries - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Formulating Accurate Queries

Description:

Contains NO aggregation (or summaries) Summary Query. Contains aggregation of some type (example: maximum, minimum, sum, average, count, etc. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 39
Provided by: ReneeM63
Category:

less

Transcript and Presenter's Notes

Title: Formulating Accurate Queries


1
Formulating Accurate Queries
  • Theo Shlien, Systems Integrator

2
Overview
  • Know your business question
  • Get details!
  • Know your data
  • What do you expect back?
  • Table joins
  • Common pitfalls

3
Know Your Business Question
  • Get a clear definition of the question
  • Examples
  • How many CDs does each person in our database
    have?
  • What are the names/IDs of the people that have
    CDs in our database?

4
Know Your Business Question
  • What kind of information you need depends on the
    type of query
  • Two types of queries
  • Detail listing
  • Summary

5
Types of Queries
  • Detail Query
  • Contains detailed information about records that
    you are pulling back
  • Contains NO aggregation (or summaries)
  • Summary Query
  • Contains aggregation of some type (example
    maximum, minimum, sum, average, count, etc.)
  • Easier to make mistakes on summaries

6
General Business Question Limiting Results
(detail and summary)
  • Limiting factors depend on the data/business
  • Entire population or subset - Do we want
    everyone?
  • - Just Teenagers?
  • - Just People who are still alive?
  • Time periods - Do we want all purchases
    within the last year?
  • - Total purchase over their entire
    life?
  • - How many they own?
  • - Were some given as gifts?

7
General Business Question (cont)
  • Historical - Do we want people in the
    program now?
  • - Or have ever been?
  • Geography - Certain regions?
  • - Purchased from specific stores?
  • - Certain states?

8
Limiting Results Define terms
  • Defining terms of the business question is very
    important
  • Consider terms that should be obvious
  • Data storage for these obvious terms could be
    different than expected
  • In our example, what constitutes a
    person?Businesses could be included in person
    data
  • What constitutes a CD?Are computer CDs
    included? Music CDs? DVDs? Xbox games? Blank CDs?

9
Detailed Business Question
  • What information do you want displayed?
  • List the tables and fields where the data is
    present
  • Check to make sure the fields in question are not
    codes, if descriptions are desired
  • If this will be used by someone else, consider
    the formatting and order of the data
  • LIMIT YOUR DATA!

10
Summary Business Question
  • What information do you want displayed? VERY
    IMPORTANT!
  • Which elements will be calculated?
  • How will the elements be calculated? (Formula
    used? Distinct used?)
  • Limiting is less important when aggregating
  • Know your data

11
Know Your Data
  • Important for ALL queries
  • Where is the information stored?
  • What ELSE is stored in the same table?
  • How is the data stored?
  • How many records will come back?
  • How are different tables related?
  • Table Relationships
  • Join types
  • Join fields

12
Know Your Data Table Relationships
  • One-to-oneExample Each person (OK, there may be
    exceptions to this) has one and only one
    biological motherNote One-to-one relationships
    will not result in duplication or missing data
  • One-to-manyExample Each person (in this room)
    has more than one co-workerNote A one-to-many
    relationship could cause a record to be listed
    multiple times.
  • One-to-zeroExample Each person (in this room)
    has no wings.Note A one-to-zero relationship
    could result in the appearance of missing data

13
Table Relationships Example
  • How many CDs does each person have?
  • Person Lists person information
  • Name Lists names of people in Person table
  • CDs Lists information about CDs
  • Purchases Lists people and what CDs they
    purchased

Person
Name
CD
Purchases
14
Table Relationships (cont)
  • Written query counts the number of purchases for
    each person

Purchases
15
Table Relationships (cont)
  • When sorted by count, we notice no zero counts
    show up, and some really HIGH counts DO
  • What did we do wrong?

Purchases
16
Table Relationships (cont)
  • Purchases table does not contain people who have
    not purchased CDs
  • Must use Person table
  • When we use the Person table, we LOST a record
  • What did we do wrong?

Purchases
Person
17
Table Relationships (cont)
  • Person ID 138 is not in the Person tableNote
    This is an example of what happens with a
    one-to-zero relationship
  • Ignoring this for now, what if we want person
    NAMES, instead of IDs?

Purchases
Person
18
Table Relationships (cont)
  • Why are two 20 counts showing up?
  • Re-add Person ID as a displayed field

Purchases
Person
Name
19
Table Relationships (cont)
  • Wilma has changed her name at some pointNOTE
    This is an example of a one-to-many relationship
  • How do we fix the problems weve encountered?

Purchases
Person
Name
20
Table Relationships What We Learned
  • When you need to use more than one table, start
    with one
  • Get an initial count from your primary table to
    get an expected estimate (In the example, we
    probably should have gotten a count of how many
    people we had, first)
  • Add one table at a time
  • Look at each step. Does the result make sense?
  • This works for detail AND summary queries

21
Table Relationships Example Problems
  • Do we want Blockbuster listed? How do we remove
    it?KNOW YOUR DATA
  • How do we get rid of the extra person
    listing?KNOW YOUR DATA
  • How could we get Person 138 back?KNOW YOUR DATA
  • How do we find out who does not own any CDs?KNOW
    YOUR DATA
  • Note If we wanted to answer questions about what
    TYPE of CD we wanted included, wed need to use
    the CDs table.

22
Know Your Data Join types
  • Inner Join Produces results from two tables only
    if there is an exact match in BOTH tablesOur
    example used Inner Joins exclusively
  • Left Outer Join Produces data for each record in
    left table, even if there is no match in the
    right table
  • Right Outer Join Produces data for each record
    in right table, even if there is no match in the
    left table

23
Join Types Correcting Relationship Problems
  • Use an outer join when you have one table
    missing data, but you still want information
    displayed
  • In our example, if using a left outer join,
    which table will be on the left? (The tables
    were working with are People and
    Purchases)Business Question How many CDs does
    each person in our database have?

Purchases
Person
24
Join Types Correcting Relationship Problems
  • Now we can see people with count zero
  • How do we get rid of businesses?Answer This is
    dependant on how the information is stored. Let
    us assume, however, that there is a business
    indicator flag available.

Purchases
Person
25
Join Types Correcting Relationship Problems
  • How do we get names?(Keep in mind the multiple
    names per person)
  • Answer Again, data dependent. Let us assume that
    there is a Name type flag available on the Name
    table

Person
Purchases
26
Join Types Correcting Relationship Problems
  • Use the Name Type to get only the most valid name
  • Note For validation purposes, we kept the Person
    ID field in
  • What about the missing person 138?

Name
Person
Purchases
27
Know Your Data Join Fields
  • Join fields are as important as join type
  • Example How many CDs has each person in our
    database purchased in the last year?

Purchases
Person
28
Know Your Data Join Fields
  • Join fields are as important as join type
  • Example How many CDs has each person in our
    database purchased in the last year?Note We
    still want to see zero counts.
  • Temptation is to keep the left outer join weve
    done before, and add a where clause (or
    qualification) for the purchase date.

Person
Purchases
29
Join Fields (cont)
  • No zeros purchases again
  • Join happens FIRST
  • Where clause eliminates records
  • THEN aggregation occurs
  • How do we fix this?

Purchases
Person
30
Join Fields (cont)
  • Put the date check INTO the left outer join (In
    BI-Query, this would need a dynamic relationship)
  • Join happens first Since its a left outer join,
    all people are kept, even if theres no purchase
  • Then the count is done on purchases

Person
Purchases
31
Join Fields Part 2
  • COMPLETE joins are important
  • Example (incomplete join) No name type meant we
    got too many names per person
  • Example (too specific) How many names does each
    person have?Note Some people may not have any
    names entered
  • If name type is added to THIS query, you well get
    how many names of each type the person has
  • For a detailed query, extra/too specific a join
    could eliminate rows that you want

32
Common Pitfalls Summary
  • Forgetting table/data relationshipsResult
    Missing/extra information or wrong counts
  • Using the wrong join typeResult Missing/extra
    information or wrong counts
  • Using the wrong join fieldsResult Missing/extra
    information or wrong counts

33
Common Pitfalls Additional
  • Using extra fields in summary / grouped
    queriesTemptation in getting extra information
    for the summary
  • Unclear qualifiers / where clauseMultiple
    conditions (due to using and in combination
    with or) cause misinterpretation

34
Common Pitfalls Extra Fields
  • Not a problem in detail queries
  • Example How many CDs were sold from each Label
    (aka Record company)?
  • Expect to see Label/company and a countHowever,
    an additional field (genre) is added for extra
    information.

CD
Purchases
35
Common Pitfalls Extra Fields
  • While information is correct, it is not what we
    wanted / were asked for
  • The fix is very simple, just remove the extra
    column. To avoid this problem, have an idea of
    what you want the results to look like BEFORE the
    query is written

36
Common Pitfalls Unclear Qualifiers
  • Example I want to see a movie and go to dinner
    or go shopping
  • What does this example mean?
  • I want to see a movie and go to dinner, or I want
    to go shoppingOR
  • I want to see a movie and either go to dinner or
    go shopping
  • Same problem occurs with queries

37
Common Pitfalls Unclear Qualifiers Solution
  • Be specific. Group qualifiers explicitly
  • I want to (see a movie and go to dinner) or I
    want to go shoppingOR
  • I want to see a movie and either (go to dinner or
    go shopping)
  • In SQL, just put parenthesis in to enforce order
    of evaluation. In reporting tools, group
    qualifiers together
  • Teradata makes assumptions about the order you
    want to have them evaluated

38
Writing Accurate Queries
  • Know your business question
  • Get details!
  • Know your data
  • What do you expect back?
  • Table joins
  • Common pitfalls
  • Only have the fields you want / need
  • Group qualifiers explicitly
Write a Comment
User Comments (0)
About PowerShow.com