Monicas provisional plan - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Monicas provisional plan

Description:

For each value of a domain, there is a bitmap identifying the row Ids of satisfying tuples ... The second table shows a bitmap index for rating, assuming values ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 37
Provided by: jenny91
Category:

less

Transcript and Presenter's Notes

Title: Monicas provisional plan


1
Monicas provisional plan
2
F28DM Database Management Systems Indexes in
Oracle
  • Monica Farrow
  • monica_at_macs.hw.ac.uk
  • Room EMG30, Ext 4160
  • Material on Vision my web page
  • Content taken from HW lecturers,
  • books by Rob Coronel, and by Connolly Begg
  • For more info on everything, see Oracle9i
    Database Concepts (on the web)

3
Basic Use of Indexes
  • Primary, Secondary, Composite
  • When to use

4
Indexes
  • An index helps speed up retrieval.
  • A column that is frequently referred to in the
    WHERE clause is a potential candidate for
    indexing.
  • Why use indexing?
  • It's easy and provides immediate value
  • It can be used to speed up existing applications
    without changing any of their logic or queries
  • One of the main reasons for the success of
    relational databases is that they have efficient
    indexing systems.

5
Index recap
  • An index typically consists of the search key
    (one or more attributes) and the row ID(s) of
    record(s) which match this key
  • An index on a file speeds up selections on the
    search key fields specified for that index
  • Any subset of the fields of a relation can be the
    search key for an index on the relation
  • A primary index allows efficient searching on the
    primary key of a relation
  • A secondary index allows efficient searching on
    other attributes which are often used in queries

6
Example the Sailors and Boats Domain
  • Information needs to be stored about sailors,
    boats and reservations that sailors are making
    for boats
  • There are three relations with the following
    schemas
  • Sailor (idint, namestring, ratingint, ageint,
    basestring)
  • Boat (idint, namestring, colourstring,
    basestring)
  • Reservation (sidint, bidint, daydate)

7
Automatically Created Indexes
  • On every relation, Oracle creates a non-clustered
    Btree index for the primary key
  • A B-tree is shown on the next slide
  • A B-tree stays balanced when items are added and
    removed
  • Indexes will be created on
  • Sailor table, id column
  • Boat table, id column
  • Reservations table, a composite index on sid, bid
    and day
  • Additionally Oracle will also create an index for
    every UNIQUE constraint

8
B-tree index
Diagram just to give you an idea, you dont have
to memorise this
9
Composite index keys
  • If the key consists of more than one attribute,
    the index is a composite or concatenated index
  • i.e. an index for several attributes
  • A composite index is sorted lexicographically
    according to the order of the attributes
  • E.g for a composite key (name,age), we
    have(Kelly, 22) lt (Kelly, 63) lt (Smith, 18) lt
    (Smith,36)
  • This is different to a composite key (age, name)

10
Secondary indexes
  • Create secondary indexes for other attributes in
    your tables which are frequently searched by
  • In the SQL WHERE clause, not the SQL SELECT line
  • In this case, you must explicitly create the
    index, as shown on the next slide

11
Explicitly Created Indexes
  • An index
  • Has a name. Choose an informative one.
  • Is created for a sequence of attributes over a
    table which must already exist
  • Can be DROPped
  • Examples
  • CREATE INDEX sailor_name_idx ON Sailor(name)
  • DROP INDEX sailor_name_idx
  • CREATE INDEX sailor_name_and_age_idx ON
    Sailor(name, age)

12
Using CREATE INDEX (1) 
  • The CREATE INDEX statement creates a sorted index
    on the specified columns of the named table.
  • Indexes are automatically used to improve the
    performance of queries issued to the database,
    and to sort queries with an ORDER BY clause.
  • Once an index is created, it is never referenced
    in a SQL statement again except to validate it
    (VALIDATE INDEX) or delete it (DROP INDEX).
  • Indexes cannot be created for views.
  • Add an index to the underlying base table

13
Using CREATE INDEX (2)
  • Indexes are always owned by the owner of the
    table. The index name must be unique for each
    table and owner.
  • Exclusive table use.   
  • You cant create an index if the table is in use.
  • CREATE INDEX can be time consuming and the server
    will not process requests referencing the same
    table while the statement is being processed.
  • As with all other CREATE commands, store the
    command in a text file
  • To remind you what the index is like
  • So it can be rerun if you decide to re-create
    your DB

14
Overheads v performance
  • There is an overhead involved in the maintenance
    and use of secondary indexes.
  • The index must be updated whenever the table is
    updated
  • Disk space is needed for the index
  • Balance the overhead against the performance
    benefits
  • Faster data retrieval

15
How to choose secondary indexes (1)
  • Add a secondary index to a foreign key if it is
    frequently accessed
  • E.g. Boat id in Reservations, if we frequently
    want to know the name of the boat in a
    reservation
  • Add a secondary index to any attribute that is
    frequently used as a search key
  • E.g. day in Reservations (what reservations do we
    have for today or tomorrow?)
  • E.g. name in Sailor (what is the rating for the
    sailor called Lubber?). People often search by
    name rather than id.

16
How to choose secondary indexes (2)
  • Add a secondary index to any attributes
    frequently used in order by, group by, min, max,
    avg
  • E.g. age in Sailor if frequently want list in age
    order
  • Add a composite secondary index to a small set of
    attributes if the index would provide the full
    details for a frequently used query without
    having to search the base table
  • E.g. rating and age in Sailor if a frequent query
    isSELECT rating, AVG(age) FROM SailorGROUP BY
    rating

17
When not to use secondary indexes
  • If the relation is small not many rows
  • If the relation or attribute is frequently
    updated
  • If periodic large updates, drop the index, update
    the data, re-create the index
  • If the attribute is always used in queries that
    retrieve a significant proportion of the rows in
    the relation
  • E.g. If the attribute has low sparsity (i.e. the
    number of different values is small, such as
    gender F or M).

18
A typical exam question
  • The following SQL Create commands have been used
    in Oracle. There will be a large number of
    records in the Employee table but very few
    records in the JobTitle table.
  • CREATE TABLE Employee (eID VARCHAR(8) PRIMARY
    KEY,eName VARCHAR(20),jobTitle VARCHAR (20)
    REFERENCES JobTitle(title),sex CHAR (1) CHECK
    (sex in ('M', 'F')),NHSNumber VARCHAR(12) UNIQUE
    )CREATE TABLE JOBTITLE (title VARCHAR(20)
    PRIMARY KEY,salaryGrade CHAR(1) )

19
A typical exam question
  • Common queries will be
  • Find all details of employees with a particular
    name
  • Find the names of all male employees
  • Find names of all employees with a particular
    salary grade
  • Find all employees with a given job title
  • Find the name of the employee with a given NHS
    number
  • For each of the attributes in both tables,
    discuss whether you should create an index for
    that field, giving

20
More Advanced Indexing Topics
  • Index-organised tables
  • Bitmap indexes
  • Clusters

21
Index organised tables
  • Index organised tables
  • Entries of index are table records rather than
    row IDs
  • Useful if frequent access to complete table via
    the index
  • In Oracle, the default table organisation is heap
  • To store the table in primary key order, add the
    words ORGANIZATION INDEX to the end of the CREATE
    TABLE command
  • In the B-tree, the leaf nodes are replaced by all
    the non-primary key attributes in the row

22
.Bitmap indexes
  • Bitmap indexes
  • An example on the next slide
  • For each value of a domain, there is a bitmap
    identifying the row Ids of satisfying tuples
  • Usually applied to attributes with a sparse
    domain
  • In Oracle, lt100 distinct values
  • E.g. bitmaps for all tuples with sex male and
    for sexfemale
  • Updating a bitmap takes a lot of time, so use for
    tables with hardly any updates, inserts, deletes
  • Ideal for data warehousing
  • See later lecture, if time

23
Bitmap indexes example
  • The first table is the Sailors table
  • The second table shows a bitmap index for rating,
    assuming values are only from 1-3

24
Clusters in Oracle
  • Oracle supports 2 types of clusters
  • Indexed clusters
  • Hashed clusters

25
Clusters
  • Clusters are groups of 1 or more tables
    physically stored together because they share
    common columns and are often used together
    (Connolly Begg)
  • This improves disk access
  • The related columns of the tables are called the
    cluster key

26
Saiilor/Reservation cluster example
  • Sailor and Reservation relations could be stored
    together, with the cluster key Sailor id
  • Each relation can be retrieved independently
  • Whenever the joined relations are required,
    access is faster

27
Should we cluster Sailors and Reservations?
  • A cluster for Sailors and Reservations could be
    created if many queries in the application joined
    these tables together.
  • A cluster for Sailors and Reservations would be
    inefficient if reservations were only a minor
    part of the application, and many more queries
    involved Sailors joined with other tables

28
Indexed clusters
  • Use when
  • Queries retrieve records over a range of cluster
    key values
  • Clustered tables may grow unpredictably
  • With an indexed table or index cluster, Oracle
    locates the rows in a table using key values that
    Oracle stores in a separate index.
  • E.g.
  • No idea how many reservations there are going to
    be
  • Queries like SELECT FROM Sailors S,
    Reservations RWHERE S.id R.sidAND S.id gt 50

29
Defining a indexed cluster in SQL(1)
  • Create a cluster first
  • CREATE CLUSTER sailor_id_cluster (id int)
  • A cluster can be made for a fixed number of typed
    attributes (here just one)
  • Define an index on the cluster
  • CREATE INDEX sailor_id_idx ON CLUSTER
    sailor_id_cluster

30
Defining an indexed cluster in SQL (2)
  • Create table(s) for the cluster
  • CREATE TABLE Sailor (id INT NOT NULL,
  • name VARCHAR(20),

  • ranking INT,
  • age
    INT)CLUSTER sailor_id_cluster(id)
  • CREATE TABLE Reservation (sid INT, bid
    INT.........as before....)CLUSTER
    sailor_id_cluster(sd)

31
Hash Clusters
  • The key of a hash cluster, like the key of an
    index cluster, can be a single column or
    composite key (multiple column key).
  • A hash function is applied to the cluster key to
    obtain a hash value.
  • The hash value determines the location where the
    rows of a table are physically stored. So all
    records with the same cluster key are stored
    together

32
Creating a hash cluster
  • Create a hash cluster
  • CREATE CLUSTER sailor_id_hashcluster (id
    int)HASH IS id HASHKEYS 10000
  • Add table(s) as before
  • CREATE TABLE Sailor (id INT NOT NULL,
  • name VARCHAR(20),

  • ranking INT,
  • age
    INT)CLUSTER sailor_id_hashcluster(id)

33
Use hashing when (1)
  • Most queries are equality queries on the cluster
    key
  • SELECT ... WHERE cluster_key ...
  • the cluster key in the equality condition is
    hashed, and the corresponding hash key is usually
    found with a single read.
  • In comparison, for an indexed table the key value
    must first be found in the index (usually several
    reads), and then the row is read from the table
    (another read).
  • AND WHEN . . .

34
Use hashing when (2)
  • AND
  • The tables in the hash cluster are primarily
    static in size so that you can determine the
    number of rows and amount of space required for
    the tables in the cluster.
  • If tables in a hash cluster require more space
    than the initial allocation for the cluster,
    performance degradation can be substantial
    because overflow blocks are required.

35
Hashing v Indexing
  • To find or store a row in an indexed table or
    cluster, a minimum of two (there are usually
    more) I/Os must be performed
  • One or more I/Os to find or store the key value
    in the index
  • Another I/O to read or write the row in the table
    or cluster
  • In contrast, if a hash function is used to locate
    a row in a hash cluster no I/O is required.
  • As a result, a minimum of one I/O operation is
    necessary to read or write a row in a hash
    cluster.

36
A typical exam question - answers
  • Job title table no because few records.
  • Oracle will automatically create an index for
    primary keys and unique fields, so there is
    already an index for eID, NHSNumber and title
  • eName yes, because used in condition of query
  • sex no, because attribute has low sparsity
  • jobTitle yes, because used in queries and in
    joins
  • salaryGrade no because few entries in table
Write a Comment
User Comments (0)
About PowerShow.com