The%20Relational%20Model%20(cont - PowerPoint PPT Presentation

About This Presentation
Title:

The%20Relational%20Model%20(cont

Description:

... manager should do a decent job. Buffer Management. Query Optimization ... free frame. Page Requests from Higher Levels. BUFFER POOL. choice of frame dictated ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 32
Provided by: wwwinstEe
Category:

less

Transcript and Presenter's Notes

Title: The%20Relational%20Model%20(cont


1
The Relational Model (contd) Introduction to
Disks and Storage
  • CS 186, Spring 2007, Lecture 3
  • Cow book Section 1.5, Chapter 3 (contd)
  • Cow book Chapter 9
  • Mary Roth

2
Administrivia
  • Homework 0 due today 10 p.m.!
  • Nathan and Erinaios posted their office hours on
    class homepage
  • Homework 1 available today from class web site
  • Submit team members online
  • Read thru description well talk more about it
    after todays lecture
  • Questions from last time?

3
Outline
  • What we learned last time
  • Components of a DBMS
  • Relational Data Model
  • New stuff
  • Storage, Disks and Files

4
Review Components of a DBMS
  • A DBMS is like an ogre it has layers

5
Review Relational Data Model
  • Most widely used data model today.
  • Relations
  • Schema specifies name of relation, plus name
    and type of each column.
  • Instance a table, with rows and columns that
    contain data.
  • SQL is a query language for relational data model
  • DDL To define/modify/change schemas
  • DML To query data in table.
  • Keys are a way to associate tuples in different
    relations

6
Lets return to our bank
  • Can we apply a relational model to our bank
    spreadsheet?

CREATE TABLE CUSTOMERS (CustomerID INTEGER,
Name VARCHAR(128),
Address VARCHAR(256),
AccountID INTEGER)
CREATE TABLE ACCOUNTS (AccountID INTEGER,
Balance Double)
CustomerID Name Address AccountID
Account ID Balance
314159 Frodo Baggins BagEnd 112358
271828 Sam Gamgee BagShot Row 132124
42 Bilbo Baggins Rivendell 112358
112358 4500.00
132124 2000.00
7
Primary Keys
  • A set of fields is a superkey if
  • No two distinct tuples can have same values in
    all key fields
  • A set of fields is a key for a relation if
  • It is a superkey
  • No subset of the fields is a superkey
  • what if gt1 key for a relation?
  • One of the keys is chosen (by DBA) to be the
    primary key. Other keys are called candidate
    keys.
  • e.g.
  • sid, gpa is an example of a superkey.
  • sid is a key for Students.
  • what about name? login?

Students
sid
name
login
age
gpa
53666
Jones
jones_at_cs
18
3.4
53688
Smith
smith_at_eecs
18
3.2
53650
Smith
smith_at_math
19
3.8
8
Primary and Candidate Keys in SQL
  • Keys must be chosen and defined carefully!
  • They imply semantics!
  • What does this set of key definitions imply about
    students?

CREATE TABLE Enrolled (sid CHAR(20) cid
CHAR(20), grade CHAR(2), PRIMARY KEY
(sid), UNIQUE (cid, grade))
Students can take only one course, and no two
students in a course receive the same grade.
9
Primary and Candidate Keys in SQL
  • Better definition

CREATE TABLE Enrolled (sid CHAR(20) cid
CHAR(20), grade CHAR(2), PRIMARY KEY
(sid,cid))
For a given student and course, there is a
single grade.
10
Foreign Keys, Referential Integrity
  • Foreign key Set of fields in one relation that
    is used to refer to a tuple in another
    relation.
  • Must correspond to the primary key of the other
    relation.
  • Like a logical pointer.
  • Plays the same role as the physical pointer in
    IMS
  • If all foreign keys in a table refer to tuples in
    the other, referential integrity is achieved
    (i.e., no dangling references.)

11
Foreign Keys in SQL
  • E.g. Only students listed in the Students
    relation should be allowed to enroll for courses.
  • sid is a foreign key referring to Students

CREATE TABLE Enrolled (sid CHAR(20),cid
CHAR(20),grade CHAR(2), PRIMARY KEY (sid,cid),
FOREIGN KEY (sid) REFERENCES Students )
Enrolled
Students
sid
cid
grade
sid
name
login
age
gpa
53666
Carnatic101
C
53666
Jones
jones_at_cs
18
3.4
53666
Reggae203
B
53688
Smith
smith_at_eecs
18
3.2
53650
Topology112
A
53650
Smith
smith_at_math
19
3.8
53666
History105
B
12
Lets return to our bank
  • Can we define keys for our relations?

CREATE TABLE CUSTOMERS (CustomerID INTEGER NOT
NULL, Name VARCHAR(128),
Address VARCHAR(256),
AccountID
INTEGER, PRIMARY KEY(CustomerID), FOREIGN
KEY(accountid) references ACCOUNTS)
CREATE TABLE ACCOUNTS (AccountID INTEGER NOT
NULL, Balance Double,
PRIMARY KEY (AccountID))
  • Why do we need NOT NULL?
  • What would happen if I executed these commands in
    this order?

13
Lets return to our bank
  • Well come back to these later
  • Write a SQL query (DML) that returns the names
    and account balances for all customers that have
    an account balance gt 2500.
  • Write a SQL query (DML) that withdraws 300 from
    Frodos account.

14
Intermission
15
Disks, Memory, and Files
You are here
16
Disks and Files
  • DBMS stores information on disks.
  • Data must be transferred to and from disk and RAM
  • READ transfer data from disk to main memory
    (RAM).
  • WRITE transfer data from RAM to disk.
  • READ and WRITE are expensive and must be planned
    carefully!
  • DBMS architecture is designed to minimize both

17
Why Not Store Everything in Main Memory?
  • Costs too much. For 300, PCConnection will
    sell you
  • 1GB of RAM
  • 30GB of flash
  • 1 TB of disk
  • Main memory is volatile. We want data to be
    saved between runs. (Obviously!)

18
The Storage Hierarchy
Smaller, Faster
  • Main memory (RAM) for currently used data.
  • Disk for the main database (secondary storage).
  • Tapes for archiving older versions of the data
    (tertiary storage).

Bigger, Slower
Source Operating Systems Concepts 5th Edition
19
Jim Grays Storage Latency Analogy How Far
Away is the Data?
20
Disks
  • Secondary storage device of choice.
  • Main advantage over tapes
  • faster time to retrieve
  • random access vs. sequential.
  • Data is stored and retrieved in units called disk
    blocks or pages.
  • Unlike RAM, time to retrieve a disk block varies
    depending upon location on disk.
  • Therefore, relative placement of blocks on disk
    has major impact on DBMS performance!

21
Components of a Disk
Spindle
Disk head
The platters spin (say, 120 rps).
The arm assembly is moved in or out to position
a head on a desired track. Tracks under heads
make a cylinder (imaginary!).
Sector
Platters
Only one head reads/writes at any one time.
  • Block size is a multiple of sector size (which
    is fixed).

22
Accessing a Disk Page
Transfer time
Seek time
  • Time to access (read/write) a disk block
  • seek time (moving arms to position disk head on
    track)
  • rotational delay (waiting for block to rotate
    under head)
  • transfer time (actually moving data to/from disk
    surface)

Rotational delay
23
Accessing a Disk Page
  • Seek time and rotational delay dominate.
  • Seek time varies between about 0.3 and 10msec
  • Rotational delay varies from 0 to 4msec
  • Transfer rate around .08msec per 8K block
  • Key to lower I/O cost reduce seek/rotation
    delays!

24
Arranging Pages on Disk
  • Next block concept
  • blocks on same track, followed by
  • blocks on same cylinder, followed by
  • blocks on adjacent cylinder
  • Blocks in a file should be arranged sequentially
    on disk (by next), to minimize seek and
    rotational delay.
  • For a sequential scan, pre-fetching several pages
    at a time is a big win!

25
Summary Disk Space Manager
  • Lowest layer of DBMS software manages space on
    disk (using OS file system or not?).
  • Higher levels call upon this layer to
  • allocate/de-allocate a page
  • read/write a page
  • Best if a request for a sequence of pages is
    satisfied by pages stored sequentially on disk!
  • Responsibility of disk space manager.
  • Higher levels dont know how this is done, or how
    free space is managed.
  • Though they may make performance assumptions!
  • Hence disk space manager should do a decent job.

26
Buffer Management
You are here
  • Data must be in RAM for DBMS to operate on it!
  • Buffer Mgr hides the fact that not all data is in
    RAM

27
Buffer Management in a DBMS
Page Requests from Higher Levels
BUFFER POOL
disk page
free frame
MAIN MEMORY
DISK
choice of frame dictated by replacement policy
  • Buffer pool information table contains
    ltframe,
    pageid, pin_count, dirtygt

28
Requesting a page
I need page 3
Higher level DBMS component
BUFFER POOL
Buf Mgr
22
disk page
3
I need page 3
3
free frames
MAIN MEMORY
Disk Mgr
DISK
1
2
3
22
90

  • If requests can be predicted (e.g., sequential
    scans) pages can be pre-fetched several pages at
    a time!

29
Releasing a page
I read page 3 and Im done with it
Higher level DBMS component
BUFFER POOL
Buf Mgr
22
disk page
3
free frames
MAIN MEMORY
Disk Mgr
DISK
1
2
3
22
90


30
Releasing a page
I wrote on page 3 and Im done with it
Higher level DBMS component
BUFFER POOL
Buf Mgr
22
disk page
3
free frames
3
MAIN MEMORY
Disk Mgr
DISK
3
1
2
3
22
90


31
More on Buffer Management
  • Requestor of page must eventually unpin it, and
    indicate whether page has been modified
  • dirty bit is used for this.
  • Page in pool may be requested many times,
  • a pin count is used.
  • To pin a page, pin_count
  • A page is a candidate for replacement iff pin
    count 0 (unpinned)
  • CC recovery may entail additional I/O when a
    frame is chosen for replacement.
  • Write-Ahead Log protocol more later!
Write a Comment
User Comments (0)
About PowerShow.com