Storing Data: Disks and Files - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Storing Data: Disks and Files

Description:

... of the page tells the buffer manger that the page is modified by setting ... are some unpinned pages, the buffer manger chooses one by using the replacement ... – PowerPoint PPT presentation

Number of Views:288
Avg rating:3.0/5.0
Slides: 41
Provided by: RaghuRamak150
Category:
Tags: data | disks | files | manger | storing

less

Transcript and Presenter's Notes

Title: Storing Data: Disks and Files


1
Storing Data Disks and Files
  • Chapter 7

Yea, from the table of my memory Ill wipe away
all trivial fond records. -- Shakespeare, Hamlet
2
Disks and Files
  • DBMS stores information on (hard) disks.
  • This has major implications for DBMS design!
  • READ transfer data from disk to main memory
    (RAM).
  • WRITE transfer data from RAM to disk.
  • Both are high-cost operations, relative to
    in-memory operations, so must be planned
    carefully!

3
Why Not Store Everything in Main Memory?
  • Costs too much. 1000 will buy you either 128MB
    of RAM or 7.5GB of disk today.
  • Main memory is volatile. We want data to be
    saved between runs. (Obviously!)
  • Typical storage hierarchy
  • Main memory (RAM) for currently used data.
  • Disk for the main database (secondary storage).
  • Tapes for archiving older versions of the data
    (tertiary storage).

4
Disks
  • Secondary storage device of choice.
  • Main advantage over tapes random access vs.
    sequential.
  • Data is stored and retrieved in units called disk
    blocks or pages.
  • Unlike RAM, time to retrieve a disk page varies
    depending upon location on disk.
  • Therefore, relative placement of pages on disk
    has major impact on DBMS performance!

5
Components of a Disk
Spindle
Disk head
  • The platters spin (say, 90rps).
  • The arm assembly is moved in or out to position
    a head on a desired track. Tracks under heads
    make a cylinder (imaginary!).

Sector
Platters
  • Only one head reads/writes at any one time.
  • Block size is a multiple of sector
    size (which is fixed).

6
Accessing a Disk Page
  • Time to access (read/write) a disk block
  • seek time (moving arms to position disk head on
    track)
  • rotational delay (waiting for block to rotate
    under head)
  • transfer time (actually moving data to/from disk
    surface)
  • Seek time and rotational delay dominate.
  • Seek time varies from about 1 to 20msec
  • Rotational delay varies from 0 to 10msec
  • Transfer rate is about 1msec per 4KB page
  • Key to lower I/O cost reduce seek/rotation
    delays! Hardware vs. software solutions?

7
Arranging Pages on Disk
  • Next block concept
  • blocks on same track, followed by
  • blocks on same cylinder, followed by
  • blocks on adjacent cylinder
  • Blocks in a file should be arranged sequentially
    on disk (by next), to minimize seek and
    rotational delay.
  • For a sequential scan, pre-fetching several pages
    at a time is a big win!

8
RAID
  • Disk Array Arrangement of several disks that
    gives abstraction of a single, large disk.
  • Goals Increase performance and reliability.
  • Two main techniques
  • Data striping Data is partitioned size of a
    partition is called the striping unit. Partitions
    (of equal size called stripping unit) are
    distributed over several disks.
  • Redundancy More disks -gt more failures.
    Redundant information allows reconstruction of
    data if a disk fails.

9
RAID Levels
  • Level 0 No redundancy
  • Level 1 Mirrored (two identical copies)
  • Each disk has a mirror image (check disk)
  • Parallel reads, a write involves two disks.
  • Maximum transfer rate transfer rate of one disk
  • Level 01 Striping and Mirroring
  • Parallel reads, a write involves two disks.
  • Maximum transfer rate aggregate bandwidth

10
RAID Levels (Contd.)
  • Level 2 Error-Correcting Codes
  • Striping Unit One bit.
  • Redundancy scheme Hamming Code
  • Keeps more redundant information than necessary

11
RAID Levels (Contd.)
  • Level 3 Bit-Interleaved Parity
  • Striping Unit One bit. One check disk.
  • Each read and write request involves all disks
    disk array can process one request at a time.
  • Level 4 Block-Interleaved Parity
  • Striping Unit One disk block. One check disk.
  • Parallel reads possible for small requests, large
    requests can utilize full bandwidth
  • Writes involve modified block and check disk
  • Level 5 Block-Interleaved Distributed Parity
  • Similar to RAID Level 4, but parity blocks are
    distributed over all disks

12
Disk Space Management
  • Lowest layer of DBMS software manages space on
    disk.
  • Supports concept of page as unit of data
  • size of page chosen as size of disk block
  • Pages stored as disk blocks
  • Reading writing page can be done in one disk I/O
  • Higher levels call upon this layer to
  • allocate/de-allocate a page
  • read/write a page
  • Request for a sequence of pages must be satisfied
    by allocating the pages sequentially on disk!
    Higher levels dont need to know how this is
    done, or how free space is managed.

13
Practice Chapter 7
  • What is the most important difference between a
    disk and a tape?
  • Tapes are sequential devices, disk support direct
    access to desired page.
  • Explain the terms seek time, rotational delay,
    and transfer time.
  • Time to access (read/write) a disk block
  • seek time time to move disk heads to track on
    which desired block is located
  • rotational delay waiting for block to rotate
    under disk head it is the time required for half
    a rotation on average, and is usually less than
    seek time.
  • transfer time actually read/write data to/from
    block once head is positioned, I.e. time for disk
    to rotate over block.

14
Practice Chapter 7 (Cont.)
  • Both disks and main memory support direct access
    to any desired location (page). On average, main
    memory accesses are faster, of course. What is
    the other important difference (from the
    perspective of the time required to access a
    desired page)?
  • The time to access a disk page is not constant.
    It depends on the location of the data.
    Accessing to some data might be much faster than
    to others. It is different in memory access to
    memory is uniform for most computer systems.
  • If you have a large file that is frequently
    scanned sequentially, explain how you would store
    the pages in the file on a disk.
  • The pages in the file should be stored
    sequentially on a disk. We should put two
    logically adjacent pages as close as possible.
    In decreasing order of closeness, they could be
    on the same track, the same cylinder, or an
    adjacent cylinder.

15
Components of a Disk
Spindle
Disk head
  • The platters spin (say, 90rps).
  • The arm assembly is moved in or out to position
    a head on a desired track. Tracks under heads
    make a cylinder (imaginary!).

Sector
Platters
  • Only one head reads/writes at any one time.
  • Block size is a multiple of sector
    size (which is fixed).

16
Practice Chapter 7 (Cont.)
  • Consider a disk with sector size 512 bytes, 2,000
    tracks per surface, 50 sectors per track, 5
    double-sided platters, average seek time of 10
    msec.
  • What is the capacity of a track in bytes?
  • What is the capacity of each surface?
  • What is the capacity of the disk?
  • What is the capacity of a track in bytes?
  • gtgt Bytes/track bytes/sector x sector/track
    512 x 50 25K
  • 2. What is the capacity of each surface?
  • gtgt Bytes/surface bytes/track x tracks/surface
    25K x 2000 50,000K
  • 3. What is the capacity of the disk?
  • gtgt Bytes/disk bytes/surface x surfaces/disk
    50,000K x 10

17
Practice Chapter 7 (Cont.)
  • Consider a disk with sector size 512 bytes, 2,000
    tracks per surface, 50 sectors per track, 5
    double-sided platters, average seek time of 10
    msec.
  • 4. How many cylinders does the disk have?
  • 5. Give examples of valid block sizes. Is 256
    bytes a valid block size? 2,048? 51,200?
  • 4. gtgt The number of cylinders is the same as the
    number of tracks on each platter, which is 2000.
  • gtgt Block size should be a multiple of the sector
    size 256 not valid block size, but 2,048 and
    51,200 are.

18
Practice Chapter 7 (Cont.)
  • Consider a disk with sector size 512 bytes, 2,000
    tracks per surface, 50 sectors per track, 5
    double-sided platters, average seek time of 10
    msec.
  • 6. If the disk platters rotate at 5,400 rpm
    (revolutions per minute), what is the maximum
    rotational delay?

6. gtgt If the disk platters rotate at 5,400 rpm,
the time required for a rotation, which is the
maximum rotational delay, is (1/5400) x 60
0.011 seconds The average rotational delay is
half of the rotation time, 0.006 seconds.
19
Practice Chapter 7 (Cont.)
  • Consider a disk with sector size 512 bytes, 2,000
    tracks per surface, 50 sectors per track, 5
    double-sided platters, average seek time of 10
    msec.
  • 7. Assuming that one track of data can be
    transferred per revolution, what is the transfer
    rate?

7. gtgt The capacity of a track is 25K bytes.
Since one track of data can be transferred by
revolution, the data transfer rate is 25K / 0.011
2,250K bytespersec
20
Buffer Management in a DBMS
Page Requests from Higher Levels
BUFFER POOL
disk page
free frame
MAIN MEMORY
DISK
choice of frame dictated by replacement policy
  • Data must be in RAM for DBMS to operate on it!
  • Table of ltframe, pageidgt pairs is maintained.

21
When a Page is Requested ...
  • If requested page is not in pool
  • Choose a frame for replacement
  • If frame is dirty, write it to disk
  • Read requested page into chosen frame
  • Pin the page and return its address.

Requires two variables for buffer manager ?
pincount (initially set to zero for each
frame) ? dirty (initially off for each frame)
  • If requests can be predicted (e.g., sequential
    scans)
  • pages can be pre-fetched several pages at a
    time!

22
More on Buffer Management
  • Requestor of page must unpin it, and indicate
    whether page has been modified
  • dirty bit is used for this.
  • Page in pool may be requested many times,
  • a pin count is used. A page is a candidate for
    replacement iff pin count 0.
  • CC recovery may entail additional I/O when a
    frame is chosen for replacement. (Write-Ahead Log
    protocol more later.)

23
Buffer Replacement Policy
  • Frame is chosen for replacement by a replacement
    policy
  • Least-recently-used (LRU), Clock, MRU etc.
  • Policy can have big impact on of I/Os depends
    on the access pattern.
  • Sequential flooding Nasty situation caused by
    LRU repeated sequential scans.
  • buffer frames lt pages in file means each page
    request causes an I/O. MRU much better in this
    situation (but not in all situations, of course).

24
DBMS vs. OS File System
  • OS does disk space buffer mgmt why not let
    OS manage these tasks?
  • Differences in OS support portability issues
  • DBMS can predict access patterns in typical DB
    operations,
  • because most page references are generated by
    higher-level opns (e.g. seq. scans or
    implementations of rel. algebra operators)
  • and therefore it can adjust replacement policy,
    and pre-fetch pages based on those access
    patterns
  • Buffer management in DBMS requires ability to
  • pin a page in buffer pool, force a page to disk
    (important for implementing CC recovery),

25
Unordered (Heap) Files
  • Simplest file structure contains records in no
    particular order.
  • Only guarantee one can retrieve all records in
    file by repeated requests for the next record.
  • As file grows and shrinks, disk pages are
    allocated and de-allocated.
  • Every record in file has unique record id, rid,
    and every page in file is of same size.
  • To support record level operations, we must
  • keep track of the pages in a file
  • keep track of free space on pages
  • keep track of the records on a page
  • Supported operations create and destroy files,
    insert delete and get a record with rid, scan all
    records.
  • There are many alternatives for keeping track of
    this.

26
Heap File Implemented as a List
Data Page
Data Page
Data Page
Full Pages
Header Page
Data Page
Data Page
Data Page
Pages with Free Space
  • The ltheader page id, Heap file namegt must be
    stored someplace DBMS maintains table of such
    pairs.
  • Each page contains 2 pointers plus data.

27
Heap File Using a Page Directory
  • The entry for a page can include the number of
    free bytes on the page.
  • The directory is a collection of pages linked
    list implementation is just one alternative.
  • Much smaller than linked list of all HF pages!

28
Indexes
  • A Heap file allows us to retrieve records
  • by specifying the rid, or
  • by scanning all records sequentially
  • Sometimes, we want to retrieve records by
    specifying the values in one or more fields,
    e.g.,
  • Find all students in the CS department
  • Find all students with a gpa gt 3
  • Find all books by Asimov (? index by Author)
  • Find Foundation (? index by Title)
  • Indexes are file structures that enable us to
    answer such value-based queries efficiently.
  • Implementation just another kind of file
    containing records that direct traffic on
    requests for data records.

29
Indexes (Cont.)
  • Each index has an associated search key
  • collection of one or more fields of the file of
    records for which we are building the index
  • Any subset of the fields can be a search key
  • Sometimes refer to the file of records as the
    indexed file.
  • Each index designed to speed up equality or range
    selections on search key
  • E.g. If want to build index to improve efficiency
    of queries about employees of given age ? build
    index on age attribute of employee dataset.
  • Records stored in index file (called entries
    vs. data records) allow to find data records
    with given search key value
  • ltage, ridgt where rid identifies data record

30
Summary
  • Disks provide cheap, non-volatile storage.
  • Unit of Xfer from disk into main memory is called
    block or page.
  • Blocks are arranged on tracks on several
    platters.
  • Random access, but cost depends on location of
    page on disk important to arrange data
    sequentially to minimize seek and rotation
    delays.
  • Buffer manager brings pages into RAM.
  • Page stays in RAM until released by requestor.
  • Written to disk when frame chosen for replacement
    (which is sometime after requestor releases the
    page).
  • Choice of frame to replace based on replacement
    policy.
  • Tries to pre-fetch several pages at a time.

31
Summary (Contd.)
  • DBMS vs. OS File Support
  • DBMS needs features not found in many OSs, e.g.,
    forcing a page to disk, controlling the order of
    page writes to disk, files spanning disks,
    ability to control pre-fetching and page
    replacement policy based on predictable access
    patterns, etc.
  • Variable length record format with field offset
    directory offers support for direct access to
    ith field and null values.
  • Slotted page format supports variable length
    records and allows records to move on page.

32
Summary (Contd.)
  • File layer keeps track of pages in a file, and
    supports abstraction of a collection of records.
  • Pages with free space identified using linked
    list or directory structure (similar to how pages
    in file are kept track of).
  • Indexes support efficient retrieval of records
    based on the values in some fields.
  • Catalog relations store information about
    relations, indexes and views. (Information that
    is common to all records in a given collection.)

33
Practice Chapter 7 (Cont.??)
  • Consider a disk with sector size 512 bytes, 2,000
    tracks per surface, 50 sectors per track, 5
    double-sided platters, average seek time of 10
    msec., and suppose block size of 1,024 is chosen.
    Suppose that a file containing 100,000 records
    of 100 bytes each is to be stored on such a disk
    and that no record is allowed to span on two
    blocks.
  • How many records fit onto a block?
  • blocksize / bytesperrec
  • 1024 / 100 10. We can have at most 10 records
    in a block.

34
Chapter 7 Practice (Cont.)
  • Explain what the buffer manager must do to
    process a read request for a page. What happens
    if the requested page is in the pool but not
    pinned?
  • When a page is requested the buffer manager does
    the following
  • The buffer pool is checked to see if it contains
    the requested page. If the page is not in the
    pool, it is brought as follows
  • A frame is chosen for replacement, using the
    replacement policy.
  • If the frame chosen for replacement is dirty (has
    been written to), it is flushed (the page it
    contains is written out to disk).
  • The requested page is read into the frame chosen
    for replacement.
  • The requested page is pinned (pin_count of its
    frame is incremented), and its address is
    returned to the requestor.
  • Note that if the page is not pinned, it could be
    removed from buffer pool even if it is actually
    needed in main memory.

35
Chapter 7 Practice (Cont.)
  • When does a buffer manager write a page to disk?
  • If a page in the buffer pool is chosen to be
    replaced and this page is dirty, the buffer
    manager must write the page to the disk called
    flushing the page to the disk.
  • Sometimes buffer mgr can also force a page to
    disk for recovery-related purposes (to ensure
    that log records corresponding to a modified page
    are written to disk before modified page itself
    is written to disk).

36
Chapter 7 Practice (Cont.)
  • What does it mean to say that a page is pinned in
    the buffer pool? Who is responsible for pinning
    pages? Who is responsible for unpinning pages?
  • Pinning a page means the pin_count of its frame
    is incremented. Pinning a page guarantees
    higher-level DBMS s/w that the page will not be
    removed from the buffer pool by the buffer
    manager. That is, another file page will not be
    read into the frame containing this page until it
    is unpinned by this requestor.
  • It is the buffer managers responsibility to pin
    a page.
  • It is the responsibility of the requestor of that
    page to tell the buffer manager to unpin a page.

37
Chapter 7 Practice (Cont.)
  • When a page in the buffer pool is modified, how
    does the DBMS ensure that this change is
    propagated to disk? (Explain role of buffer mgr
    as well as modifier of the page).
  • The modifier of the page tells the buffer manger
    that the page is modified by setting the dirty
    bit of the page.
  • The buffer manager flushes the page to disk when
    necessary.

38
Chapter 7 Practice (Cont.)
  • What happens if there is a page request when all
    pages in the buffer pool are dirty?
  • If there are some unpinned pages, the buffer
    manger chooses one by using the replacement
    policy, flushes this page, and then replaces it
    with the requested page.
  • If there are no unpinned pages, the buffer
    manager has to wait until an unpinned page is
    available (or signal an error condition to the
    page requestor).

39
Chapter 7 Practice (Cont.)
  • What is sequential flooding of the buffer pool?
  • Some DB operations (e.g. certain implementations
    of the join relational algebra operator) require
    repeated sequential scans of a relation. Suppose
    that there are 10 frames available in the buffer
    pool, and the file to be scanned has 11 or more
    pages (I.e. at least one more than the number of
    available pages in the buffer pool). Using LRU,
    every scan of the file will result in reading in
    every page of the file! In this situation,
    called sequential flooding, LRU is the worst
    possible replacement strategy.

40
Chapter 7 Practice (Cont.)
  • Name an important capability of a DBMS buffer mgr
    that is not supported by a typical OSs buffer
    mgr.
  • Pinning a page to prevent it from being replaced.
  • Ability to explicitly force a single page to
    disk.
  • Explain the term prefetching. Why is it
    important?
  • Because most page references in a DBMS env. Are
    with a known reference pattern, the buffer mgr
    can anticipate the next several page requests and
    fetch the corresponding pages into memory before
    the pages are requested. This is prefetching.
  • Benefits 1) pages are available in buffer pool
    when they are requested 2) reading in a
    contiguous block of pages is much faster than
    reading the same pages at different times in
    response to distinct requests.
Write a Comment
User Comments (0)
About PowerShow.com