A Fast File System for Unix - PowerPoint PPT Presentation

About This Presentation

A Fast File System for Unix


A Fast File System for Unix Marshall K. Mckusick, William N. Joy, Samual J. Leffler and Robert S. Fabry Computer Systems Research Group, UCB Presented By: – PowerPoint PPT presentation

Number of Views:206
Avg rating:3.0/5.0
Slides: 46
Provided by: Goog6296


Transcript and Presenter's Notes

Title: A Fast File System for Unix

A Fast File System for Unix
  • Marshall K. Mckusick, William N. Joy,
  • Samual J. Leffler and Robert S. Fabry
  • Computer Systems Research Group, UCB

Presented By Parang Saraf
CS 5204 Operating Systems, Virginia Tech
About the Paper
  • Considered as one of the most fundamental papers
    in operating systems
  • Have been cited around 930 times
  • Describes a new file system

Traditional File System
  • File System developed at Bell Laboratories
  • A file system is described by its Super-Block
  • Number of Data Blocks
  • Count of maximum number of files
  • Pointer to free list (linked list to all free
  • Disk drive is divided into partitions
  • Each disk partition may contain one file system
  • A file system never spans multiple partitions

Traditional File System
Traditional File System Inode
  • Each file has a descriptor associated with it
  • Information includes
  • Ownership of the file
  • Time stamps marking last modification and access
  • Array of indices pointing to the data blocks
  • Direct Blocks 8
  • Indirect Blocks Singly, Doubly and Triply

Traditional File System Inode
Traditional File System Inode
Traditional File System Problem
  • Inode information segregated from Data
  • Long seek time from inode to its data
  • Files in single directory are not typically
    allocated consecutive slots for inode information
  • Many non-consecutive blocks of inodes are
    accessed when executing operations on inodes of
    several files in a directory
  • Sub-optimum allocation of data blocks
  • Small Block size 512 bytes
  • Many Seeks Next sequential block is not on the
    same cylinder
  • Limited read-ahead

Old File System
  • Developed at Berkeley
  • Increased Throughput
  • Changing the basic block size from 512 bytes to
    1024 bytes
  • Each disk transfer accessed twice as much data
  • Less number to indirect blocks used
  • Increased Reliability
  • Staging modifications to critical file system
    information so that they could either be
    completed or repaired cleanly after a crash

Old File System Problem
  • Old file system was still using just 4 of disk
  • Main problem Scrambled Free List

Old File System Problem
  • Old file system was still using just 4 of disk
  • Main problem Scrambled Free List
  • Initially ordered for optimal access
  • Scrambled because files were created and removed
  • Eventually becomes entirely random blocks
    allocated randomly
  • On creation provides transfer rates up to 175
  • Rate deteriorates to 30 kbps after a few weeks of
    moderate use
  • Possible Solution Dump, rebuild and restore /

New File System
  • Each disk drive contains one or more file systems
  • A File System is described by its super-block,
    located at the beginning of the disk partition
  • Super-block is replicated to protect against
    catastrophic loss
  • Block size is any power of two gt 4096 bytes
  • Decided at the time of file system creation and
    cant be changed
  • File Systems can have different block sizes

New File System Cylinder Groups
  • Comprises of one or more consecutive cylinders

New File System Cylinder Groups
  • Comprises of one or more consecutive cylinders
  • Disk partition is divided into one or more
    cylinder groups
  • Has associated book-keeping information
  • A redundant copy of super-block
  • Space for inodes
  • A bit map describing available blocks replaces
    free list
  • Summary information describing usage of data

New File System Cylinder Groups
  • Contains static number of inodes
  • Allocated at file system creation time
  • Default policy one inode for each 2048 bytes
  • Book-keeping information begins at varying offset
    from the beginning of the cylinder group
  • Redundant information spirals down into the
  • Any single track, cylinder or platter can be lost
    without losing copies of the super-block

New File System Structure
New File System Key Contributions
  • Optimizing storage utilization
  • File System Parameterization
  • Layout Policies

Optimizing Storage Utilization
  • New 4096 size blocks transfers 4 times more
  • Problem with large blocks
  • Wasted space due to small files

Optimizing Storage Utilization
  • Solution
  • Divide the 4096 block into 2, 4 or 8 fragments to
    accommodate small files
  • Fragment size is specified at the time file
    system is created
  • Block map records the space available at fragment

Optimizing Storage Utilization
  • Free List vs Bitmap

Optimizing Storage Utilization
  • Space allocation
  • Space is allocated when a program does a write
    system call
  • Three possible conditions
  • Enough space left in an already allocated block
    or fragment
  • File contains no fragmented blocks allocate new
    blocks and fragments
  • File contains one or more fragmented blocks but
    has insufficient space to hold new data new
    block is allocated, old fragments are copied and
    new fragments are appended

Optimizing Storage Utilization
  • Free space reserve
  • Minimum acceptable percentage of file system
    blocks that should be free 90
  • Only system administrator can allocate blocks
    after that
  • Important for the layout policies to be effective
  • After this the file system throughput is cut in
    half because of the inability to localize blocks
    in a file

Optimizing Storage Utilization
  • Wasted space comparison
  • Space wasted by 4096/1024 byte new file system is
    same as 1024 byte Old File System
  • New file system uses less space for indexing
    large files
  • Uses same amount of space for small files
  • Free space reserve should also be counted as
    wasted space

File System Parameterization
  • Optimum block allocation based on hardware
  • Speed of Processor
  • Hardware support for mass storage transfers
  • Characteristics of the mass storage devices
  • Blocks are allocated on the same cylinder
  • Block allocation depends on whether the processor
    has an input/output channel or not

File System Parameterization
Accessing which data is faster?
File System Parameterization
Accessing which data is faster?
Depends whether processor has I/O channel or not
File System Parameterization
  • Rotationally Optimal Blocks
  • Processors without I/O channels must field an
    interrupt and then prepare for a new disk
  • Disk rotates during this time
  • Place blocks such that disk rotation is taken
    into account before the start of a new disk
    transfer operation
  • Cylinder group summary information includes count
    of blocks based on different rotational positions
    8 positions
  • Super-block contains a vector of lists called as
  • Rotational Layout Tables Used by system when
    allocating new blocks

File System Parameterization
Layout Policies
  • Layout policies divided into two distinct parts
  • Global Policies
  • Local Allocation Routines
  • Two allocable resources
  • Inodes
  • Data Blocks

Layout Policies
  • Global Policies
  • Uses file system wide summary information to make
    decisions regarding the placement of new inodes
    and data blocks
  • Tries to localize data that is concurrently
    accessed while spreads out unrelated data
  • Inodes
  • Places all inodes of files in a directory in the
    same cylinder group
  • A new directory is placed in a cylinder group
    that has a greater than average number of free
    inodes and the smallest number of directories
    already in it ensures that files are
    distributed throughout the disk

Layout Policies
  • Global Policies
  • Data Blocks
  • Tries to place all data blocks for a file in the
    same cylinder group
  • None of the cylinder groups should ever become
    completely full
  • Heuristic Solution redirect block allocation to
    a different cylinder group when a file exceeds 48
    kb and at every MB thereafter
  • Ensures that cost of one long seek per MB is
  • New cylinder groups are chosen from those
    cylinder groups that have a greater than average
    number of free blocks left
  • Finally it calls Local Allocation Routines for
    block allocation

Layout Policies
  • Local Allocation Routines
  • Allocates a free block as requested by the Global
    layout policies
  • Uses a four level allocation
  • First Level use the next free block that is
    rotationally closest to the requested block on
    the same cylinder

Layout Policies
  • Local Allocation Routines
  • Second Level if there are no free blocks on the
    same cylinder, a free block in the same cylinder
    group is selected

Cylinder Group
Layout Policies
  • Local Allocation Routines
  • Third Level if the cylinder group is full, use
    the quadratic hash function to hash the cylinder
    group number to find another cylinder group to
    look for a free block
  • Fourth Level if the hash fails, use an
    exhaustive search on all cylinder groups
  • Quadratic Hash
  • is used because of its speed in finding unused
    slots in nearly full hash tables
  • File systems parameterized to maintain 10 free
    space rarely use this

  • Measured Throughput

  • List Directory command performance
  • For large directories containing many
    directories, disk access for inodes is cut by a
    factor of two
  • For large directories containing only files, disk
    access for inodes is cut by a factor of eight
  • Both reads and writes are faster in new file
  • Because larger block sizes are used
  • The overhead of allocating is more but cost per
    byte allocation is same
  • Reading rate is always at least as fast as
    writing rate
  • Writes are slower for 4096 byte block as compared
    to 8096 byte block
  • In old file system writing was 50 faster than

New File System - Limitations
  • Limited by memory to memory copy operations
    required to move data from disk buffers in the
    systems address space to data buffers in the
    users address space
  • Buffer alignment of both address space
  • One block is allocated to a file at a time
  • Pre-allocate several blocks at once and releasing
    unused ones on file closing

Functional Enhancements
  • Long File Name
  • File Locking
  • Symbolic Links
  • Rename
  • Quotas

Long File Name
  • Maximum length of file name is 255 characters
  • Directories are allocated 512 byte units called
  • Chunks are broken into Directory Entries
  • Contains information necessary to map the name of
    file with inode
  • First three fields are fixed length inode
    number, size of entry and length of file name

File Locking
  • Hard Lock always enforced when a program tries
    to access a file
  • Advisory shared or exclusive locks requested by
    the programs
  • System administrator privilege can override locks
  • No deadlock detection is attempted

Symbolic Links
  • A symbolic link is implemented as a file that
    contains a pathname
  • Pathname can be relative or absolute
  • On encountering a symbolic link while
    interpreting a component of a pathname, the
    contents of the symbolic link is prepended to the
    rest of the pathname

  • Old file system required three system calls for
  • Target file could be left with temporary name due
    to crash
  • New rename system call added that guarantees the
    existence of the target name
  • Renaming works both on directory and files

  • Old file system any single user can allocate
    all the available space in the file system
  • Quota restricts the amount of file system
    resources that a user can obtain
  • Sets limits to both inodes and number of disk
  • Hard and soft limits

Key Take-Away points
  • Substantially higher throughput rates large
    block size
  • Flexible allocation policies
  • Better locality of reference
  • Less wastage
  • Adapted to wide range of peripheral and processor

  • Presentation on A Fast File System by
  • Zhifei Wang www.cs.pdx.edu/walpole/class/cs533/
  • pdc-amd01.poly.edu/wein/cs6243/ppts/fastfile.ppt
  • Sean Mondesire and Subramanian Kasi
  • www.scs.ryerson.ca/aabhari/File_System.ppt
  • http//flylib.com/books/en/
  • http//osr507doc.sco.com/en/HANDBOOK/graphics/hard
Write a Comment
User Comments (0)
About PowerShow.com