Chapter Eight : File Management - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Chapter Eight : File Management

Description:

User told 'disk full' when only directory full. ... File organized in sequential fashion can support only sequential access to its ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 54
Provided by: Terri108
Category:

less

Transcript and Presenter's Notes

Title: Chapter Eight : File Management


1
Chapter Eight File Management
  • The File Manager
  • Interacting With File Manager
  • File Organization
  • Physical Storage Allocation
  • Data Compression
  • Access Methods
  • Levels in File Management System
  • Access Control Verification Module
  • Fixed Length Contiguous
  • Records Storage
  • Non-contiguous
  • Storage
  • Variable Length
  • Records
  • Indexed
  • Storage
  • Sequential or Direct File Access

2
The File Manager
  • File Manager controls every file in system which
    is a complex job.
  • Efficiency depends on
  • how systems files are organized (sequential,
    direct, or indexed sequential).
  • how theyre stored (contiguously,
    noncontiguously, or indexed).
  • how each files records are structured
    (fixed-length or variable-length).
  • how access to these files is controlled .

3
Responsibilities of File Manager
  • Track where each file is stored.
  • Determine where and how files will be stored.
  • Efficiently use available storage space.
  • Provide efficient access to files.
  • Allocate each file when a user has been cleared
    for access to it, then record its use.
  • Deallocate file when it is returned to storage.
  • Communicate its availability to others waiting
    for it.

4
Important Definitions
  • Field -- group of related bytes that can be
    identified by user with name, type, and size.
  • Record -- group of related fields.
  • File (flat file) -- group of related records that
    contains info used by specific application
    programs to generate reports.
  • Database -- groups of related files that are
    interconnected at various levels to give flexible
    access to users.
  • Appears to File Manager to be a type of file.

5
Definitions - 2
  • Program files contain instructions.
  • Data files contain data.
  • Directories -- listings of file names and their
    attributes.
  • Every program and data file accessed by computer
    system, and every piece of computer software, is
    treated as a file.
  • File Manager treats all files exactly same way as
    far as storage is concerned.

6
Interacting With File Manager
  • Users communicates with File Manager via specific
    commands that may be either embedded in users
    program or submitted interactively by user.
  • Embedded commands
  • OPEN CLOSE pertain to availability of file for
    program invoking it.
  • READ WRITE are I/O commands.
  • MODIFY specialized WRITE command for existing
    data files that allows for appending/rewriting
    records.

7
Interactive Commands
  • CREATE DELETE -- deal with systems knowledge
    of file.
  • SAVE -- first time used, a file is actually
    created.
  • OPEN NEW -- within a program indicates file must
    be created.
  • OPENFOR OUTPUT -- creates file by making entry
    for it in directory finding space for it in
    secondary storage.
  • RENAME -- allows users to change name of existing
    file.
  • COPY allows user to make duplicate copies of
    existing files.

8
Commands Are Device-Independent
  • Interface commands designed to be as simple as
    possible to use.
  • Lack detailed instructions to run device where
    file is stored.
  • Device independent.
  • To access a file, user doesnt need to know its
    exact physical location on disk pack or storage
    medium.
  • Each logical command broken down into sequence of
    low-level signals that
  • Trigger step-by-step actions performed by device.
  • Supervise progress of operation by testing
    devices status.

9
Typical Volume Configuration
  • Each secondary storage unit (removable or
    non-removable) is considered a volume.
  • Each volume can contain several files called
    multifile volumes.
  • Some files are extremely large and are contained
    in several volumes called multivolume files.
  • Generally, each volume in system is given name.
  • File Manager writes name other descriptive info
    on easy-to-access place on each unit.

10
Master File Directory (MFD)
  • MFD stored immediately after volume descriptor
  • Lists names characteristics of every file
    contained in volume.
  • File names refer to program files, data files,
    and/or system files.
  • Subdirectories, if supported.
  • Remainder of volume is used for file storage.
  • Early OS supported only a single directory per
    volume.
  • Created by File Manager.
  • Contains names of files, usually organized in
    alphabetical, spatial, or chronological order.
  • Simple to implement and maintain.
  • Some major disadvantages

11
Volume Descriptor
12
Some Major Disadvantages of Single Directory Per
Volume
  • Takes long time to search for an individual file,
    especially if MFD was organized in an arbitrary
    order.
  • If user has many small files stored in volume,
    directory space fills before disk storage space
    fills. User told disk full when only directory
    full.
  • Users cant create subdirectories to group
    related files.
  • Multiple users cant safeguard files from other
    users browsing file lists cause entire directory
    listed on request.
  • Each program in entire directory needs unique
    name.
  • E.g., Only 1 person using directory can name
    program PROG1.

13
About Subdirectories
  • Semi-sophisticated File Managers create MFD for
    each volume with entries for files
    subdirectories.
  • Subdirectory created when user opens account to
    access computer.
  • MFD entry flagged to indicate subdirectory with
    unique properties.
  • Improvement from single directory scheme.
  • Still cant group files in a logical order to
    improve accessibility efficiency of system.

14
Subdirectories Can Be Implemented As an
Upside-down Tree
  • Todays File Managers allow users to create
    subdirectories so related files are grouped
    together.
  • Extension of previous two-level directory
    structure.
  • Tree structures allow system to efficiently
    search individual directories due to fewer
    entries in each.
  • Path to requested file may lead through several
    directories.
  • When user wants to access specific file, file
    name is sent to File Manager. File Manager
    searches MFD for user's directory. Then searches
    user's directory any subdirectories for
    requested file location.

15
File Descriptor
  • Each file entry in every directory contains info
    describing file
  • File nameusually represented in ASCII code.
  • File typeorganization and usage that are
    dependent on system (e.g., Files and
    directories).
  • File sizesize is kept here for convenience.
  • File locationidentification of first physical
    block (or all blocks) where file is stored.
  • Date and time of creation.
  • Owner.
  • Protection informationaccess restrictions based
    on who is allowed to access file and what type of
    access is allowed.
  • Record size its fixed size or its maximum size,
    depending on type of record

16
File Names
  • Absolute file name (complete file name) long
    name that includes all path info.
  • Relative file name short name seen in
    directory listings.
  • Selected by user when file is created.
  • E.g., ACCOUNT ADDRESSES, TAXES 2001, or AUTOEXEC.
  • Extension 2-3 character name used to identify
    type of file or its contents.
  • Separated from relative name by a period.
  • E.g., CPP, BAS, BAT, COB, EXE signal to system
    to use specific compiler or program to run these
    files.
  • E.g., TXT, DOC, OUT, MIC, KEY created by
    applications or by users for own identification.

17
File Naming Conventions
  • Can vary in length from 1 or more characters.
  • Can include letters of alphabet digits.
  • Every OS has specific rules that affect length of
    relative name types of characters allowed.
  • E.g., MS-DOS allows 1-8 alphanumeric character
    names without spaces.
  • More modern OS allow names with dozens of
    characters including spaces.
  • Try to select descriptive relative names that
    readily identify file contents/purpose of file.

18
Base and Current Directories Used by File Manager
to Locate Files
  • File Manager selects base directory for user when
    interactive session begins.
  • All file operations requested by that user start
    here.
  • Then, user selects subdirectory (current
    directory or working directory).
  • Thereafter, files presumed to be located in
    current directory.
  • Whenever file accessed, user types in relative
    name File Manager adds proper prefix.
  • As long as users refer to files in working
    directory, can access them without entering
    complete name.

19
File Organization Record Format
  • Fixed-length records easiest to access
    directly.
  • Most common type ideal for data files.
  • Record size critical (too small truncation too
    large wastes space).
  • Variable-length records -- difficult to access
    directly because hard to calculate exactly where
    record is located.
  • Dont leave empty storage space dont truncate
    any characters.
  • Frequently used in files accessed sequentially
    (e.g,. text files, program files) or files using
    index to access records.
  • File descriptor stores record format, how its
    blocked, other related info.

20
Physical File Organization
  • Concerned with how records are arranged
    characteristics of medium used to store it.
  • On magnetic disks, files can be organized as
  • Sequential
  • Direct
  • Indexed sequential.

21
Characteristics Considered When Selecting File
Organization
  • Volatility of datafrequency with which additions
    deletions made.
  • Activity of file records processed during a
    given run.
  • Size of file.
  • Response timeamount of time user is willing to
    wait before requested operation is completed.

22
Sequential Record Organization
  • Easiest to implement because records are stored
    retrieved serially, one after other.
  • To speed process some optimization features may
    be built into system.
  • E.g., select a key field from record then sort
    records by that field before storing them.
  • Aids search process.
  • Complicates maintenance algorithms because
    original order must be preserved every time
    records added or deleted.

23
Direct Record Organization (Random Organization)
  • Uses direct access files which can be implemented
    only on direct access storage devices.
  • Give users flexibility of accessing any record in
    any order without having to begin search from
    beginning of file.
  • Records are identified by their relative
    addresses (their addresses relative to beginning
    of file).
  • Logical addresses computed when records are
    stored again when records are retrieved.
  • Use hashing algorithms.

24
Advantages of Direct Access Organization
  • Fast access to records.
  • Can be accessed sequentially by starting at first
    relative address incrementing it by one to get
    to next record.
  • Can be updated more quickly than sequential files
    because records quickly rewritten to original
    addresses after modifications.
  • No need to preserve order of the records, so
    adding or deleting them takes very little time.

25
Collisions Are a Problem With Direct Access
Organization
  • Several records with unique keys may generate
    same logical address (collision).
  • Program generates another logical address before
    presenting it to File Manager for storage.
  • Colliding records stored in overflow area via
    links.
  • File Manager handles physical allocation of
    space.
  • Maximum file size established when created
    eventually file is full or too many records are
    stored in overflow area.
  • Programmer must reorganize rewrite file.

26
Indexed Sequential Record Organization
  • Combines best of sequential direct access.
  • Created maintained through Indexed Sequential
    Access Method (ISAM) software package.
  • Doesnt create collisions because it doesnt use
    result of hashing algorithm to generate a
    records address.
  • Uses info to generate index file through which
    records retrieved.
  • Divides ordered sequential file into blocks of
    equal size.
  • Size determined by File Manager to take advantage
    of physical storage devices to optimize
    retrieval strategies.
  • Each entry in index file contains highest record
    key physical location of data block where this
    record, records with smaller keys, are stored.

27
Indexed Sequential - 2
  • To access any record in file, system begins by
    searching index file then goes to physical
    location indicated at that entry.
  • Overflow areas are spread throughout file
  • Existing records can expand new records are in
    close physical logical sequence.
  • Last-resort overflow area is located apart from
    main data area but is used only when the other
    overflow areas are completely filled.
  • When retrieval time becomes too slow, file has to
    be reorganized..
  • Allows both direct access to a few requested
    records sequential access to many records for
    most dynamic files.
  • A variation of indexed sequential files is
    B-tree.

28
Physical Storage Allocation
  • File Manager must work with files not just as
    whole units but also as logical units or records.
  • Records within file must have same format but can
    vary in length.
  • Records are subdivided into fields.
  • Structure usually managed by application
    programs, not OS.
  • When we talk about file storage, were actually
    referring to record storage .

29
  • Unblocked, fixed-length records
  • Blocked, fixed length records
  • Unblocked, variable-length records
  • Unblocked, variable-length records
  • Blocked, variable-length records

30
Contiguous Storage
  • Records stored one after other.
  • Any record can be found read once starting
    address size are known, so directory is very
    streamlined.
  • Direct access easy every part of file is stored
    in same compact area.
  • Files cant be expanded unless theres empty
    space available immediately following it.
  • Room for expansion must be provided when file is
    created.
  • Fragmentation occurs (slivers of unused storage
    space).
  • Can compact rearrange files.
  • Files cant be accessed while compaction is
    taking place.

31
Noncontiguous Storage
  • Allows files to use any storage space available
    on disk.
  • Files records are stored in a contiguous manner
    if enough empty space.
  • Any remaining records, all other additions to
    file, are stored in other sections of disk
    (extents).
  • Linked together with pointers.
  • Physical size of each extent is determined by OS
    (e.g., 256 bytes).

32
Linking File Extents
  • Linking at storage level each extent points to
    next one in sequence.
  • Directory entry consists of file name, storage
    location of first extent, location of last
    extent, total number of extents, not counting
    first.
  • Linking at directory level each extent listed
    with its physical address, size, pointer to
    next extent.
  • A null pointer indicates that it's last one.
  • Eliminate external storage fragmentation need
    for compaction.
  • Dont support direct access because no easy way
    to determine exact location of specific record.

33
Indexed Storage
  • Allows direct record access by bringing pointers
    linking every extent of that file into index
    block.
  • Every file has its own index block (addresses of
    each disk sector that make up the file)
  • Lists each entry in same order in which sectors
    linked .
  • When a file is created, pointers in index block
    set to null.
  • As each sector is filled, pointer set to
    appropriate sector address.
  • Address is removed from empty space list copied
    into its position in index block.

34
Indexed Storage - 2
  • Supports both sequential direct access.
  • Doesnt necessarily improve use of storage space
    because each file must have index block.
  • For larger files with more entries, several
    levels of indexes can be generated.
  • To find a desired record, File Manager accesses
    first index (highest level), which points to a
    second index (lower level), which points to an
    even lower level index eventually to data
    record.

35
Data Compression
  • Several techniques (3) used to save space in
    files.
  • System must be able to distinguish between
    compressed uncompressed data.
  • Trade-off storage space gained, but processing
    time lost.
  • Records with repeated characters can be
    abbreviated.
  • E.g., fixed-length field with short name many
    blank characters replaced with variable-length
    field special code to indicate blanks
    truncated.
  • ADAMSbbbbbbbbbb ? ADAMSb10
  •   300000000 ? 38

36
Data Compression Repeated Terms
  • Repeated terms compressed by using symbols to
    represent each of most commonly used words in the
    database.
  • E.g., in a universitys student database common
    words like student, course, teacher, classroom,
    grade, department could each be represented
    with single character.

37
Data Compression Front-end Compression
  • 3. Front-end compression used for index
    compression.
  • For example, student database where the students
    names are kept in alphabetical order could be
    compressed

38
Access Methods
  • Access methods dictated by a files organization
  • Most flexibility is allowed with indexed
    sequential files and least with sequential.
  • File organized in sequential fashion can support
    only sequential access to its records, these
    records can be of fixed or variable length.
  • File Manager uses the address of last byte read
    to access the next sequential record.
  • Current byte address (CBA) must be updated every
    time a record is accessed.

39
Sequential Access
  • For sequential access of fixed-length records,
    CBA updated by incrementing it by record length
    (RL), which is constant
  • CBA CBA RL
  • For sequential access of variable-length records,
    File Manager adds length of record (RLk) plus
    number of bytes used to hold record length (N) to
    CBA.
  • CBA CBA N RLk

40
Direct Access Fixed-Length Records
  • If file is organized in direct fashion, accessed
    easily in direct or sequential order if have
    fixed-length records.
  • For direct access with fixed length records, CBA
    computed directly from record length desired
    record number RN (info provided through READ
    command) minus one
  • CBA(RN1) RL

41
Direct Access Variable-Length Records
  • Virtually impossible to access a record directly
    because address of desired record cant be easily
    computed.
  • To access a record, File Manager must do
    sequential search through records.
  • If File Manager saves address of last record
    accessed, can do half-sequential read through
    file. When next request arrives it could search
    forward from CBA.
  • Or File Manager can keep table of record numbers
    their CBAs. Search table for exact storage
    location of desired record.
  • To avoid this problem, many systems force users
    to have files organized for fixed-length records
    if want direct access to records.

42
Access of Records in Indexed Sequential File
  • Accessed either sequentially or directly,
  • Either CBA computations apply but with one extra
    step.
  • Index file must be searched for pointer to block
    where data stored.
  • Because index file is smaller, kept in main
    memory quick search to locate block where
    desired record is located.
  • Block retrieved from secondary storage
    beginning byte address of record calculated.
  • In systems with several levels of indexing, index
    at each level must be searched before computing
    CBA.
  • Entry point to this type of data file is usually
    through index file.

43
Levels in a File Management System
  • Efficient management of files cant be separated
    from efficient management of devices that house
    them.
  • A wide range of functions must be organized for
    I/O system to perform efficiently.
  • Each level implemented by using structured
    modular programming techniques, which also set up
    a hierarchy.
  • Basic File System
  • Access Control Module
  • Logical File System
  • Physical File System
  • Device Interface Module
  • Device

44
Basic File System
  • Highest level module that passes info to logical
    file system, which notifies physical file system,
    which works with Device Manager.
  • Activates access control verification module to
    verify that this user is permitted to perform
    this operation with this file.

45
Access Control Verification Module
  • Any file can be shared.
  • Saves space allows for synchronization of data
    updates.
  • Improves efficiency of system's resources,
    because if files are shared in main memory, I/O
    operations reduced.
  • However, integrity of each file must be
    safeguarded
  • Control over who is allowed to access file and
    what type of access is permitted.
  • READ only, WRITE only, EXECUTE only, DELETE only,
    or some combination.

46
File Access Control Methods
  • Each file management system has own file access
    control method.
  • Access control matrix
  • Access control lists Most
  • Capability lists Common Methods
  • Lockword control.

47
Access Control Matrix
  • Intuitively appealing easy to implement.
  • Works well only for systems with few files few
    users.
  • In matrix each column identifies a user each
    row identifies a file.
  • Intersection of row column has access rights
    for that user to that file.

48
Access Control Lists
  • Modification of access control matrix technique.
  • Each file is entered in list contains names of
    users allowed to access it type of access
    permitted.
  • To shorten list, only those who may use file are
    named those denied any access are grouped under
    global heading such as WORLD.
  • Or shorten by putting every user into a category
  • SYSTEM system personnel with unlimited access
    to all files.
  • OWNER absolute control over all files created
    in own account.
  • GROUP all users belonging to appropriate group
    have access.
  • WORLD all other users in system default access
    types given by File Manager.

49
Access Control List Example
50
Capability Lists
  • Lists every user and files to which each has
    access.
  • Requires less storage space than an access
    control matrix.
  • Easier to maintain than an access control list
    when users are added or deleted from system.

51
Lockword Control
  • Lockword is similar to a password but protects a
    single file.
  • When file created, owner protects it via lockword
  • Stored in directory but isnt revealed with
    directory listing.
  • User must provide correct lockword to access
    protected file.
  • Require smallest amount of storage for file
    protection.
  • Can be guessed by hackers or passed on to
    unauthorized users.
  • Generally doesnt control type of access to file.
  • Anyone who knows lockword can read, write,
    execute, or delete file.

52
Terminology
  • access control list
  • access control matrix
  • capability list
  • complete file name
  • current byte address (CBA)
  • current directory
  • data compression
  • data file
  • database
  • device independent
  • direct access files
  • direct record organization
  • directory
  • extension
  • extents
  • file
  • file descriptor
  • fixed-length record
  • hashing algorithm
  • indexed sequential record organization
  • key field
  • lockword
  • logical address

53
Terminology - 2
  • logical address
  • master file directory (MFD)
  • relative address
  • relative file name
  • sequential record organization
  • subdirectory
  • variable-length record
  • volume
  • working directory
Write a Comment
User Comments (0)
About PowerShow.com