File System - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

File System

Description:

Used for storing long-term information on disk and other external media in units ... Client-server frenzy reaches the desktop of average users with little patience ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 61
Provided by: smf0
Category:
Tags: file | frenzy | system

less

Transcript and Presenter's Notes

Title: File System


1
Lecture 13
  • File System Databases

2
File System
  • Used for storing long-term information on disk
    and other external media in units called files
  • Information stored in files must be persistent
  • It is not affected by process creation and
    termination
  • Process can read and write files if they need to

3
File System
  • Files
  • When a process create a file, it gives the file a
    name
  • When the process terminates, the file continues
    to exist, and can be accessed by other processes
    using its name.

4
File System
  • Contain two parts
  • Collection of files
  • - Each storing related data
  • A directory structure
  • - Organize and provide information about all the
    files in the system

5
File Concept
  • OS abstracts from the physical properties of its
    storages to define a logical storage unit.
  • Files are mapped, by OS, onto physical devices.
  • These devices are usually nonvolatile, so the
    contents are persistent through power failures
    and system reboots.

6
File Concept
  • Files may be free-form, such as text files or
  • Files may be formatted rigidly
  • So in general
  • A file is a sequence of bits, bytes lines or
    records whose meaning is defined by the files
    creator and user.

7
File Structure
  • Text file a sequence of characters organized
    into lines/pages
  • Source file a sequence of functions and
    subroutines
  • Object file a sequence of bytes organized into
    blocks understandable by the systems linker
  • Executable file a series of code sections that
    the loader can bring into memory and execute

8
File Attributes
  • Name usually a string of characters
  • Type information need for those systems that
    support different types
  • Location a pointer to a device and to the
    location of the file on that device
  • Size current size of the file in bytes, words,
    or blocks

9
File Operations
  • OS provides system calls to create/write/read/repo
    sition/delete/truncate files. (refer UNIX system
    for examples)
  • Creating a file in two steps
  • Find the space for file allocation
  • Make an entry for the new file in the directory,
    recording file name and the location in the file
    system.

10
File Open
  • Some system implicitly open a file when the first
    reference is made to it.
  • The file is automatically closed when the
    job/program that opened the file terminates
  • Most systems requires that a file be opened
    explicitly by the programmer with a system call
    (open) before that file can be used.
  • The open system call will return a pointer to the
    entry in the open-file table.

11
Information with a Open File
  • File pointer
  • Unique to each process operating on the file
  • File open count
  • Tracks the no. of opens and closes and reaches
    zero on the last close.
  • Disk location of the file
  • Info. Needed to locate the file on disk is kept
    in memory to avoid having to read it from disk
    for each operation.

12
Typical Open File Table
13
File Types
  • If an OS recognizes the type of a file, it can
    then operate on the file in a reasonable ways.
  • Common technique to implement file types is to
    include the type as part of the file name.
  • A name and a extension.

14
Common File types
15
File System Organization
  • To organize thousand of files on hundreds of
    gigabytes of disk will be done in two parts
  • 1. The file system is broken into partition.
  • - Typically each disk on a system contains at
    least one partition where directory and data
    reside.
  • 2. Each partition contains information about
    files within it.
  • - Information is like name, location, size, types
    of the files on that partition.

16
Directory structure
  • Single-level directory
  • Containing all the user files
  • Naming uniqueness problem
  • Used only by most primitive microcomputer OS

17
Directory structure
  • Two-level directory
  • Each user has her own user file
  • Eliminate name conflicts among users

18
Directory structure
  • Three-level directory
  • The most common directory structure.
  • Each user have as many directory as are needed.
  • - Files can be grouped together in natural way.
  • Every file in the system has a unique path name.

19
Three-level directory
20
Acyclic-Graph Directory
  • Allow directories to have shared subdirectories
    and files.
  • All the files to be shared can be put together
    into one directory.
  • Implementation of shared subdirectories and files
  • Create link in directory
  • Duplicate all info. about the shared files.

21
Acyclic-Graph directory structure
22
General Graph directory structure
23
Acyclic-Graph Directory
  • When can the space allocated to a shared file be
    deallocated and reused ?
  • Preserve the file until all reference to it are
    deleted by a count of reference.
  • Unix OS uses this approach

24
Art of Engineering
  • Database design and development involves both art
    and engineering
  • Gathering and organizing user requirements is an
    art
  • Transforming the resulting designs into physical
    applications involves engineering

25
Types of Data Stored
  • Today, most newer databases are able to store a
    large variety of data including
  • Scalar data
  • - Names, dates, phone numbers
  • Pictures
  • Audio
  • Video

26
Introducing the Database
  • Data versus information
  • Data constitute building blocks of information
  • Information produced by processing data
  • Information reveals meaning of data
  • Good, timely, relevant information key to
    decision making
  • Good decision making key to organizational
    survival

27
Historical Roots of Database
  • First applications focused on clerical tasks
  • Requests for information quickly followed
  • File systems developed to address needs
  • Data organized according to expected use
  • Data processing specialists computerized manual
    file systems

28
File-Processing System
29
File-Processing System
  • Data
  • Raw Facts.
  • Field
  • Group of characters with specific.
  • Record
  • Logically connected fields.
  • File
  • Collection of related records

30
File-Processing System
  • Data separated and isolated
  • Makes an hoc queries impossible
  • Data often duplicated
  • Different and conflicting versions of same data
  • Results of uncontrolled data redundancy
  • - Data inconsistency

31
File-Processing System
  • Application program dependent
  • Data Dependence
  • - Change in files data characteristics requires
    modification of data access problems
  • - Makes file system cumbersome from programming
    and data management views
  • Structural Dependence
  • - Change in file structure requires modification
    of related programs
  • Results in Incompatible data files
  • Difficult to understand

32
File-Processing System
  • When storing the same data in multiple locations,
    the likelihood of inconsistency is very high.
  • What is my real names ?
  • File 1 my name is Dan
  • File 2 my name is Danielle
  • File 3 my name is Daniel
  • File 4 my name is Don

33
Database
  • A self-describing collection of integrated
    records.
  • Self-describing
  • In addition to source data, it contains a
    description of its own structure.
  • - Called data dictionary or metadata
  • Why is this important ?
  • - Can determine the structure and content of the
    database by examining the DB itself

34
Database
  • The Hierarchy of Data

35
Database
  • Database vs. File System

36
Database
  • Benefits of DBMS
  • Data is integrated
  • Data duplication is reduced
  • Data is program independent
  • Data is easy to understand

37
Database
  • Why Use a Database ?
  • Data independence
  • Shared data
  • Avoid redundancy
  • Consistency
  • Centralized security system
  • Hardware independence

38
Database
  • Database System Types ?
  • Single-user vs. Multi-user database
  • Desktop
  • Workgroup
  • Enterprise
  • Centralized vs. Distributed
  • Use
  • Production or transactional
  • Decision support or data warehouse

39
Database
  • Comparison among Database

40
Database Models
  • Collection of logical constructs used to
    represent data structure and relationships within
    the database.
  • Conceptual models logical nature of data
    representation
  • Implementation modes emphasis on how the data
    are represented in the database
  • Hierarchical
  • Network
  • Relational
  • Object-oriented

41
Hierarchical Database Models
  • Logically represented by an upside down tree
  • Each parent can have many children
  • Each child has only one parent

42
Hierarchical Database Models
  • Hierarchical DB.
  • IBMs IMS/VS
  • Suitable for one-to-many relation
  • Not suitable for many to many relation
  • Difficult to manage and lack of standards
  • Lacks structural independence

43
Network Database Models
  • Expand the Hierarchical Database Models
  • Each record can have multiple parents

44
Network Database Models
  • Original network model was presented in CODASYL
    Data Base Task Groups 1971 report.
  • Standard DDL and DML
  • Disadvantages
  • System complexity
  • Lack of structural independence

45
Relational Database Models
  • Perceived by user as a collection of tables for
    data storage
  • Tables are a series of row/column intersections
  • Tables related by sharing common entity
    characteristic(s)
  • One big drawback compute intensive

46
Objected-Oriented Database Models
  • OODB products and designed to work well with
    object programming to work well with object
    programming languages such as C, C, and Java.
  • Makes database objects appear as programming
    language objects in one or more object language
    objects in one or more object programming
    languages

47
Objected-Oriented Database Models
  • Extends the language with transparently
    persistent data, concurrency control, data
    recovery, associative queries, and other
    capabilities.

48
Database History
  • 1960s
  • Two main data models were developed network
    model (CODASYL) and hierarchical (IMS).
  • Access to database is through low-level pointer
    operations linking records.
  • Storage details depended on the type of data to
    be stored.
  • Thus adding an extra field to your database
    requires rewriting the underlying
    access/modification scheme.

49
Database History
  • 1960s
  • Emphasis was on records to be processed, not
    overall structure of the system.
  • A user would need to know the physical structure
    of the database in order to query for
    information.
  • One major commercial success was SABRE system
    from IBM and American Airlines.

50
Database History
  • 1970s
  • Several campus of proponents argue about merits
    of these competing systems while the theory of
    databases leads to mainstream research projects.
  • Two main prototypes for relational systems were
    developed during 1974-1977
  • These provide nice example of how theory leads to
    best practice.

51
Database History
  • Early 1980s
  • Commercialization of relational systems begins as
    a boom in computer purchasing fuels DB market for
    business.

52
Database History
  • Mid and late-1980s
  • SQL (Structured Query Language) becomes
    intergalactic standard.
  • DB2 becomes IBMs flagship product.
  • Network and hierarchical models fade into the
    background, with essentially no development of
    these systems today but some legacy are still in
    use

53
Database History
  • Mid and late-1980s
  • Development of the IBM PC gives rise to many DB
    companies and products such as RIM, RBASE 5000,
    PARADOX, OS/2 Database manager, Dbase III, IV
    (later Foxbase, even later Visual FoxPro), Watcom
    SQL.

54
Database History
  • Early 1990s
  • An industry shakeout begins with fewer surviving
    companies offering increasingly complex products
    at higher prices.
  • Much development during this period centers on
    client tools for application development such as
    PowerBuilder (Sybase), Oracle Developer, VB
    (Microsoft), etc.

55
Database History
  • Early 1990s
  • Client-server model for computing becomes the
    norm for future business decisions.
  • Development of personal productivity tools such
    as Excel/Access (MS) and ODBC.
  • This also marks the beginning of Object Database
    Management Systems (ODBMS) prototypes.

56
Database History
  • Mid-1990s
  • The usable internet/www appears.
  • A mad scramble ensues to allow remote access to
    computer systems with legacy data.
  • Client-server frenzy reaches the desktop of
    average users with little patience for complexity
    while Web/DB grows exponentially.

57
Database History
  • Late-1990s
  • The large investment of internet companies fuels
    tools market boom for Web/Internet/DB connectors.
  • Active server pages, Front Page, Java Servlets,
    JDBC, Enterprise Java Beans, ColdFusion, Dream
    Weaver, Oracle Developer 2000, etc are examples
    of such offerings.

58
Database History
  • Late-1990s
  • Open source solution come online with widespread
    use of gcc, cgi, Apache, MySQL, etc. Online
    Transaction processing (OLTP) and online analytic
    processing (OLAP) comes of age with many
    merchants using point-of-sale (POS) technology on
    a daily basis.

59
Database History
  • Early 21st century
  • Decline of the internet industry as a whole but
    solid growth of DB applications continues.
  • More interactive applications appear with use of
    PDAs, POS transactions, consolidation of vendors,
    etc.
  • Three main (western) companies predominate in the
    large DB market IBM (buys Informix), Microsoft,
    and Oracle.

60
Database History
  • Future trends
  • Huge (terabyte) system are appearing and will
    require novel means of handling and analyzing
    data.
  • Large science databases such as genome project,
    geological, national security, and space
    exploration data.
  • Clickstream analysis is happening now.
  • Data mining, data warehousing, data marts are a
    commonly used technique today.
Write a Comment
User Comments (0)
About PowerShow.com