CSC 370 - PowerPoint PPT Presentation

About This Presentation
Title:

CSC 370

Description:

If the value of the name field is 'Smith' then get the value of the dept_code field. ... Many agents will be accessing parts of the data at any given time. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 19
Provided by: scie232
Category:

less

Transcript and Presenter's Notes

Title: CSC 370


1
CSC 370 Database Systems Introduction
2
Whats a database?
  • In essence a database is nothing more than a
    collection of information that exists over a long
    period of time.
  • Databases are empowered by a body of knowledge
    and technology embodied in specialized software
    called a database management system, or DBMS.
  • A DBMS is a powerful tool for creating and
    managing large amounts of data efficiently and
    allowing it to persist over long periods of time,
    safely.
  • Among the most complex types of software
    available.

3
The database management system
  • Allows users to create new databases and specify
    their schema (logical structure of the data),
    using a data-definition language.
  • Enables users to query and modify the data, using
    a query language and data-manipulation language.
  • Supports intelligent storage of very large
    amounts of data.
  • Protects data from accident or not proper use.
  • Example We can require from the DBMS to not
    allow the insertion of two different employees
    with the same SIN.
  • Allows efficient access to the data for queries
    and modifications.
  • Example Indexes over a specified fields
  • Controls access to data from many users at once
    (concurrency), without allowing bad
    interactions that can corrupt the data
    accidentally.
  • Recovers from software failures and crashes.

4
Early database systems and file syst.
  • First commercial database systems evolved from
    file systems.
  • File systems allow the storage of big amounts of
    data (not safely though).
  • But file systems do not provide a query language
    for the data in files.
  • People had to write programs in order to extract
    even the most elementary information from a set
    of files.
  •  
  • Example Suppose we have stored in a file called
    Employees records having the fields
  • (emp_code, name, dept_code)
  • and in another file called Departments records
    having the fields
  • (dept_code, dept_name)
  •  
  • Suppose now that given an employee, for
    instance with name Smith, we want to find out
    what department is he working for.

5
Continued
  • In the absence of a query language we have to
    write a program which will
  • open the file Employees
  • declare a variable of the same type as the
    records stored in the file
  • scan the file
  • while the end of the file is not yet
    encountered, assign the current record to above
    variable.
  • If the value of the name field is Smith then
    get the value of the dept_code field. Suppose it
    is 100
  • Search in a similar way for a record with 100
    for the dept_code in the Department file.
  • Print the dept_name when successfully finding the
    dept_code.
  • Very painful procedure even for the simplest
    queries.
  • Compare it to the short and elegant SQL query  
  • SELECT dept_name
  • FROM Employees, Department
  • WHERE Employees.name"Smith" AND
    Employees.dept_code Department.dept_code

6
First important applications of DBMSs
  • The ones where the data was composed of
  • many small items, and
  • many queries or modifications were made.
  • Examples
  • Airline reservation systems
  • Banking systems
  • Corporate records

7
Airline Reservation Systems
  • Here the items of data include
  • Reservations by a single customer on a single
    flight, including such information as assigned
    seat
  • Flights information the airport they fly from
    and to, their departure and arrival times
  • Ticket information prices, requirements, and
    availability.
  • Typical queries ask for
  • Flights leaving about a certain time from one
    given city to another, what seats are available,
    and at what prices.
  • Typical data modifications include
  • Making a reservation in a flight for a customer,
    assign a seat etc.
  • Many agents will be accessing parts of the data
    at any given time.
  • The DBMS must allow concurrent accesses
    preventing problems such as two agents assigning
    the same seat simultaneously. 
  • Also, the DBMS should protect against loss of
    records if the system suddenly fails.

8
Banking Systems
  • Data items include
  • Customers, their names, addresses etc.
  • Accounts, and their balances
  • Loans, and their balances
  • Connections between customers and their accounts
    and loans.
  • Typical queries are those for account and loan
    balances.
  • Typical modifications are those representing a
    payment from or deposit to an account.
  • In banking systems failures cannot be tolerated.
  • E.g, once the money has been ejected from an ATM
    machine, the bank must record the debit, even if
    the power immediately fails.
  • On the other hand, it is not permissible for the
    bank to record the debit and then not to deliver
    the money because the power fails.
  • The proper way to handle this operation is far
    from obvious and is one of the significant
    achievements in DBMS architecture.

9
Early DBMSs (1960s)
  • They encouraged the user to view the data much as
    it was stored.
  • The chief models were the Hierarchical and
    Network.
  • The main characteristic of these models was the
    possibility of easy jumping or navigating from
    one object to another through pointers.
  • E.g. From one employee to his department.
  • However these models didnt provide a high-level
    query language for the data.
  • So, one had still to write programs for querying
    the data.
  • Also they didnt allow on-line schema
    modifications.

10
Relational databases
  • Codd (1970)
  • A database system should present the user with a
    view of data organized as tables (also called
    relations).
  • Behind the scene there could be a complex data
    structure that allows rapid response to a variety
    of queries.
  • But the user would not be concerned with the
    storage structure.
  • Queries could be expressed in a very high-level
    language, which greatly increases the efficiency
    of database programmers.
  • This high-level query language for relational
    databases is called Structured Query Language
    (SQL)

11
Example of a Relational DB
  • Relations Tables. Columns are headed by
    attribute names.
  • Rows Tuples
  • A relation Accounts might be
  • SQL Examples
  • Whats the balance of account 67890 ?
  • Which are the savings accounts with negative
    balances?
  1. Examine all tuples of the relation Accounts in
    FROM-clause.
  2. Pick out those tuples that satisfy some criterion
    in the WHERE-clause,
  3. Output the attributes of those tuples which are
    asked for by SELECT-clause.

12
Architecture of a DBMS
  • The cylindrical component contains not only
    data, but also metadata, i.e. info about the
    structure of data.
  • If DBMS is relational, metadata includes
  • names of relations,
  • names of attributes of those relations, and
  • data types for those attributes (e.g., integer or
    character string).
  • A database also maintains indexes for the data.
  • Indexes are part of the stored data.
  • Description of which attributes have indexes is
    part of the metadata.

13
Few words about indexes
  • Similar to book indexes.
  • A book index associates words with page numbers
    where they appear.
  • A database index associates values of some object
    field(s) with the physical address of the
    corresponding objects in the disk.
  • Main property of an index Its size is much
    smaller than the record set being indexed.
  • Hence, searching in an index is much faster than
    searching in the corresponding record set.

14
Storage Manager
  • The job of the Storage Manager is to
  • obtain data from the data storage, and
  • modify the data to the data storage when
    requested.
  • Storage Manager has two components
  • File Manager handles files.
  • Keeps track of the location of files
  • Obtains block(s) of a file on request from the
    buffer manager.
  • Buffer Manager handles main memory.
  • Obtains and returns blocks of data from/to the
    file manager
  • Stores blocks temporarily in main memory pages.
  • 1 block 1 page 4,000 to 16,000 bytes.  
  • Smallest unit of data that is read/written
    from/to disk.

15
Query Processor
  • Query Processor handles queries and
    modifications to the data.
  • Finds the best way to carry out a requested
    operation and
  • Issues commands to the storage manager which will
    carry them out.
  • E.g. A bank has a DB with two relat.  
  • Customers (name, SIN, address),
  • Accounts (accountNo, balance, SIN)
  • Query Find the balances of all accounts of
    which Sally is the owner.
  • SELECT Accounts.balance
  • FROM Customers, Accounts
  • WHERE Customers.SIN Accounts.SIN AND
    Customers.name 'Sally'

16
Query Processor (Cont.)
  • What this query logically says is
  • Make Cartesian product R of tables specified in
    the FROM-clause,
  • Choose from R the tuples satisfying the condition
    in the WHERE clause.
  • Produce as answer only the values of attributes
    in SELECT-clause.
  • If answer this query as it says the performance
    would be terrible.
  • Because of the usually enormous Cartesian
    product.
  • Suppose we have
  • Index on name of Customer and
  • Index on SIN of Accounts.
  • Then, query processor will cleverly create a plan
    which inexpensively
  • Retrieves the tuple for Sally and gets the SIN
    number.
  • Retrieves the account tuples for this SIN number.

17
Transaction Manager
  • Transaction Manager is responsible for the
    integrity of the system. It must assure that
  • several queries running simultaneously do not
    interfere with each other and that,
  • the system will not lose data even if there is a
    power failure.
  • Transaction Manager interacts with
  • Query Manager,
  • Because it may need to delay certain query
    operations to avoid conflicts.
  • Storage Manager
  • Because schemes for protecting data involve
    storing a log of changes to the data.

18
Database Studies
  • Design of databases.
  • What kinds of information go into the database?
  • How is the information structured?
  • How do data items connect?
  • Database programming.
  • How does one express queries on the database?
  • How does one use other capabilities of a DBMS,
    such as transactions or constraints, in an
    application?
  • How is database programming combined with
    conventional programming?
  • Database system implementation.
  • How does one build a DBMS, including such matters
    as query processing, transaction processing and
    organizing storage for efficient access?
Write a Comment
User Comments (0)
About PowerShow.com