Organizing Data and Information - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Organizing Data and Information

Description:

Physical access path (PAP) DBMS accesses a storage device to retrieve data. Schematic ... Physical access path (PAP) Logical access path (LAP) ITEC 1010 ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 62
Provided by: ianscottm
Category:

less

Transcript and Presenter's Notes

Title: Organizing Data and Information


1
Chapter 5
Organizing Data and Information
2
Data
  • Data
  • A necessity for almost any enterprise to carry
    out its business. Consists of raw facts, and when
    organized may be transformed into information
  • Database
  • A collection of data organized to meet users
    needs
  • Database management system (DBMS)
  • A group of programs that manipulate the database
    and provide an interface between the database and
    the user of the database or other application
    programs

3
DBMS Discussion (1)
  • A collection of programs that enables you to
    store, modify, and
  • extract information from a database. There are
    many different types
  • of DBMSs, ranging from small systems that run
    on personal
  • computers to huge systems that run on
    mainframes. The following
  • are examples of database applications
  • computerized library systems
  • automated teller machines
  • flight reservation systems
  • computerized parts inventory systems
  • From a technical standpoint, DBMSs can differ
    widely. The terms

4
DBMS Discussion (2)
  • relational, network, flat, and hierarchical all
    refer to the way a
  • DBMS organizes information internally. The
    internal organization can
  • affect how quickly and flexibly you can extract
    information.
  • Requests for information from a database are
    made in the form of a
  • query, which is a stylized question. For
    example, the query
  • SELECT ALL WHERE NAME "SMITH" AND AGE gt 35
  • requests all records in which the NAME field is
    SMITH and the AGE

5
DBMS Discussion (3)
  • field is greater than 35. The set of rules for
    constructing queries is
  • known as a query language. Different DBMSs
    support different query
  • languages, although there is a
    semi-standardized query language
  • called SQL (structured query language).
    Sophisticated languages for
  • managing database systems are called
    fourth-generation languages,
  • or 4GLs for short.
  • The information from a database can be
    presented in a variety of
  • formats. Most DBMSs include a report writer
    program that enables
  • you to output data in the form of a report.
    Many DBMSs also include

6
DBMS Discussion (4)
  • a graphics component that enables you to output
    information in the
  • form of graphs and charts.

7
Hierarchy of Data
Schematic
8
Hierarchy of data
Example
Database
Personel file
(Project database)
Department file
Payroll file
Files
005-10-6321 Johns Francine 10-7-65 549-77-1001
Buckley Bill 2-17-79 098-40-1370 Fiske Steven
1-5-85
(Personnel file)
Records
(Record containing SSN, last name, first name,
date of hire)
098-40-1370 Fiske Steven 1-5-85 598
Fields
Fiske
(Last name field)
Characters(bytes)
1000100
(Letter F in ASCII)
9
Terminology
  • Database
  • A collection of integrated and related files
  • File
  • A collection of related records
  • Record
  • A collection of related fields
  • Field
  • A group of characters
  • Character
  • Basic building block of information, represented
    by a byte

10
Data Entities, Attributes, and Keys
  • Entity
  • A generalized class of people, places, or things
    (objects) for which data are collected, stored,
    and maintained
  • E.g., Customer, Employee
  • Attribute
  • A characteristic of an entity something the
    entity is identified by
  • E.g., Customer name, Employee name
  • Keys
  • A field or set of fields in a record that is used
    to identify the record
  • E.g, A field or set of fields that uniquely
    identifies the record

11
Keys and Attributes
Entities(records)
Key field
Attributes (fields)
12
The Traditional Approach
  • The traditional approach
  • Separate files are created and stored for each
    application program

Schematic
13
Application programs
Data
Files
Users
Payroll
Reports
Payrollprograms
Invoicing
Reports
Invoicingprograms
Inventorycontrol
Inventorycontrolprograms
Reports
Managementinquiries
Reports
Managementinquiriesprograms
14
Drawbacks
  • Data redundancy
  • Duplication of data in separate files
  • Lack of data integrity
  • The degree to which the data in any one file is
    accurate
  • Program-data dependence
  • A situation in which program and data organized
    for one application are incompatible with
    programs and data organized differently for
    another application

15
Database Approach
  • The database approach
  • A pool of related data is shared by multiple
    application programs
  • Rather than having separate data files, each
    application uses a collection of data that is
    either joined or related in the database

Schematic
16
Payroll program
Reports
Payroll data Inventorydata Invoicing Data Othe
rdata
Inventory program
Reports
Database management system
Invoicing program
Reports
Other programs
Reports
Database
Interface
Applications programs
Users
17
Advantages
  • Improved strategic use of corporate data
  • Reduced data redundancy
  • Improved data integrity
  • Easier modification and updating
  • Data and program independence
  • Better access to data and information
  • Standardization of data access
  • A framework for program development
  • Better overall protection of the data
  • Shared data and information resources

18
Disadvantages
  • Relatively high cost of purchasing and operating
    a DBMS in a mainframe operating environment
  • Increased cost of specialized staff
  • Increased vulnerability

19
Data Modeling and Database Models (1)
  • Planned data redundancy
  • A way of organizing data in which the logical
    database design is altered so that certain data
    entities are combined
  • Summary totals are carried in the data records
    rather than calculated from elemental data
  • Some data attributes are repeated in more than
    one data entity to improve database performance

20
Data Modeling and Database Models (2)
  • Data model
  • A map or diagram of entities and their
    relationships
  • Enterprise data modeling
  • Data modeling done at the level of the entire
    organization
  • Entity-relationship (ER) diagrams
  • A data model that uses basic graphical symbols to
    show the organization of and relationships
    between data

21
ExampleEntity Relationship (ER) Diagram for a
Customer Ordering Database
Schematic
22
Last name
Colour
Attributes
Entities
Order
Name
First name
Customer
Product
1
N
1N one-to-many relationship
Identificationnumber
Identificationnumber
23
Hierarchical Database Model
  • Hierarchical database model
  • A data model in which data are organized in a
    top-down, or inverted tree structure

Schematic
24
Project 1
Department C
Department B
Department A
Employee 1
Employee 2
Employee 3
Employee 4
Employee 5
Employee 6
25
Network Data Model
  • Network data model
  • An expansion of the hierarchical database model
    with an owner-member relationship in which a
    member may have many owners

Project 1
Project 2
Department A
Department B
Department C
26
Relational Data Model
  • Relational data model
  • All data elements are placed in two-dimensional
    tables, called relations, that are the logical
    equivalent of files

Schematic
27
Data Table 2 Department Table
Data Table 1 Project Table
Data Table 3 Manager Table
28
Relational Database Terminology
  • Selecting
  • Data manipulation that eliminates rows according
    to certain criteria
  • Projecting
  • Data manipulation that eliminates columns in a
    table
  • Joining
  • Data manipulation that combines two or more
    tables
  • Linked
  • Relating tables in a relational database together

29
Linking Data Tables to Answer an Inquiry
Schematic
30
(No Transcript)
31
Building and Modifying a Relational Database
  • Using Microsoft Access

Screen snap
32
(No Transcript)
33
Schemas and Subschemas
  • Schema
  • A description of the entire database
  • Subschema
  • A file that contains a description of a subset of
    the database and identifies which users can
    perform modifications on the data items in that
    subset

Schematic
34
DBMS
Schema
SubschemaB
SubschemaA
SubschemaC
User1
User2
User3
User4
User5
35
Schema Discussion
  • Pronounced skee-ma, the structure of a database
    system, described
  • in a formal language supported by the database
    management
  • system (DBMS). In a relational database, the
    schema defines the
  • tables, the fields in each table, and the
    relationships between fields
  • and tables.
  • Schemas are generally stored in a data
    dictionary. Although a
  • schema is defined in text database language,
    the term is often used
  • to refer to a graphical depiction of the
    database structure.

36
Data Definition Language
  • Data Definition Language (DDL)
  • A collection of instructions and commands used to
    define and describe data and data relationships
    in a specific database

Schematic
37
SCHEMA DESCRIPTION SCHEMA NAME IS
XXXX AUTHOR XXXX DATE XXXX FILE
DESCRIPTION FILE NAME IS XXXX ASSIGN
XXXX FILE NAME IS XXXX ASSIGN XXXX AREA
DESCRIPTION AREA NAME IS XXXX RECORD
DESCRIPTION RECORD NAME ISXXXX RECORD ID IS
XXXX LOCATION MODE ISXXXX WITHIN XXX AREA FROM
XXXX THRU XXXX SET DESCRIPTION SET NAME IS
XXXX ORDER IS XXXX MODE IS XXXX MEMBER IS
XXXX . . .
38
Data Dictionary
  • Data Dictionary
  • A detailed description of all data used in the
    database

Schematic
39
NORTHWESTERN MANUFACTURING PREPARED BY D.
BORDWELL DATE 04 AUGUST APPROVED BY J.
EDWARDS DATE 13 OCTOBER VERSION 3.1 PAGE 1
OF 1 DATA ELEMENT NAME PARTNO DESCRIPTION INVE
NTORY PART NUMER OTHER NAMES PTNO VALUE
RANGE 100 TO 5000 DATA TYPE NUMERIC POSITIONS
4 POSITIONS OR COLUMNS
40
Data Dictionary Features
  • Provide a standard definition of terms and data
    elements
  • Assist programmers in designing and writing
    programs
  • Simplify database modification
  • Reduce data redundancy
  • Increase data reliability
  • Faster program development
  • Easier modification of data and information

41
Logical and PhysicalAccess Paths
  • Logical access path (LAP)
  • Application requires information from the DBMS
  • Physical access path (PAP)
  • DBMS accesses a storage device to retrieve data

Schematic
42
Dataon storage devices
Physical access path (PAP)
DBMS
Logical access path (LAP)
Management inquiries
Other software
Application programs
43
Manipulating Data
  • Concurrency control
  • A method of dealing with a situation in which two
    or more people need to access the same record in
    a database at the same time
  • Data manipulation language (DML)
  • The commands that are used to manipulate the data
    in a database
  • Structured query language (SQL)
  • A standardized data manipulation language

44
Structured Query Language (SQL)
  • Invented at IBMs Almaden Research Centre (San
    Jose, CA) in the 1970s
  • E.g.,

SELECT FROM EMPLOYEE WHERE JOB_CLASSIFICATION
C2
Select all () columns from the EMPLOYEE table
in which the JOB_CLASSIFICATION field is equal to
C2
45
SQL Discussion (1)
  • Abbreviation of structured query language, and
    pronounced either
  • see-kwell or as separate letters. SQL is a
    standardized query
  • language for requesting information from a
    database. The original
  • version called SEQUEL (structured English query
    language) was
  • designed by an IBM research center in 1974 and
    1975. SQL was first
  • introduced as a commercial database system in
    1979 by Oracle
  • Corporation.
  • Historically, SQL has been the favorite query
    language for database
  • management systems running on minicomputers and
    mainframes.

46
SQL Discussion (2)
  • Increasingly, however, SQL is being supported
    by PC database
  • systems because it supports distributed
    databases (databases that
  • are spread out over several computer systems).
    This enables
  • several users on a local-area network to access
    the same database
  • simultaneously.
  • Although there are different dialects of SQL,
    it is nevertheless the
  • closest thing to a standard query language that
    currently exists. In
  • 1986, ANSI approved a rudimentary version of
    SQL as the official
  • standard, but most versions of SQL since then
    have included many

47
SQL Discussion (3)
  • extensions to the ANSI standard. In 1991, ANSI
    updated the
  • standard. The new standard is known as SAG SQL.

48
Database Output
Screen snap
49
(No Transcript)
50
Popular Database Management Systems for End Users
  • Microsoft Access 98
  • Lotus Approach 98
  • Inprise (formerly Borland) dBASE
  • DBMS Selection Criteria
  • Database size
  • Number of concurrent users
  • Performance
  • Integration
  • Features
  • The vendor
  • Cost

51
Distributed Databases
  • Distributed database
  • A database in which the actual data may be spread
    across several smaller databases connected via
    telecommunications devices

Pretty picture
52
(No Transcript)
53
Data Warehouse
  • Data warehouse
  • A relational database management system designed
    specifically to support management decision
    making
  • Current evolution of Decision Support Systems
    (DSSs)
  • Data mart
  • A subset of a data warehouse for small and
    medium-size businesses or departments within
    larger companies

Schematic
54
Relationaldatabases Hierarchicaldatabases Netw
orkdatabases Flat files Spreadsheets
Dataextractionprocess
Datacleanupprocess
Data wharehouse
End user access
Query andanalysistools
55
Designing a Customer Data Warehouse
  • Sharply define your goals and objectives before
    you build the warehouse
  • Choose the software that best fits your goals
  • Determine who/what should be in the database
  • Develop a plan
  • Measure results

56
Data Mining Applications
  • Data mining
  • The automated discovery of patterns and
    relationships in a data warehouse
  • Data mining applications
  • Market segmentation
  • Customer queries
  • Fraud detection
  • Direct marketing
  • Market basket analysis
  • Trend analysis

57
On-Line AnalyticalProcessing (OLAP)
  • On-line analytic processing (OLAP)
  • Access to multidimensional databases providing
    managerially useful display techniques
  • Now used to store and deliver data warehouse
    information
  • Data warehouse and OLAP
  • Provides top-down, query-driven analysis
  • Data mining
  • Provides bottom-up, discovery-driven analysis

58
Open DatabaseConnectivity (ODBC)
  • Open database connectivity (ODBC)
  • A set of standards that ensures software written
    to comply with these standards can be used with
    any ODBC-compliant database

Schematic
59
dBASE
ODBC Import
ACCESSdatabase
Paradoxdatabase
ODBC Export
ODBC Link
Lotus 1-2-3 spreadsheet
60
Object-Relational Database Management Systems
(ORDBMS)
  • Object-relational database management system
    (ORDBMS)
  • A DBMS capable of manipulating audio, video, and
    graphical data.
  • Hypertext
  • Users can search and manipulate alphanumeric data
    in an unstructured way
  • Hypermedia
  • Allows businesses to search and manipulate
    multimedia forms of data
  • Spatial data technology
  • Use of an object-relational database to store and
    access data according to the location it
    describes and to permit spatial queries and
    analysis

61
End of Chapter 5
Chapter 6
Write a Comment
User Comments (0)
About PowerShow.com