Principles of Database Design - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Principles of Database Design

Description:

findings No infiltrate, cardiac shadow. not enlarged... /findings ... Chest X-ray Left upper lobe infiltrate, left ventricular hypertrophy ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 50
Provided by: dean179
Category:

less

Transcript and Presenter's Notes

Title: Principles of Database Design


1
Principles of Database Design
  • James J. Cimino
  • NIH Clinical Center

2
Outline
  • Definition
  • Motivation
  • History and evolution
  • Design principles
  • Design methods
  • Exercises
  • Take-Home Messages

3
Database Definition
  • A collection of data that
  • is organized
  • usually computer-based
  • represents repetitive information implicitly
  • supports retrieval

4
Information
  • Content
  • Name
  • Date
  • Diagnosis
  • Medication
  • Problem
  • Procedure
  • Visit
  • Structure
  • Field
  • Record
  • Table
  • Database

5
Paper Database as Expert System
6
Motivation
  • Power and flexibility depend on data model
  • Database is the realization of data model
  • Evaluation of commercial products
  • Communicating with vendors and IT staff
  • Building your own databases

7
In the beginning, there were punch cards
8
Sequential Files
  • Stored on magnetic tape
  • Longer (or shorter) than 80 characters
  • 8-bit characters (256 characters)
  • Variable-length records

Len21 Data Len16
Data
0211234567SandiegoCarmen0161234568CiminoJim
  • Random access possible

ID Loc ID Loc
12345678901234567890123
SandiegoCarmenCiminoJim
1234567000112345680015
  • Slowwwwww..

9
Random Access Files
  • Disk storage with moving heads
  • Larger capacity (MB!)
  • Addressable records and fields using pointers
  • Indexes created as lists of pointers
  • Separation of physical and logical models
  • Can be difficult to recover if index corrupted

10
Random Access Files
  • 0000 Name1000Addr2000DOB 6000
  • 1000 Sandiego, Carmen
  • 2000 123 Main Street
  • 3000
  • 4000
  • 5000
  • 6000 Feb 1, 1948

11
Indexed Files
0000 0000 0004 4000 0008 2000
0000 Cimino, James 1000 2000 Norton,
Cathy 3000 4000 Lindberg, Don
12
Hierarchical Databases
1234567
Sandiego, Carmen
123 Main Street
Labs
Chem7
Chem7
K 3.9
Na142
K 4.3
Na136
13
Hierarchical Databases
  • Easy to use
  • Efficient storage
  • Tree walking is fast
  • Queries across trees are slow
  • Flexible
  • Too flexible chaos is allowed
  • Too easy to modify
  • Difficult to document complex structures

14
Hierarchical Chaos
1234567
Admissions
Admission 1
Admit Date 2/2/02
Primary DX CHF
Other DX
AODM
A Fib
Flag S
Flag P
15
Hierarchical Databases
  • EMR(1234567)Sandiego, Carmen
  • EMR(1234567, Address)123 Main Street
  • EMR(1234567, Chem7, 2/2/02, Na)136
  • EMR(1234567, Chem7, 2/2/02, K)4.3
  • EMR(1234567, Chem7, 2/3/02, Na)142
  • EMR(1234567, Chem7, 2/3/02, K)3.9

16
Network Databases
1234567
Gyn Clinic
2 Main St.
Sandiego
305-2500
Secretary
Gyn Clinic
8AM-5PM
Ms Smith
305-1000
Service
Pap
Dr. Jones
Gyn Visit
Beeper 34
17
Extensible Markup Language (XML) Databases
  • SGML is a metalanguage
  • SGML is used to write Document Type Definitions
    (DTDs) that define languages
  • HTML is a language with an SGML DTD
  • Tags are for formatting/presentation syntax
  • XML is a proper subset of SGML
  • XML defines tags that convey semantics
  • We could write Health Markup Language (HML)
    in XML (if we could agree on the semantics and
    tags)
  • Tags may or may not be stored with data

18
ltdocumentgt lt/documentgt
ltdocument.idgtCXR001lt/document.idgt ltdoc.
dategt19991101lt/doc. dategt ltdocument.typegt lt/d
ocument.typegt ltdocument.bodygt ltdocument.bod
ygt
ltidentifiergtP5-00010lt/identifiergt
lttextgtChest X-Raylt/textgt
ltfindingsgtNo infiltrate, cardiac shadow not
enlarged...lt/findingsgt ltimpressiongtNormal
X-raylt/impressiongt
19
ltpatientgt lt/patientgt
ltpatient.idgt lt/patient.idgt ltpatient.namegt lt/pa
tient.namegt ltpatient.dobgt19230113lt/patient.dobgt ltp
atient.sex value"male"/gt ltinpatient/gt
ltid.valuegt1234789lt/id.valuegt
ltfamily.namegtSandiegolt/family.namegt ltgiven.namegtCa
rmenlt/given.namegt ltsuffixgtM.D.lt/suffixgt
20
Extensible Markup Language (XML) Databases
  • Strengths
  • Flexibility to represent wide range of data
  • Data carries its field assignment
  • Sparse data handled compactly
  • Tags can have platform-specific display
  • Weaknesses
  • Immature database tools
  • Verbose
  • I/O intensive
  • A trade-off of decreased efficiency for increased
    flexibility ? scalability

21
Relational Databases - Features
  • Tables with columns and rows
  • Logical vs. physical representation
  • Multiple indexes
  • Inter-table relationships
  • Virtual sequential files (with simultaneous
    update)

22
Relational Databases
From table Patient, get Pt_UI where
LnameJones and FnameCasey
and then Get Tname and Date from table
Lab_test for the same Pt_UI
23
Normalization
  • Efficient database organization
  • Eliminate redundant data
  • Ensure data dependencies make sense
  • E.F. Codd, 1970 five normal forms

24
Normalization
  • First Normal Form
  • Create tables for related data
  • Eliminate duplicative columns
  • Identify a primary key

25
Normalization
26
Normalization
What is the primary key? (hint it must uniquely
identify each row)
27
Normalization
  • First Normal Form
  • Create tables for related data
  • Identify a primary key
  • Eliminate duplicative columns
  • Second normal form
  • Move duplicative data to new tables
  • Relate data through foreign keys

28
Normalization
29
Normalization
30
Normalization
  • First Normal Form
  • Create tables for related data
  • Identify a primary key
  • Eliminate duplicative columns
  • Second normal form
  • Move duplicative data to new tables
  • Relate data through foreign keys
  • Third normal form
  • Move columns not dependent upon the primary key

31
Relational Databases - Advantages
  • Comprehensible
  • Multiple views possible
  • Easy to modify
  • New elements dont break programs
  • Database management systems (DBMS)
  • Referential integrity
  • Reorg for efficiency
  • Access control
  • Locking for multiple simultaneous use

32
Relational Databases - Disadvantages
  • Storage overhead
  • I/O-intense
  • Cost

33
Systems Design Fundamentals
  • The data model is the most critical aspect
  • Data model should reflect real world objects and
    relationships to ensure durability
  • A correct data model outlasts applications,
    including many not anticipated at system start-up

34
System Design Basic Concepts
  • The world contains things
  • Develop abstractions called objects
  • Group objects by criteria which represent the
    abstract object as an empty table

35
Types of Objects (Tables)
  • Tangible things (book, person)
  • Roles (doctor, patient, supervisor)
  • Events (ordering of a lab test)
  • Interactions - bind two or more other objects via
    a transaction (purchase relates buyer to seller)

36
Objects
  • All of the real-world things in the set (the
    instances) have the same characteristics
  • All instances conform to the same rules

37
Basic Concepts (continued)
  • Empty tables can be filled in to represent the
    real world things from which the object was
    abstracted

38
Formalisms for Tables
  • Rule 1 One instance of an object has
  • exactly one value for each attribute
  • only one element per row-column intersection
  • no repeating groups
  • no true holes in table
  • Rule 2 Attributes contain no internal structure

39
Formalisms for Tables
40
Formalisms for Tables
  • Rule 1 One instance of an object has
  • exactly one value for each attribute
  • only one element per row-column intersection
  • no repeating groups
  • no true holes in table
  • Rule 2 Attributes contain no internal structure
  • Rule 3 Every attribute should represent a
    characteristic of the entire object, not a
    characteristic of a limited part of the object

41
Formalisms for Tables
42
Formalisms for Tables
  • Rule 1 One instance of an object has
  • exactly one value for each attribute
  • only one element per row-column intersection
  • no repeating groups
  • no true holes in table
  • Rule 2 Attributes contain no internal structure
  • Rule 3 Every attribute should represent a
    characteristic of the entire object, not a
    characteristic of a limited part of the object

43
Relationships
  • Relationship an abstraction of an association
    between real world things
  • Patient OCCUPIES Bed
  • Library CONTAINS Books
  • Specimen IS ASSAYED by Lab Method
  • Inverse relationships
  • Bed is OCCUPIED BY Patient
  • Book IS LENT BY Library

44
Basic Concepts (continued)
  • Relationships between objects are attributes of
    those objects

Relationship Has-Doc Patient Has-Doc
Physician
45
Relationship Types
Patient
Many-to-Many
Physician
46
Modeling Many-to-Many Relationships
47
Exercise Devise a Relational Model for MEDLINE
citations
PMID- 2405204 TI- Medical informatics. An
emerging academic discipline and institutional
priority. AB- Information management constitutes
a major activity of the health care\ AD- Departme
nt of Radiology, Brigham and Women's Hospital,
Boston, MA 02115 AU- Greenes RA AU- Shortliffe
EH LA- eng PT- Journal Article PT- Review JT- JAMA
the journal of the American Medical
Association SO- JAMA. 1990 Feb 23263(8)1114-20.
MH- Career Choice MH- Hospital Information
Systems MH- Information Systems MH- Medical
Informatics/education/organization
administration/trends MH- Medical Informatics
Applications MH- National Library of Medicine
(U.S.) MH- Research MH- Training
Support MH- United States
48
Case Presentation
  • The patient is a 50 year old, Native American
    female who present to the emergency room (ER)
    with the chief complaint of lip numbness, nausea
    and chest pain.
  • The patient was generally well until about one
    half hour prior to arrival in the ER, while
    eating dinner at as seafood restaurant in Rock
    Harbor, MA. She was finishing a dinner of New
    England clam chowder, lobster, steamed clams, and
    corn on the cob when she noted onset of symptoms.
    Others in her party ate fish and chips, although
    two other people ate the clam chowder none at
    the steamers.
  • She gives a history of hypertension and states
    that she was getting a "capsule, half green, half
    blue-green" from her private doctor. She also
    reports that she was treated in the past for
    tuberculosis while she was pregnant, but doesn't
    remember what she was treated with or for how
    long. She reports that she was at another
    hospital on the other side of town, where she had
    a liver biopsy. She reports that he thinks the
    diagnosis was "hemachromatosis". The patient
    reports an allergy to Bufferin.
  • Physical examination revealed a well-developed,
    well-nourished diaphoretic female in moderate
    respiratory distress. Vital signs showed a pulse
    of 110, a respiratory rate of 8, an oral
    temperature of 100.3, and a blood pressure of
    150/100. Examination revealed rales over both
    lower lung fields. Abdominal exam revealed a
    tender, palpable liver edge. Neurologic exam
    reveals dysarthria, diffuse muscle weakness, and
    hyperreflexia.
  • Chem7 (serum) Glucose 100 (70-105) Chem7
    (plasma) Glucose 150 (75-110)
  • CBC Hgb 15 (12.0-15.8), Hct 45 (42.4-48.0), WBC
    11,000 (3,540-9,060), Platelets 145K (165-415K)
  • A fingerstick blood sugar was 80
  • Urinalysis showed protein of 1 and glucose of 0
  • A blood culture was positive for
    methicillin-resistant Staphylococcus aureus
    (MRSA)
  • ECG - Sinus Rhythm, 74BPM, Axis -30 degrees, ST
    segment 2mm elevated and
  • T-waves down in leads I, L, V5 and V6
  • Chest X-ray Left upper lobe infiltrate, left
    ventricular hypertrophy
  • The patient's nurse reported that the patient
    seemed more worried about who would care for her
    elderly father if anything happened to her.
  • A medical student reviewing the case wonders
    whether paralytic shellfish poisoning could cause
    a mycoardial infarction is concerned she decides
    to do a literature search.
  • The patient was treated with activated charcoal
    and stomach lavage , followed by enteric-coated
    aspirin. Due to worsening respiratory
    insufficiency, she was intubated and placed on
    mechanical ventilation.

49
Take-Home Messages
  • Data model is the most critical aspect of system
    design and function
  • Data models should reflect real world objects and
    their relationships to ensure durability
  • A correct data model outlasts applications,
    including many not anticipated at system start-up
Write a Comment
User Comments (0)
About PowerShow.com