Database Integration - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Database Integration

Description:

Title: Database Integration Author: stefano spaccapietra Last modified by: IDEA RAYANE Created Date: 1/21/2003 12:44:24 PM Document presentation format – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 45
Provided by: stefan217
Category:

less

Transcript and Presenter's Notes

Title: Database Integration


1
??????? ???? ???? ??? ???????? ???? ??????
  • ??????? ?????
  • ??????? ???- ??????? ??? ? ????????

???? ???? ????? ????? ?????? ???? ????? ?????
2
????? ?????
  • ????? ???? ????????
  • ????? ??????
  • ????? ?????
  • ????? ??????
  • ????? ????
  • ?????? ??????? ???? ????
  • ??????? ???? ??? ??
  • ?????? ? ??????? ??? ????? ?? ???? ?? (Data
    interoperability)
  • ??????? ???? ??? ???????
  • ?????? ????? ??? ???? ???????? ?????????
  • ?????? ???? ???? ??
  • ??? ??? ??????? ????? (Data alignment)
  • ???????? ???? ??
  • ????? ????????? ?? ???? ??
  • ??????
  • ???

3
????? ????? (?????)
  • ???? ??? ?????? ???? ??? ???????? ????????????
  • ?????? ??? ??????? ???? ???? ??? ???????? ????
    ??????

4
????? ???? ????????
?????? ?? ????? ???? ???????? ?? ???? ?? ????
????? ????? ?? ???? ????? ?? ???? ???????? ??????
?? ???. ?? ?? ???? ???????? ????? ??? ???? ?????
????? ?????? ???? ??? ????? ?? ?? ????? ? ??????
??? ??? ???? ?? ? ????? ?? ?? ???? ????.
??????? ???? ???????? ????? ?? ??? ???? ????
???? ???????? ????? ?? ???. ????? ?? ????? ????
???????? ?? ???? ?? ???? ???????? ????? ?? ????
??? ??? ???? ????? ?? ???? ??????? ?????? ????
???????? ????? ?? ???.
5
????? ???? ???????? (?????)
  • ???? ????? ????? ???? ????????
  • ????? ??????? ?? ???? ?? ?????? ? ????? ?????
    ????
  • ????? ?????? ?? ???? ?? ??????? ?? ? ???? ???? ?
    ????? ????? ????
  • ????? ??????? ?? ????? ???? ??? ?? ????? ?????
  • ????? ????? ????? ????? ????? ?? Oracle ?? SQL
    Server

6
????? ??????
?? ????? ?????? ?? ?? ?? ??? ???? ?? ?????? ???
???? ?? ??? ?? ???? ???? ?????? ? ????? ???
??????. ??? ???? ????? ?????? ??? ????? ????? ??
?????? ?? ????? ?? ??? ? ??? ???? ?? ?? ????
????? ?? ???? ?? ???. ?? ???? ??? ??? ???? ????
??? ??????? ??? ????? ????? ? ????? ?? ????? ??
???. ?? ???? ?? ??? ??? ?? ???? ????? ??? ??????
?? ?????? ?? ???? ?? ??? ? ?? ????? ?? ???????
??? ???????? ??? ???? ?? ???? ??????? ???? ??
???. ??? ???? ?? ????? ?????? ? ?????? ??? ????
?? ????? ?? ???. ??? ??? ???? ?? ?? ???? ????????
?? ??? ?? ????? ???? ????? ?? ??? ? ??????? ???
????? ????. ??? ?? ????? ?????? ????? ?? ??????
???? ???? ???????????? ?? ????. ??? ???? ??????
????? ?? ???????? ?? ???. ?? ???? ????? ?????? ??
?? ???? ????? ????? ????? ????? ?? ?? ?? ?? ????
? ?? ?????? ???? ????? ???? ??? ?????.
7
????? ?????
?? ??????? ????? ?? ??? ???? ?? ???? ?? ????? ?
???? ??? ?? ?? ???? ?? ?????? ?? ? ??????? ??? ??
???????? ?????? ???? ?? ???? ???? ???. ??? ???
???? ???? ????? ?? ?? ????? ??? ?????? ????
???????? ?? ????? ????? ???? ?? ???????? ????
??????? ???? ??? ?? ??? ???? ????? ????? ????? ??
? ????? ???? ??? ????? ?? ??? ??? ?? ???? ? ?? ??
??? XML ??????? ???. ??? ?????? ?? ???? ??? ????
?? ?????? ?? ?? ?? ??? ????? ?? ?? ???????? ????
??????? ??? ???? ? ?? ?? ???? ??? ???? ?? ??????
?? ?? ?? ?????? ????? ? ??????? ??????? ?? ????
?????? ???? ??? ???? ?? ????? ?? ????.
8
????? ??????
??? ?????? ?? ???? ????? ?? ??? ?? ???? ?? ?????
?? ??? ????? ?????? ????? ???? ????? ?? ????. ???
????? ???? ?? ????? ?? ????? ?????? ? ?????? ? ??
????? ????? ??? ???? ????. ???? ??? ?????? ??
???? ????? ????? ?????? ?? ???? ? ????? ???? ?
???? ?????? ?? ?? ????
9
????? ????
?? ?? ???? ?? ?????? ????? ??????? ??????? ???.
?? ?? ???? ????? ?? ?? ?? ?? ?? ???? ??? ??????
??? ???? ?? ??????? ???? ????? ?? ???? ?????.
???? ?? Oracle ??????? ?? ?? ???? ???? ??? ??
?? ?? ??? ???? ????? ?? ????? Oracle ???????
????. ?? Oracle ????? ?? ????? ????? ????? ??
???? ?? ?? ?? ?????? ???? ????? ?????. ??????
???? ??? ?????? ???????? ???? ??? ??????????
???????? ?????? ? ... ????? ???? ?? ??????? ??
?????
10
???? ??????
  • ?????? ??????? ???? ????
  • ??????? ???? ??? ??
  • ?????? ? ??????? ??? ????? ?? ???? ?? (Data
    interoperability)
  • ??????? ???? ??? ???????
  • ?????? ???? ????? ????? ??? ???? ????????
    ?????????
  • ?????? ???? ???? ??
  • ??? ??? ??????? ????? (Data alignment)

11
??????? ???? ????? (1965-75)
  • ???? ?? ???? ??? ????? ?????? ??? ??????? ?? ???
    ????? ???? ????????

???? ??? ?????? ??????? 1
???? ??? ?????? ??????? 2
???? ??? ?????? ??????? 3
???? ??? ?????? n ???????

????? ?????? ??????? ???? ???
12
?????? ? ??????? ??? ?????? ?? ??????? (1975-80)

I N T E G R A T I O N
  • ???????? ???? ???? ?? ???? (Location
    Transparency)
  • ? ????? ??????
  • ?????? ?????? ???? ??? ???????? ????? ???
  • ?????? ?????? ?????? ????
    ???? ??? ???????? ?????????
  • ??? ???????? ???? ???? ?? ???? ( LOCATION
    VISIBILITY)
  • ? ??? ???? ????? ??????
  • multiDB views, multi DB access language

  • MULTIDATABASE SYSTEMS
  • ???? ??? ?????? ?????? ? ?? ???? ?????? ?????
  • (files, repositories, knowledge bases,
    spreadsheets, )
  • information exchange protocols / languages

-
13
??????? ???? ??? ??????? (80-95)
  • Federated Databases

????? Interoperability ??????? ?????? ? ?????
??????? ????? ?? ??? ??? ????? ? ??????? ?? ???
???????
14
??????? ???? ??? ??????? (?????)
Courtesy Oracle
15
?????? ????? ??? ???? ???????? ?????????

Filtering
Integration
Translation Wrappers / Mediators
local export schemas
16
?????? ???? ???? ?? (1995-2000)
  • ?????? ?? ?????? ???? ????

17
????? ???????? ??????????? ?????? ?? ?????? ????
???? ??
  • ????? ??????? ???? ???? ??? ???? ????
  • ??????? ????

18
??????? ???? ?? ??? ??
  • ????? ???????
  • ???? ???? ?? ???? ???? ????? ?? ????? ????? ????
    (????? eBay) ?????? ??? ?? ??? ????? ??????? ???
    ?? ???? ? ??????? ??? ????? ?? ????? ????? ??????
    ??? ???? ???? ?? ???? ???.
  • ?? ?? ????? ??? ?????? ??? ?? ????? ????? ???
    ???? ????? ????? ???? ??? ?? ?? ???? ????????
    ???? ?? ?? ?? ??????? ??? ????? ????? ? ??
    ??????? ???? ?? ??????? ?? ?? ??????? ?? ????
    ?????? ?? ????.
  • ?? ?? ?????? ??? ??????? ?? ??????? ?? ???? ????
    ? ??????? ?? ?????? ?? ???? ???? ???? ??? ???
    ?????? ?????? ????.

19
????? ???????? ???????
20
?? ???????? 4 ???? ???? ??????? ???? ???? ???
???????? ??????
21
???? ???????? (????)
  • ??????? ?????? ?????? ??? ??????

Schema 1
Schema 2
22
???? ????? ???? ???????? (??????)
Schema S1 (OO)
The integrated schema (OO)
Person
Person
Pin
Name
Pin
Name
Student
Faculty
Student
Faculty
GPA
Rank
Rank
GPA
Phd-advisor
PhD Student
Schema S2 (relational)
Thesis
Thesis (Phd-advisor, Phd-student, title)
Title
Adv.
Student
  • ??? ??? ???? ?? ??????? ??????? ????? ????????

23
??? ?????? ???? ??
  • ???? ???? ???? ??
  • ????? ??? ?????? ??????? ??? ????? ??? ? ??
    ????? ???? ??? ???? ?? ? ??? ????????? ?? ? ?????
    ??
  • ??????? ???? ???? ??
  • ??????? ???? ???? ??? ???????? ?? ???? ???
    ???????? ? ?? ???? ???????
  • ????? ???? ?? (Data transformation)
  • ?????? ? ?? ????? ???? ???? ?? ( Normalization
    and aggregation)
  • ???? ??? ???? ?? (Data reduction)
  • ???????? ???? ?? ?? ???? ???? ?? ???? ?? ?? ?????
    ?????? ????? ???? ????? ???? ???? ?????? ????
    (????? ???? ????? ????? ?????)

24
????? ????????? ?? ?? ??????? ????
  • ????????? ??? ??????
  • ????????? ??? ???

25
????????? ??? ??????
  • Naming Conflicts
  • In any data model, the schemata incorporate
    names for various entities/objects represented by
    them. Since these schemata are designed
    independently, the designer of each schema uses
    his or her own vocabulary to name these objects.
    Objects in different schemata representing the
    same real world concept may contain dissimilar
    names

26
Semantic Incompatibilities (cont.)
  • Naming Conflicts (Cont.)
  • Homonyms This inconsistency arises when the
  • same name is used for two different concepts. For
  • example, 'SALARY' may mean weekly salary in
  • one database, and monthly salary in another.
  • Synonyms This type of naming conflict arises
  • when the same concept is identified by two or
  • more names.
  • For example, the term 'DOMESTIC CUSTOMER'
  • in one database may refer to the same concept as
    the term 'BUYERS' in another database

27
Semantic Incompatibilities (cont.)
Type Conflicts These conflicts arise when the
same concept is represented by different coding
constructs in different schemata. For example, an
object may be represented as an entity in one
schema and as an attribute in another schema.
28
Semantic Incompatibilities (cont.)
Key Conflicts Different keys may be assigned
to the same concept in different schemata 15,
46. For example, ss and EMP-ID may be keys for
employees in two component schemata.
29
Semantic Incompatibilities (cont.)
Behavioral Conflicts These conflicts arise
when different insertion/deletion policies are
associated with the same class of objects in
different schemata. For example, in one database,
the relation DEPT may exist without having any
employee records being associated with it, where
as in another database, the deletion of the last
employee record may also delete the relation DEPT
from the database.
30
Semantic Incompatibilities (cont.)
Missing Data Different attributes may be
defined for the same concept in different
schemata. For example, EMPI(SSN, NAME, AGE) and
EMP2(SSN, NAME,ADDRESS) may represent the same
concept in two database schemata. Attribute 'AGE'
is missing in EMP2, and attribute 'ADDRESS' is
missing in EMPI.
31
Semantic Incompatibilities (cont.)
Levels of Abstraction This incompatibility is
encountered when information about an entity is
stored at dissimilar levels of detail in two
databases. For example, 'LABOR-COST' and
'MATERIAL-COST' may be stored separately in one
database and combined together as 'TOTAL-COST' in
a second database.
32
Semantic Incompatibilities (cont.)
Identification of Related Concepts For concepts
in the component schemata that are not the same
but are related, one needs to discover all the
inter-schema properties that relate to them. For
example, two entities belonging to two different
databases may not be equivalent but one entity
may be a generalization of the other entity.
33
Semantic Incompatibilities (cont.)
Scaling Conflicts This incompatibility arises
when the same attribute of an entity is stored in
dissimilar units in different databases. For
example, the attribute 'LENGTH' of an entity may
be stored in terms of centimeters in one database
and as inches in another database.
34
Quantitative Data Incompatibilities
Different Levels of Precision Different
databases may be storing an attribute at
dissimilar levels of precision. For example, one
database may contain the weight of a particular
part up to an precision of a milligram,
whereas another database may specify precision
only up to a gram
35
Quantitative Data Incompatibilities (Cont.)
Asyncronous Updates Since each database is
managed independently, all databases may not
update the value simultaneously
36
Quantitative Data Incompatibilities (Cont.)
Lack of Security Due to lack of information
security at component databases, unauthorized
users may have changed the data
37
Challenges of Bioinformatics Databases Management
  • Bioinformatics Databases format
  • Flat files GenBank, EMBL, DDBJ, PDB.
  • Relational databases HGMD, MGMD
  • Object-oriented database AceDB.
  • XML databases PIR, SwissProt, InterPro.
  • Characteristics
  • The Diversity/variety of data.
  • The representational heterogeneity.
  • Autonomous and web-based sources.
  • Varied interface and query capabilities

38
Motivation
  • Very large heterogeneous databases.
  • Need to
  • link.
  • Integration.
  • Complex relation.

39
Volume and Variety
  • Two interacting issues in the generating
    information
  • 1. The volume is large --
  • we need automation
  • 2. The data is varied heterogeneous
  • many autonomous sources
  • many distinct objectives
  • many incompatibilities, errors

40
Diversity Heterogeneity
  • A wide variety of knowledge is needed to
    interpret the data
  • A large variety of experts is developing this
    knowledge
  • The scope of interests differs among those
    experts
  • The knowledge is expressed in diverse ways
  • The terms differs in precise meaning semantics
  • A large variety of data types is needed
  • A wide variety of representations is used
  • The database and file schemas differ
  • A wide variety of representations is used
  • The openness and accessibility of the
    information differs

41
Heterogeneity inhibits Integration
  • An essential feature of science
  • autonomy of fields
  • differing granularity and scope of focus
  • growth of fields requires new terms
  • A feature of technological process
  • standards require stability
  • yesterdays innovations are todays
    infrastructure
  • Must be dealt with explicitly
  • sharing, integration, and aggregation are
    essential
  • Large quantities of data require precision

42
Heterogeneity among domains is natural
  • Interoperation creates mismatch
  • Autonomy conflicts with consistency,
  • - Local Needs have Priority,
  • - Outside uses are a Byproduct
  • Heterogeneity must be addressed
  • Platform and Operating Systems 4 4
  • Data Representation and Access Conventions 4
  • Metadata Annotations, Naming, and Ontology
  • needed to share data from distinct sources

43
Obstacles to Integration
  • Data spread over multiple, heterogeneous dbs
  • Not all are easily queried
  • flat file sequence dbs, web sites, BLAST
    alignments
  • Some are not even easily parsed!
  • Not all represent biology optimally
  • Genbank is sequence-centric, not gene-centric
  • SwissProt is sequence-centric, not
    domain-centric
  • Hard to keep results up-to-date
  • Non-traditional query approaches are needed to
    exclude extraneous results

44
What are the Data Sources?
  • Flat Files
  • URLs
  • Proprietary Databases
  • Public Databases
  • Data Marts
  • Spreadsheets
  • Emails
Write a Comment
User Comments (0)
About PowerShow.com