Reflections on "The Challenge of Environmental Data Interoperability on the Global Information Grid" - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Reflections on "The Challenge of Environmental Data Interoperability on the Global Information Grid"

Description:

LDM to consist of atomic concepts. Three schema architecture *Task-Process-Exploit-Disseminate ... http://www.cnn.com/US/9701/23/shoe.sales ... – PowerPoint PPT presentation

Number of Views:742
Avg rating:3.0/5.0
Slides: 20
Provided by: dalem8
Category:

less

Transcript and Presenter's Notes

Title: Reflections on "The Challenge of Environmental Data Interoperability on the Global Information Grid"


1
Reflections on "The Challenge of Environmental
Data Interoperability on the Global Information
Grid" by Dobey and Eirich (05S-SIW-133)
  • Dr. Dale D. Miller
  • Geo-Spatial Technologies, Inc.
  • Seattle, WA
  • dmiller_at_gsti3d.com
  • Dr. Paul A. Birkel
  • The MITRE Corporation
  • McLean, VA
  • pbirkel_at_mitre.org

2
Dobey / Eirich Tenets
  • Paradigm shift TPED ? TPPU
  • Interposed mediating software layer and LDM
    for data conversion between schemas
  • LDM to consist of atomic concepts
  • Three schema architecture

Task-Process-Exploit-Disseminate Task-Post-Proc
ess-Use Logical Data Model
3
Outline
  • Recent Relevant Efforts
  • Mappings and Semantic Nuances
  • Losslessness and Correctness
  • Normalization the Silver Bullet?
  • Dynamic Semantic Content Update Implications of
    the GIG
  • Conclusions and Recommendations

4
Recent Relevant Efforts
  • U.S. Army Geospatial Data Integrated Master Plan
    (AGDIMP)
  • Geospatial Intelligence Database Integration
    (GIDI)
  • Multilateral Interoperability Programme (MIP)
    C2IEDM
  • GDI FACT REDM, DREDM and the Dobey / Eirich
    Mediating Layer Environmental Data Model (MLEDM)

5
U.S. Army Geospatial Data Integrated Master Plan
(AGDIMP)
  • Developed in preparation for the FCS and the
    Joint Geospatial Enterprise Services (J-GES)
    Initial Capabilities Document (ICD)
  • Vision
  • Solder as a sensor with a two-way data flow
    between the area of operations (AO) and
    geospatial databases at data repositories
  • Key recommendations
  • Joint geospatial data dictionary
  • Standard ontology
  • Integrated, end-to-end geospatial process
  • Real-time, on-site geospatial data updates
    broadly andrapidly disseminated
  • Brilliant push dissemination
  • Machine readable metadata
  • Procedures standards for fusion, conflation,
    filtering,transformation, etc.

6
Similarities and DifferencesAGDIMP vs. Dobey /
Eirich
  • Both utilize TPPU paradigm
  • Both utilize 3-schema architecture and mediating
    layer
  • Implicit in AGDIMP
  • AGDIMP envisions geospatial data repositories
    based upon emerging NSG standards
  • The AGDIMP is the Army's implementation of the
    National System for Geospatial Intelligence
    Geospatial Transition Plan (GTP).  The
    GTPprovides an overarching vision, concept of
    operations, and animplementation plan to assist
    the NSG in providing for global geospatial
    readiness and responsiveness that is current,
    accurate, relevant, and interoperable, and fully
    supports the Common Operational Picture.
  • AGDIMP does not reduce environmental concepts to
    atomic forms

The National System for Geospatial-Intelligence
7
Geospatial Intelligence Database Integration
(GIDI)
  • Three-schema architecture to integrate existing
    NGA databases and production tools
  • Fielded in 2002
  • Continuous use by GIG-based Operations community
    via NGA Gateway
  • Evolved to meet Homeland Security and other
    requirements
  • Being integrated into the Geospatial-Intelligence
    Knowledge Base (GKB) in early-FY06
  • The Geospatial Intelligence Feature Database
    (GIFD)serves the role of a MLEDM
  • Supports data exchange between ESRI-based
    andIntergraph-based geospatial data
    environmentsthrough mappings and a common
    datastore
  • Data dictionary is FACC Ed. 2.1 with US National
    extensions
  • No atomic data elements or canonical forms for
    environmental concepts

8
Multilateral Interoperability Programme (MIP)
C2IEDM
  • Actual implementation of a three-schema
    architecture
  • Here, the C2IEDM serves the role of a MLEDM
  • Producers and consumers are national C2
    Information Systems (C2IS)
  • No attempt for atomistic reduction
  • Semantic interoperability
  • Proposal for a data access stack with multiple
    abstraction layers, access control and
    notification services

In order to ensure true semantic
interoperability, far-reaching modifications to
the core of national C2ISs are necessary rather
than just the addition of mapping adapters as new
interfaces to the existing systems. -- M.
Schmitt, Integration of the MIP C2IEDM into
National Systems
9
GDI FACT REDM, DREDM and the MLEDM
  • DREDM an adjudicated union of EDMs
  • Losslessly supports representation of concepts of
    all constituents
  • Uses a common data dictionary
  • Approach taken by NGA GIDI
  • REDM an adjudicated intersection of EDMs
  • Intended to express the common semantics to
    support the closely-coupled deep interoperation
    of multiple systems
  • Advantages of the atomic concept MLEDM over the
    DREDM
  • MLEDM very powerful if uniqueness of
    representation can be attained an open question
  • Issues remain with mapping composite concepts

10
Lessons Learned from Previous Efforts
  • Army has established a technical vision and
    recommended policies to foster the interchange
    and interoperability of geospatial data in the
    context of the GIG and GES (i.e., the AGDIMP)
  • While advocating a Joint geospatial data
    dictionary and data model,it does not envision
    success as predicated on atomic decomposition
  • The three-schema architecture has been
    implemented previously in the environmental
    domain and has aided in the interchange and
    interoperability of environmental data (e.g., the
    GIDI)
  • Successful while operating above the atomic
    level
  • Rigorous normalization of a rich data model is a
    complex undertaking (e.g., the C2IEDM)
  • No existing implementation (of which we are
    aware) has attempted to develop or leverage an
    environmental logical data model comprised
    entirely of irreducible (atomic) conceptual
    elements

Global Information Grid GIG Enterprise Services
11
Mappings and Semantic Nuances
  • Dobey and Eirich state
  • Another potential tradeoff exists in cases
    where use of a mediating layer does not provide
    for a lossless representational match for a
    source data item. In this case, there may be
    nuances of semantics that are lost in the
    translation.
  • On mapping complexity, Schmitt states (C2IEDM
    context)
  • The required mapping rules can be very complex
    in practice. In particular, this holds in cases
    in which there is no clear 11 mapping of
    concepts. For instance, n attributes of the ODB
    operational data base might have to be mapped
    onto m attributes in the C2IEDM where the
    attributes may be distributed over several
    entities.

12
Nuance Examples Abound(especially for aggregate
concepts)
13
Losslessness and Correctness
  • Assertion All lossless mappings are correct,
    but the converse is false
  • RIVER ? WATERCOURSE is correct but lossy
  • RECYCLING_SITE ? AB010 Wrecking Yard/Scrap Yard
    is incorrect
  • Dobey / Eirich analogy
  • Another analogy for this transfer might be a
    chemical reaction, where atoms contained in
    molecules from one or more substances are
    exchanged and reassembled into molecules of one
    or more different substances. The common
    interchange hub provides a mechanism wherein the
    disassembly and reassembly of data objects can
    take place.
  • The concepts of losslessness and correctness
    elucidate flaws in the analogy

14
Mapping Aggregate to Atomic Concepts
FACC Composite Concept
MLEDM Atomic Concepts
?
  • How does instance data actually map?
  • Chemical reaction analogy fallacy
  • Molecules are the conjunction of their elements
    while aggregate environmental features are the
    disjunction of possible specific types
  • When two hydrogen and one oxygen atom combine to
    form one H2O molecule, both sides of the equation
    still have three atoms two hydrogen and one
    oxygen
  • But an aggregate feature instance is only one of
    the possible feature types comprising the feature
    type (concept) definition
  • There are simply no building blocks to
    disassemble and recombine

15
Normalization the Silver Bullet?
  • While there are rigorous definitions of nth
    normal forms in relational database theory, an
    intuitive description is
  • Every expressible semantic can be reduced to a
    unique canonical form
  • A stated goal of the C2IEDM, however Schmitt
    states
  • The MIP community is continually improving the
    model but there will always be some unresolved
    problems.
  • Is normalizing a logical data model of fine
    grained, atomic environmental concepts tractable?
  • Take, e.g., the atomic concept of RIVER

16
John Sowas Exampleriver and stream vs. fleuve
and rivière
  • In English, size is the feature that
    distinguishes river from stream
  • In French, a fleuve is a river that flows into
    the sea, and a rivière is either a river or a
    stream that flows into another river
  • Life experiences color our interpretation of
    words, and, no matter how precise and rigorous
    the definitions in an environmental data
    dictionary, not everyone will agree on all
    nuances of their meanings
  • Reducing a rich environmental data model to
    normal form is a monumental task which may not
    even be possible
  • Objective determination of the atomicity of
    environmental concepts may not be possible

Many issues will require thought and adjudication
by thoughtful people to arrive at a good solution
for a particular context. But there is probably
no perfect one.
17
Dynamic Semantic Content Update Implications of
the GIG
  • NCOW tenet Allow communication of all
    information of interest to all interested
    parties all the time
  • What about a new semantic concept?
  • Must be incorporated in MLEDM
  • Example
  • Consumer declares interest in geospatial feature
    SHOE_FOOTPRINT with attributes MANUFACTURER and
    MODEL and respective enumerants BRUNO_MAGLIand
    LORENZO
  • If not in MLEDM, how does consumer specify
    requirement?
  • Autonomic updates to MLEDM judged infeasible for
    foreseeable future
  • Human in the loop at a central hub of the data
    interchange design process
  • Contrary to design philosophy of the GIG, but no
    available alternatives

An approach to self-managed computing systems
with a minimum of human interference. The term
derives from the body's autonomic nervous system,
which controls key functions without conscious
awareness or involvement. IBM Corp, Autonomic
Computing Glossary http//www.research.ibm.com/au
tonomic/glossary.html
http//www.cnn.com/US/9701/23/shoe.sales/
18
Conclusions
  • Three-schema architecture
  • Tried-and-true and also well established in the
    environmental domain
  • Mappings from aggregate concepts to aggregate
    concepts
  • Inherently problematic, usually lossy and often
    incorrect
  • Chemical recombinations aggregate-level data
    model mappings analogy
  • Fundamentally flawed
  • Chemical recombinations are represented by
    conjunctions of atomic elements
  • Aggregate-level environmental concepts are
    represented by disjunctions of their atomic
    building blocks (subtypes)
  • Logical data model reduction to a maximally
    normalized (nth normal) form
  • Feasible in constrained (one might say
    artificial) domains
  • E.g., banking, inventory control, and shipping
    and receiving
  • Problematical for a real-world environmental data
    model
  • Ultimately suffers the ambiguities, redundancies
    and nuances of natural language
  • Significant programs (e.g., the MIP C2IEDM) have
    not yet demonstrated success
  • Unable to eliminate all ambiguities and
    redundancies
  • Unable (yet, anyway) to guarantee the mapping of
    an arbitrary real world conceptto a (unique)
    canonical form
  • GIG tenant (all information of interest, to all
    interested parties, all the time)
  • When a producer or consumer needs a new semantic
    concept, can it autonomically publish its
    description and interested parties make immediate
    use of it?
  • We think not human analysts will be employed in
    this regard for some time

19
Recommendations
  • Basic tenets sound
  • Three-schema architecture, mediating layer LDM in
    normal form, and well constructed ontology
  • But no substitute for hard work by SMEs to work
    out and document the semantics of the domains of
    interoperating COIs
  • GES mediation services can implement the results,
    but hard work to reach cross-COI agreement
    remains the necessary precondition
  • Aggregate concepts should continue to be included
    in EDMs
  • But with explicit, instance-based relationships
    to their more specific subtypes
  • Further research and development investment is
    required to meet the environmental data
    interoperability challenge
  • Environmental data model atomization
  • COI-driven EDM mappings
  • Computational aspects of linguistics modeling
  • Data fusion and conflation

Community of Interest
Write a Comment
User Comments (0)
About PowerShow.com