XMDR Prototype Progress Report - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

XMDR Prototype Progress Report

Description:

Racer. Ontology Editor. Protege. 11179 OWL Ontology. Java ... Jena or Racer (memory) result set. includes. subclasses, inverses, etc. ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 30
Provided by: johnmc2
Category:

less

Transcript and Presenter's Notes

Title: XMDR Prototype Progress Report


1
XMDR Prototype Progress Report
  • John McCarthy and Kevin D. Keck
  • XMDR Project Quarterly Meeting
  • 15 March, 2005
  • UC Berkeley Faculty Club

2
Purposes of XMDR Prototype for ISO/IEC 11179
Registry Standard
  • Extend semantics management capabilities
  • Explore uses of terminologies and ontologies
  • Systematize representation of relationships
  • Adapt test emerging semantic technologies
  • Help resolve registration harmonization issues
    for different metadata standards
  • Propose revisions to 11179 Parts 2 3 (Ver. 3)
  • Show how proposed revisions to metadata registry
    standards can be implemented
  • Demonstrate Reference Implementation (RI)

3
XMDR Prototype Architecture Initial Implemented
Modules
Subversion
Java
Jena, Xerces
Lucene
Jena, OWI KS
4
XML files for each metadata object were extracted
from EDR via scripts
Extract Script - works on underlying html -
follows each link - gets html file for each
linked object
Conversion Script - works on html files - creates
xml file for each
First stepall xx items linked from Countries of
the World Next step bulk load all metadata item
instances from EDR
5
Html to xml file conversion example
  • Show html file and then its xml transform

6
XMDR Prototype contains an xml file for each
metadata item instance
  • Administered Items
  • Classification Schemes
  • Conceptual Domains
  • Contexts (for Administered Items)
  • Data Elements
  • Data Element Concepts
  • Object Classes
  • Properties
  • Representation Classes
  • Value Domains

Other Items Relationships What else?
7
XMDR Prototype example dual purpose rdf/xml
file DC-Country_Label (extract)
  • ltDataElement rdfabout""
  • xmlnsmdr"http//hpcrd.lbl.gov/SDM/XMDR/ont/i
    so11179-3v2.owl"
  • xmlnsrdf"http//www.w3.org/1999/02/22-rdf-sy
    ntax-ns"
  • mdrmeaning"DC-Country_Label.xml"
  • mdrregistrationAuthority"http//oaspub.epa.g
    ov/edr"gt
  • ltidentifiergt
  • ltregistrationAuthorityIdentifiergt
  • ltinternationalCodeDesignatorgten-USlt/internat
    ionalCodeDesignatorgt
  • ltorganizationIdentifiergtEPAlt/organizationIde
    ntifiergt
  • lt/registrationAuthorityIdentifiergt
  • lt/identifiergt
  • ltadministrationRecordgt
  • ltregistrationStatusgtStandardlt/registrationStat
    usgt
  • ltadministrativeStatusgtFinallt/administrativeSta
    tusgt
  • ltcreationDategt????lt/creationDategt
  • lt/administrationRecordgt
  • ltsteward mdrorganization"ORG-1044"
  • mdrcontact"CON-20068"/gt

ltterminologicalEntry mdrentryContext"CXT-St
andard"gt ltcomponentgt ltsectionLanguagegt
ltlanguagegtenglt/languagegt ltcountryIdentifiergtUSAlt/
countryIdentifiergt lt/sectionLanguagegt
ltdesignationgt ltnamegtCountry Namelt/namegt
lt/designationgt ltdefinitiongt lttextgtThe name
that represents a primary geopolitical unit of
the world.lt/textgt lt/definitiongt
lt/componentgt lt/terminologicalEntrygt ltterminologi
calEntry mdrentryContext"CXT-XML_Tag_Final"
gt ltcomponentgt ltsectionLanguagegt ltlangua
gegtenglt/languagegt ltcountryIdentifiergtUSAlt/country
Identifiergt lt/sectionLanguagegt
ltdesignationgt ltnamegtCountryNamelt/namegt
lt/designationgt lt/componentgt
lt/terminologicalEntrygt lt/DataElementgt
boil down contents and add annotations
8
XMDR files serve dual purposexml and
OWL-compatible rdf
  • Well-formed XML
  • XML serialization of RDF
  • Conforms with 11179 OWL ontology
  • Base tag includes rdfabout attribute
  • Literals encoded as element content
  • URIs encoded as attribute values
  • striped resource, property, resource, use
    abbreviated form for anonymous nodes

9
Xml schema specifies constraints
  • Relax NG schema to make xml files
  • Enforces constraints for 11179 OWL

10
Relationships are implemented as LINKS to other
xml files
ltentity-typegtDataElement lt/entity-typegt
ltnamegtCountry of Birthlt/namegt ltconceptual
domaingtCountry lt/conceptual domaingt ltvalue
domaingtCountries of the World lt/value domaingt
Metadata schema includes relationships that
specify which attributes can or must link to
other entity-types
ltentity-typegtConceptualDomainlt/entity-typegt
ltnamegtCountrylt/namegt
ltentity-typegtValueDomainlt/entity-typegt ltnamegt
Countries of the World lt/namegt
11
How can terminologies and ontologies help manage
metadata?
  • At the level of metadata instances in a registry,
    connect metadata entities via shared terms
  • via automatic indexing of metadata words
  • via text values from specific metadata elements
  • At the level of the 11179 (or other) metamodel,
    ontologies can help specify formal relationships
  • is-a and part-of hierarchies, etc.
  • Inheritance, aggregation,
  • for automatic searching of sub-classes inverses
  • to specify semantic pathways for indexing

12
Protégé Editor was used to create OWL ontology
for 11179 metamodel
  • insert screenshot from Protégé

13
Protégé OWLViz Plug-in Displays OWL Class
Hierarchy for 11179
Built on top of GraphViz
14
Class hierarchy includes simple and union
is-arelationships
15
Is-a and union hierarchies can nest to arbitrary
depth in OWL
16
ISO/IEC 11179 fragment is expressed as an OWL
ontology using RDF syntax
lt?xml version"1.0" encoding"ISO-8859-1"?gt ltrdfR
DF xmlnsrdf"http//www.w3.org/1999/02/22-rdf
-syntax-ns" xmlnsrdfs"http//www.w3.org/200
0/01/rdf-schema" xmlnsowl"http//www.w3.org
/2002/07/owl" xmlns"http//www.owl-ontologie
s.com/unnamed.owl" xmlbase"http//www.owl-ont
ologies.com/unnamed.owl"gt ltowlOntology
rdfabout""/gt ltowlClass rdfID"Registrar"gt
ltrdfssubClassOf rdfresource"http//www.w3.org
/2002/07/owlThing"/gt ltrdfssubClassOfgt
ltowlRestrictiongt ltowlcardinality
rdfdatatype"http//www.w3.org/2001/XMLSchemaint
" gt1lt/owlcardinalitygt
ltowlonPropertygt ltowlObjectProperty
rdfID"contact"/gt lt/owlonPropertygt
lt/owlRestrictiongt lt/rdfssubClassOfgt
ltrdfssubClassOfgt ltowlRestrictiongt
17
Lucene facilitates text indexing to search on
words phrases
Word Index birth country world
Name Index birth country
Other Indexes
Phrase searches done on results
ltentity-typegtTerminologylt/entity-typegt
ltnamegtUnited Nations XXXXlt/namegt
ltentity-typegtValueDomainlt/entity-typegt ltnamegt
Countries of the World lt/namegt
ltentity-typegtConceptualDomainlt/entity-typegt
ltnamegtCountrylt/namegt
ltentity-typegtDataElement lt/entity-typegt
ltnamegtCountry of Birthlt/namegt ltconceptual
domaingtCountry lt/conceptual domaingt ltvalue
domaingtCountries of the World lt/value domaingt
18
Lucene text search capabilities, with examples
  • Simple word search
  • Wild-card word search
  • Fuzzy or stem search ()
  • Search specified field
  • Search for links
  • Distance (in words)
  • Phrases
  • Boolean operators

country
coun
country
namecoun
linkhttp)
country world4
countries of the world
AND, "", OR, NOT, "-"
http//lucene.apache.org/java/docs/queryparsersynt
ax.html
19
More text search capabilities
  • Range search
  • Boosting a term
  • Grouping
  • Field grouping
  • Escape special characters

20
Advanced search interface can automatically
generate syntax
Field/Column
Operator
Value
Conjunctions
21
show Text Queries here
  • show Text Query examples here
  • Queries and results

22
Distinguish text vs rdf queries
  • What text queries do
  • What rdf queries do
  • What are the major differences?

23
Reasoners use OWL ontologies to augment RDF
graph queries
RDF Query (rdql/ndql)
Reasoner Jena or Racer (memory)
result set includes subclasses, inverses, etc.
OWL 11179 Ontology
OWL built-in rules
11179 metadata (xml/rdf files)
24
Uses of RDF queries
  • Expand queries to other classes in a hierarchy
  • E.g., all data elements that have to do with
    infectious diseases
  • E.g. what attributes are there for images of
    carcinomas vs basil cell
  • Go from concepts back to des even if only have
    concepts in des (do index)

25
RDF Queries are very different from text search
or SQL!
QUERY SELECT ?t WHERE (lthttp//erdos.lbl.gov/xmdr/
data/VDALL.1.15147.1.xmlgt rdftype ?t) USING
mdr FOR lthttp//hpcrd.lbl.gov/SDM/XMDR/ont/iso1117
9-3v2.owlgt RESULT ?t lthttp//hpcrd.lbl.gov/SDM/X
MDR/ont/iso11179-3v2.owlEnumeratedValueDomaingt lth
ttp//hpcrd.lbl.gov/SDM/XMDR/ont/iso11179-3v2.owl
ValueDomaingt lthttp//hpcrd.lbl.gov/SDM/XMDR/ont/is
o11179-3v2.owlAdministeredItemgt ltowlThinggt ltrdfs
Resourcegt search includes sub-classes and
inverses
26
Registering/Loading Metadata
  • Browse and upload file(s)
  • Form interface

27
Lessons Learned to Date
  • XML-RDF files can serve dual purpose (may be
    first time this has been done?)
  • Independent modules facilitate testing many
    possible new toolsBUT
  • take more time to implement maintain
  • State of art for open-source OWL reasoners not
    very advanced (none yet for OWL-DL)

28
Unresolved Issues
  • Performance using files for objects
  • Relationship representation
  • XML objects display browsing
  • User-friendly interface for RDF queries
  • external data sources, ontologies, terminologies
    (maybe via indexing?)
  • Harmonization with MMR
  • Others??

29
Next Steps
  • Load data from EDR other sources
  • Implement advanced text search UI
  • Use style sheets to display xml as html
  • Add validation and mapping modules
  • Connect terminology/ontology to items
Write a Comment
User Comments (0)
About PowerShow.com