Terminology and the Semantic MediaWikiEcoterm IV Vienna - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Terminology and the Semantic MediaWikiEcoterm IV Vienna

Description:

Evaluate the roles, categories and organization of the National Cancer ... just kidding. Terminology and the Semantic MediaWiki. 8. Wiki's. Community developed ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 50
Provided by: hsol
Category:

less

Transcript and Presenter's Notes

Title: Terminology and the Semantic MediaWikiEcoterm IV Vienna


1
Terminology Curation with the Semantic MediaWiki
  • Harold Solbrig
  • Informatics Architect
  • Apelon, Inc.

2
(No Transcript)
3
The Primary Task
  • Evaluate the roles, categories and organization
    of the National Cancer Institute (NCI)s Cancer
    Thesaurus with respect to
  • Upper Level Ontological Principles
  • ISO TC37 Related principles
  • As with Ontology construction, it was understood
    by all parties that this was a process not a
    goal.

4
Approach
  • Gather appropriate upper level ontologies (BFO,
    Dolce, Top Bio, UMLS Semantic Net and OBO
    Relations Ontology) into a single, readily
    referenced format
  • Load NCI Thesaurus into same format
  • Multiple parties review, annotate, recommend and
    categorize
  • Publish, analyze and evaluate results

5
Solution
  • By using the Semantic MediaWiki (SMW), we were
    able to accomplish all of the goals in a (very)
    reasonable period of time

6
Discussion
  • We also discovered that, with some extensions,
    the SMW could be useful for publishing,
    annotating and cross-referencing other
    terminological (and other..) resources.

7
Questions?
  • just kidding.

8
Wikis
  • Community developed
  • Collaborative
  • Organic to the very core
  • Primary focus (to date) is human consumption
  • Traceable, provenance automatically recorded,
    differences, undo and redo.

9
MediaWiki
  • http//en.wikipedia.org/wiki/Wiki
  • Base for WikiPedia and many others
  • Key characteristics
  • Web based editing
  • Page links
  • Categories
  • Templates

10
MediaWiki
  • Fully documented using (surprise!) mediawiki
  • Rich mechanisms for discussion, curation, export,
    etc.

11
(No Transcript)
12
Common constructs
  • Train Transport hyperlink to page named
    Train_Transport
  • Italic, Bold
  • Bullet point
  • http//www.w3c.org/ The W3C hyperlink
  • and much more

13
Templates
14
Templates
15
Sample Template
Extension call
Parameter
16
Semantic MediaWiki
17
Semantic MediaWiki
  • 3 Key extensions to MediaWiki
  • Categories Class
  • PageA CategoryX ? pageA rdfType
    categoryX
  • CategoryY CategoryX ? categoryY
    rdfssubClassOf categoryX
  • Links Role
  • PageA PageB ? PageA hasPartPageB
  • Attributes DataProperty
  • population32,154,773
  • Includes datatypes

18
Categories and Relations
19
Attributes
20
Semantic Rendering
RDF (!)
Relation
Attribute Value
Type (or superClass)
21
Thesaurus Content
22
Templates?
Gene_Product_Is_Biomarker_Type The role is
used to designate the type of Kind
CategoryNCI_Kind Semantic Type
NCI_Semantic_TypeCategorySN_Conceptual_Entity
Conceptual Entity
Brittle, not readily changed
23
Templates?
OntylogDescriptionnsNCItextThe role is
used to designate KindnsNCItargetKind
ResourceRefnameSemantic_TypensNCItargetCon
ceptual_EntitytargetnsSN
Can readily be updated viat template
24
Link to another NCI comment
Link to external Ontology
Categorization in external Ontology
Commentary
25
Computed
26
How is it Working?
  • Very well!

27
What can we do to improve it
28
Terminology
  • Centrally curated
  • Central to the practice of medicine
  • Insurance and reporting
  • Regulatory
  • Research
  • Clinical Practice
  • Information Sharing
  • ICD-9, CPT-4, SNOMED,

29
Clinical Terminology
  • Quality and content is important
  • Needs central vetting, integration, qa
  • Central model doesnt scale
  • Need input from (many) experts
  • Need visible, active feedback loop

30
Terminology Workflow 1995
Books PDF
Distribution
(3)
Controlled Terminology
Lists and Tables
(2)
(1)
Curation
(4)
31
Terminology Workflow 1995
Books PDF
Distribution
(3)
Controlled Terminology B
(2)
Lists and Tables
(1)
Curation
32
Terminology Workflow 2008
(3)
Common Distribution Model
Distribution
Controlled Terminology
(2)
(4)
Online Services
(1)
Curation
(5)
33
Terminology Workflow 2008
(3)
Controlled Terminology B
Common Distribution Model
Distribution
Controlled Terminology
(2)
(4)
Online Services
(1)
Curation
(5)
34
Common Distribution Model
  • LexGrid
  • (a little bit of) OWL
  • NCI Thesaurus SNOMED CT
  • Still requires LexGrid-like additions
  • Pushing the envelope
  • UMLS RRF
  • Although underspecified as a model

35
Online Services
  • OMG Terminology Query Services
  • Not heavily used
  • Perceived (incorrectly) as CORBA specific
  • Perceived as too complex
  • Object oriented and stateful
  • ANSI Common Terminology Services
  • Being adopted
  • Necessary but not sufficient
  • Stateless
  • CTS-2
  • Co-development beginning w/ HL7 OMG

36
Online Services
  • LexBIG
  • LexGrid for the Bio Informatics Grid
  • Robust query specification
  • Meets many end-user (developers) requirments
  • Not simple to implement it actually adds value
  • Not a standard - but will be used to guide CTS-2

37
Workflow and Feedback
(3)
Common Distribution Model
Distribution
Controlled Terminology
(2)
(4)
Online Services
(1)
Curation
(5)
38
The Feedback Component
Curation
39
The Feedback Component
Common Distribution Model
Semantic MediaWiki ()
Distribution
Annotations and Change Requests
Online Services
Community Review
Version Staging
Curation
40
Issues and Next Steps
  • (1) SHARED Semantics
  • Definition
  • Synonym
  • References
  • DLSome
  • DLAll
  • 12620 anyone?

41
Issues and Next Steps
  • (2) Figure out namespaces
  • NCIActivity, AgroVocFish,
  • NCI_Activity, AgroVoc_Fish
  • ???
  • (2a) Identifiers (Activity vs. C12345)
  • (2b) Versions
  • (2c) URIs (vs. URLs)
  • Internal
  • External

42
Certification and Sanctioning
  • Who can edit?
  • Who can validate?
  • Who selects updates?
  • (see http//en.citizendium.org/wiki/Main_Page

43
Automatic Export
  • Selecting sets of updates
  • Formatting update recommendations for target
    curators, etc

44
Synchronization
  • Changes implemented in terminology
  • Update wiki pages
  • Say what changed
  • What changes are incorporated by value? By
    reference?

45
Approach and Responsible Parties
  • Shared Semantics
  • Core set based on LexGrid OWL
  • Post on WIKI and link on SMW site
  • Assigned to Apelon, Mayo, NCI, ???
  • Extend to OBO, SKOS (?), XMDR
  • Connections to 12620

46
Time Frame and Assignments
  • URIs, namespaces, naming
  • UK NCR (CancerGrid) looking at unAPI and
    servers
  • (Hopefully) can provide URI resolver svc.
  • Short term use templates / extensions

47
Content
  • SNOMED-CT, ICD-9-CM, many, many others are
    already available via. Apelon DTS Services
  • Available soon
  • FMA, HL7 Version 3 Terminology, OBO Foundry (GO,
    PATO, etc) as time permits
  • Others as needed (and funded)

48
What weve got to date
  • Apelon DTS Server Extension
  • Includes both defined and classified view (!)
  • Export in restful XML (currentely Apelon, soon to
    be LexGrid)
  • XMDR Export Format
  • Protégé (Native and OWL 3.2) prototype
  • Done by Mayo
  • Both import and export
  • Still needs templates

49
Questions?
  • This time for real ?
  • Note SMW will be made externally available (w/
    simple password) once we get contract specific
    info cleaned up (NCI will probably publish
    shortly) contact hsolbrig_at_apelon.com for
    access.
Write a Comment
User Comments (0)
About PowerShow.com