Formal Aspects Of Protege - PowerPoint PPT Presentation

1 / 80
About This Presentation
Title:

Formal Aspects Of Protege

Description:

Ray can't write code fast enough ! Interoperability requires common ground ... Very few users know about the underlying logical formalism ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 81
Provided by: william311
Category:

less

Transcript and Presenter's Notes

Title: Formal Aspects Of Protege


1
Formal Aspects Of Protege
  • William Grosso
  • Stanford Medical Informatics
  • Stanford University

2
Overview
  • Interoperability is important
  • HPKB DARPA project with many participants
  • Protégé-2000 Lots of developers in many
    locations
  • Ray cant write code fast enough !
  • Interoperability requires common ground
  • Shared semantics for common constructs
  • The new Knowledge Model

3
Proposed HPKB Scenario
PSM
SRI
PSM
PSM
MIT
SMI
Knowledge Base(s) in a KB Server
Shared Ontologies
Situation Data
4
Knowledge Bases in HPKB
  • Ontologies are ways to share well-defined
    information
  • Define knowledge structure
  • Useful as a coupling mechanism
  • Knowledge Bases serve multiple roles
  • Repositories of shared knowledge
  • Community blackboards (with semantics).

5
Interoperability requires Semantics
  • As long as all the developers are in the same
    building, things can be underspecified
  • Rely on group knowledge and established
    practice
  • Larger working groups (over time, space, or in
    numbers of people), can require more precise
    specifications

6
Knowledge Models
  • Formal specification of the way knowledge is
    represented
  • Precise, human-readable definitions of structures
    in a language
  • Frequently unwritten
  • Implied by the documentation
  • Deduced via experience

7
Knowledge Models at SMI
  • Work spurred by the OKBC Specification
  • Defining the Protégé Knowledge Model
  • Comparing it to other knowledge models
  • Goal Enable Protégé tools to interoperate with
    knowledge-based systems from other labs
  • Goal is knowledge reuse
  • Implicit Hypothesis understanding knowledge
    models will facilitate interoperation

8
Example Protégé and Loom
  • Protégé A suite of tools to simplify knowledge
    base design and construction
  • Design ontologies, create KA tools to acquire
    instances
  • Explicitly adopts notion of external PSMs in
    order to focus on KA
  • Loom An environment for knowledge-based system
    construction
  • Everything done inside the Loom environment

9
Frame-Based Knowledge Models
  • Both Protégé and Loom use frame-based knowledge
    models
  • Classes, instances, slots, facets,
  • We expect differences over things like default
    values and models of time
  • But the knowledge models differ on more mundane
    notions as well

10
Whats a Slot ?
  • Protégé/Win
  • Slots are not part of the global namespace
  • Define attributes of a frame
  • Cannot be referred to independently of either a
    class or an instance
  • Which slots are attached to an instance is part
    of the class definition
  • Loom
  • Slots are part of the global namespace
  • Defined by defrelation construct
  • Have attributes
  • domain, range,
  • Slots can be reified
  • Instances of a slot class correspond to a
    specific relation (between two instances)

11
Whats an Instance ?
  • Protégé/Win
  • Every instance is a direct instance of a single
    specified class
  • Automatically has the own slots defined by the
    class
  • No other slots allowed
  • Direct instance typing cannot change.
  • To change type at all, need to do explicit
    operations on the class
  • Loom
  • Type of an instance does not have to be specified
  • Classifier deduces instance types
  • Types of instances can change (without being
    explicitly set)
  • Instances can be direct instances of more than
    one class

12
Interoperation ?
  • Two different development environments
  • Two different user models
  • Two different approaches to KA
  • Two different knowledge models
  • Both frame based
  • Disagree on the definitions of commonly used
    structures
  • Solution ado,apt the OKBC knowledge model

13
Protégé-2000 Is Like HPKB
  • Ray cant write the code fast enough
  • Therefore someone else has to write it
  • Protégé-2000 allows everyone to customize it
    using Java components
  • If we glue together components written at
    multiple labs, and knowledge bases produced by
    many different people, we might inadvertently
    introduce the same issues

14
Components
Central Framework
Widget
Widget
Widget
Storage Model
Widget
Widget
Widget
Storage Model
Widget
Provided by SMI. Plumbing that cannot be
replaced or augmented.
Widgets mediate between the knowledge base and
the user. They display small pieces of the
knowledge base in a way that the user can
understand and manipulate. SMI provides a generic
set of default widgets.
Every running application uses a storage model
for persistence. SDI currently provides two
(CLIPS format and RDBMS format).
15
Widgets
  • Widgets can be added to the platform (using
    JavaBeans)
  • There is a well-defined Widget API for building
    new widgets and adding them to a project
  • Widgets can now be arbitrarily complex
  • Dialogs are used to configure widgets
  • State is stored into a separate knowledge base
    (the project knowledge base)

16
Storage Models
  • Protege/Win stored knowledge bases in a
    CLIPS-compatible format
  • The goal for Protege-2000 is to use a
    wide-variety of persistence mechanisms
  • CLIPS-format is still useful
  • OKBC servers are important
  • Relational databases could be useful
  • To do this, we need to isolate out the
    persistence mechanism as a component

17
Axioms and Constraints
  • Protege/Win used a frame-based language
  • Protege-2000 keeps the emphasis on frames, but
    adds in a constraint language
  • Based on KIF
  • Compatible with OKBC

18
The Actual Knowledge Model
19
Knowledge Models
  • Formal specification of the way knowledge is
    represented
  • Precise, human-readable definitions of structures
    in a language
  • Gives guarantees of what must hold in the
    knowledge base
  • Other things may be true, in addition to what the
    knowledge model guarantees
  • Protege ada,opts the OKBC knowledge model

20
The Role of Logic
  • Frames are intuitive for humans
  • Concept / instance distinction dates back to
    Plato
  • But theyre not very well-defined
  • What Minsky meant by frame is not what Winograd
    meant by frame (and is certainly not what Plato
    meant by form)
  • We use logic to formalize the definitions
  • Make the underlying assumptions explicit

21
KIF
  • Knowledge Interchange Format
  • Developed in early 1990s as a standard syntax
    for first order logic
  • entirely ASCII and somewhat LISPy
  • (forall ?x (exists ?y (......))))
  • Currently a draft standard
  • http//logic.stanford.edu/kif/dpans.html
  • Slight peculiarity relations are multiple arity

22
Frames
  • A Frame is simply a symbol
  • A symbol is simply a 0-ary relation
  • That is, it can be an argument to a function or a
    predicate
  • That is, it is something we can make assertions
    about
  • Types of frames include most of the traditional
    modelling constructs (classes, instances, slots ,
    ...)

23
Classes
  • Classes are frames (are symbols ....)
  • Classes are also unary predicates
  • KIF allows multiple arity predicates
  • That is, classes are sets (the set of instances)
  • Members of the set instances of the class.
  • You can assert things about the class (using the
    fact that the class is a frame)
  • You can reason about the elements of the
    associated set

24
Defining Subclasses
  • Subclass usually means two things
  • All instances of the subclass are instances of
    the superclass
  • Anything that is true of the superclass (as a
    class) is true of the subclass
  • The first of these is simply subset

(gt (subclass-of ?S ?P) (forall ?F (gt
(?S ?F) (?P ?F))))
25
Multiple Inheritance
  • Easy to define in this model
  • For Set-aspects, simply use subclass subset
  • A set can be a subset of more than one class
  • As frames, enforce substitutability
  • Any sentence that can be asserted about the
    superclass, as a class, ought to be true of the
    subclass
  • Winds up being union of logical statements

26
Slots
  • Slots are frames (are symbols ...)
  • Slots are also binary predicates (taking a frame
    and a value)
  • Slots also have associated predicates
  • binary (take a slot and a frame, formalize the
    notion of attachment)
  • ternary (take a slot, a frame, and a value)

template-slot-of slot-of
template-slot-value slot-value
27
Attaching a Slot
  • Slots are frames that get attached to other
    frames
  • Attaching a slot to a class, for example
  • You can attach a slot as either a template slot
    or an own slot
  • template slots define information that can be
    propagated to elements of a class (and via
    inheritance)
  • own slots are strictly local information

28
Slots Propagation
instance-of
subclass-of
T
T
O
O
/dev/null
/dev/null
T
T
O
O
29
Restating this in KIF
(gt (template-slot-value ?S ?C ?V) (and
(template-slot-of ?S ?C) (gt
(instance-of ?I ?C) (holds
?S ?I ?V)) (gt (subclass-of ?X ?C)
(template-slot-value ?S ?X
?V))))
30
Restating this in English
If V is a template slot value of S on the class
C, then we know the following three things 1. S
has been attached to C as a template slot 2. V
is an own slot value for all instances I of C 3.
V is a template slot value for all subclasses X
of C
31
Restating this in Swedish
Om V är värdet på en mallegenskap S på klassen
C, så vet vi följande tre saker 1. S har
kopplats till C som en mallegenskap 2. V är ett
eget värde på egenskapen för alla instanser I av
C 3. V är värdet på mallegenskapen för alla
underklasser X av C
32
Instances
  • An instance is a frame
  • The idea of instance is, more or less, a GUI
    notion (and has no implications for the knowledge
    model)

33
Facets
  • Facets are frames (and symbols ...)
  • Facets are also ternary predicates (taking a
    frame, a slot, and a value)
  • Facets also have associated predicates
  • ternary (take a slot, a frame, and a facet
    formalize the notion of attachment)
  • 4-ary (take a slot, a frame, a facet and a value)

template-facet-of facet-of
34
Facet Restrictions
  • Template facets can only be attached to template
    slots
  • Having a value implies attachment
  • Similarly for own slots

(gt (template-facet-of ?F ?S ?C)
(template-slot-of ?S ?C))
(gt (template-facet-value ?F ?S ?C ?V)
(template-facet-of ?F ?S ?C))
35
Facet Propagation
subclass-of
  • Facets are attached to (frame, slot) pairs
  • Whenever a slot propagates, from one frame to
    another, the facets are carried along

T
O
/dev/null
T
O
36
Canonical Facets
  • The standard facets are local (e.g. at a single
    (frame,slot) pair) constraints

VALUE-TYPE CARDINALITY NUMERIC-MINIMUM NUMERIC
-MAXIMUM
(gt (VALUE-TYPE ?S ?F ?C) (and (class ?C)
(gt (holds ?S ?F ?V)
(instance-of ?V ?C))))
37
OKBC Revisited
  • Protégé-2000 knowledge-bases are OKBC-compliant
  • Protégé-2000 is not OKBC generic
  • There are OKBC knowledge bases that Protégé-2000
    cannot handle
  • Its close, though !
  • Differences are KA related
  • Instances are instances of exactly one class
  • The role slot

38
Desiderata for a Constraint Language
  • William Grosso
  • Stanford Medical Informatics
  • Stanford University

39
Overview
  • Examples of Constraints
  • Design Desiderata
  • The Constraint Language
  • Implementation Decisions
  • The Default Implementation
  • Dimensions for Evolution

40
Desiderata for the Language
41
The Big Modular Picture of Protege
Widgets
Widgets
Widgets
Widgets
Widgets
Widgets
Core Protege Framework
Storage Model
Constraint Engine
Actual KB
42
Full and formal semantics
  • Widgets can include widgets for acquiring
    specific types of constraints
  • Multiple constraint engines are possible
  • Performing different checks at different times
  • Replacing one engine with another
  • The entire kb gets stored out to some server
  • Without formal semantics (a logical theory), this
    is just not possible

43
Compatibility with the OKBC knowledge model
  • OKBC does not specify an axiom language
  • OKBC is specified as a set of relations in KIF
  • Classes are unary predicates, slots are binary
    predicates, ...
  • All of these relations should immediately be
    accessible from within the constraint language
  • And the constraint engine should give them the
    right semantics

44
Ease of Translation
  • Important goal we want to be able to use Protege
    as a front-end to a wide-variety of knowledge
    base servers
  • This means that the constraint language ought to
    be easily translated into a wide-variety of
    constraint languages
  • At the very least, figuring out what can be
    translated ought to be easy

45
Supported by a reasonable default implementation
  • KMG will provide a default implementation of the
    constraint language
  • Not very efficient
  • But good semantics for KA
  • Good enough to bootstrap the process
  • As we learn more about constraints, and how they
    are used, we hope that people with real expertise
    will step forward

46
A Deficient Syllogism
Major Premise Interoperability requires formal
semantics (and knowledge models based on
mathematical logic) Minor Premise Humans
dont easily adapt to formal languages Conclusion
Widgets !!!!!!!
47
Human Readability is a Red Herring
  • The casual user interacts with forms
  • The expert user knows about classes and instances
  • Very few users know about the underlying logical
    formalism
  • If we design widgets for acquiring constraints,
    then the user will never see the constraint
    language

48
The Constraint Language
49
A Single Constraint Language
  • Constraint language is really an interlingua for
    communication
  • Between widgets and the framework
  • Between the framework and the storage model
  • If we want all the components to evolve
    independently and communicate gracefully, we need
    to fix a single constraint language

50
Logic
  • We decided on a variant of KIF
  • We use the KIF connectives and the KIF syntax
  • Not all the KIF constants and predicates are
    included
  • Our theory of arithmetic is much smaller
  • (defrelation ...) is omitted
  • For now ?

51
Sorted Logic
  • Two new constructs in the language
  • defset allows the user to define a bag of
    values.
  • Similar to notion of class, but with no support
    in the ontology tab
  • Useful for enumerated types
  • defrange all variables must have their types
    declared
  • types can include things like is a target of
    slot name

52
Reified Constraints
  • There is a knowledge-base for constraints
  • Acquiring a constraint is really acquiring an
    instance of Constraint
  • You can annotate sentences and relations with
    useful information
  • You can store constraints out to a vanilla
    frame-based system
  • To a simple KB server, a constraint is just
    another frame

53
The Constraint KB
  • To use constraints, you must include the
    constraint knowledge base
  • Will also contain default implementation of
    engine (as a tab widget)
  • Will also include java code for the standard
    relations
  • Will also include widgets for constraint
    acquisition
  • Wont include any instances

54
(No Transcript)
55
Constraints and Axioms
  • Constraints and Axioms use the syntax of logic
    but have different semantics
  • Axioms can be used to assert new knowledge
  • Constraints are restrictions on existing
    knowledge
  • (forall ?x (exists ?y (rel-name ?x ?y)))
  • Asserted as an axiom its reasonable to create a
    skolem constant and bind it to ?y
  • Asserted as a constraint might not want to
    skolemize

56
Multiple Interpretations of a Single Theory
  • No engine can return true when OKBC would
    return false
  • Model theoretic terms If an engine thinks there
    is a model, then there must be one
  • But engines are free to overlook models

57
New functions and predicates are implemented
procedurally
  • KIF has the (defrelation ...) construct to define
    new relations
  • Our point of view A relation is, almost always,
    something that should be defined in the ontology
  • The exceptions (mostly n-ary relations) should be
    annotated explicitly and defined procedurally

58
(No Transcript)
59
Universal Implementation Decisions
60
The Language is defined in a Knowledge-Base
  • PAL Protege Axiom Language
  • The PAL knowledge-base contains
  • The constraint ontology
  • The default relations
  • And the java code that implements them
  • The default implementation
  • Once again, taking advantage of knowledge-base
    inclusion

61
Enforcement of constraints is not necesarily
real-time
  • When the user loads (or saves) a knowledge-base,
    it should be consistent
  • Its not always possible for the user to always
    have a consistent KB while editing
  • And, even if it were possible, it might be
    inconvenient.
  • Therefore, the user should decide when to check
    constraints

62
Enforcement via plug-ins (and tabs)
  • The basic way users will interact with constraint
    engines will be via tabs and widgets
  • We want to enable special types and categories of
    constraints to be annotated
  • Basic mechanism subclassing Constraint
  • We want to have multiple possible engines,
    depending on context and user preference
  • Constraint tabs are just another way of
    interacting with the KB .

63
Two Important Consequences of these Decisions
64
What is a knowledge base ?
  • Used to be classes and instances
  • Now also includes widgets
  • Java code !
  • Now also includes constraints
  • Instances with an interpretation beyond the
    standard meaning associated to frames
  • Custom pieces of java code that implement new
    relations (possibly domain specific) for the
    constraint language

65
We have evolved from OKBC to some extent
  • If we use the ontology as a type system, it is
    convenient to have the types be mutually
    exclusive (instances are instances of a single
    class)
  • The role predicate

66
The Default Implementation
67
Model-checking, rather than theorem proving
  • Make strong closed world assumptions
  • Main goals
  • Detect incomplete entry of information
  • Check entered information for inconsistencies

68
Envisioned Constraints are mostly Local
  • The more false this assumption is, the worse
    the engine will perform(the better a traditional
    theorem prover would perform ?)

69
Dimensions for Evolution
70
Richer axiom ontology
  • Subclassing our ontology to provide more detailed
    information
  • Hints to enforcement engines
  • This is best validated using subroutine x or
    This statement is complexity level gamma
  • Statement could be generated by a widget
  • Your widget, in your domain, generating PAL
    statements for my engine to check
  • Formal Semantics necessary
  • Engines might less the user check a subset of the
    theory

71
More Predicates and Functions
  • Not many are included in the default
    implementation
  • Mostly for reasoning about types, arithmetic, and
    slot values (taking transitive closures)
  • Over time, we hope that people will implement
    predicates and pass the code to us (for inclusion
    as part of the Protege distribution)
  • Note also that relations dont have to be general
    -- you can add knowledge-base specific relations

72
Other engines
  • In particular, a theorem prover ?
  • Can GSAT be used as a preprocessing step ?
  • How about the work on ALL ?

73
Support for Knowledge-Acquisition
  • The knowledge-model is done
  • The axiom language is done (as a spec)
  • Engines are a mere matter of programming
    (similar things have been done for 25 years now)
  • Whats left ?

74
Subclassing the PAL Ontology to provide hooks for
widgets ?
  • CONSTRAINT only provides two slots (pragmatics
    and sentence)
  • How about other slots
  • Evaluation cost (for different engines) ?
  • Evaluation hints ?
  • What widget generated the axiom ?

75
No A is a B
  • A statement that is often enforced by defining
    separate classes
  • But often not
  • No hemophiliac should be taking Lasix
  • Do we really want Hemophiliac as a subclass of
    Person ?
  • Do we really want Lasix_Taker as a subclass of
    Patient ?

76
Lets write it in PAL
(forall ?P (gt (and (Person ?P)
(has-disease ?P Hemophilia))
(not (taking-drug ?P Laxol))))
77
This is really a Venn Diagram
Empty Intersection
Person
Person
Partially filled out instance defines matching
Partially filled out instance defines matching
78
Widgets play a role here
  • Widget is placed on screen to mediate between
    humans and KB
  • Widget generates PAL statements
  • Engine interprets PAL statements
  • User may or may not ever see PAL

79
Things that are done
  • The knowledge model is done
  • The constraint language is done
  • The default implementation is designed and
    (partially implemented)

80
Things that we will do
  • Finish the default implementation
  • Publish a full spec (as a Tech Report) ?
  • Serve as a clearinghouse for engines and widgets
Write a Comment
User Comments (0)
About PowerShow.com