Ontology Basics

1 / 123
About This Presentation
Title:

Ontology Basics

Description:

Ontology Development 101: A Guide to Creating Your First Ontology''. Stanford ... a branch of metaphysics concerned with the nature and relations of being and ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 124
Provided by: cseTt

less

Transcript and Presenter's Notes

Title: Ontology Basics


1
Ontology Basics
  • ? ? ?
  • ???? ?????
  • Email chingyeh_at_cse.ttu.edu.tw
  • URL http//www.cse.ttu.edu.tw/chingyeh

2
Selected From
  • Natalya Fridman Noy and Deborah L. McGuinness.
    Ontology Development 101 A Guide to Creating
    Your First Ontology''. Stanford Knowledge Systems
    Laboratory Technical Report KSL-01-05 and
    Stanford Medical Informatics Technical Report
    SMI-2001-0880, March 2001.
  • Deborah L. McGuinness. "Ontologies Come of Age".
    In Dieter Fensel, J im Hendler, Henry Lieberman,
    and Wolfgang Wahlster, editors. Spinning the
    Semantic Web Bringing the World Wide Web to Its
    Full Potential. MIT Press, 2002.
  • Ora Lassila and Deborah L. McGuinness. The Role
    of Frame-Based Representation on the Semantic
    Web''. KSL Tech Report Number KSL-01-02.
    Submitted for publication, January, 2001.

3
Ontologies Come of Age
4
Abstract
  • Ontologies have moved beyond the domains of
    library science, philosophy, and knowledge
    representation, and become the concerns of
    marketing departments, CEOs, and mainstream
    business.
  • The critical roles of ontologies in
  • support of browsing and search for e-commerce and
    in
  • support of interoperability for facilitation of
    knowledge management and configuration (Forrester
    Research report)
  • Ontologies used as central controlled
    vocabularies that are integrated into catalogues,
    databases, web publications, knowledge management
    applications, etc.

5
Abstract
  • Large ontologies are essential components in many
    online applications including
  • search (such as Yahoo and Lycos),
  • e-commerce (such as Amazon and eBay),
  • configuration (such as Dell and PC-Order), etc.
  • Ontologies have long life spans, sometimes in
    multiple projects (such as UMLS, SIC codes,
    etc.).
  • Such diverse usage generates many implications
    for ontology environments.

6
The Webs Growing Needs
  • The HTML pages contain information about how to
    present information on a page target human
    readers, rather than targeting programs or
    automatic readers.
  • For the current search engines such as Google,
    finding the exact information is not as easy as
    one would hope.
  • The result is a rank ordered list of pages rather
    than what the search engine thought.
  • Web pages typically do not contain markup
    information about the contents of the page.

7
The Webs Growing Needs
  • The next generation of the web aims at pages for
    machine or programs consumption.
  • The markup languages aimed at marking up content
    and services instead of just presentation
    information
  • XML, RDF, RDFS, DAML, etc. are becoming more
    accepted as users and application developers see
    the need for more understanding of what is
    available from web pages.

8
The Webs Growing Needs
Berners-Lees Architecture
9
The Webs Growing Needs
  • The markup languages at the base (just above
    Unicode) are used for in term specification (or
    in web speak, resource definition).
  • In the ontology layer, we can define terms and
    their relationships to other terms.
  • In the logic layer, we can deduce information,
    thereby allowing us to deduce implications of the
    term definitions and relationships.

10
Ontologies
  • Merriam Webster (1721) provides two definitions
  • a branch of metaphysics concerned with the nature
    and relations of being and
  • a particular theory about the nature of being or
    the kinds of existents.
  • While ontologies (even formal ontologies) have
    had a long history, they remained largely the
    topic of academic interest among philosophers,
    linguists, librarians, and knowledge
    representation researchers until somewhat
    recently.

11
Ontologies
  • Ontologies have been gaining interest and
    acceptance in computational audiences (in
    addition to philosophical audiences).
  • Fields embracing ontologies Guarino 1998
  • knowledge engineering, knowledge representation,
    qualitative modeling, language engineering,
    database design, information retrieval and
    extraction, and knowledge management and
    organization
  • Also including areas of
  • library science Dublin Core 1999,
    ontology-enhanced search (e.g., eCyc
    (http//www.e-Cyc.com/) and FindUR McGuinness
    1998), possibly the largest one, e-commerce
    (e.g., Amazon.com, Yahoo Shopping, etc.), and
    configuration.

12
Ontologies
  • Here we will be restricting our sense of
    ontologies to those we see emerging on the web.
  • One widely cited definition of an ontology is
    Grubers Gruber 1993 A specification of a
    conceptualization.

13
Ontologies
  • Ontologies can be used to provide a concrete
    specification of term names and term meanings.

14
Ontology Spectrum
  • One of the simplest notions of a possible
    ontology may be a controlled vocabulary i.e., a
    finite list of terms.
  • Catalogs are an example of this category.
  • Catalogs can provide an unambiguous
    interpretation of terms for example, every use
    of a term, say car will denote exactly the same
    identifier say 25.
  • Another potential ontology specification is a
    glossary (a list of terms and meanings).
  • The meanings are specified typically as natural
    language statements.
  • This provides a kind of semantics or meaning
    since humans can read the natural language
    statements and interpret them.
  • Typically, interpretations are not unambiguous
    and thus these specifications are not adequate
    for computer agents, thus this would not meet the
    criteria of being machine processable.

15
Ontology Spectrum
  • Thesauri provide some additional semantics in
    their relations between terms.
  • They provide information such as synonym
    relationships.
  • In many cases their relationships may be
    interpreted unambiguously by agents.
  • Typically thesauri do not provide an explicit
    hierarchy (although with narrower and broader
    term specifications, one could deduce a simple
    hierarchy).

16
Ontology Spectrum
  • Informal isa
  • Early web specifications of term hierarchies,
    such as Yahoos, provide a basic notion of
    generalization and specialization.
  • The hierarchy is not a strict subclass or isa
    hierarchy however.
  • This mixing of categories such as accessories in
    web classification schemes is not unique to Yahoo
    it appears in many web classification schemes.
  • Without true subclass (or true isa)
    relationships, we will see that certain kinds of
    deductive uses of ontologies become problematic.

17
Ontology Spectrum
  • Formal isa
  • If A is a superclass of B, then if an object is
    an instance of B it necessarily follows that the
    object is an instance of A.
  • For example, if Dress is a subclass of
    Apparel and MyFavoriteDress is an instance of
    Dress, then it follows that MyFavoriteDress
    is an instance of Apparel.
  • Strict subclass hierarchies are necessary for
    exploitation of inheritance.
  • Formal instance relationships
  • Some classification schemes only include class
    names while others include ground individual
    content.

18
Ontology Spectrum
  • Frames
  • Here classes include property information.
  • For example,
  • the Apparel class may include properties of
    price and isMadeFrom.
  • My specific dress may have a price of 100 and
    may be made from cotton.
  • Properties become more useful when they are
    specified at a general class level and then
    inherited consistently by subclasses and
    instances.
  • In a consumer hierarchy, a general category like
    consumer product might have a price property
    associated with it.
  • Frames were introduced by Minsky Minsky 1975
    and have been widely adopted.

19
Ontology Spectrum
  • A more expressive point in the ontology spectrum
    includes value restrictions.
  • Here we may place restrictions on what can fill a
    property.
  • For example, a price property might be
    restricted to have a filler that is a number (or
    a number in a certain range) and isMadeFrom may
    be restricted have fillers that are a kind of
    material.

20
Ontology Spectrum
  • As ontologies need to express more information,
    their expressive requirements grow.
  • For example, we may want to fill in the value of
    one property based on a mathematical equation
    using values from other properties.
  • Some languages allow ontologists to state
    arbitrary logical statements.
  • Very expressive ontology languages such as that
    seen in Ontolingua Farquhar et al 1997 or CycL
    allow ontologists to specify first order logic
    constraints between terms and more detailed
    relationships such as disjoint classes, disjoint
    coverings, inverse relationships, part-whole
    relationships, etc.

21
Ontology Spectrum
  • Here we will require the following properties to
    hold in order to consider something an ontology.
  • Finite controlled (extensible) vocabulary
  • Unambiguous interpretation of classes and term
    relationships
  • Strict hierarchical subclass relationships
    between classes
  • Properties that are typical but not mandatory
  • Property specification on a per-class basis
  • Individual inclusion in the ontology
  • Value restriction specification on a per-class
    basis
  • Properties that may be desirable but not
    mandatory nor typical
  • Specification of disjoint classes
  • Specification of arbitrary logical relationships
    between terms
  • Distinguished relationships such as inverse and
    part-whole

22
Simple Ontologies and Their Uses
  • Simple ontologies are not as costly to build and
    potentially more importantly, many are available.
  • Examples
  • DMOZ (www.dmoz.com) leverages over 35,000
    volunteer editors and at publication time, had
    over 360,000 classes in a taxonomy
  • The unified medical language system (UMLS -
    http//www.nlm.nih.gov/research/umls/) developed
    by the national library of medicine is a large
    sophisticated ontology about medical terminology.
  • Some companies such as Cycorp (www.cyc.com ) are
    making available portions of large, detailed
    ontologies.

23
Simple Ontologies and Their Uses
  • They provide a controlled vocabulary. Common term
    usage is a start for interoperability.
  • A simple taxonomy may be used for site
    organization and navigation support.
  • Taxonomies may be used to support expectation
    setting. It is an important user interface
    feature that users be able to have realistic
    expectations of a site.
  • Taxonomies may be used as umbrella structures
    from which to extend content. E.g.. UNSPSC
    (Universal Standard Products and Services
    Classification www.unspsc.org ).
  • Taxonomies may provide browsing support.
  • Taxonomies may be used to provide search support.
  • Taxonomies may be used to sense disambiguation
    support.

24
Structured Ontologies and Their Uses
  • They can be used for simple kinds of consistency
    checking.
  • Ontologies may be used to provide completion.
  • Ontologies may be able to provide
    interoperability support.
  • Ontologies may be used to support validation and
    verification testing of data (and schemas).
  • Ontologies containing markup information may
    encode entire test suites.
  • Ontologies can provide the foundation for
    configuration support.
  • Ontologies can support structured, comparative,
    and customized search.
  • Ontologies may be used to exploit
    generalization/specialization information.

25
Ontologies provide completion
  • For example, she is looking for a high-resolution
    screen on a pc, and then have the ontology expand
    the exact pixel range that is to be expected.
  • Simply by defining what the term
    HighResolutionPc is with respect to a
    particular pixel range on two roles
    verticalResolution and horizontalResolution.
  • For example, a medical system may obtain
    information from an ontology that if a patient is
    stated to be a man, then the gender of the
    patient is male and
  • that information may be used to determine that a
    question concerning whether or not the patient is
    pregnant should not be asked since there could be
    information in the system that things whose
    gender is male are disjoint from things that are
    pregnant.

26
Ontologies provide interoperability support
  • In the case of controlled vocabularies, there is
    enhanced interoperability support since different
    users/applications are using the same set of
    terms.
  • In simple taxonomies, we can recognize when one
    application is using a term that is more general
    or more specific than another term and greater
    facilitate interoperability.
  • In more expressive ontologies, we may have a
    complete operational definition for how one term
    relates to another term and thus, we can use
    equality axioms or mappings to express one term
    precisely in terms of another and thereby support
    more intelligent interoperability.

27
Ontologies support validation and verification
testing
  • If an ontology contains class descriptions, such
    as StanfordEmployee, these definitions may be
    used as queries to databases to discover what
    kind of coverage currently exists in datasets.
  • For example, if one was going to expose the class
    StanfordEmployee on an interface to some
    application, it would be useful to know first if
    the dataset contained any instances of Person
    whose employer property was filled with the
    value Stanford.
  • Similarly, checks could be done to see if there
    were currently Persons in the dataset that were
    known to be Employees yet did not have a value
    for the employer property (thereby showing that
    the dataset is not complete).

28
Ontologies may encode entire test suites
  • An ontology may contain a number of definitions
    of terms, some instance definitions, and then
    include a term definition that is considered to
    be a query find all terms that meet the
    following conditions.
  • Markup information could be encoded with this
    query to include what the answer should be, thus
    providing enough information to encode regression
    testing data.
  • Example ontology
  • http//ksl.stanford.edu/projects/DAML/chimaera-jt
    p-cardinality-test1.daml

29
Ontologies provide for configuration support
  • Class terms may be defined so that they contain
    descriptions of what kinds of parts may be in a
    system.
  • Additionally interactions between properties can
    be defined so that filling in a value for one
    property can cause another value to be filled in
    for another slot.
  • For example, one may generate an ontology of
    information about home theatre products as is
    done in a small configurator example using a
    simple description logic-based system.
  • Terms such as television, amplifier, tuner, etc
    are defined.
  • Additionally, information connecting the terms
    together is included.
  • A class of HighQualityTelevisions is defined so
    that users may choose from this class and the
    configurator will automatically fill in limited
    sets of manufacturers to choose from, minimum
    diagonal values, minimum price ranges etc.

30
Ontologies support structured, comparative, and
customized search
  • For example, if one is looking for televisions, a
    class description for television may be obtained
    from an ontology, its properties may be obtained
    (such as diagonal, price, manufacturer, etc), and
    then a comparative presentation may be made of
    televisions by presenting the values of each of
    the properties.
  • Those properties can also be used to provide a
    form for users to fill in so that they may
    provide a detailed set of specifications about
    the items they are looking to find.

31
Ontologies exploit generalization/specialization
information
  • If a search application finds that a users query
    generates too many answers, one may dissect the
    query to see if any terms in it appear in an
    ontology, and if so, then the search application
    may suggest specializing that term.
  • For example, if one did a search for concerts in
    the San Francisco Bay area and got too many
    answers, a search engine might look up concert in
    an ontology and discover that there are
    subclasses of concert (and it may also discover
    that there are specific concert locations in the
    Bay area).
  • The search engine could then choose to present
    the user with the option of looking for a
    particular kind of concert (say rock concert).

32
Ontology Acquisition
  • One methodology for obtaining ontologies is to
    begin with an industry standard ontology and then
    modify or extend it.
  • Another methodology is to semi-automatically
    generate a starting point for an ontology.
  • Many taxonomic structures exist on the web or in
    the table of contents of documents.
  • One might crawl certain sites to obtain a
    starting taxonomic structure and then analyze,
    modify, and extend that.

33
Ontology Acquisition
  • Q Where to look for existing ontologies or
    sources of information to be crawled?
  • Standards organizations, such as NIST (the
    National Institute of Standards and Technology -
    http//www.nist.gov/), support efforts in
    producing controlled vocabularies and ontologies.
  • Consortiums are forming to generate ontologies.
  • For example, RosettaNet (http//www.rosettanet.org
    ) in the area of information technology,
    electronic technology, electronic components, and
    semiconductor manufacturing.
  • They are creating industry-wide open e-business
    standards and providing a language for business
    processes.
  • Trade organizations provide class hierarchies on
    their sites that can also be used as a standard
    structured controlled vocabulary.
  • Every e-commerce site today encodes at least a
    taxonomic organization of terms, like Amazon in
    organizing their book and music information.

34
Ontology Acquisition
  • Another emerging trend is the use of markup
    languages.
  • Some pages are being annotated using markup
    languages such as XML, RDF, DAML, etc. The
    pages including the annotations may be using
    markup terms from controlled vocabularies.
  • Some libraries are emerging of ontologies
    potentially of use for markup. For example, the
    DAML program maintains a library of DAML
    ontologies in http//www.daml.org/ontologies/.
  • Much of this section has introduced the idea of
    obtaining either a simple or complex ontology as
    a starting point and then analyzing, modifying,
    and maintaining it over time.

35
Ontology-related Implications and Needs
  • When starting an ontology-based application, the
    two major concerns will be
  • language and
  • environment.

36
Language Issues
  • If one is using a simple ontology, few issues
    arise.
  • However, if one is considering a more complex
    ontology, expressive power of a representation
    and reasoning language needs to be considered.
  • For example, if one wants to do range checking in
    an e-commerce application, then it would be
    unwise to choose just a simple language that only
    contains subclass and instance relationships and
    does not include property specification with
    value restrictions.
  • Candidate ontology languages
  • Standard specification languages (such as the
    KRSS effort the Knowledge Representation System
    Specification effort)
  • Interchange formats (such as KIF -the Knowledge
    Interchange Format which is now a proposed ANSI
    standard ), and
  • Common application programming interface
    standards (such as OKBC Open Knowledge Base
    Connectivity).

37
Language Issues
  • One does not just want to consider
    representational constructs in a language one
    also wants to consider the reasoning that may be
    supported in the language.
  • Some fields such as description logics
    (www.dl.kr.org ), make this a central focus in
    language design.
  • Also, a language should be usable with existing
    platforms and should be something that
    non-experts can use to do their conceptual
    modeling.
  • The web is clearly the most important platform
    with which to be compatible today, thus any
    language choice should be able to leverage the
    web.
  • Additionally, frame-based systems have had a long
    history of being thought of as conceptually easy
    to use, thus a frame paradigm may be worth
    considering.

38
Language Issues
  • The DARPA Agent Markup Language program, for
    example, attempted to take the emerging web
    languages of today such as XML and RDF and create
    a language that is web compatible but draws on
    the 20 year history of description logics in
    choosing language constructs along with reasoning
    paradigms.
  • The resulting language DAMLOIL attempts to
    merge the best of existing web languages,
    description logics, and frame reasoning systems.

39
Environment Issues
  • How to analyze, modify, and maintain an ontology
    over time?
  • If the ontology is to be maintained by subject
    matter experts (and not by knowledge experts),
    most likely some ontology tools will be needed.
  • Verity, an information retrieval company, has
    provided a topic editor which will support
    users in generating taxonomies and utilizing them
    in search queries.
  • Research efforts have existed for many years in
    producing ontology toolkits
  • Stanford Universitys previously mentioned tools
    of Ontolingua and Chimaera
  • OilEd (http//img.cs.man.ac.uk/oil/ ) from
    Manchester University and
  • Protégé from Stanford Medical Informatics

40
Environment Issues
  • Some companies with extensive ontology needs such
    as VerticalNet (http//www.verticalnet.com/) have
    or are developing their own ontology tools in
    order to build ontologies that meet the needs of
    a sophisticated commercial ontologist.
  • Their tools were built after analyzing existing
    research prototypes and were then designed to
    meet the commercial standards required in
    diverse, collaborative, e-commerce applications
    of today.

41
Environment Issues
  • When choosing to use or build an ontology
    environment, the following issues should be
    considered
  • Collaboration and distributed workforce support.
  • Platform interconnectivity.
  • Scale.
  • Versioning.
  • Security.
  • Analysis.
  • Lifecycle issues.
  • Ease of use.
  • Diverse user support
  • Presentation Style.
  • Extensibility.

42
Conclusions
  • The emergence of ontologies from academic
    obscurity into mainstream business and practice
    on the web
  • Ontology along with a spectrum of properties
  • Criteria necessary, prototypical, and desirable
    for simple and complex ontologies
  • Ways that ontologies are being and may be used to
    provide value in many types of applications.
  • Issues of acquiring ontologies and then
    maintaining and evolving ontologies
  • Ontology-related issues that arise from the
    emergence of ontologies focusing on ontology
    language and environment

43
Ontology Development 101 A Guide to Creating
Your First Ontology
  • From Natalya Fridman Noy and Deborah L.
    McGuinness. Ontology Development 101 A Guide
    to Creating Your First Ontology''. Stanford
    Knowledge Systems Laboratory Technical Report
    KSL-01-05 and Stanford Medical Informatics
    Technical Report SMI-2001-0880, March 2001.
    Available at http//www.ksl.stanford.edu/people/dl
    m/papers/ontology101/ontology101-noy-mcguinness.ht
    ml

44
Why Develop an Ontology
  • The development of ontologies has been moving
    from the realm of Artificial-Intelligence
    laboratories to the desktops of domain experts.
  • Ontologies have become common on the World-Wide
    Web.
  • The ontologies on the Web range from large
    taxonomies categorizing Web sites (such as on
    Yahoo!) to categorizations of products for sale
    and their features (such as on Amazon.com).
  • The WWW Consortium (W3C) is developing the RDF, a
    language for encoding knowledge on Web pages to
    make it understandable to electronic agents
    searching for information.
  • The Defense Advanced Research Projects Agency
    (DARPA), in conjunction with the W3C, is
    developing DARPA Agent Markup Language (DAML) by
    extending RDF with more expressive constructs
    aimed at facilitating agent interaction on the
    Web.

45
Why Develop an Ontology
  • Many disciplines now develop standardized
    ontologies that domain experts can use to share
    and annotate information in their fields.
  • Medicine, for example, has produced large,
    standardized, structured vocabularies such as
    snomed and the semantic network of the Unified
    Medical Language System.
  • Broad general-purpose ontologies are emerging as
    well. For example, the UNSPSC ontology which
    provides terminology for products and services.

46
Why Develop an Ontology
  • Reasons why developing an ontology
  • To share common understanding of the structure of
    information among people or software agents
  • To enable reuse of domain knowledge
  • To make domain assumptions explicit
  • To separate domain knowledge from the operational
    knowledge
  • To analyze domain knowledge

47
What Is in an Ontology?
  • For the purposes of this guide an ontology is a
    formal explicit description of
  • concepts in a domain of discourse (classes
    (sometimes called concepts)),
  • properties of each concept describing various
    features and attributes of the concept (slots
    (sometimes called roles or properties)), and
  • restrictions on slots (facets (sometimes called
    role restrictions)).
  • An ontology together with a set of individual
    instances of classes constitutes a knowledge
    base.

48
What Is in an Ontology?
  • Classes describe concepts in the domain.
  • Specific wines are instances of the class of
    wines.
  • The Bordeaux wine in the glass in front of you
    while you read this document is an instance of
    the class of Bordeaux wines.
  • A class can have subclasses that represent
    concepts that are more specific than the
    superclass.
  • For example, we can divide the class of all wines
    into red, white, and rosé wines. Alternatively,
    we can divide a class of all wines into sparkling
    and non-sparkling wines.

49
What Is in an Ontology?
  • Slots describe properties of classes and
    instances
  • Château Lafite Rothschild Pauillac wine has a
    full body
  • it is produced by the Château Lafite Rothschild
    winery.
  • We have two slots describing the wine in this
    example
  • the slot body with the value full and
  • the slot maker with the value Château Lafite
    Rothschild winery.
  • At the class level, we can say that instances of
    the class Wine will have slots describing their
    flavor, body, sugar level, the maker of the wine
    and so on.

50
What Is in an Ontology?
  • In practical terms, developing an ontology
    includes
  • defining classes in the ontology,
  • arranging the classes in a taxonomic
    (subclasssuperclass) hierarchy,
  • defining slots and describing allowed values for
    these slots,
  • filling in the values for slots for instances.
  • We can then create a knowledge base by defining
    individual instances of these classes filling in
    specific slot value information and additional
    slot restrictions.

51
Some classes, instances, and relations among them
in the wine domain
52
A Simple Knowledge-Engineering Methodology
  • There is no one correct way or methodology for
    developing ontologies.
  • Here we discuss general issues to consider and
    offer one possible process for developing an
    ontology.
  • An iterative approach to ontology development
  • we start with a rough first pass at the ontology.
  • We then revise and refine the evolving ontology
    and fill in the details.
  • Along the way, we discuss the modeling decisions
    that a designer needs to make, as well as the
    pros, cons, and implications of different
    solutions.

53
A Simple Knowledge-Engineering Methodology
  • Some fundamental rules in ontology design
  • There is no one correct way to model a domain
    there are always viable alternatives. The best
    solution almost always depends on the application
    that you have in mind and the extensions that you
    anticipate.
  • Ontology development is necessarily an iterative
    process.
  • Concepts in the ontology should be close to
    objects (physical or logical) and relationships
    in your domain of interest. These are most likely
    to be nouns (objects) or verbs (relationships) in
    sentences that describe your domain.

54
Step 1 Determine the domain and scope of the
ontology
  • Answer several basic questions
  • What is the domain that the ontology will cover?
  • For what we are going to use the ontology?
  • For what types of questions the information in
    the ontology should provide answers?
  • Who will use and maintain the ontology?
  • The answers to these questions may change during
    the ontology-design process, but at any given
    time they help limit the scope of the model.

55
Step 1 Determine the domain and scope of the
ontology
  • For the ontology of wine and food, representation
    of food and wines is the domain of the ontology.
  • We plan to use this ontology for the applications
    that suggest good combinations of wines and food.
  • Naturally, the concepts describing different
    types of wines, main food types, the notion of a
    good combination of wine and food and a bad
    combination will figure into our ontology.
  • At the same time, it is unlikely that the
    ontology will include concepts for managing
    inventory in a winery or employees in a
    restaurant even though these concepts are
    somewhat related to the notions of wine and food.

56
Step 1 Determine the domain and scope of the
ontology
  • The uses of ontology
  • Assist in natural-language processing of articles
    in wine magazines, it may be important to include
    synonyms and part-of-speech information for
    concepts in the ontology.
  • Helping restaurant customers decide which wine to
    order, we need to include retail-pricing
    information.
  • For wine buyers in stocking a wine cellar,
    wholesale pricing and availability may be
    necessary.
  • If the people who will maintain the ontology
    describe the domain in a language that is
    different from the language of the ontology
    users, we may need to provide the mapping between
    the languages.

57
Step 1 Determine the domain and scope of the
ontology
  • Competency questions
  • Sketch a list of questions that a knowledge base
    based on the ontology should be able to answer
  • These questions will serve as the litmus test
    later
  • Does the ontology contain enough information to
    answer these types of questions?
  • Do the answers require a particular level of
    detail or representation of a particular area?
  • These competency questions are just a sketch and
    do not need to be exhaustive.

58
Step 1 Determine the domain and scope of the
ontology
  • In the wine and food domain, the following are
    the possible competency questions
  • Which wine characteristics should I consider when
    choosing a wine?
  • Is Bordeaux a red or white wine?
  • Does Cabernet Sauvignon go well with seafood?
    What is the best choice of wine for grilled meat?
  • Which characteristics of a wine affect its
    appropriateness for a dish?
  • Does a bouquet or body of a specific wine change
    with vintage year?
  • What were good vintages for Napa Zinfandel?

59
Step 2 Consider reusing existing ontologies
  • It is almost always worth considering what
    someone else has done and checking if we can
    refine and extend existing sources for our
    particular domain and task.
  • Reusing existing ontologies may be a requirement
    if our system needs to interact with other
    applications that have already committed to
    particular ontologies or controlled vocabularies.
  • Many ontologies are already available in
    electronic form and can be imported into an
    ontology-development environment that you are
    using.
  • The formalism in which an ontology is expressed
    often does not matter, since many
    knowledge-representation systems can import and
    export ontologies.
  • Even if a knowledge-representation system cannot
    work directly with a particular formalism, the
    task of translating an ontology from one
    formalism to another is usually not a difficult
    one.

60
Step 2 Consider reusing existing ontologies
  • Libraries of reusable ontologies on the Web and
    in the literature.
  • For example, Ontolingua ontology library
    (http//www.ksl.stanford.edu/software/ontolingua/)
    or
  • the DAML ontology library (http//www.daml.org/ont
    ologies/).
  • There are also a number of publicly available
    commercial ontologies (e.g., UNSPSC
    (www.unspsc.org), RosettaNet (www.rosettanet.org),
    DMOZ (www.dmoz.org)).
  • For example, a knowledge base of French wines may
    already exist.
  • Importing this knowledge base and the ontology on
    which it is based, we will have not only the
    classification of French wines but also the first
    pass at the classification of wine
    characteristics used to distinguish and describe
    the wines.
  • Lists of wine properties may already be available
    from commercial Web sites such as www.wines.com
    that customers consider use to buy wines.
  • Here we will assume that no relevant ontologies
    already exist and start developing the ontology
    from scratch.

61
Step 3 Enumerate important terms in the ontology
  • It is useful to write down a list of all terms we
    would like either to make statements about or to
    explain to a user.
  • What are the terms we would like to talk about?
  • What properties do those terms have?
  • What would we like to say about those terms?
  • For example, important wine-related terms will
    include
  • wine, grape, winery, location, a wines color,
    body, flavor and sugar content
  • different types of food, such as fish and red
    meat
  • subtypes of wine such as white wine, and so on.
  • Initially, it is important to get a comprehensive
    list of terms without worrying about overlap
    between concepts they represent, relations among
    the terms, or any properties that the concepts
    may have, or whether the concepts are classes or
    slots.

62
Step 3 Enumerate important terms in the ontology
  • The next two steps are closely intertwined. It is
    hard to do one of them first and then do the
    other.
  • developing the class hierarchy and
  • defining properties of concepts (slots)
  • Typically, we create a few definitions of the
    concepts in the hierarchy and then continue by
    describing properties of these concepts and so
    on.
  • These two steps are also the most important steps
    in the ontology-design process.

63
Step 4 Define the classes and the class hierarchy
  • Approaches in developing a class hierarchy
  • A top-down development process starts with the
    definition of the most general concepts in the
    domain and subsequent specialization of the
    concepts.
  • A bottom-up development process starts with the
    definition of the most specific classes, the
    leaves of the hierarchy, with subsequent grouping
    of these classes into more general concepts.
  • A combination development process is a
    combination of the top-down and bottom-up
    approaches We define the more salient concepts
    first and then generalize and specialize them
    appropriately.

64
Different levels of generality
65
Step 4 Define the classes and the class hierarchy
  • None of these three methods is inherently better
    than any of the others. The approach to take
    depends strongly on the personal view of the
    domain.
  • Whichever approach we choose, we usually start by
    defining classes.
  • From the list created in Step 3, we select the
    terms that describe objects having independent
    existence rather than terms that describe these
    objects.
  • We organize the classes into a hierarchical
    taxonomy by asking if by being an instance of one
    class, the object will necessarily (i.e., by
    definition) be an instance of some other class.

66
Step 5 Define the properties of classesslots
  • We have already selected classes from the list of
    terms we created in Step 3.
  • Most of the remaining terms are likely to be
    properties of these classes.
  • These terms include, for example, a wines color,
    body, flavor and sugar content and location of a
    winery.
  • For each property in the list, we must determine
    which class it describes.
  • These properties become slots attached to
    classes.
  • Thus, the Wine class will have the following
    slots color, body, flavor, and sugar. And the
    class Winery will have a location slot.

67
Step 5 Define the properties of classesslots
  • In general, there are several types of object
    properties that can become slots in an ontology
  • intrinsic properties such as the flavor of a
    wine
  • extrinsic properties such as a wines name, and
    area it comes from
  • parts, if the object is structured these can be
    both physical and abstract parts (e.g., the
    courses of a meal)
  • relationships to other individuals these are the
    relationships between individual members of the
    class and other items (e.g., the maker of a wine,
    representing a relationship between a wine and a
    winery, and the grape the wine is made from.)

68
Step 5 Define the properties of classesslots
  • All subclasses of a class inherit the slot of
    that class.
  • For example, all the slots of the class Wine will
    be inherited to all subclasses of Wine, including
    Red Wine and White Wine.
  • We will add an additional slot, tannin level
    (low, moderate, or high), to the Red Wine class.
  • The tannin level slot will be inherited by all
    the classes representing red wines (such as
    Bordeaux and Beaujolais).
  • A slot should be attached at the most general
    class that can have that property.
  • For instance, body and color of a wine should be
    attached at the class Wine, since it is the most
    general class whose instances will have body and
    color.

69
Step 6 Define the facets of the slots
  • Slots can have different facets describing
  • the value type,
  • allowed values,
  • the number of the values (cardinality), and
  • other features of the values the slot can take.
  • For example,
  • the value of a name slot (as in the name of a
    wine) is one string.
  • A slot produces (as in a winery produces these
    wines) can have multiple values and the values
    are instances of the class Wine.

70
Step 6 Define the facets of the slots
  • Slot cardinality defines how many values a slot
    can have.
  • Some systems distinguish only between single
    cardinality and multiple cardinality.
  • A body of a wine will be a single cardinality
    slot (a wine can have only one body).
  • Wines produced by a particular winery fill in a
    multiple-cardinality slot produces for a Winery
    class.
  • Some systems allow specification of a minimum and
    maximum cardinality to describe the number of
    slot values more precisely.
  • The grape slot of a Wine has a minimum
    cardinality of 1 each wine is made of at least
    one variety of grape.
  • The maximum cardinality for the grape slot for
    single varietal wines is 1 these wines are made
    from only one variety of grape.

71
Step 6 Define the facets of the slots
  • Slot-value type describes what types of values
    can fill in the slot. Here is a list of the more
    common value types
  • String
  • Number
  • Boolean
  • Enumerated
  • Instance-type

72
(No Transcript)
73
Step 6 Define the facets of the slots
  • Domain and range of a slot
  • Allowed classes for slots of type Instance are
    often called a range of a slot.
  • For example the class Wine is the range of the
    produces slot.
  • The classes to which a slot is attached or a
    classes which property a slot describes, are
    called the domain of the slot.
  • The Winery class is the domain of the produces
    slot.

74
Step 6 Define the facets of the slots
  • Basic rules for determining a domain and a range
    of a slot
  • When defining a domain or a range for a slot,
    find the most general classes or class that can
    be respectively the domain or the range for the
    slots .
  • If a list of classes defining a range or a
    domain of a slot includes a class and its
    subclass, remove the subclass.
  • If a list of classes defining a range or a
    domain of a slot contains all subclasses of a
    class A, but not the class A itself, the range
    should contain only the class A and not the
    subclasses.
  • If a list of classes defining a range or a
    domain of a slot contains all but a few
    subclasses of a class A, consider if the class A
    would make a more appropriate range definition.

75
Step 7 Create instances
  • Defining an individual instance of a class
    requires
  • choosing a class,
  • creating an individual instance of that class,
    and
  • filling in the slot values.
  • For example, we can create an individual instance
    Chateau-Morgon-Beaujolais to represent a specific
    type of Beaujolais wine.
  • Chateau-Morgon-Beaujolais is an instance of the
    class Beaujolais representing all Beaujolais
    wines.
  • This instance has the following slot values
    defined
  • Body Light
  • Color Red
  • Flavor Delicate
  • Tannin level Low
  • Grape Gamay (instance of the Wine grape class)
  • Maker Chateau-Morgon (instance of the Winery
    class)
  • Region Beaujolais (instance of the Wine-Region
    class)
  • Sugar Dry

76
Defining Classes and a Class Hierarchy
  • As we have mentioned before, there is no single
    correct class hierarchy for any given domain.
  • The hierarchy depends on
  • the possible uses of the ontology,
  • the level of the detail that is necessary for the
    application,
  • personal preferences, and sometimes
  • requirements for compatibility with other models.
  • Here, we discuss several guidelines to keep in
    mind when developing a class hierarchy.

77
Ensuring that the class hierarchy is correct
  • An is-a relation
  • Single wine is not a subclass of all wines
  • For example, it is wrong to define a class Wines
    and a class Wine as a subclass of Wines.
  • Transitivity of the hierarchical relations
  • If B is a subclass of A and C is a subclass of B,
    then C is a subclass of A
  • Evolution of a class hierarchy
  • Maintaining a consistent class hierarchy may
    become challenging as domains evolve.
  • Classes and their names
  • Classes represent concepts in the domain and not
    the words that denote these concepts.
  • Synonyms for the same concept do not represent
    different classes
  • Avoiding class cycles

78
Analyzing siblings in a class hierarchy
  • Siblings in a class hierarchy
  • Siblings in the hierarchy are classes that are
    direct subclasses of the same class.
  • All the siblings in the hierarchy (except for the
    ones at the root) must be at the same level of
    generality.
  • How many is too many and how few is too few?
  • If a class has only one direct subclass there may
    be a modeling problem or the ontology is not
    complete.
  • If there are more than a dozen subclasses for a
    given class then additional intermediate
    categories may be necessary.

79
(No Transcript)
80
Multiple inheritance
  • A class can be a subclass of several classes.
  • Suppose we would like to create a separate class
    of dessert wines, the Dessert wine class.
  • The Port wine is both a red wine and a dessert
    wine.
  • All instances of the Port class will be instances
    of both the Red wine class and the Dessert wine
    class.
  • Thus, it will inherit the value SWEET for the
    slot Sugar from the Dessert wine class and the
    tannin level slot and the value for its color
    slot from the Red wine class.

81
When to introduce a new class (or not)
  • Rules of thumb
  • Subclasses of a class usually (1) have additional
    properties that the superclass does not have, or
    (2) restrictions different from those of the
    superclass, or (3) participate in different
    relationships than the superclasses
  • In other words, we introduce a new class in the
    hierarchy usually only when there is something
    that we can say about this class that we cannot
    say about the superclass.
  • Classes in terminological hierarchies do not have
    to introduce new properties
  • For example, an ontology underlying an electronic
    medical-record system may include a
    classification of various diseases. This
    classification may be just thata hierarchy of
    terms, without properties (or with the same set
    of properties). In that case, it is still useful
    to organize the terms in a hierarchy rather than
    a flat list because it will (1) allow easier
    exploration and navigation and (2) enable a
    doctor to choose easily a level of generality of
    the term that is appropriate for the situation.
  • We should not create subclasses of a class for
    each additional restriction.

82
A new class or a property value?
  • When modeling a domain, we often need to decide
    whether to model a specific distinction (such as
    white, red, or rosé wine) as a property value or
    as a set of classes again depends on the scope of
    the domain and the task at hand.
  • Do we create a class White wine or do we simply
    create a class Wine and fill in different values
    for the slot color?
  • The answer usually lies in the scope that we
    defined for the ontology.

83
A new class or a property value?
  • If the concepts with different slot values become
    restrictions for different slots in other
    classes, then we represent the distinction as
    classes. Otherwise, we represent the distinction
    in a slot value.
  • If a distinction is important in the domain and
    we think of the objects with different values for
    the distinction as different kinds of objects,
    then we should create a new class for the
    distinction.
  • Our wine ontology has such classes as Red Merlot
    and White Merlot, rather than a single class for
    all Merlot wines.
  • A class to which an individual instance belongs
    should not change often.
  • Chilled wine should not be a class in an ontology
    describing wine bottles in a restaurant.
  • Usually numbers, colors, locations are slot
    values and do not cause the creation of new
    classes. Wine, however, is a notable exception
    since the color of the wine is so paramount to
    the description of wine.

84
A new class or a property value?
  • Consider, for example, the human-anatomy
    ontology. When we represent ribs,
  • do we create a class for each of the 1st left
    rib, 2nd left rib, and so on? Or
  • do we have a class Rib with slots for the order
    and the lateral position (left-right)?
  • If the information about each of the ribs that we
    represent in the ontology is significantly
    different, then we should indeed create a class
    for each of the ribs.
  • If we are modeling anatomy at a slightly lesser
    level of generality, and all ribs are very
    similar as far as our potential applications are
    concerned, we may want to simplify our hierarchy
    and have just the class Rib, with two slots
    lateral position, order.

85
An instance or a class?
  • Deciding where classes end and individual
    instances begin starts with deciding what is the
    lowest level of granularity in the
    representation.
  • The level of granularity is in turn determined by
    a potential application of the ontology.
  • Individual instances are the most specific
    concepts represented in a knowledge base.
  • If concepts form a natural hierarchy, then we
    should represent them as classes

86
Hierarchy of wine regions. The "A" icons next to
class names indicate that the classes are
abstract and cannot have any direct instances.
87
Limiting the scope
  • Helpful rules in deciding when an ontology
    definition is complete
  • The ontology should not contain all the possible
    information about the domain you do not need to
    specialize (or generalize) more than you need for
    your application (at most one extra level each
    way).
  • In our ontology, we certainly do not include all
    the properties that a wine or food could have.
    We represented the most salient properties of the
    classes of items in our ontology.
  • Even though wine books would tell us the size of
    grapes, we have not included this knowledge.
  • Similarly, we have not added all relationships
    that one could imagine among all the terms in our
    system.
  • For example, we do not include relationships such
    as favorite wine and favorite food in the
    ontology just to allow a more complete
    representation of all of the interconnections
    between the terms we have defined.

88
Disjoint subclasses
  • Classes are disjoint if they cannot have any
    instances in common.
  • For example, the Dessert wine and the White wine
    classes in our ontology are not disjoint there
    are many wines that are instances of both.

89
Defining PropertiesMore Details
  • We discuss inverse slots and default values for a
    slot.

90
Inverse slots
  • The two relations, maker and produces, are called
    inverse relations.
  • If a wine was produced by a winery, then the
    winery produces that wine.
  • Storing the information in both directions is
    redundant.
  • infer the value for the inverse relation
  • However, from the knowledge-acquisition
    perspective it is convenient to have both pieces
    of information explicitly available.

91
Inverse slots
  • Example of inverse slots
  • the maker slot of the Wine class and the produces
    slot of the Winery class.
  • When a user creates an instance of the Wine class
    and fills in the value for the maker slot, the
    system automatically adds the newly created
    instance to the produces slot of the
    corresponding Winery instance.

92
Default values
  • If a particular slot value is the same for most
    instances of a class, we can define this value to
    be a default value for the slot.
  • Then, when each new instance of a class
    containing this slot is created, the system fills
    in the default value automatically.
  • We can then change the value to any other value
    that the facets will allow.
  • That is, default values are there for
    convenience they do not enforce any new
    restrictions on the model or change the model in
    any way.
  • For example, if the majority of wines we are
    going to discuss are full-bodied wines, we can
    have full as a default value for the body of
    the wine. Then, unless we say otherwise, all
    wines we define would be full-bodied.

93
Whats in a Name?
  • Defining naming conventions for concepts in an
    ontology and then strictly adhering to these
    conventions not only makes the ontology easier to
    understand but also helps avoid some common
    modeling mistakes.
  • We need to
  • Define a naming convention for classes and slots
    and adhere to it.
  • Features of a knowledge representation system
    affect the choice of naming conventions
  • Does the system have the same name space for
    classes, slots, and instances?
  • Is the system case-sensitive?
  • What delimiters does the system allow in the
    names?

94
Capitalization and delimiters
  • We can greatly improve the readability of an
    ontology if we use consistent capitalization for
    concept names.
  • For example, it is common to capitalize class
    names and use lower case for slot names (assuming
    the system is case-sensitive).
  • When a concept name contains more than one word
    we need to delimit the words. Here are some
    possible choices.
  • Use Space Meal course
  • Run the words together and capitalize each new
    word MealCourse
  • Use an underscore or dash or other delimiter in
    the name Meal_Course, Meal_course, Meal-Course,
    Meal-course.

95
Singular or plural
  • A class name represents a collection of objects.
  • For example, a class Wine actually represents all
    wines.
  • Therefore, i
Write a Comment
User Comments (0)