Knowledge Representations - PowerPoint PPT Presentation

About This Presentation
Title:

Knowledge Representations

Description:

... the objects and their attributes for a given situation or moment in time ... Here is a model developed by NASA for a Livingston propulsion system for rockets ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 61
Provided by: foxr
Category:

less

Transcript and Presenter's Notes

Title: Knowledge Representations


1
Knowledge Representations
  • One large distinction between an AI system and a
    normal piece of software is that an AI system
    must reason using worldly knowledge
  • What types of knowledge?
  • Facts
  • Axioms
  • Statements (which may or may not be true)
  • Rules
  • Cases
  • Experiences
  • Associations (which may not be truth preserving)
  • Descriptions
  • Probabilities and Statistics

2
Types of Representations
  • Early systems used either
  • semantic networks or predicate calculus to
    represent knowledge
  • or used simple search spaces if the
    domain/problem had very limited amounts of
    knowledge (e.g., simple planning as in blocks
    world)
  • With the early expert systems in the 70s, a
    significant shift took place to production
    systems, which combined representation and
    process (chaining) and even uncertainty handling
    (certainty factors)
  • later, frames (an early version of OOP) were
    introduced
  • Problem-specific approaches were introduced such
    as scripts and CDs for language representation
  • In the 1980s, there was a shift from rules to
    model-based approaches
  • Since the 1990s, Bayesian networks and hidden
    Markov Models have become popular
  • First, we will take a brief look at some of the
    representations

3
Search Spaces
  • Given a problem expressed as a state space
    (whether explicitly or implicitly)
  • Formally, we define a search space as N, A, S,
    GD
  • N set of nodes or states of a graph
  • A set of arcs (edges) between nodes that
    correspond to the steps in the problem (the legal
    actions or operators)
  • S a nonempty subset of N that represents start
    states
  • GD a nonempty subset of N that represents goal
    states
  • Our problem becomes one of traversing the graph
    from a node in S to a node in GD
  • Example
  • 3 missionaries and 3 cannibals are on one side of
    the river with a boat that can take exactly 2
    people across the river
  • how can we move the 3 missionaries and 3
    cannibals across the river such that the
    cannibals never outnumber the missionaries on
    either side of the river (lest the cannibals
    start eating the missionaries!)

4
M/C Solution
  • We can represent a state as a 6-item tuple
    (a, b, c, d, e, f)
  • a/b number of missionaries/cannibals on left
    shore
  • c/d number of missionaries/cannibals in boat
  • e/f number of missionaries/cannibals on right
    shore
  • where a b c d e f 6
  • a gt b (unless a 0), c gt d (unless c 0), and
    e gt f (unless e 0)
  • Legal operations (moves) are
  • 0, 1, 2 missionaries get into boat
  • 0, 1, 2 missionaries get out of boat
  • 0, 1, 2 cannibals get into boat
  • 0, 1, 2 missionaries get out of boat
  • boat sails from left shore to right shore
  • boat sails from right shore to left shore

5
Relationships
  • We often know stuff about objects (whether
    physical or abstract)
  • These objects have attributes (components,
    values) and/or relationships with other things
  • So, one way to represent knowledge is to
    enumerate the objects and describe them through
    their attributes and relationships
  • Common forms of such relationship representations
    are
  • semantic networks a network consists of nodes
    which are objects and values, and edges
    (links/arcs) which are annotated to include how
    the nodes are related
  • predicate calculus predicates are often
    relationships and arguments for the predicates
    are objects
  • frames in essence, objects (from
    object-oriented programming) where attributes are
    the data members and the values are the specific
    values stored in those members in some cases,
    they are pointers to other objects

6
Representations With Relationships
Here, we see the same information
being represented using two different
representational techniques a semantic network
(above) and predicates (to the left)
7
Another Example Blocks World
Here we see a real-world situation of three
blocks and a predicate calculus representation
for expressing this knowledge We equip our
system with rules such as the below rule to
reason over how to draw conclusions and
manipulate this blocks world
This rule says if there does not exist a Y that
is on X, then X is clear
8
Semantic Networks
  • Collins and Quillian were the first to use
    semantic networks in AI by storing in the network
    the objects and their relationships
  • their intention was to represent English
    sentences
  • edges would typically be annotated with these
    descriptors or relations
  • isa class/subclass
  • instance the first object is an instance of the
    class
  • has contains or has this as a physical property
  • can has the ability to
  • made of, color, texture, etc

A semantic network to represent the sentences a
canary can sing/fly, a canary is a
bird/animal, a canary is a canary, a canary
has skin
9
Representing Word Meanings
  • Quillian demonstrated how to use the semantic
    network to represent word meanings
  • each word would have one or more networks, with
    links that attach words to their definition
    planes
  • the word plant is represented as three planes,
    each of which has links to additional word planes

10
Frames
  • The semantic network requires a graph
    representation which may not be a very efficient
    use of memory
  • Another representation is the frame
  • the idea behind a frame was originally that it
    would represent a frame of memory for
    instance, by capturing the objects and their
    attributes for a given situation or moment in
    time
  • a frame would contain slots where a slot could
    contain
  • identification information (including whether
    this frame is a subclass of another frame)
  • relationships to other frames
  • descriptors of this frame
  • procedural information on how to use this frame
    (code to be executed)
  • defaults for slots
  • instance information (or an identification of
    whether the frame represents a class or an
    instance)

11
Frame Example
Here is a partial frame representing a hotel
room The room contains a chair, bed, and phone
where the bed contains a mattress and a bed
frame (not shown)
12
Production Systems
  • A production system is
  • a set of rules (if-then or condition-action
    statements)
  • working memory
  • the current state of the problem solving, which
    includes new pieces of information created by
    previously applied rules
  • inference engine (the author calls this a
    recognize-act cycle)
  • forward-chaining, backward-chaining, a
    combination, or some other form of reasoning such
    as a sponsor-selector, or agenda-driven scheduler
  • conflict resolution strategy
  • when it comes to selecting a rule, there may be
    several applicable rules, which one should we
    select? the choice may be based on a conflict
    resolution strategy such as first rule, most
    specific rule, most salient rule, rule with
    most actions, random, etc

13
Chaining
  • The idea behind a production systems reasoning
    is that rules will describe steps in the problem
    solving space where a rule might
  • be an operation in a game like a chess move
  • translate a piece of input data into an
    intermediate conclusion
  • piece together several intermediate conclusions
    into a specific conclusion
  • translate a goal into substeps
  • So a solution using a production system is a
    collection of rules that are chained together
  • forward chaining reasoning from data to
    conclusions where working memory is sought for
    conditions that match the left-hand side of the
    given rules
  • backward chaining reasoning from goals to
    operations where an initial goal is unfolded into
    the steps needed to solve that goal, that is, the
    process is one of subgoaling

14
Two Example Production Systems
15
Example System Water Jugs
  • Problem given a 4-gallon jug (X) and a 3-gallon
    jug (Y), fill X with exactly 2 gallons of water
  • assume an infinite amount of water is available
  • Rules/operators
  • 1. If X 0 then X 4 (fill X)
  • 2. If Y 0 then Y 3 (fill Y)
  • 3. If X gt 0 then X 0 (empty X)
  • 4. If Y gt 0 then Y 0 (empty Y)
  • 5. If X Y gt 3 and X gt 0 then X X (3 y)
    and Y 3 (fill Y from X)
  • 6. If X Y gt 4 and Y gt 0 then X 4 and Y Y
    (4 X) (fill X from Y)
  • 7. If X Y lt 3 and X gt 0 then X 0 and Y X
    Y (empty X into Y)
  • 8. If X Y lt 4 and Y gt 0 then X X Y and Y
    0 (empty Y into X)
  • rule numbers used on the next slide

16
Conflict Resolution Strategies
  • In a production system, what happens when more
    than one rule matches?
  • a conflict resolution strategy dictates how to
    select from between multiple matching rules
  • Simple conflict resolution strategies include
  • random
  • first match
  • most/least recently matched rule
  • rule which has matched for the longest/shortest
    number of cycles (refractoriness)
  • most salient rule (each rule is given a salience
    before you run the production system)
  • More complex resolution strategies might
  • select the rule with the most/least number of
    conditions (specificity/generality)
  • or most/least number of actions (biggest/smallest
    change to the state)

17
MYCIN
  • By the early 1970s, the production system
    approach was found to be more than adequate for
    constructing large scale expert systems
  • in 1971, researchers at Stanford began
    constructing MYCIN, a medical diagnostic system
  • it contained a very large rule base
  • it used backward chaining
  • to deal with the uncertainty of medical
    knowledge, it introduced certainty factors (sort
    of like probabilities)
  • in 1975, it was tested against medical experts
    and performed as well or better than the doctors
    it was compared to

(defrule 52 if (site culture is blood)
(gram organism is neg) (morphology organism
is rod) (burn patient is serious) then .4
(identity organism is pseudomonas))
If the culture was taken from the patients
blood and the gram of the organism is negative
and the morphology of the organism is rods and
the patient is a serious burn patient, then
conclude that the identity of the organism is
pseudomonas (.4 certainty)
18
MYCIN in Operation
  • Mycins process starts with diagnose-and-treat
  • repeat
  • identify all rules that can provide the
    conclusion currently sought
  • match right hand sides (that is, search for rules
    whose right hand sides match anything in working
    memory)
  • use conflict resolution to identify a single rule
  • fire that rule
  • find and remove a piece of knowledge which is no
    longer needed
  • find and modify a piece of knowledge now that
    more specific information is known
  • add a new subgoal (left-hand side conditions that
    need to be proved)
  • until the action done is added to working memory
  • Mycin would first identify the illness, possibly
    ordering more tests to be performed, and then
    given the illness, generate a treatment
  • Mycin consisted of about 600 rules

19
R1/XCON
  • Another success story is DECs R1
  • later renamed XCON
  • This system would take customer orders and
    configure specific VAX computers for those orders
    including
  • completing the order if the order was incomplete
  • how the various components (drive and tape units,
    mother board(s), etc) would be placed inside the
    mainframe cabinet)
  • how the wiring would take place among the various
    components
  • R1 would perform forward chaining over about
    10,000 rules
  • over a 6 year period, it configured some 80,000
    orders with a 95-98 accuracy rating
  • ironically, whereas planning/design is viewed as
    a backward chaining task, R1 used forward
    chaining because, in this particular case, the
    problem is data driven, starting with user input
    of the computer systems specifications
  • R1s solutions were similar in quality to human
    solutions

20
R1 Sample Rules
  • Constraint rules
  • if device requires battery then select battery
    for device
  • if select battery for device then pick battery
    with voltage(battery) voltage(device)
  • Configuration rules
  • if we are in the floor plan stage and there is
    space for a power supply and there is no power
    supply available then add a power supply to the
    order
  • if step is configuring, propose alternatives and
    there is an unconfigured device and no container
    was chosen and no other device that can hold it
    was chosen and selecting a container wasnt
    proposed yet and no problems for selecting
    containers were identified then propose selecting
    a container
  • if the step is distributing a massbus device and
    there is a single port disk drive that has not
    been assigned to a massbus and there are no
    unassigned dual port disk drives and the number
    of devices that each massbus should support is
    known and there is a massbus that has been
    assigned at least one disk drive and that should
    support additional disk drives and the type of
    cable needed to connect the disk drive is known,
    then assign the disk drive to this massbus

21
Strong Slot-n-Filler Structures
  • To avoid the difficulties with Frames and Nets,
    Schank and Rieger offered two network-like
    representations that would have implied uses and
    built-in semantics conceptual dependencies and
    scripts
  • the conceptual dependency was derived as a form
    of semantic network that would have specific
    types of links to be used for representing
    specific pieces of information in English
    sentences
  • the action of the sentence
  • the objects affected by the action or that
    brought about the action
  • modifiers of both actions and objects
  • they defined 11 primitive actions, called ACTs
  • every possible action can be categorized as one
    of these 11
  • an ACT would form the center of the CD, with
    links attaching the objects and modifiers

22
Example CD
  • The sentence is John ate the egg
  • The INGEST act means to ingest an object (eat,
    drink, swallow)
  • the P above the double arrow indicates past test
  • the INGEST action must have an object (the O
    indicates it was the object Egg) and a direction
    (the object went from Johns mouth to Johns
    insides)
  • we might infer that it was an egg instead of
    the egg as there is nothing specific to
    indicate which egg was eaten
  • we might also infer that John swallowed the egg
    whole as there is nothing to indicate that John
    chewed the egg!

23
The CD Theory ACTs
  • Is this list complete?
  • what actions are missing?
  • Could we reduce this list to make it more
    concise?
  • other researchers have developed other lists of
    primitive actions including just 3 physical
    actions, mental actions and abstract actions

24
Example CD Links
25
Example CDs
26
More Examples
27
Complex Example
  • The sentence is John prevented Mary from giving
    a book to Bill
  • This sentence has two ACTs, DO and ATRANS
  • DO was not in the list of 11, but can be thought
    of as caused to happen
  • The c/ means a negative conditional, in this case
    it means that John caused this not to happen
  • The ATRANS is a giving relationship with the
    object being a Book and the action being from
    Mary to Bill Mary gave a book to Bill
  • like with the previous example, there is no way
    of telling whether it is a book or the book

28
Scripts
  • The other structured representation developed by
    Schank (along with Abelson) is the script
  • a description of the typical actions that are
    involved in a typical situation
  • they defined a script for going to a restaurant
  • scripts provide an ability for default reasoning
    when information is not available that directly
    states that an action occurred
  • so we may assume, unless otherwise stated, that a
    diner at a restaurant was served food, that the
    diner paid for the food, and that the diner was
    served by a waiter/waitress
  • A script would contain
  • entry condition(s) and results (exit conditions)
  • actors (the people involved)
  • props (physical items at the location used by the
    actors)
  • scenes (individual events that take place)
  • The script would use the 11 ACTs from CD theory

29
Restaurant Script
  • The script does not contain atypical actions
  • although there are options such as whether the
    customer was pleased or not
  • There are multiple paths through the scenes to
    make for a robust script
  • what would a going to the movies script look
    like? would it have similar props, actors,
    scenes? how about going to class?

30
Knowledge Groups
  • One of the drawbacks of the knowledge
    representations demonstrated thus far is that all
    knowledge is grouped into a single, large
    collection of representations
  • the rules taken as a whole for instance dont
    denote what rules should be used in what
    circumstance
  • Another approach is to divide the representations
    into logical groupings
  • this permits easier design, implementation,
    testing and debugging because you know what that
    particular group is supposed to do and what
    knowledge should go into it
  • it should be noted that by distributing the
    knowledge, we might use different problem solving
    agents for each set of knowledge so that the
    knowledge is stored using different
    representations

31
Knowledge Sources and Agents
  • Which leads us to the idea of having multiple
    problem solving agents
  • each agent is responsible for solving some
    specialized type of problem(s) and knows where to
    obtain its own input
  • each agent has its own knowledge sources, some
    internal, some external
  • since external agents may have their own forms of
    representation, the agent must know
  • how to find the proper agents
  • how to properly communicate with these other
    agents
  • how to interpret the information that it receives
    from these agents
  • how to recover from a situation where the
    expected agent(s) is/are not available

32
What is an Agent?
  • Agents are interactive problem solvers that have
    these properties
  • situated the agent is part of the problem
    solving environment it can obtain its own input
    from its environment and it can affect its
    environment through its output
  • autonomous the agent operates independently of
    other agents and can control its own actions and
    internal states
  • flexible the agent is both responsive and
    proactive it can go out and find what it needs
    to solve its problem(s)
  • social the agent can interact with other agents
    including humans
  • Some researchers also insist that agents have
  • mobility have the ability to move from their
    current environment to a new environment (e.g.,
    migrate to another processor)
  • delegation hand off portions of the problem to
    other agents
  • cooperation if multiple agents are tasked with
    the same problem, can their solutions be combined?

33
The Semantic Web
  • The WWW is a collection of data and knowledge in
    an unstructured format
  • Humans often can take knowledge from disparate
    sources and put together a coherent picture, can
    problem solving agents?
  • Agents on the semantic web all have their own
    capabilities and know where to look for knowledge
  • Whether a static source, or an agent that can
    provide the needed information through its own
    processing, or from a human
  • The common approach is to model the knowledge of
    a web site using an ontology
  • ontologies give agents the ability to translate
    the results of another agent, or the data
    provided from a website, into a version of
    knowledge that they can understand and use

34
Knowledge Acquisition and Modeling
  • Expert System construction used to be a
    trial-and-error sort of approach with the
    knowledge engineers
  • once they had knowledge from the experts, they
    would fill in their knowledge base and test it
    out
  • By the end of the 80s, it was discovered that
    creating an actual domain model was the way to go
    build a model of the knowledge before
    implementing anything
  • A model might be
  • a dependency graph of what can cause what to
    happen
  • or an associational model which is a collection
    of malfunctions and the manifestations we would
    expect to see from those malfunctions
  • or a functional model where component parts are
    enumerated and described by function and behavior
  • The emphasis changed to knowledge acquisition
    tools (KADS)
  • domain experts enter their knowledge as a
    graphical model that contains the component parts
    of the item being diagnosed/designed, their
    functions, and rules for deciding how to diagnose
    or design each one

35
A NASA Example
  • Here is a model developed by NASA for a
    Livingston propulsion system for rockets
  • a reactive self-configuring autonomous system
  • knowledge modeled using propositional calc
    (instead of predicate calc there are a finite
    number of elements, each will be modeled by its
    own proposition)

Helium is the fuel tank Oxidizer is mixed to
cause the fuel to burn Acc is the accelerometer
which, along with sensors in the valves, is used
as input to control the system Pryo valves are
used as control once they Change state, they
stay in that state so they are used to change
the flow of fuel when an error is detected,
opening or closing a new pathway from tank to
engine
36
Model (Architecture) for the System
  • The idea is that the configuration manager tries
    to keep the spacecraft moving but at the lowest
    cost configuration
  • Sensors feed into the ME (mode estimator) to
    determine if the system is functioning and in the
    lowest configuration
  • If not, the MR (mode reconfiguration) plans a new
    mode by determining what valves to open and close
  • Since this is a spacecraft, the output of the MR
    is a set of actions that cause valves to open or
    close directly

The high level planner generates a sequence of
hardware configurations goals such as the amount
of propellant that should be used , it is the
configuration manager that must translate these
goals into actions
37
VT an Elevators Design
The design of an elevator can be used to
generate a diagnostic system for elevator
problems, or in VTs case, a system that
can design new elevators
38
Reasoning with Uncertainty
  • Representations generally represent knowledge as
    fact
  • However, often, knowledge and the use of the
    knowledge brings with it a degree of uncertainty
  • how can we represent and reason with uncertainty?
  • We find two forms of uncertainty
  • unsure input
  • unknown do not know the answer so you have to
    say unknown
  • unclear answer doesnt fit question (e.g., not
    yes but 80 yes)
  • vague data is a 100 degree temp a high fever
    or just fever?
  • ambiguous/noisy data data may not be easily
    interpretable
  • non-truth preserving knowledge (most rules are
    associational, not truth preserving)
  • unlike if you are a man then you are mortal, a
    doctor might reason from symptoms to diseases
  • all men are mortal denotes a class/subclass
    relationship, which is truth preserving
  • but the symptom to disease reasoning is based on
    associations and is not guaranteed to be true

39
Certainty Factors
  • First used in the Mycin system, the idea is that
    we will attribute a measure of belief to any
    conclusion that we draw
  • CF(H E) MB(H E) MD(H E)
  • certainty factor for hypothesis H given evidence
    E is the measure of belief we have for H minus
    measure of disbelief we have for H
  • CFs are applied to hypotheses that are drawn from
    rules
  • CFs can be combined as we associate a CF with
    each condition and each conclusion of each rule
  • To use CFs, we need
  • to annotate every rule with a CF value (this
    comes from the expert)
  • ways to combine CFs when we use AND, OR, ?
  • Combining rules are straightforward
  • for AND use min
  • for OR use max
  • for ? use (multiplication)

40
CF Example
  • Assume we have the following rules
  • A ? B (.7)
  • A ? C (.4)
  • D ? F (.6)
  • B AND G ? E (.8)
  • C OR F ? H (.5)
  • We know A, D and G are true (so each have a value
    of 1.0)
  • B is .7 (A is 1.0, the rule is true at .7, so B
    is true at 1.0 .7 .7)
  • C is .4
  • F is .6
  • B AND G is min(.7, 1.0) .7 (G is 1.0, B is .7)
  • E is .7 .8 .56
  • C OR F is max(.4, .6) .6
  • H is .6 .5 .30

41
Continued
  • Another combining rule is needed when we can
    conclude the same hypothesis from two or more
    rules
  • we already used C OR F ? H (.5) to conclude H
    with a CF of .30
  • lets assume that we also have the rule E ? H
    (.5)
  • since E is .56, we have H at .56 .5 .28
  • We now believe H at .30 and at .28, which is
    true?
  • the two rules both support H, so we want to draw
    a stronger conclusion in H since we have two
    independent means of support for H
  • We will use the formula CF1 CF2 CF1CF2
  • CF(H) .30 .28 - .30 .28 .496
  • our belief in H has been strengthened through two
    different chains of logic

42
Fuzzy Logic
  • Prior to CFs, Zadeh introduced fuzzy logic to
    introduce shades of grey into logic
  • other logics are two-valued, true or false only
  • Here, any proposition can take on a value in the
    interval 0, 1
  • Being a logic, Zadeh introduced the algebra to
    support logical operators of AND, OR, NOT, ?
  • X AND Y min(X, Y)
  • X OR Y max(X, Y)
  • NOT X (1 X)
  • X ? Y X Y
  • Where the values of X, Y are determined by where
    they fall in the interval 0, 1

43
Fuzzy Set Theory
  • Fuzzy sets are to normal sets what fuzzy logic is
    to logic
  • fuzzy set theory is based on fuzzy values from
    fuzzy logic but includes set operations instead
    of logic operations
  • The basis for fuzzy sets is defining a fuzzy
    membership function for a set
  • a fuzzy set is a set of items along with their
    membership values in the set where the membership
    value defines how closely that item is to being
    in that set
  • Example the set tall might be denoted as
  • tall x f(x) 1.0 if x gt 62, .8 if x gt
    6, .6 if x gt 510, .4 if x gt 58, .2 if x gt
    56, 0 otherwise
  • so we can say that a person is tall at .8 if they
    are 61 or we can say that the set of tall
    people are Anne/.2, Bill/1.0, Chuck/.6, Fred/.8,
    Sue/.6

44
Fuzzy Membership Function
  • Typically, a membership function is a continuous
    function (often represented in a graph form like
    above)
  • given a value y, the membership value for y is
    u(y), determined by tracing the curve and seeing
    where it falls on the u(x) axis
  • How do we define a membership function?
  • this is an open question

45
Using Fuzzy Logic/Sets
  • 1. fuzzify the input(s) using fuzzy membership
    functions
  • 2. apply fuzzy logic rules to draw conclusions
  • we use the previous rules for AND, OR, NOT, ?
  • 3. if conclusions are supported by multiple
    rules, combine the conclusions
  • like CF, we need a combining function, this may
    be done by computing a center of gravity using
    calculus
  • 4. defuzzify conclusions to get specific
    conclusions
  • defuzzification requires translating a numeric
    value into an actionable item
  • Fuzzy logic is often applied to domains where we
    can easily derive fuzzy membership functions and
    have a few rules but not a lot
  • fuzzy logic begins to break down when we have
    more than a dozen or two rules

46
Example
  • We have an atmospheric controller which can
    increase or decrease the temperature of the air
    and can increase or decrease the fan based on
    these simple rules
  • if air is warm and dry, decrease the fan and
    increase the coolant
  • if air is warm and not dry, increase the fan
  • if air is hot and dry, increase the fan and the
    increase the coolant slightly
  • if air is hot and not dry, increase the fan and
    coolant
  • if air is cold, turn off the fan and decrease the
    coolant
  • Our input obviously requires the air temperature
    and the humidity, the membership function for air
    temperature is shown to the right

if it is 60, it would be considered cold 0,
warm 1, hot 0 if it is 85, it would be cold 0,
warm .3 and hot .7
47
Continued
  • Temperature 85, humidity indicates dry .6
  • hot .7, warm .3, cold 0, dry .6, not dry .4 (not
    dry 1 dry 1 - .6)
  • Rule 1 has warm and dry
  • warm is .3, dry is .6, so warm and dry
    min(.3, .6) .3
  • Rule 2 has warm and not dry
  • min(.3, .4) .3
  • Rule 3 has hot and dry min(.7, .3) .3
  • our fourth and fifth rules give us 0 since cold
    is 0
  • Our conclusions from the first three rules are to
  • decrease the coolant and increase the fan at
    levels of .3
  • increase the fan at level of .3
  • increase the fan at .3 and increase the coolant
    slightly
  • To combine our results, we might increase the fan
    by .9 and decrease the coolant (assume increase
    slightly means increase by ¼) by .3 - .3/4
    .9/4
  • Finally, we defuzzify decrease by .9/4 and
    increase by .9 to actionable amounts

48
Using Fuzzy Logic
  • The most common applications for fuzzy logic are
    for controllers
  • devices that, based on input, make minor
    modifications to their settings for instance
  • air conditioner controller that uses the current
    temperature, the desired temperature, and the
    number of open vents to determine how much to
    turn up or down the blower
  • camera aperture control (up/down, focus, negate a
    shaky hand)
  • a subway car for braking and acceleration
  • Fuzzy logic has been used for expert systems
  • but the systems tend to perform poorly when more
    than just a few rules are chained together
  • in our previous example, we just had 5
    stand-alone rules
  • when we chain rules, the fuzzy values are
    multiplied (e.g., .5 from one rule .3 from
    another rule .4 from another rule, our result
    is .06)

49
Dempster-Shaefer Theory
  • The D-S Theory goes beyond CF and Fuzzy Logic by
    providing us two values to indicate the utility
    of a hypothesis
  • belief as before, like the CF or fuzzy
    membership value
  • plausibility adds to our belief by determining
    if there is any evidence (belief) for opposing
    the hypothesis
  • We want to know if h is a reasonable hypothesis
  • we have evidence in favor of h giving us a belief
    of .7
  • we have no evidence against h, this would imply
    that the plausibility is greater than the belief
  • p(h) 1 b(h) 1 (since we have no evidence
    against h, h 0)
  • Consider two hypotheses, h1 and h2 where we have
    no evidence in favor of either, so b(h1) b(h2)
    .5
  • we have evidence that suggests h2 is less
    believable than h1 so that b(h2) .3 and
    b(h1) .5
  • h1 .5, .5 and h2 .5, .7 so h2 is more
    believable

50
Computing Multiple Beliefs
  • D-S theory gives us a way to compute the belief
    for any number of subsets of the hypotheses, and
    modify the beliefs as new evidence is introduced
  • the formula to compute belief (given below) is a
    bit complex
  • so we present an example to better understand it
  • but the basic idea is this we have a belief
    value for how well some piece of evidence
    supports a group (subset) of hypotheses
  • we introduce a new evidence and multiply the
    belief from the first with the belief in support
    of the new evidence for those hypotheses that are
    in the intersection of the two subsets
  • the denominator is used to normalize the computed
    beliefs, and is 1 unless the intersection
    includes some null subsets

51
Example
  • There are four possible hypotheses for a given
    patient, cold (C), flu (F), migraine (H),
    meningitis (M)
  • we introduce a piece of evidence, m1 fever,
    which supports C, F, M at .6
  • we also have Q (the entire set) with support 1
    - .6 .4
  • now we add the evidence m2 nausea which can
    support C, F, H at .7 so that Q .3
  • we combine the two sets of beliefs into m3 as
    follows

Since m3 has no empty sets, the denominator is 1,
so the set of values in m3 is already normalized
and we do not have to do anything else
52
Continued
  • When we had m1, we had two sets, C, F, M and
    Q
  • When we combined it with m2 (with two sets of its
    own,C, F, H and Q), the result was four sets
  • the intersection of C, F, M and C, F, H C,
    F
  • the intersection of C, F, M and Q C, F, M
  • the intersection of C, F, H and Q C, F, H
  • the intersection of Q and Q Q
  • We now add evidence m4 lab culture result that
    suggest Meningitis, with belief .8
  • m4M .8 and m4Q .2
  • In adding m4, with M and Q, we intersect
    these with the four intersected sets above which
    results in 8 sets
  • shown on the next slide, with some empty sets so
    our denominator will no longer be 1 and we will
    have to compute it after computing the numerators

53
End of Example
Sum of empty sets .336 .224 .56, the
denominator is 1 - .56 .44 m5M (.096
.144) / .44 .545 m5C, F, M .036 / .44
.082 m5 (.336 .224) / .44 .56 m5C, F
.084 / .44 .191 m5C, F, H .056 / .44
.127 m5Q .036 / .44 .055 The most
plausible explanation is because the evidence
tends to contradict (some symptoms indicate
Meningitis, another symptom indicates no
Meningitis)
54
Bayesian Probabilities
  • Bayes derived the following formula
  • p(h E) p(E h) p(h) / sum for all i (p(E
    hi) p(hi))
  • the probability that h is true given evidence E
  • p(h E) conditional probability
  • what is the probability that h is true given the
    evidence E
  • p(E h) evidential probability
  • what is the probability that evidence E will
    appear if h is true?
  • p(h) prior probability (or a priori
    probability)
  • what is the probability that h is true in general
    without any evidence?
  • the denominator normalizes the conditional
    probabilities to add up to 1
  • To solve a problem with Bayesian probabilities
  • we need to accumulate the probabilities for all
    hypotheses h1, h2, h3 of p(h1 E), p(h2 E),
    p(h3 E), , p(E h1), p(E h2), p(E h3),
    and p(h1), p(h2), p(h3), and then its just a
    straightforward series of calculations

55
Example
  • The sidewalk is wet, we want to determine the
    most likely cause
  • it rained overnight (h1)
  • we ran the sprinkler overnight (h2)
  • wet sidewalk (E)
  • Assume the following
  • there was a 50 chance of rain p(h1) .5
  • sprinkler is run two nights a week p(h2) 2/7
    .28
  • p(wet sidewalk rain overnight) .8
  • p(wet sidewalk sprinkler) .9
  • Now we compute the two conditional probabilities
  • p(h1 E) (.5 .8) / (.5 .8 .28 .9)
    .61
  • p(h2 E) (.28 .9) / (.5 .8 .28 .9)
    .39

56
Independent Events
  • There is a flaw with our previous example
  • if it is likely that it will rain, we will
    probably not run the sprinkler even if it is the
    night we usually run it, and if it does not rain,
    we will probably be more likely to run the
    sprinkler the next night
  • So we have to be aware of whether events are
    independent or not
  • two events are independent if P(A B) P(A)
    P(B)
  • where means intersect
  • when P(B) ltgt 0, then P(A) P(A B)
  • knowing B is true does not affect the probability
    of A being true
  • We can also modify our computation by using the
    formula for conditional independent events
  • P(A B C) P(A C) P(B C)
  • again, is used to mean intersection
  • we will expand on this shortly

57
Multiple Pieces of Evidence
  • In our wet sidewalk example, E consisted of one
    piece of evidence, wet sidewalk
  • what if we have many pieces of evidence?
  • Consider a diagnostic case where there are 10
    possible symptoms that we might look for to
    determine whether a patient has a cold (h1), flu
    (h2) or sinus infection (h3)
  • E is some subset of e1, e2, e3, e4, e5, e6, e7,
    e8, e9, e10
  • To use Bayes formula, we need to know
  • p(h1), p(h2), p(h3) as well as
  • p(e1 h1), p(e1 h2), p(e1 h3)
  • p(e2 h1), p(e2 h2), p(e2 h3)
  • p(e3 h1), p(e3 h2), p(e3 h3)

58
Continued
  • But our patient may have several symptoms
  • So we also need
  • p(e1, e2 h1), p(e1, e2 h2), p(e1, e2 h3)
  • p(e1, e3 h1), p(e1, e3 h2), p(e1, e3 h3)
  • p(e2, e3 h1), p(e2, e3 h2), p(e2, e3 h3)
  • p(e1, e2, e3 h1), p(e1, e2, e3 h2), p(e1, e2,
    e3 h3)
  • How many different probabilities will we need?
  • with 10 pieces of evidence, there are 210 1024
    different combinations for E, so we will need 3
    1024 3072 evidential probabilities (to go along
    with the 3 prior probabilities, one for each
    hypothesis)
  • imagine if E comprised a set of 50 pieces of
    evidence instead!

59
Bayesian Net
  • We can apply the Bayesian formulas for
    independent and conditionally dependent events in
    a network form
  • we want to determine the likely cause for seeing
    orange barrels, flashing lights and bad traffic
    on the highway
  • two hypotheses construction, accident (see the
    figure below)
  • notice T (bad traffic) can be caused by either
    construction or an accident, orange barrels are
    only evidence of construction and flashing lights
    are only evidence of an accident (although it
    could also be that a driver has been pulled over)
  • construction and accident are not directly
    related to each other this will help simplify
    the problem

60
Dynamic Bayesian Networks
  • Cause-effect situations are temporal
  • at time i, an event arises and causes an event at
    time i1
  • the Bayesian belief network is static, it
    captures a situation at a singular point in time
  • we need a dynamic network instead
  • The dynamic Bayesian network is similar to our
    previous networks except that each edge
    represents not merely a dependency, but a
    temporal change
  • when you take the branch from state i to state
    i1, you are not only indicating that state i can
    cause i1 but that i was at a time prior to i1

Here is a state diagram to represents possible
utterances for the word tomato Each node
represents both a sound and a segment of time
Write a Comment
User Comments (0)
About PowerShow.com