Title: CS4018 Formal Models of Computation weeks 2023 Computability and Complexity
1CS4018Formal Models of Computation weeks 20-23
Computability and Complexity
- Kees van Deemter
- (partly based on lecture notes by Dirk Nikodem)
2Fourth set of slidesGenerating Referring
Expressions
- The GRE game GRE as part of NLG
- Grices maxims
- (Complexity of) Full Brevity
- (Complexity of) the Incremental Algorithm
- Complexity can be measured in different ways
- N.B. This topic is not covered in the Lecture
Notes
3Lets play a game
- Desks b,c, Chairs a,e, Sofas f
- Leather b, Wood a,f
- Blue c,d, Red a,e
- Please write down how a speaker of
- English might describe each of a, b, ,f
- (seven Noun Phrases).
4For example
- Desks b,c, Chairs a,e, Sofas f
- Leather b, Wood a,f
- Blue c,d, Red a,e
- athe wooden chair, the red chair, the red wooden
thing - bthe leather desk, the leather (?), the leather
object - cthe blue desk
- d? the blue thing thats not a desk
- ethe red chair
- fthe sofa
5Game is called Generation of Referring Expressions
- Referring Expression a string of words that
identifies an object uniquely - Also called a distinguishing description of the
target referent r - Other elements of domain are distractors
- Formally, this is a set of properties P1,,Pn
such that P1 ? ? Pn r - Assumption speaker and hearer share the same
facts shared knowledge base
6Generation of Referring Expressions
- Part of a larger area of research and
applications, Natural Language Generation (NLG) - Natural Language ordinary language (e.g.,
English, Dutch, ..) - We want NLG to produce natural English, that
is, the kind of English that a native speaker
would use
7Whats the most natural referring expression?
- Weve seen that this is not always easy
- An important linguist Paul Grice. Principles
underlying conversation,called the Gricean
maxims. E.g., - Dont use more words than necessary
- For this and what follows, read early sections of
Dale Reiter 1995. - Grices work was informal, and can be
understood/applied in different ways
8Most literal interpretation of Grice
- Full Brevity algorithmUse the shortest
description of r thats still a distinguishing
description of r - NB This is a slight simplification, since well
be counting properties not words
9Most literal interpretation of Grice
- Full Brevity algorithmUse the shortest
description - Search space all sets of properties
- If the language has properties P1,,Pm then
seach space Powerset(P1,,Pm) - This search space grows exponentially in m, since
Powerset(X) 2x .In this case,
Powerset(P1,,Pm) 2m
10Full Brevity
- Any algorithm meeting Full Brevity has to find
the solution in this exponentially growing search
space - It does not automatically follow that such an
algorithm must have exponential time complexity
(in the worst case)Maybe there exist smart
algorithms that can skip some parts of the search
space
11Full Brevity
- First published algorithm (By R.Dale)
12- List all properties P1,P2,PmGo though list
until a distinguishing description is found
shortest has
been found!or until the end of the list is
reached - List all sets of two properties Pi,PjGo though
list until a distinguishing description is found
shortest has been found!or until the end of
the list is reached - and so on
- List all sets containing all only P1,P2,PmGo
though list until a distinguishing description is
found
shortest has
been found!or until the end of the list is
reached no distinguishing
description exists
13What do you think of this algorithm?
14What do you think of this algorithm?
- It seems smart, only trying larger descriptions
if shorter descriptions dont lead to a
distinguishing description. - Yet, the algorithm is exponential Smartness does
not affect the worst case. - In fact, this is very easy to seeThe worst case
arises if P1,P2,Pmis the only description
that distinguishes rIn this case, all properties
are visited, hence we have our Powerset(P1,,Pm
) 2m again
15Lets look at the complexity assessment in Dale
Reiter
- Choosing x out of m is a familiar problem in
combinatorics m! - ------------
- x! (m-x)!
- Divided by (m-x)! because youre interested in
only the first x factors - Divided by x! because otherwise youre counting
all permutations of the set of properties as
distinct
16Time-complexity of entire algorithm
- Dale Reiter do not directly calculate the
worst-case complexity, but the complexity when
the shortest distinguishing description contains
x properties
- If mgtgtx then this equals mx,
- so this is still exponential
17For example,
- If x3 and m10 then check 175 combinations
- If x4 and m20 then check 6000 combinations
- If x5 and m50 then check 2.000.000 combinations
- (By assuming that xltltm, DR assume
- that the worst case does never arise)
18- This was Dale Reiters first finding. What
would you conclude?
19Some options
- Find a faster algorithm.
- This is probably impossible proof by
reduction shows that the problem is
NP-Complete. Please take this on faith - Give up and make do with this algorithm. After
all, if no really faster algorithm exists, why
exert yourselftrying to look for one. - (Any ideas?)
20Dale and Reiters response
- Experimental literature had shown that human
speakers do not adhere to Full Brevity - One example The leather
- Some properties are so striking that they are
always tried first - Once a property has been recognized as useful
(because it removes some distractors) - Putting it simply people talk before they are
finished thinking.
21Their algorithm makes use of these insights
- They dont say Grice was wrong, but Lets
understand Grice differently - The idea is to approximate brevity, without
always achieving it - Approximation is always possible, but in this
case the facts about natural language seem to say
that an approximation is the real thing!
22Sketch of the Incremental Algorithm
- Let Prop be a list of properties, going from most
striking to least striking - You go through the list, asking of each property
Does it remove any distractors - If a property P removes distractors then include
it in the description set - When adding a property to description set, keep
count of how large a set of referents youre
describing. (This set gets smaller) - If, at any stage, set of described referents
rthen success. If the end of Prop is reached
then fail.
23- r target referent L description set
- C set of described referents
- Prop ordered list of properties
- D Domain
- CD
- For each P ? Prop do
- If r?P and not(C?P) P is useful!
- Then L L ? P Add P to list C
C ? P Reduce set of described referents - If C r then return L
- return failure
24Properties of the algorithm
- Hillclimbing algorithms are well know in AI.
- No more properties included than needed, but no
backtracking, so descriptions are not always
minimal - Can you analyse the algorithm in terms of
time-complexity? - The key operation is the usefulness check
25- Worst-case time-complexity is good!
- Worst case, every property in Prop has to be
checked (thats m times) - For every property, you have to check its
behaviour with respect to every element of C
(i.e., r and all remaining distractors) - Remaining distractors get fewer and fewer.
Worst-case, you remove one distractor at a time,
so
26Complexity of Incremental Algorithm
- md m(d-1) m(d-2) m1
- Number of checks is a constant times ½md, which
is clearly polynomial (O (½md)) - Dale Reiter arrive at a slightly different
figure. Reason they want to steer away from
worst-case complexity. - Instead, they calculate expected complexity.
The basic conclusion is the same algorithm is
polynomial
27Alternative analyses
- This analysis assumes that the time needed for
checking whether x ? Pis constant. - If one can check in constant time whether (not)
C?P then an even simpler analysis is possible
O(m). - Which analysis is best is often a difficult
question (but note that one is a refinement of
the other, in this case)
28End of NLG example
- The Incremental Algorithm has proven to be quite
seminal harder problems have been attacked along
similar lines - Assessments of the time-complexity of algorithms
are not only provided, but they have played a key
role for researchers in preferring one algorithm
over the other - Do have a read!
29Issues
- An algorithm is made for a purpose. Whether an
approximation is acceptable depends on this
purpose. - There is not necessarily always one correct way
of measuring complexity. - Worst-case, best-case, average, expected
complexity - Which factors deserve to be modelled as
variables? For example,
30Issues
- Suppose human speakers never utter descriptions
containing more than 4 properties. ? Parameter x
in mx is replaced by the constant 4. Problem
becomes polynomial ! - Same for x100
- So, is the algorithm polynomial or
exponential?It depends on what exactly you try
to model.If you want to cover descriptions of
any length then length is a variable - Never take complexity assessments at face value!
31Complexity
- Back to the cartoons in Garey Johnson (1979)
- I cant find an efficient solution. I guess Im
too dumb. - I cant find an efficient solution. No efficient
solution exists. - I cant find an efficient solution, but neither
can all these famous people. -
- Closing observations
- (2) is beyond the present state of the art in
computer science. (3) is only a substitute - Cartoons are easily adapted to illustrate
computability too. For computability, (2) is
often possible!