Colin de la Higuera - PowerPoint PPT Presentation

1 / 182
About This Presentation
Title:

Colin de la Higuera

Description:

are sequences of symbols from. 22. Erice 2005, the Analysis of Patterns. Grammatical Inference ... A string in Gaelic and its translation to English: ... – PowerPoint PPT presentation

Number of Views:170
Avg rating:3.0/5.0
Slides: 183
Provided by: facultesci
Category:

less

Transcript and Presenter's Notes

Title: Colin de la Higuera


1
Grammatical inference techniques and algorithms
Colin de la Higuera
2
Acknowledgements
  • Laurent Miclet, Tim Oates, Jose Oncina, Rafael
    Carrasco, Paco Casacuberta, Pedro Cruz, Rémi
    Eyraud, Philippe Ezequel, Henning Fernau,
    Jean-Christophe Janodet, Thierry Murgue, Frédéric
    Tantini, Franck Thollard, Enrique Vidal,...
  • and a lot of other people to whom I am grateful

3
Outline
  • 1 An introductory example
  • 2 About grammatical inference
  • 3 Some specificities of the task
  • 4 Some techniques and algorithms
  • 5 Open issues and questions

4
1 How do we learn languages?
  • A very simple example

5
The problem
  • You are in an unknown city and have to eat.
  • You therefore go to some selected restaurants.
  • Your goal is therefore to build a model of the
    city (a map).

6
The data
  • Up Down Right Left Left ? Restaurant
  • Down Down Right ? Not a restaurant
  • Left Down ? Restaurant

7
Hopefully something like this
u,r
N
u
d
R
d,l
N
u
R
d
r
8
(No Transcript)
9
Further arguments (1)
  • How did we get hold of the data?
  • Random walks
  • Following someone
  • someone knowledgeable
  • Someone trying to lose us
  • Someone on a diet
  • Exploring

10
Further arguments (2)
  • Can we not have better information (for example
    the names of the restaurants)?
  • But then we may only have the information about
    the routes to restaurants (not to the non
    restaurants)

11
Further arguments (3)
  • What if instead of getting the information
    Elimo or restaurant, I get the information
    good meal or 7/10?

Reinforcement learning POMDP
12
Further arguments (4)
  • Where is my algorithm to learn these things?
  • Perhaps should I consider several algorithms for
    the different types of data?

13
Further arguments (5)
  • What can I say about the result?
  • What can I say about the algorithm?

14
Further arguments (6)
  • What if I want something richer than an
    automaton?
  • A context-free grammar
  • A transducer
  • A tree automaton

15
Further arguments (7)
  • Why do I want something as rich as an automaton?
  • What about
  • A simple pattern?
  • Some SVM obtained from features over the strings?
  • A neural network that would allow me to know if
    some path will bring me or not to a restaurant,
    with high probability?

16
Our goal/idea
  • Old Greeks
  • A whole is more than the sum of all parts
  • Gestalt theory
  • A whole is different than the sum of all parts

17
Better said
  • There are cases where the data cannot be analyzed
    by considering it in bits
  • There are cases where intelligibility of the
    pattern is important

18
What do people know about formal language theory?
Nothing
Lots
19
A small reminder on formal language theory
  • Chomsky hierarchy
  • and of grammars

20
A crash course in Formal language theory
  • Symbols
  • Strings
  • Languages
  • Chomsky hierarchy
  • Stochastic languages

21
Symbols
  • are taken from some alphabet ?

Strings
are sequences of symbols from ?
22
Languages
  • are sets of strings over ?

Languages
are subsets of ?
23
Special languages
  • Are recognised by finite state automata
  • Are generated by grammars

24
DFA Deterministic Finite State Automaton
25
b
a
a
a
b
b
abab?L
26
What is a context free grammar?
  • A 4-tuple (S, S, V, P) such that
  • S is the alphabet
  • V is a finite set of non terminals
  • S is the start symbol
  • P ? V ? (V?S) is a finite set of rules.

27
Example of a grammar
  • The Dyck1 grammar
  • (S, S, V, P)
  • S a, b
  • V S
  • P S ? aSbS, S ? ?

28
Derivations and derivation trees
S
  • S ? aSbS
  • ? aaSbSbS
  • ? aabSbS
  • ? aabbS
  • ? aabb

a
b
S
S
?
a
b
S
S
?
?
29
Chomsky Hierarchy
  • Level 0 no restriction
  • Level 1 context-sensitive
  • Level 2 context-free
  • Level 3 regular

30
Chomsky Hierarchy
  • Level 0 Whatever Turing machines can do
  • Level 1
  • anbncn n? ?
  • anbmcndm n,m? ?
  • uu u??
  • Level 2 context-free
  • anbn n? ?
  • brackets
  • Level 3 regular
  • Regular expressions (GREP)

31
The membership problem
  • Level 0 undecidable
  • Level 1 decidable
  • Level 2 polynomial
  • Level 3 linear

32
The equivalence problem
  • Level 0 undecidable
  • Level 1 undecidable
  • Level 2 undecidable
  • Level 3 Polynomial only when the representation
    is DFA.

33
b
a
b
a
a
b
PFA Probabilistic Finite (state) Automaton
34
0.1
b
0.9
a
a
0.35
0.7
a
0.7
b
0.65
b
0.3
0.3
DPFA Deterministic Probabilistic Finite (state)
Automaton
35
What is nice with grammars?
  • Compact representation
  • Recursivity
  • Says how a string belongs, not just if it belongs
  • Graphical representations (automata, parse trees)

36
What is not so nice with grammars?
  • Even the easiest class (level 3) contains SAT,
    Boolean functions, parity functions
  • Noise is very harmful
  • Think about putting edit noise to language w
    wa02?wb02

37
2 Specificities of grammatical inference
  • Grammatical inference consists (roughly) in
    finding the (a) grammar or automaton that has
    produced a given set of strings (sequences,
    trees, terms, graphs).

38
The field
Inductive Inference
Pattern Recognition
Machine Learning
Grammatical Inference
Computational linguistics
Computational biology
Web technologies
39
The data
  • Strings, trees, terms, graphs
  • Structural objects
  • Basically the same gap of information as in
    programming between tables/arrays and data
    structures

40
Alternatives to grammatical inference
  • 2 steps
  • Extract features from the strings
  • Use a very good method over ?n.

41
Examples of strings
  • A string in Gaelic and its translation to
    English
  • Tha thu cho duaichnidh ri èarr àirde de a
    coisich deas damh
  • You are as ugly as the north end of a southward
    traveling ox

42
(No Transcript)
43
(No Transcript)
44
gtA BAC41M14 LIBRARYCITB_978_SKB AAGCTTATTCAATAGT
TTATTAAACAGCTTCTTAAATAGGATATAAGGCAGTGCCATGTA GTGGA
TAAAAGTAATAATCATTATAATATTAAGAACTAATACATACTGAACACTT
TCAAT GGCACTTTACATGCACGGTCCCTTTAATCCTGAAAAAATGCTAT
TGCCATCTTTATTTCA GAGACCAGGGTGCTAAGGCTTGAGAGTGAAGCC
ACTTTCCCCAAGCTCACACAGCAAAGA CACGGGGACACCAGGACTCCAT
CTACTGCAGGTTGTCTGACTGGGAACCCCCATGCACCT GGCAGGTGACA
GAAATAGGAGGCATGTGCTGGGTTTGGAAGAGACACCTGGTGGGAGAGG
GCCCTGTGGAGCCAGATGGGGCTGAAAACAAATGTTGAATGCAAGAAAAG
TCGAGTTCCA GGGGCATTACATGCAGCAGGATATGCTTTTTAGAAAAAG
TCCAAAAACACTAAACTTCAA CAATATGTTCTTTTGGCTTGCATTTGTG
TATAACCGTAATTAAAAAGCAAGGGGACAACA CACAGTAGATTCAGGAT
AGGGGTCCCCTCTAGAAAGAAGGAGAAGGGGCAGGAGACAGGA TGGGGA
GGAGCACATAAGTAGATGTAAATTGCTGCTAATTTTTCTAGTCCTTGGTT
TGAA TGATAGGTTCATCAAGGGTCCATTACAAAAACATGTGTTAAGTTT
TTTAAAAATATAATA AAGGAGCCAGGTGTAGTTTGTCTTGAACCACAGT
TATGAAAAAAATTCCAACTTTGTGCA TCCAAGGACCAGATTTTTTTTAA
AATAAAGGATAAAAGGAATAAGAAATGAACAGCCAAG TATTCACTATCA
AATTTGAGGAATAATAGCCTGGCCAACATGGTGAAACTCCATCTCTAC T
AAAAATACAAAAATTAGCCAGGTGTGGTGGCTCATGCCTGTAGTCCCAGC
TACTTGCGA GGCTGAGGCAGGCTGAGAATCTCTTGAACCCAGGAAGTAG
AGGTTGCAGTAGGCCAAGAT GGCGCCACTGCACTCCAGCCTGGGTGACA
GAGCAAGACCCTATGTCCAAAAAAAAAAAAA AAAAAAAGGAAAAGAAAA
AGAAAGAAAACAGTGTATATATAGTATATAGCTGAAGCTCCC TGTGTAC
CCATCCCCAATTCCATTTCCCTTTTTTGTCCCAGAGAACACCCCATTCCT
GAC TAGTGTTTTATGTTCCTTTGCTTCTCTTTTTAAAAACTTCAATGCA
CACATATGCATCCA TGAACAACAGATAGTGGTTTTTGCATGACCTGAAA
CATTAATGAAATTGTATGATTCTAT
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
ltbookgt ltpartgt ltchaptergt
ltsect1/gt ltsect1gt
ltorderedlist numeration"arabic"gt
ltlistitem/gt ltffragbody/gt
lt/orderedlistgt lt/sect1gt
lt/chaptergt lt/partgt lt/bookgt
49
lt?xml version"1.0"?gtlt?xml-stylesheet
href"carmen.xsl" type"text/xsl"?gtlt?cocoon-proce
ss type"xslt"?gt lt!DOCTYPE pagina lt!ELEMENT
pagina (titulus?, poema)gtlt!ELEMENT titulus
(PCDATA)gtlt!ELEMENT auctor (praenomen, cognomen,
nomen)gtlt!ELEMENT praenomen (PCDATA)gtlt!ELEMENT
nomen (PCDATA)gtlt!ELEMENT cognomen
(PCDATA)gtlt!ELEMENT poema (versus)gtlt!ELEMENT
versus (PCDATA)gtgt ltpaginagtlttitulusgtCatullus
IIlt/titulusgtltauctorgtltpraenomengtGaiuslt/praenomengt
ltnomengtValeriuslt/nomengtltcognomengtCatulluslt/cogno
mengtlt/auctorgt
50
(No Transcript)
51
A logic program learned by GIFT
color_blind(Arg1) - start(Arg1,X), p11(
Arg1,X). start(X,X). p11(Arg1,P) -
mother(M,P),p4(Arg1, M). p4(Arg1,X) -
woman(X),father(F,X),p3(Arg1,F). p4(Arg1,X) -
woman(X),mother(M,X),p4(Arg1,M). p3(Arg1,X) -
man(X),color_blind(X).
52
3 Hardness of the task
  • One thing is to build algorithms, another is to
    be able to state that it works.
  • Some questions
  • Does this algorithm work?
  • Do I have enough learning data?
  • Do I need some extra bias?
  • Is this algorithm better than the other?
  • Is this problem easier than the other?

53
Alternatives to answer these questions
  • Use well admitted benchmarks
  • Build your own benchmarks
  • Solve a real problem
  • Prove things

54
Use well admitted benchmarks
  • yes allows to compare
  • no many parameters
  • problem difficult to better (also, in GI, not
    that many about!)

55
Build your own benchmarks
  • yes allows to progress
  • no against one-self
  • problem one invents the benchmark where one is
    best!

56
Solve a real problem
  • yes it is the final goal
  • no we dont always know why things work
  • problem how much pre-processing?

57
Theory
  • Because you may want to be able to say something
    more than seems to work in practice .

58
Identification in the limit
A class of languages
yields
L
Pres
? ??X
?
L
A learner
The naming function
G
A class of grammars
L(?(f))yields(f)
f(?)g(?) ?yields(f)yields(g)
59
L is identifiable in the limit in terms of G from
Pres iff ?L?L, ?f? Pres(L)
f1
f2
fn
fi
h1
h2
hn
hi
? hn
L(hi) L
60
  •      No quería componer otro Quijote lo cual es
    fácil sino el Quijote. Inútil agregar que no
    encaró nunca una transcripción mecánica del
    original no se proponía copiarlo. Su admirable
    ambición era producir unas páginas que
    coincidieran palabra por palabra y línea por
    línea con las de Miguel de Cervantes.
  •    
  • Mi empresa no es difícil, esencialmente leo
    en otro lugar de la carta. Me bastaría ser
    inmortal para llevarla a cabo.
  • Jorge Luis Borges(18991986)
  • Pierre Menard, autor del Quijote (El jardín de
    senderos que se bifurcan) Ficciones

61
4 Algorithmic ideas
62
The space of GI problems
  • Type of input (strings)
  • Presentation of input (batch)
  • Hypothesis space (subset of the regular grammars)
  • Success criteria (identification in the limit)

63
Types of input
64
Types of input - oracles
  • Membership queries
  • Is string S in the target language?
  • Equivalence queries
  • Is my hypothesis correct?
  • If not, provide counter example
  • Subset queries
  • Is the language of my hypothesis a subset of the
    target language?

65
Presentation of input
  • Arbitrary order
  • Shortest to longest
  • All positive and negative examples up to some
    length
  • Sampled according to some probability distribution

66
Presentation of input
  • Text presentation
  • A presentation of all strings in the target
    language
  • Complete presentation (informant)
  • A presentation of all strings over the alphabet
    of the target language labeled as or -

67
Hypothesis space
  • Regular grammars
  • A welter of subclasses
  • Context free grammars
  • Fewer subclasses
  • Hyper-edge replacement graph grammars

68
Success criteria
  • Identification in the limit
  • Text or informant presentation
  • After each example, learner guesses language
  • At some point, guess is correct and never changes
  • PAC learning

69
Theorems due to Gold
  • The good news
  • Any recursively enumerable class of languages can
    be learned in the limit from an informant (Gold,
    1967)
  • The bad news
  • A language class is superfinite if it includes
    all finite languages and at least one infinite
    language
  • No superfinite class of languages can be learned
    in the limit from a text (Gold, 1967)
  • That includes regular and context-free

70
A picture
Mildly context sensitive, from queries
DFA, from queries
A lot of information
DFA, from posneg
Context-free, from pos
Sub-classes of reg, from pos
Little information
Poor languages
Rich Languages
71
Algorithms
RPNI
K-Reversible
L
SEQUITUR
GRIDS
72
4.1 RPNI
  • Regular Positive and Negative Grammatical
    Inference
  • Identifying regular languages in polynomial time
  • Jose Oncina Pedro García 1992

73
  • It is a state merging algorithm
  • It identifies any regular language in the limit
  • It works in polynomial time
  • It admits polynomial charac-teristic sets.

74
The algorithm
  • function rmerge(A,p,q)
  • A merge(A,p,q)
  • while ?a??, p,q??A(r,a), p?q do
  • rmerge(A,p,q)

75
  • APTA(X) Fr ?(q0,a) a??
  • K q0
  • While Fr?? do
  • choose q from Fr
  • if ?p?K L(rmerge(A,p,q))?X-? then A
    rmerge(A,p,q)
  • else K K ? q
  • Fr ?(q,a) q?K K

76
X?, aaa, aaba, ababa, bb, bbaaa
7
a
11
a
a
4
8
b
2
a
b
a
14
a
b
9
12
5
1
a
15
b
13
3
b
a
a
10
6
X-aa, ab, aaaa, ba
77
Try to merge 2 and 1
7
a
11
a
a
4
8
b
2
a
b
a
14
a
b
9
12
5
1
a
15
b
13
3
b
a
a
10
6
X-aa, ab, aaaa, ba
78
Needs more merging for determinization
7
a
11
a
a
4
8
b
a
a
14
a
b
9
b
12
5
1,2
a
15
b
13
3
b
a
a
10
6
X-aa, ab, aaaa, ba
79
But now string aaaa is accepted, so the merge
must be rejected
a
9, 11
a
14
b
a
12
a
15
b
1,2,4,7
13
b
a
a
10
3,5,8
6
X-aa, ab, aaaa, ba
80
Try to merge 3 and 1
7
a
11
a
a
4
8
b
2
a
b
a
14
a
b
9
12
5
1
a
15
b
13
3
b
a
a
10
6
X-aa, ab, aaaa, ba
81
Requires to merge 6 with 1,3
7
a
11
a
a
4
8
b
2
a
b
a
14
a
b
9
12
5
1,3
a
15
b
b
13
a
a
10
6
X-aa, ab, aaaa, ba
82
And now to merge 2 with 10
7
a
11
a
a
4
8
b
2
a
b
a
14
a
b
9
1,3,6
12
5
a
15
a
b
13
a
10
X-aa, ab, aaaa, ba
83
And now to merge 4 with 13
7
a
11
a
a
4
2,10
8
b
a
b
a
14
a
b
9
1,3,6
12
5
a
15
a
b
13
X-aa, ab, aaaa, ba
84
And finally to merge 7 with 15
7
a
11
a
4,13
a
2,10
8
b
a
b
a
14
a
b
9
1,3,6
12
5
b
15
a
X-aa, ab, aaaa, ba
85
No counter example is accepted so the merges are
kept
7,15
a
11
a
4,13
a
2,10
8
b
a
b
a
14
a
b
9
1,3,6
12
5
b
X-aa, ab, aaaa, ba
86
Next possible merge to be checked is 4,13 with
1,3,6
7,15
a
11
a
4,13
a
2,10
8
b
a
b
a
14
a
b
9
1,3,6
12
5
b
X-aa, ab, aaaa, ba
87
More merging for determinization is needed
7,15
a
11
a
b
2,10
8
a
b
a
14
a
b
9
1,3,4,6,13
a
12
5
b
X-aa, ab, aaaa, ba
88
But now aa is accepted
2,7,10,11,15
a
b
a
14
1,3,4,6, 8,13
a
b
9
a
12
5
b
X-aa, ab, aaaa, ba
89
So we try 4,13 with 2,10
7,15
a
11
a
4,13
a
2,10
8
b
a
b
a
14
a
b
9
1,3,6
12
5
b
X-aa, ab, aaaa, ba
90
After determinizing, negative string aa is again
accepted
a
a
b
a
14
2,4,7,10, 13,15
a
b
1,3,6
12
9,11
5,8
b
X-aa, ab, aaaa, ba
91
So we try 5 with 1,3,6
7,15
a
11
a
4,13
a
2,10
8
b
a
b
a
14
a
b
9
1,3,6
12
5
b
X-aa, ab, aaaa, ba
92
But again we accept ab
7,15
a
11
a
4,13
a
2,9,10,14
8
b
a
1,3,5,6,12
b
b
X-aa, ab, aaaa, ba
93
So we try 5 with 2,10
7,15
a
11
a
4,13
a
2,10
8
b
a
b
a
14
a
b
9
1,3,6
12
5
b
X-aa, ab, aaaa, ba
94
Which is OK. So next possible merge is 7,15
with 1,3,6
7,15
11,14
a
4,9,13
a
2,5,10
a
b
a
1,3,6
8,12
b
b
X-aa, ab, aaaa, ba
95
Which is OK. Now try to merge 8,12 with
1,3,6,7,15
11,14
a
4,9,13
a
b
8,12
a
a
1,3,6, 7,15
a
2,5,10
b
b
X-aa, ab, aaaa, ba
96
And ab is accepted
a
4,9,13
b
1,3,6,7, 8,12,15
a
a
2,5,10,11,14
b
b
X-aa, ab, aaaa, ba
97
Now try to merge 8,12 with 4,9,13
11,14
a
4,9,13
a
b
8,12
a
a
1,3,6, 7,15
a
2,5,10
b
b
X-aa, ab, aaaa, ba
98
This is OK and no more merge is possible so the
algorithm halts.
a
4,8,9,12,13
b
a
a
1,3,6,7, 11,14,15
a
2,5,10
b
b
X-aa, ab, aaaa, ba
99
Definitions
  • Let ? be the length-lex ordering over ?
  • Let Pref(L) be the set of all prefixes of strings
    in some language L.

100
Short prefixes
  • Sp(L)u?Pref(L)
  • ?(q0,u)?(q0,v) ? u?v
  • There is one short prefix per useful state

101
Kernel-sets
  • N(L)ua?Pref(L) u?Sp(L)??
  • There is an element in the Kernel-set for each
    useful transition

102
A characteristic sample
  • A sample is characteristic (for RPNI) if
  • ?x?Sp(L) ?xu?X
  • ?x?Sp(L), ?y?N(L),
  • ?(q0,x)??(q0,y)
  • ?z?? xz?X?yz?X- ?
  • xz?X-?yz?X

103
About characteristic samples
  • If you add more strings to a characteristic
    sample it still is characteristic
  • There can be many different characteristic
    samples
  • Change the ordering (or the exploring function in
    RPNI) and the characteristic sample will change.

104
Conclusion
  • RPNI identifies any regular language in the
    limit
  • RPNI works in polynomial time. Complexity is in
    O(X3.X-)
  • There are many significant variants of RPNI
  • RPNI can be extended to other classes of
    grammars.

105
Open problems
  • RPNIs complexity is not a tight upper bound.
    Find the correct complexity.
  • The definition of the characteristic set is not
    tight either. Find a better definition.

106
Algorithms
RPNI
K-Reversible
L
SEQUITUR
GRIDS
107
4.2 The k-reversible languages
  • The class was proposed by Angluin (1982).
  • The class is identifiable in the limit from text.
  • The class is composed by regular languages that
    can be accepted by a DFA such that its reverse is
    deterministic with a look-ahead of k.

108
  • Let A(?, Q, ?, I, F) be a NFA,
  • we denote by AT(?, Q, ?T, F, I) the reversal
    automaton with
  • ?T(q,a)q?Q q??(q,a)

109
a
b
A
a
2
b
1
0
4
a
a
a
3
a
AT
b
a
2
b
1
0
4
a
a
a
3
110
Some definitions
  • u is a k-successor of q if uk and ?(q,u)??.
  • u is a k-predecessor of q if uk and
    ?T(q,uT)??.
  • ? is 0-successor and 0-predecessor of any state.

111
a
b
A
a
2
b
1
0
4
a
a
a
3
  • aa is a 2-successor of 0 and 1 but not of 3.
  • a is a 1-successor of 3.
  • aa is a 2-predecessor of 3 but not of 1.

112
  • A NFA is deterministic with look-ahead k iff
    ?q,q?Q q?q
  • (q,q?I) ? (q,q??(q,a))
  • ?

(u is a k-successor of q) ? (v is a k-successor
of q)
?
u?v
113
Prohibited
u
1
uk
a
a
u
2
114
Example
a
b
a
2
b
1
0
4
a
a
a
3
  • This automaton is not deterministic with
    look-ahead 1 but is deterministic with look-ahead
    2.

115
K-reversible automata
  • A is k-reversible if A is deterministic and AT is
    deterministic with look-ahead k.
  • Example

b
b
a
a
b
2
a
b
2
a
1
0
1
0
b
b
deterministic with look-ahead 1
deterministic
116
Violation of k-reversibility
  • Two states q, q violate the k-reversibility
    condition iff
  • they violate the deterministic condition
    q,q??(q,a)
  • or
  • they violate the look-ahead condition
  • q,q?F, ?u??k u is k-predecessor of both
  • ?u??k, ?(q,a)?(q,a) and u is k-predecessor of
    both q and q.

117
Learning k-reversible automata
  • Key idea the order in which the merges are
    performed does not matter!
  • Just merge states that do not comply with the
    conditions for k-reversibility.

118
K-RL Algorithm (?k-RL)
  • Data k??, X sample of a k-RL L
  • APTA(X)
  • While ?q,q k-reversibility violators do
  • Amerge(A,q,q)

119
Let Xa, aa, abba, abbbba
k2
a
aa
abba
a
a
a
?
b
b
b
b
a
ab
abb
abbbb
abbb
abbbba
Violators, for u ba
120
Let Xa, aa, abba, abbbba
k2
a
aa
abba
a
a
a
a
?
b
b
b
b
ab
abb
abbbb
abbb
Violators, for u bb
121
Let Xa, aa, abba, abbbba
k2
a
aa
abba
a
a
a
b
?
b
b
b
ab
abb
abbb
122
Properties (1)
  • ?k?0, ?X, ?k-RL(X) is a k-reversible language.
  • L(?k-RL(X)) is the smallest k-reversible language
    that contains X.
  • The class Lk-RL is identifiable in the limit from
    text.

123
Properties (2)
  • Any regular language is k-reversible iff
  • (u1v)-1L ?(u2v)-1L?? and vk
  • ?
  • (u1v)-1L(u2v)-1L
  • (if two strings are prefixes of a string of
    length at least k, then the strings are
    Nerode-equivalent)

124
Properties (3)
  • Lk-RL(X) ? L(k1)-RL(X)
  • Lk-TSS(X) ? L(k-1)-RL(X)

125
Properties (4)
  • The time complexity is O(kX3).
  • The space complexity is O(X).
  • The algorithm is not incremental.

126
Properties (4) Polynomial aspects
  • Polynomial characteristic sets
  • Polynomial update time
  • But not necessarily a polynomial number of mind
    changes

127
Extensions
  • Sakakibara built an extension for context-free
    grammars whose tree language is k-reversible
  • Marion Besombes propose an extension to tree
    languages.
  • Different authors propose to learn these automata
    and then estimate the probabilities as an
    alternative to learning stochastic automata.

128
Exercises
  • Construct a language L that is not k-reversible,
    ?k?0.
  • Prove that the class of k-reversible languages is
    not in TxtEx.
  • Run ?k-RL on Xaa, aba, abb, abaaba, baaba for
    k0,1,2,3

129
Solution (idea)
  • Lkai i?k
  • Then for each k Lk is k-reversible but not
    k-1-reversible.
  • And ULk a
  • So there is an accumulation point

130
Algorithms
RPNI
K-Reversible
L
SEQUITUR
GRIDS
131
4.4 Active Learning learning DFA from
membership and equivalence queries the L
algorithm
132
The classes C and H
  • sets of examples
  • representations of these sets
  • the computation of L(x) (and h(x)) must take
    place in time polynomial in ?x?.

133
Correct learning
  • A class C is identifiable with a polynomial
    number of queries of type T if there exists an
    algorithm ? that
  • ?L?C identifies L with a polynomial number of
    queries of type T
  • does each update in time polynomial in ?f? and in
    ??xi?, xi counter-examples seen so far.

134
Algorithm L
  • Angluins papers
  • Some talks by Rivest
  • Kearns and Vazirani
  • Balcazar, Diaz, Gavaldà Watanabe

135
Some references
  • Learning regular sets from queries and
    counter-examples, D. Angluin, Information and
    computation, 75, 87-106, 1987.
  • Queries and Concept learning, D. Angluin, Machine
    Learning, 2, 319-342, 1988.
  • Negative results for Equivalence Queries, D.
    Angluin, Machine Learning, 5, 121-150, 1990.

136
The Minimal Adequate Teacher
  • You are allowed
  • strong equivalence queries
  • membership queries.

137
General idea of L
  • find a consistent table (representing a DFA)
  • submit it as an equivalence query
  • use counterexample to update the table
  • submit membership queries to make the table
    complete
  • Iterate.

138
An observation table
?
a
?
1
0
0
0
a
b
0
1
0
0
aa
0
1
ab
139
The experiments (E)
?
a
?
1
0
The states (S) or test set
0
0
a
b
0
1
aa
0
0
The transitions (T)
0
1
ab
140
Meaning
?
a
?
1
0
?(q0, ?. ?)?F ? ? ?L
0
0
a
b
0
1
aa
0
0
0
1
ab
141
?
a
?
1
0
0
0
?(q0, ab.a)? F ? aba ? L
a
b
0
1
aa
0
0
ab
0
1
142
Equivalent prefixes
?
a
?
1
0
These two rows are equal, hence ?(q0,?)
?(q0,ab)
0
0
a
b
0
1
aa
0
0
ab
0
1
143
Building a DFA from a table
?
a
?
1
0
0
0
a
b
0
1
aa
0
0
ab
0
1
144
b
?
a
?
?
1
0
b
a
0
0
a
a
b
0
1
aa
0
0
a
ab
0
1
145
Some rules
This set is suffix-closed
b
?
a
?
?
1
0
This set is prefix- closed
b
a
0
0
a
a
b
0
1
aa
0
0
S?\ST
a
0
1
ab
146
An incomplete table
b
?
a
?
?
1
0
b
a
0
a
a
b
0
1
aa
a
0
ab
0
1
147
Good idea
  • We can complete the table by making membership
    queries...

v
Membership query
u
?
uv?L ?
148
A table is
  • closed if any row of T corresponds to some row
    in S

?
a
?
1
0
0
0
a
b
0
1
1
0
aa
Not closed
0
1
ab
149
And a table that is not closed
b
?
a
?
?
1
0
b
a
0
0
a
a
b
0
1
aa
1
0
a
0
1
ab
?
150
What do we do when we have a table that is not
closed?
  • Let s be the row (of T) that does not appear in
    S.
  • Add s to S, and ?a?? sa to T.

151
An inconsistent table
?
a
?
1
0
Are a and b equivalent?
aa
0
1
ab
0
1
ba
0
1
0
0
bb
152
A table is consistent if
  • Every equivalent pair of rows in H remains
    equivalent in S after appending any symbol
  • row(s1)row(s2)
  • ?
  • ?a??, row(s1a)row(s2a)

153
What do we do when we have an inconsistent table?
  • Let a?? be such that row(s1)row(s2) but
    row(s1a)?row(s2a)
  • If row(s1a)?row(s2a), it is so for experiment e
  • Then add experiment ae to the table

154
What do we do when we have a closed and
consistent table ?
  • We build the corresponding DFA
  • We make an equivalence query!!!

155
What do we do if we get a counter-example?
  • Let u be this counter-example
  • ?w?Pref(u) do
  • add w to S
  • ?a??, such that wa?Pref(u) add wa to T

156
Run of the algorithm
?
?
Table is now closed and consistent
157
An equivalence query is made!
Counter example baa is returned
158
?
?
1
b
1
ba
0
baa
a
1
bb
bab
baaa
baab
159
? a
Table is now closed and consistent
?
1 1
b
1 1
ba
1 0
0 0
baa
a
1
bb
1
bab
1
baaa
0
baab
1
160
Proof of the algorithm
Sketch only Understanding the proof is important
for further algorithms Balcazar et al. is a good
place for that.
161
Termination / Correctness
  • For every regular language there is a unique
    minimal DFA that recognizes it.
  • Given a closed and consistent table, one can
    generate a consistent DFA.
  • A DFA consistent with a table has at least as
    many states as different rows in S.
  • If the algorithm has built a table with n
    different rows in S, then it is the target.

162
Finiteness
  • Each closure failure adds one different row to S.
  • Each inconsistency failure adds one experiment,
    which also creates a new row in S.
  • Each counterexample adds one different row to S.

163
Polynomial
  • E ? n
  • at most n-1 equivalence queries
  • membership queries ? n(n-1)m where m is the
    length of the longest counter-example returned by
    the oracle

164
Conclusion
  • With an MAT you can learn DFA
  • but also a variety of other classes of grammars
  • it is difficult to see how powerful is really an
    MAT
  • probably as much as PAC learning.
  • Easy to find a class, a set of queries and
    provide and algorithm that learns with them
  • more difficult for it to be meaningful.
  • Discussion why are these queries meaningful?

165
Algorithms
RPNI
K-Reversible
L
SEQUITUR
GRIDS
166
4.5 SEQUITUR
  • (http//sequence.rutgers.edu/sequitur/)
  • (Neville Manning Witten, 97)
  • Idea construct a CF grammar from a very long
    string w, such that L(G)w
  • No generalization
  • Linear time (/-)
  • Good compression rates

167
Principle
  • The grammar with respect to the string
  • Each rule has to be used at least twice
  • There can be no sub-string of length 2 that
    appears twice.

168
Examples
S ?aAdA A ?bc
S?abcdbc
S?AaA A ?aab
S?aabaaab
S?AbAab A ?aa
169
abcabdabcabd
170
  • In the beginning, God created the heavens and the
    earth.
  • And the earth was without form, and void and
    darkness was upon the face of the deep. And the
    Spirit of God moved upon the face of the waters.
  • And God said, Let there be light and there was
    light.
  • And God saw the light, that it was good and God
    divided the light from the darkness.
  • And God called the light Day, and the darkness he
    called Night. And the evening and the morning
    were the first day.
  • And God said, Let there be a firmament in the
    midst of the waters, and let it divide the waters
    from the waters.
  • And God made the firmament, and divided the
    waters which were under the firmament from the
    waters which were above the firmament and it was
    so.
  • And God called the firmament Heaven. And the
    evening and the morning were the second day.

171
(No Transcript)
172
Sequitur options
  • appending a symbol to rule S
  • using an existing rule
  • creating a new rule
  • and deleting a rule.

173
Results
  • On text
  • 2.82 bpc
  • compress 3.46 bpc
  • gzip 3.25 bpc
  • PPMC 2.52 bpc

174
Algorithms
RPNI
K-Reversible
L
SEQUITUR
GRIDS
175
4.6 Using a simplicity bias
  • (Langley Stromsten, 00)
  • Based on algorithm GRIDS (Wolff, 82)
  • Main characteristics
  • MDL principle
  • Not characterizable
  • Not tested on large benchmarks.

176
Two learning operators
  • Creation of non terminals and rules

NP ?ART ADJ NOUN NP ?ART ADJ ADJ NOUN
NP ?ART AP1 NP ?ART ADJ AP1 AP1 ? ADJ NOUN
177
  • Merging two non terminals

NP ?ART AP1 NP ?ART AP2 AP1 ? ADJ NOUN AP2 ? ADJ
AP1
NP ?ART AP1 AP1 ? ADJ NOUN AP1 ? ADJ AP1
178
  • Scoring function MDL principle ?G??w?T ?d(w)?
  • Algorithm
  • find best merge that improves current grammar
  • if no such merge exists, find best creation
  • halt when no improvement

179
Results
  • On subsets of English grammars (15 rules, 8 non
    terminals, 9 terminals) 120 sentences to
    converge
  • on (ab) all (15) strings of length ? 30
  • on Dyck1 all (65) strings of length ? 12

180
Algorithms
RPNI
K-Reversible
L
SEQUITUR
GRIDS
181
5 Open questions and conclusions
  • dealing with noise
  • classes of languages that adequately mix
    Chomskys hierarchy with edit distance compacity
  • stochastic context-free grammars
  • polynomial learning from text
  • learning POMDPs
  • fast algorithms

182
ERNESTO SÁBATO, EL TÚNEL 
  •    Intuí que había caído en una trampa y quise
    huir. Hice un enorme esfuerzo, pero era tarde mi
    cuerpo ya no me obedecía. Me resigné a presenciar
    lo que iba a pasar, como si fuera un
    acontecimiento ajeno a mi persona. El hombre
    aquel comenzó a transformarme en pájaro, en un
    pájaro de tamaño humano. Empezó por los pies vi
    cómo se convenían poco a poco en unas patas de
    gallo o algo así. Después siguió la
    transformación de todo el cuerpo, hacia arriba,
    como sube el agua en un estanque. Mi única
    esperanza estaba ahora en los amigos, que
    inexplicablemente no habían llegado. Cuando por
    fin llegaron, sucedió algo que me horrorizó no
    notaron mi transformación. Me trataron como
    siempre, lo que probaba que me veían como
    siempre. Pensando que el mago los ilusionaba de
    modo que me vieran como una persona normal,
    decidí referir lo que me había hecho. Aunque mi
    propósito era referir el fenómeno con
    tranquilidad, para no agravar la situación
    irritando al mago con una reacción demasiado
    violenta (lo que podría inducirlo a hacer algo
    todavía peor), comencé a contar todo a gritos.
    Entonces observé dos hechos asombrosos la frase
    que quería pronunciar salió convertida en un
    áspero chillido de pájaro, un chillido
    desesperado y extraño, quizá por lo que encerraba
    de humano y, lo que era infinitamente peor, mis
    amigos no oyeron ese chillido, como no habían
    visto mi cuerpo de gran pájaro por el contrario,
    parecían oír mi voz habitual diciendo cosas
    habituales, porque en ningún momento mostraron el
    menor asombro. Me callé, espantado. El dueño de
    casa me miró entonces con un sarcástico brillo en
    sus ojos, casi imperceptible y en todo caso sólo
    advertido por mí. Entonces comprendí que nadie,
    nunca, sabría que yo había sido transformado en
    pájaro. Estaba perdido para siempre y el secreto
    iría conmigo a la tumba.  
Write a Comment
User Comments (0)
About PowerShow.com