Title: Two-Stage Constraint Based Hindi Parser
1Two-Stage Constraint Based Hindi Parser
LTRC, IIIT Hyderabad
2Brief Recap
- Broad coverage parser
- Dependency
- Paninian framework
- vibhakti-karaka correspondence
- karaka frames (basic transformation)
- Source groups, demand groups
- Constraints
- Three basic constraints
- Constraints as Integer programming equations
3Parser
- Two stage strategy
- Appropriate constraints formed
- Stage I (Intra-clausal relations)
- Dependency relations marked
- Relations such as k1, k2, k3, etc. for each verb
- Stage II (Inter-clausal relations conjunct
relations) - Conjuncts, relative clauses, kriya mula, etc
- In certain cases, separates syntax from semantics
(eg. kriya mula), in others, reduces the
complexity.
4Steps in Parsing
SENTENCE
Morph, POS tagging, Chunking
Identify Demand Groups
STAGE - II
Load Frames Transform
YES
Is Complex
NO
Find Candidates
Apply Constraints Solve
Final Parse
5Stage I Types being handled
- Simple Sentences (finite verbs)
- Clausal arguments
- Non-finite verbs
- wA_huA
- wA_hI
- nA
- kara
- 0_rahe, etc.
- Copula
- Genitive
6Stage - II
- Handles
- Conjuncts
- Subordinating Coordinating
- Relative clauses
- Complex predicates
- Basic constraints similar to Stage-I
- Some additional constraints
- New demand groups
- New candidates
7Steps (Stage II)
Identify New Demand Groups
Load Frames Transform
Output of STAGE - I
Find Candidates
Repair
Apply Constraints Solve
FINAL PARSE
8Example Relative Clause
- vaha puswaka jo rAma ne mohana ko
xI hE prasixXa hE - that book which Ram ERG. Mohana
DAT. gave is famous is - The book which Ram gave to Mohana is famous
9Output after Stage - I
_ROOT_
main
main
hE
xI
k1
k1s
prasixXa
puswaka
k2
k1
k4
vaha
jo
mohana
rAma
10Identify the demand group
- xiyA give
- Main verb of the relative clause
11Identify the demand group,Load and Transform DF
- jo which transformation (special)
- Transforms the demand frame of the main verb of
the relative clause - --------------------------------------------------
--------------------------------------------------
---------- - arc-label necessity vibhakti
lextype src-pos arc-dir
oprt - --------------------------------------------------
--------------------------------------------------
---------- - nmod__relc m any
n rl p
insert - --------------------------------------------------
--------------------------------------------------
----------
12Karaka Frame
Main verb of relative clause
- vaha puswaka jo rAma ne mohana
ko xI prasixXa hE - that book which Ram ERG. Mohana
DAT. gave famous is - The book which Ram gave to Mohana is famous
Transformed frame for xe after applying the jo
trasformation
--------------------------------------------------
--------------------------------------------------
---- arc-label necessity vibhakti
lextype src-pos arc-dir
oprt ---------------------------------------------
--------------------------------------------------
--------- nmod__relc m any
n rl
p insert --------------------------------
--------------------------------------------------
-----------------------
New row inserted after transformation
13Possible candidates
nmod__relc
- vaha puswaka jo rAma ne mohana ko xI hE
prasixXa hE
14Output after Stage - II
_ROOT_
main
hE
k1
k1s
prasixXa
vaha puswaka
nmod__relc
xiyA hE
k1
k2
k4
rAma
mohana
jo
15Example II Coordination
- rAma Ora siwA kala Aye
- Ram and Sita yesterday came
- Ram and Sita came yesterday
16Output of Stage - I
_ROOT_
dummy
main
dummy
rAma
Aye
Ora
k1
k7t
siwA
kala
17For Stage II (Constraint Graph)
_ROOT_
main
rAma
Aye
Ora
k1
k7t
ccof
siwA
ccof
kala
18Candidate Arcs
_ROOT_
main
k1
rAma
Aye
Ora
k1
k1
ccof
siwA
ccof
kala
19Solution Graph
_ROOT_
main
k1
rAma
Aye
Ora
k7t
ccof
siwA
ccof
kala
20Parse tree
_ROOT_
main
Aye
k7t
k1
kala
Ora
ccof
ccof
siwA
rAma
Output after Stage II
21Finite Verb Coordination
- rAma Gara gayA Ora vaha so gayA
- Ram home went and he
sleep went - Ram went home and slept
_ROOT_
main
main
dummy
so
gayA
Ora
k1
k1
k2
vaha
rAma
Gara
Output after Stage I
22Karaka Frame - Ora
Finite
Ora
Ora
ccof
ccof
ccof
ccof
v_fin
gayA
v_fin
so
23Finite Verb Coordination (Parse Tree)
_ROOT_
main
Ora
ccof
ccof
gayA
so
k1
k1
k2
rAma
vaha
Gara
Output after Stage II
24Relative Clause Coordination
- rAma ne vaha puswaka KarIxI jo prasixXa hE Ora
jo saswI hE - Ram purchased the book which is famous and
which is cheap
_ROOT_
main
main
main
dummy
KarIxI
hE
Ora
hE
k1
k1s
k1
k1s
k2
k1
jo
prasixXa
jo
saswI
puswaka
rAma
Output after Stage I
25Karaka Frame - Ora
Relative Clause
n
puswaka
nmod__relc
nmod__relc
Ora
Ora
ccof
ccof
ccof
ccof
v_rel
v_rel
hE
hE
26Relative Clause Coordination (Parse Tree)
_ROOT_
main
KarIxI
k2
k1
puswaka
rAma
nmod__relc
Ora
ccof
ccof
hE
hE
k1
k1s
k1
k1s
jo
prasixXa
jo
saswI
Output after Stage II
27Non-Finite Verb Coordination
- rAma Kelakara Ora KAnA KAkara so
gayA - Ram having played and food having
eaten sleep went
_ROOT_
main
dummy
so
Ora
vmod
vmod
k1
rAma
Kelakara
KAkara
k2
KAnA
Output after Stage I
28Karaka Frame - Ora
Non-Finite
so
v_fin
Ora
Ora
ccof
ccof
ccof
ccof
v_nfin
v_nfin
Kelakara
KAkara
29Non-Finite Verb Coordination (Parse Tree)
_ROOT_
main
so
vmod
k1
Ora
rAma
ccof
ccof
KAkara
Kelakara
k2
KAnA
Output after Stage II
30Nominal Coordination
- rAma Ora siwA kala Aye
- Ram and Sita yesterday
came - Ram and Sita came yesterday
_ROOT_
dummy
main
dummy
rAma
Aye
Ora
k1
k7t
siwA
kala
Output after Stage I
31Karaka Frame - Ora
Nominal
Ora
Ora
ccof
ccof
ccof
ccof
siwA
rAma
n
n
32Nominal Coordination (Parse Tree)
_ROOT_
main
Aye
k7t
k1
kala
Ora
ccof
ccof
siwA
rAma
Output after Stage II
33Example
_ROOT_
dummy
main
dummy
rAma
Aye
Ora
k1
k7t
siwA
kala
34Steps (Stage II)
Identify Nodes
Load Frames Transform
Identify New Demand Groups
Output of STAGE - I
Find Candidates
Repair
Apply Constraints Solve
FINAL PARSE
35Constraint Graph Nodes (Stage II)
- Selected from the intermediate parse tree (Stage
I) - Set-I (demand nodes)
- Conjuncts
- Nearest verbal ancestor of jo (usually just the
parent) - _ROOT_
- Children of _ROOT_ other than (1) and (2).
- Other nodes which are added due to nodes in Set 2
36Constraint Graph Nodes (Stage II)
- Set-II (source nodes)
- Possible children and parents of conjuncts
- Possible heads of the relative clause.
-
- Identification of nodes in Set-II will generally
trigger the repair.
37Steps (Stage II)
Identify Nodes
Load Frames Transform
Identify New Demand Groups
Output of STAGE - I
Find Candidates
Repair
Apply Constraints Solve
FINAL PARSE
38Identify the demand group
39Steps (Stage II)
Identify Nodes
Load Frames Transform
Identify New Demand Groups
Output of STAGE - I
Find Candidates
Repair
Apply Constraints Solve
FINAL PARSE
40General Principles
- Repair/Revision
- Any node which becomes a potential child in stage
2, its arc to its existing parent is open to
revision - rAma Ora siwA kala Aye
- Node 4 becomes potential child (of node
- 1)
- Its parent (node 2) is open to revision
41General Principles
- Repair/Revision after parse of stage I
- Any node which becomes a potential parent must be
re-looked at. - rAma Ora siwA kala Aye
- Node 2 becomes potential parent (of 1)
- Its child (node 4) is open to revision
42Algorithm
- Identify nodes of the constraint graph
- From Set 1, and
- From Set 2
- Remove all outgoing edges from _ROOT_.
- Find possible candidates for demand nodes present
in Set 1 from Set 2 - Parent candidate for finite verb
- Parent and children for conjuncts
- Children of _ROOT_
- Convert the formed constraint graph into integer
programming (IP) problem. - Solve the IP equations to get the possible
solution parse.
43An example
- raama aura sitaa kala aaye
- Ram and Sita yesterday
came - Ram and Sita came yesterday
_ROOT_
dummy
main
dummy
rAma
Aye
Ora
k1
k7t
siwA
kala
44Identify Nodes
_ROOT_
dummy
main
dummy
rAma
Aye
Ora
k1
k7t
siwA
kala
_ROOT_
dummy
main
dummy
rAma
Aye
Ora
k1
k7t
siwA
kala
45Constraint Graph
- New Constraint Graph
- Ora, Aye and _ROOT_ are the demand groups
- Note kala remains attached to its parent
aaye (does not show up in stage 2)
_ROOT_
main
k1
ccof
Aye
Ora
rAma
k1
ccof
siwA
46Example
_ROOT_
main
Aye
k7t
k1
kala
Ora
ccof
ccof
siwA
rAma
47Types of complex sentences
- Relative clauses
- Initial
- Final
- Medial
- Conjuncts (Coordination)
- Simple clause
- Relative clause
- Non-finite
- Nominal, adjectival, adverbial
48Some other examples
- rAma ne vaha puswaka KarIxI jo saswI hE Ora jo
bAjZAra meM prasixXa hE - samIra Ora aBay ne vaha puswaka KarIxI jo saswI
hE Ora jo bAjZAra meM prasixXa hE - rAma Ora mohana ke xoswa kI baccI Aye
- Only baccI came, or
- Both rAma and baccI came
- Use of gnp of the main verb, Aye vs. AI
49