Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree

Description:

Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree Reporter: Longhua Qian School of Computer Science and Technology – PowerPoint PPT presentation

Number of Views:753

Avg rating:3.0/5.0

Slides: 25

Provided by: qlh

Category:

more less

Transcript and Presenter's Notes

Title: Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree

1
Tree Kernel-based Semantic Relation Extraction
using Unified Dynamic Relation Tree

Reporter Longhua Qian
School of Computer Science and Technology
Soochow University, Suzhou, China
2008.07.23
ALPIT2008, DaLian, China

2
Outline

1. Introduction
2. Dynamic Relation Tree
3. Unified Dynamic Relation Tree
4. Experimental results
5. Conclusion and Future Work

3
1. Introduction

Information extraction is an important research
topic in NLP.
It attempts to find relevant information from a
large amount of text documents available in
digital archives and the WWW.
Information extraction by NIST ACE
Entity Detection and Tracking (EDT)
Relation Detection and Characterization (RDC)
Event Detection and Characterization (EDC)

4
RDC

Function
RDC detects and classifies semantic relationships
(usually of predefined types) between pairs of
entities. Relation extraction is very useful for
a wide range of advanced NLP applications, such
as question answering and text summarization.
E.g.
The sentence Microsoft Corp. is based in
Redmond, WA conveys the relation GPE-AFF.Based
between Microsoft Corp (ORG) and Redmond
(GPE).

5
Two approaches

Feature-based methods
have dominated the research in relation
extraction over the past years. However, relevant
research shows that its difficult to extract new
effective features and further improve the
performance.
Kernel-based methods
compute the similarity of two objects (e.g. parse
trees) directly. The key problem is how to
represent and capture structured information in
complex structures, such as the syntactic
information in the parse tree for relation
extraction?

6
Kernel-based related work

Zelenko et al. (2003), Culotta and Sorensen
(2004), Bunescu and Mooney (2005) described
several kernels between shallow parse trees or
dependency trees to extract semantic relations.
Zhang et al. (2006), Zhou et al. (2007) proposed
composite kernels consisting of an linear kernel
and a convolution parse tree kernel, and the
latter can effectively capture structured
syntactic information inherent in parse trees.

7
Structured syntactic information

A tree span for relation instance
a part of a parse tree used to represent the
structured syntactic information for relation
extraction.
Two currently used tree spans
PT(Path-enclosed Tree) the sub-tree enclosed by
the shortest path linking the two entities in the
parse tree
CSPT(Context-Sensitive Path-enclosed Tree)
Dynamically determined by further extending the
necessary predicate-linked path information
outside PT.

8
Current problems

Noisy information
Both PT and CSPT may still contain noisy
information. In other words, more noise should be
pruned away from a tree span.
Useful information
CSPT only captures part of context-sensitive
information only relating to predicate-linked
path. That is to say, more information outside
PT/CSPT may be recovered so as to discern their
relationships.

9
Our solution

Dynamic Relation Tree (DRT)
Based on PT, we apply a variety of
linguistics-driven rules to dynamically prune out
noisy information from a syntactic parse tree and
include necessary contextual information.
Unified Dynamic Relation Tree (UDRT)
Instead of constructing composite kernels,
various kinds of entity-related semantic
information, including entity types/sub-types/ment
ion levels etc., are unified into a Dynamic
Relation Tree.

10
2. Dynamic Relation Tree

Generation of DRT
Starting from PT, we further apply three kinds of
operations (i.e. Remove, Compress, and Expansion)
sequentially to reshaping PT, giving rise to a
Dynamic Relation Tree at last.
Remove operation
DEL_ENT2_PRE Removing all the constituents
(except the headword) of the 2nd entity
DEL_PATH_ADVP/PP Removing adverb or preposition
phrases along the path

11
DRT(cont)

Compress operation
CMP_NP_CC_NP Compressing noun phrase
coordination conjunction
CMP_VP_CC_VP Compressing verb phrase
coordination conjunction
CMP_SINGLE_INOUT Compressing single in-and-out
nodes
Expansion operation
EXP_ENT2_POS Expanding the possessive structure
after the 2nd entity
EXP_ENT2_COREF Expanding entity coreferential
mention before the 2nd entity

12
Some examples of DRT
13
3.Unified Dynamic Relation Tree

T1 DRT
T2 UDRT-Bottom
T3 UDRT-Entity
T4 UDRT-Top

14
Four UDRT setups

T1 DRT
there is no entity-related information except
the entity order (i.e. E1 and E2).
T2 UDRT-Bottom
the DRT with entity-related information attached
at the bottom of two entity nodes
T3 UDRT-Entity
the DRT with entity-related information attached
in entity nodes
T4 UDRT-Top
the DRT with entity-related feature attached at
the top node of the tree.

15
4. Experimental results

Corpus Statistics
The ACE RDC 2004 data contains 451 documents and
5702 relation instances. It defines 7 entity
major types, 7 major relation type and 23
relation subtypes.
Evaluation is done on 347 (nwire/bnews) documents
and 4307 relation instances using 5-fold
cross-validation.
Corpus processing
parsed using Charniaks parser (Charniak, 2001)
Relation instances are generated by iterating
over all pairs of entity mentions occurring in
the same sentence.

16
Classifier

Tools
SVMLight (Joachims 1998)
Tree Kernel Tooklits (Moschitti 2004)
The training parameters C (SVM) and ? (tree
kernel) are also set to 2.4 and 0.4 respectively.
One vs. others strategy
which builds K basic binary classifiers so as to
separate one class from all the others.

17
Contribution of various operation rules

Each operation rule is incrementally applied on
the previously derived tree span.
The plus sign preceding a specific rule indicates
that this rule is useful and will be added
automatically in the next round.
Otherwise, the performance is unavailable.

Operation rules P R F
PT (baseline) 76.3 59.8 67.1
DEL_ENT2_PRE 76.3 62.1 68.5
DEL_PATH_PP - - -
DEL_PATH_ADVP - - -
CMP_SINGLE_INOUT 76.4 63.1 69.1
CMP_NP_CC_NP 76.1 63.3 69.1
CMP_VP_CC_VP - - -
EXP_ENT2_POS 76.6 63.8 69.6
EXP_ENT2_COREF 77.1 64.3 70.1
18
Comparison of different UDRT setups
Tree Setups P R F
DRT 68.7 53.5 60.1
UDRT-Bottom 76.2 64.4 69.8
UDRT-Entity 77.1 64.3 70.1
UDRT-Top 76.4 65.2 70.4

Compared with DRT, the Unified Dynamic Relation
Trees (UDRTs) with only entity type information
significantly improve the F-measure by average 10
units due to the increase both in precision and
recall.
Among the three UDRTs, UDRT-Top achieves slightly
better performance than the other two.

19
Improvements of different tree setups over PT
Tree Setups P R F
CSPT over PT 1.5 1.1 1.3
DRT over PT 0.1 5.4 3.3
UDRT-Top over PT 3.9 9.4 7.2

Dynamic Relation Tree (DRT) performs better that
CSPT/PT setups.
the Unified Dynamic Relation Tree with
entity-related semantic features attached at the
top node of the parse tree performs best.

20
Comparison with best-reported systems
Systems P R F Systems P R F
Zhou et al. Composite kernel 82.2 70.2 75.8 Ours CTK with UDRT-Top 80.2 69.2 74.3
Zhang et al. Composite kernel 76.1 68.4 72.1 Zhou et al. CS-CTK with CSPT 81.1 66.7 73.2
Zhao and Grishman Composite kernel 69.2 70.5 70.4 Zhang et al. CTK with PT 74.1 62.4 67.7

It shows that our UDRT-Top performs best among
tree setups using one single kernel, and even
better than the two previous composite kernels.

21
5. Conclusion

Dynamic Relation Tree (DRT), which is generated
by applying various linguistics-driven rules, can
significantly improve the performance over
currently used tree spans for relation
extraction.
Integrating entity-related semantic information
into DRT can further improve the performance,
esp. when they are attached at the top node of
the tree.

22
Future Work

we will focus on semantic matching in computing
the similarity between two parse trees, where
semantic similarity between content words (such
as hire and employ) would be considered to
achieve better generalization.

23
References

Bunescu R. C. and Mooney R. J. 2005. A Shortest
Path Dependency Kernel for Relation Extraction.
EMNLP-2005
Chianiak E. 2001. Intermediate-head Parsing for
Language Models. ACL-2001
Collins M. and Duffy N. 2001. Convolution Kernels
for Natural Language. NIPS-2001
Collins M. and Duffy, N. 2002. New Ranking
Algorithm for Parsing and Tagging Kernel over
Discrete Structure, and the Voted Perceptron.
ACL-02
Culotta A. and Sorensen J. 2004. Dependency tree
kernels for relation extraction. ACL2004.
Joachims T. 1998. Text Categorization with
Support Vector Machine learning with many
relevant features. ECML-1998
Moschitti A. 2004. A Study on Convolution Kernels
for Shallow Semantic Parsing. ACL-2004
Zelenko D., Aone C. and Richardella A. 2003.
Kernel Methods for Relation Extraction. Journal
of MachineLearning Research. 2003(2) 1083-1106
Zhang M., , Zhang J. Su J. and Zhou G.D. 2006. A
Composite Kernel to Extract Relations between
Entities with both Flat and Structured Features.
COLING-ACL2006.
Zhao S.B. and Grisman R. 2005. Extracting
relations with integrated information using
kernel methods. ACL2005.
Zhou G.D., Su J., Zhang J. and Zhang M. 2005.
Exploring various knowledge in relation
extraction. ACL2005.