Hindi POS tagging and chunking : An MEMM approach

About This Presentation

Title:

Description:

Number of Views:210

Avg rating:3.0/5.0

Slides: 19

Provided by: cseIi8

Category:

more less

Transcript and Presenter's Notes

Title: Hindi POS tagging and chunking : An MEMM approach

1
Hindi POS tagging and chunking An MEMM approach

2
Introduction

Lexical Analysis
Part-Of-Speech (POS) Tagging Assigning
part-of-speech to each word. eg. Noun, Verb...
Syntactic Analysis
Chunking Identify and label phrases as verb
phrase, noun phrase etc.

3
Outline

4
Maximum Entropy Markov Model

Maximum entropy principle
The least biased model which considers all known
information is the one which maximizes entropy.
Mathematical formulation
Maximize entropy

5
Maximum Entropy Markov Model

The distribution with the maximum entropy(p) is
equivalent to
6
System overview

Parameter estimation and classification
GIS (Generalized Iterative Scaling)
finds the model parameters that define the
maximum entropy classifier for a given feature
set and training corpus
Beam Search
heuristic search algorithm, optimization of
best-first search
unfolds the first m most promising nodes at each
depth

7
What are features?

8
POS tagging features

9
POS tagging features

10
Chunking features

11
Results

12
Accuracy across runs
13
Error Analysis

14
Future Work

15
References

Adwait Ratnaparakhi. 1996. A maximum entropy
model for part-of-speech tagging. In Erich Brill
and Kenneth Church, editors, Proceedings of the
Conference on Empirical Methods in NLP, pages
133-142. ACL. Somerset, New Jersey.
Adwait Ratnaparakhi. 1997. A simple introduction
to maximum entropy models for natural language
processing. Technical report 97-08, Institute for
Research in Cognitive Science, University of
Pennsylvania.

16
References

Adam L. Berger , Vincent J. Della Pietra ,
Stephen A. Della Pietra, 1996 .A maximum entropy
approach to natural language processing,
Computational Linguistics, v.22 n.1, p.39-71.
Akshay Singh, Sushma Bendre, and Rajeev Sangal.
2005. HMM based chunker for hindi. In Proceedings
of IJCNLP-05. Jeju Island, Republic of Korea.

17
References