Prague Arabic Dependency Treebank - PowerPoint PPT Presentation


PPT – Prague Arabic Dependency Treebank PowerPoint presentation | free to download - id: 10e49f-ZDc1Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Prague Arabic Dependency Treebank


Project Release PADT 1.0. December 2004, Linguistic Data Consortium ... (adverbial, locative) Verbal. Verb-like behavior (object of noun?) September 23, 2004 ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 15
Provided by: q382
Learn more at:


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Prague Arabic Dependency Treebank

Prague Arabic DependencyTreebank
Development in Data and Tools
Jan Hajic Otakar SmržPetr ZemánekJan
ŠnaidaufEmanuel Beška
Faculty of Mathematics and Physics Faculty of
Philosophy and Arts Charles University in Prague
Project Release PADT 1.0
  • December 2004, Linguistic Data Consortium
  • 148 000 Morpho, 113 500 Syntax

AFP 13 000 N/A France Presse Penn ATB 1
UMH 38 500 N/A Ummah Press Penn ATB 2
XIN 13 500 N/A Xinhua News A Gigaword
ALH 10 000 73 500 Al-Hayat News A Gigaword
ANN 12 500 25 500 An-Nahar News A Gigaword
XIA 26 500 49 500 Xinhua News A Gigaword
Open-Source Tools
  • TrEd Tree Editor
  • Multi-purpose annotation environment
  • Suite of programming utilities
  • Netgraph Search Engine
  • Server/Client system architecture
  • Easy-to-learn query language
  • EncodeArabic Perl Module
  • Extension for processing of Arabic script
  • ArabTeX, Buckwalter, Unicode,

PADT Functional Views
  • Functional Generative Description
  • Theory of linguistic meaning and its expression
  • Prague Dependency Treebank for Czech
  • Independence of representation levels
  • Tectogrammatical linguistic meaning
  • Analytical surface dependency syntax
  • Morphological categories and lexical units
  • Abstraction of the relations across levels
  • Strict distinction between form and function
  • Different units of description on each level

Functional Morphology
  • Provides syntax levels with their abstract
    language, not just giving letters in tokens
  • Revives multiple senses of categories
  • Completeness of generation
  • Strict modeling of grammatical control
  • MorphoTrees human tagging
  • Successful prototype feature-based tagger

Syntactic Levels of Description
  • Analytical level
  • Pragmatically motivated, close to surface syntax
  • Every single token resulting frommorphological
    level forms one node
  • Tree-like dependency structure for every sentence
  • Tectogrammatical level
  • Linguistic (literal) meaning, deep relations, TFA
  • Initial structures transformed from AL
  • Nodes for autosemantic words only
  • Decisive role of valency frames

Logic of Analytical Trees
  • Concepts of dependency and valency
  • Reduction sentence must retain grammatical
    correctness if leaves(terminal nodes) are
    chopped off
  • Trees clause components ? clauses ? sentences ?
    paragraphs etc.Subtrees of clauses exchangeable
    for non-clauses
  • Nodes words, tokenized parts of words,
    punctuation marks marked by functions
  • Edges syntactic relations governing node ?
    dependent node/subtree

Some Syntax Issues of Arabic
  • Non-verbal predication of several types
  • Subordinate non-verbal clauses / modification
  • Verb-like behavior of many nominal forms
  • Mostly VSO in verbal sentences, but
  • vice-versa in non-verbal clauses
  • different, depending on context boundness
  • Compound verbs, fixed composite prepositions
  • Grammatical co-reference, accusative ofinner
    object, complex referencing, etc.

Problem I Predication
  • Head node of tree PREDICATE
  • Why? Steady role in sentence, cannot be omitted
  • Verbal predicate I-go to school
  • Non-verbal predicate
  • Nominal The-house a-big (the house is big)
  • Existential There a-city (there is a city)
  • Prepositional
  • Possessive For him a-house (he has a house)
  • Adverbial The-mosque in the-city (is)
  • Conjunctional The-problem that (is that)

Predication Types in Trees
dAma Pred lasted
kabIrun Pnom a-big nom.
iqtirAHu Sb proposal
sAEatayni Adv two-hours acc.
al-baytu Sb the-house nom.
-hu Atr his
al-EamalIyata Obj the-operation acc.
EalA AuxP on
vamata PredE there-is
zumalAi Obj colleagues
Prepositional(adverbial, locative)
la- PredP for
madInatun Sb a-city nom.
-hi Atr his
Verb-like behavior (object of noun?)
fI PredP in
-hu Obj him
baytun Sb a-house nom.
al-madInati Adv the-city gen.
al-jAmiEu Sb the-mosque nom.
Problem II Clauses Co-reference
  • Recursiveness subordinate clause is con-tained
    as subtree in place of simple element
  • Head-node of clause gets the same function
  • Problem non-verbal structures clauses or not?
  • Compound verbs (mA zAla etc.) treated equally
  • Grammatical co-reference Personal pro- noun
    formally required by another element
  • Pronoun must be marked to be treated as such
  • Target of reference is unambiguously identifiable
  • Often in subordinate clauses, mostly
    attributiveEx. He-wrote a-book number its-pages

Clauses Co-reference in Trees
Compound verb, formed as main verb and its
Attributive clause, prepositional predicate
zAlat Pred she-stopped
kataba Pred he-wrote
kitAban Obj a-book
mA AuxM not
Objective clause, verbal predicate
tuHisu Atv she-feels
al-rajulu Sb the-man nom.
fI Atr_PredP in
zaybabu Sb Zaynab
Attributive clause, nominal predicate
miatu Sb hundred nom.
Referencing pronoun, as attribute in clause
anna AuxC that
-hi Adv_Ref it
tuEjibu Obj_Pred they-impress
SafHatin Atr pages gen.
jumalan Sb sentences acc.
Referencing pronoun, as adverbial in clause
wADiHun Atr_Pnom clear nom.
naHwu Sb grammar nom.
-hA Obj her
-hA Atr_Ref their
Future Prospects
  • Implementation of Functional Morphology
  • Tectogrammatical annotation
  • Lexicons of valency frames
  • Re-training the feature-based tagger on
  • Machine-learning on the treebank data for various

Thank you
  • Questions welcome!
  • http//