11682: Introduction to IR, NLP, MT and Speech - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

11682: Introduction to IR, NLP, MT and Speech

Description:

11-682: Introduction to IR, NLP, MT and Speech. Natural Language Generation: Overview ... 11-682: Intro to IR, NLP,MT,Speech. NLG: Overview. When Is NLG Appropriate? ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 30
Provided by: csC76
Category:
Tags: nlp | introduction | nlp | speech

less

Transcript and Presenter's Notes

Title: 11682: Introduction to IR, NLP, MT and Speech


1
11-682 Introduction to IR, NLP, MT and Speech
Natural Language Generation Overview
2
Todays Topics
  • An overview of practical issues in building
    natural language generation (NLG) systems
  • Based on Reiter Dale, 1997
  • Goal produce understandable texts in a human
    language from some underlying representation

3
NLG Ingredients
  • A representation of the input (probably not
    human-friendly)
  • Knowledge of the domain
  • Knowledge of the target language
  • A human-friendly output format
  • documents, reports, explanations, help messages,
    technical instructions, etc.

4
Example NLG Applications
  • Forecasts from weather maps
  • Summarize results of DB queries
  • Explain complex (e.g. medical) information
  • Describe a chain of reasoning in an expert system
  • Answering questions about an object in a
    knowledge base

5
Authoring Aids
  • Template-based generation of routine documents
  • Examples
  • discharge summaries, referral letters
  • letters to customers
  • management summaries
  • job descriptions
  • technical manuals

6
When Is NLG Appropriate?
  • Are graphics more useful?
  • Is human-quality output required?
  • How much stylistic variation?
  • Any legal liabilities / requirements?
  • Constraints posed by the problem domain? (e.g.
    bandwidth)

7
Templates (mail-merge)
  • Insert input data into pre-defined slots in a
    template document
  • More complex systems vary structure based on
    input
  • More limited than NLG
  • NLG can achieve higher quality
  • NLG is easier to adapt to changes

8
Human vs. Machine
  • Is NLG a cost-effective solution?
  • Economics of NLG development
  • Systems are expensive
  • A large volume of output necessary to justify the
    expenditure
  • The cost / quality threshold
  • Can NLG provide the necessary quality at an
    acceptable price? (or at all?)

9
Requirements Analysis
  • NLG is an evolving technology...
  • ...so iterative prototyping is the most
    appropriate SE technique
  • Corpus-Based Methods
  • Identify target text sample
  • Associate with internal representations (input to
    NLG)
  • Specify required NLG algorithms and data

10
Gathering A Corpus
  • Archived examples of human texts
  • Cover a full range of texts
  • If no corpus, ask experts to create one
    (associated costs conflicts)
  • Document Table
  • rows domain categories (e.g., product lines,
    business areas,)
  • columns document types (installation, user,
    maintenance, etc.)

11
Example Document Table
12
Analyzing theInformation Content
  • Which parts convey information that isnt
    available to the NLG system? E.g.When is the
    next train to Glasgow?(requires external DB)
  • Analysis classifying sentences according to
    information required
  • unchanging text, direct data, computed data,
    unavailable data

13
Sentence Types
  • Unchanging TextThank you for flying US Airways
  • Directly-Available DataScheduled departure is
    630pm
  • Computable DataThere are 20 flights to Boston
  • Unavailable DataDue to ground delay in Pittsburgh

Easy
Hard orImpossible
(Rely on Humans for Unavailable Data)
14
6 Basic NLG Tasks
  • 1. Content Determination what information should
    be conveyed?
  • 2. Discourse Planning order structure of
    message set
  • 3. Sentence Aggregation grouping messages into
    sentences
  • 4. Lexicalizationwords phrases for concepts,
    relations
  • 5. Referring Expression Generation words
    phrases for entities
  • 6. Linguistic Realisation syntax, morphology,
    orthography

15
Typical 3-Module Architecture
Q How should these be represented?
16
Text Plans
  • Common representation tree
  • Leaf nodes messages
  • Internal nodes message groupings
  • Simple text plans templates OK
  • Complex text plans require full representation
    language (e.g., TAMERLAN, DIOGENES)

17
Sentence Plans
  • Simple templates (select fill)
  • Complex abstract representation(SPL Sentence
    Planning Language)

18
Example SPL Expression
(S1/exist object (01/train
cardinality 20 relations ((R1/period
value daily)
(R2/source value
Aberdeen) (R3/destination
value Glasgow))))
There will be 20 trains to Glasgow
19
Content Determination
  • Messages (raw content)
  • User Model (influences content)
  • Is Reasoning Required?Find a train from Aberdeen
    to Leeds(It requires two trains to get there)
  • Deep Reasoning Systems
  • represent the users goals as well as any
    immediate query
  • utilize plan recognition reasoning

20
Discourse Planning
  • Structure messages into a coherent text
  • Example start with a summary, then give details
  • Discourse relations, e.g.
  • elaboration More specifically, X
  • exemplification For example, X
  • contrast / exception However, X
  • Rhetorical Structure Theory (RST)

21
Sentence Aggregation
  • No aggregation (1 sentence / message)
  • Relative Clause..which leaves at 10am
  • Conjunction..and the next train is the express
  • Combinations..and the next train is the express
    which leaves at 10am

22
Lexicalization
  • Choosing words to realize concepts or relations
  • Example(action/change (measure
    outside_temperature) (delta (quantity/deg_F
    -10)))The temperature dropped 10 degrees

23
Lexical Selection Rules
24
Case Creation
  • Additional structure is required to realize the
    meaning of the semantic representation

(A-KICK (AGENT O-JOHN) (PATIENT
O-BALL)) "John
propelled the ball with his foot"
25
Case Absorption
  • Word chosen to realize a semantic head also
    implies the meaning conveyed by a semantic role

(A-FILE-LEGAL-ACTION (AGENT O-BOB) (PATIENT
O-SUIT) (RECIPIENT O-ACME))
"Bob sued Acme"
26
Referring Expression Generation
  • Initial introductionA man in the park looked up
  • PronounsHe saw a bird fly over
  • Definite DescriptionsThe man covered his head
    with a newspaper

27
Fixing Robot Text
  • Start the enginei and run the enginei until
    the enginei reaches normal operating
    temperature
  • Start i and run the enginei until iti
    reaches normal operating temperature
  • Second example introduces ellipsis and anaphora

28
Journalistic Style
A dissident Spanish priest was charged here
todaywith attempting to murder the Pope. Juan
FernandezKrohn, aged 32, was arrested after a
man armed witha bayonet approached the Pope
while he was saying prayers at Fatima on
Wednesday night. According tothe police,
Fernandez told the investigating magistrates
today, he trained for the past six months for the
assault. If found guilty, the Spaniard faces a
prison sentence of 15-20 years. (Brown and
Yule, 1983)
29
Summary
  • 6 Basic Steps in NLG
  • Architectures group those steps into different
    modules
  • Input / output / approach depend on the domain
  • Design of internal data structures depends on
    complexity of task
Write a Comment
User Comments (0)
About PowerShow.com