Introduction to Computational Linguistics - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Computational Linguistics

Description:

Introduction to Computational Linguistics Misty Azara – PowerPoint PPT presentation

Number of Views:567
Avg rating:3.0/5.0
Slides: 38
Provided by: Mist74
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Computational Linguistics


1
Introduction to Computational Linguistics
  • Misty Azara

2
Agenda
  • Introduction to Computational Linguistics (CL)
  • Common CL applications
  • Using CL in theoretical linguistics
    (computational modeling)

3
What is Computational Linguistics?
  • CL is interdisciplinary
  • Linguistics
  • Computer Science
  • Mathematics
  • Electrical Engineering
  • Psychology
  • Speech and Hearing Science

4
What is Computational Linguistics?
  • Computational Linguistics covers many areas
  • Essentially, CL is any task, model, algorithm,
    etc. that attempts to place any type of language
    processing (syntax, phonology, morphology, etc.)
    in a computational setting

5
Core Areas of CL
  • Machine Translation
  • Speech Recognition
  • Text-to-Speech
  • Natural Language Generation
  • Human-Computer Dialogs
  • Information Retrieval
  • Computational Modeling

6
Machine Translation
  • Using computers to automate some or all of
    translating from one language to another

7
  • Three general models or tasks
  • Tasks for which a rough translation is adequate
  • Tasks where a human post-editor can be used to
    improve the output
  • Tasks limited to a small sublanguage

8
Machine Translation (cont.)
  • Linguistic knowledge is extremely useful in this
    area of CL
  • MT benefits from knowledge of language typology
    and language-specific linguistic information

9
Speech Recognition
  • Taking spoken language
  • as input and outputting the corresponding text

10
Architecture
  • SR takes the source speech and produces guesses
    as to which words could correspond to the source
    via some type of acoustic model
  • The word with the highest probability is selected
    as the optimal candidate

11
Why use SR?
  • Allow for hands-free human-computer interaction

12
Text-to-Speech
  • Taking text as input and outputting the
    corresponding spoken language

13
Three types of TTS
  • Articulatory- models the physiological
    characteristics of the vocal tract
  • Concatenative- uses pre-recorded segments to
    construct the utterance(s)

14
Three types of TTS (cont.)
  • Parametric/Formant- models the formant
    transitions of speech
  • baj

15
Why is TTS so difficult?
  • Spelling
  • through, rough
  • Homonyms
  • PERmit (n) vs. perMIT (v)
  • Prosody
  • Pitch, duration of segments, phrasing of
    segments, intonational tune, emotion
  • I am so angry at you. I have never been more
    enraged in my life!!

16
Why use TTS?
  • Allows for text to be read automatically
  • Extremely useful for the visually impaired

17
Natural Language Generation
  • Constructing linguistic outputs from
    non-linguistic inputs

18
Natural Language Generation
  • Maps meaning to text
  • Nature of the input varies greatly from one
    application to another (i.e documenting structure
    of a computer program)
  • The job of the NLG system is to extract the
    necessary information to drive the generation
    process

19
NLG systems have to make choices
  • Content selection- the system must choose the
    appropriate content for input, basing its
    decision on a pre-specified communicative goal
  • Lexical selection- the system must choose the
    lexical item most appropriate for expressing a
    concept

20
  • Sentence Structure
  • Aggregation- the system must apportion the
    content into phrase, clause, and sentence-sized
    chunks
  • Referential expression- the system must determine
    how to refer to the objects under discussion (not
    a trivial task)

21
  • Discourse structure- many NLG systems have to
    deal with multi-sentence discourses, which must
    have a coherent structure

22
Sample NLG output
  • To save a file
  • 1. Choose save from the file menu
  • 2. Choose the appropriate folder
  • 3. Type the file name
  • 4. Click the save button
  • The system will save the document.

23
Human-Computer Dialogs
  • Uses a mix of SR, TTS, and pre-recorded prompts
    to achieve some goal

24
Human-Computer Dialogs
  • Uses speech recognition, or a combination of SR
    and touch tone as input to the system
  • The system processes the spoken information and
    outputs appropriate TTS or pre-recorded prompts

25
  • Dialog systems have specific tasks, which limit
    the domain of conversation
  • This makes the SR problem much easier, as the
    potential responses become very constrained

26
Sample dialog system for banking
  • Sys would you like information for checking or
    savings?
  • User Checking, please.
  • Sys Your current balance is 2,568.92. Would you
    like another transaction?
  • User Yes, has check 2431 cleared?

27
Linguistic knowledge in dialog systems
  • Discourse structure- ensuring natural flowing
    discourse interaction
  • Building appropriate vocabularies/lexicons for
    the tasks
  • Ensuring prosodic consistencies (i.e. questions
    sound like questions and spliced prompts sound
    continuous)

28
Why use human-computer systems?
  • Automate simple tasks- no need for a teller to be
    on the other end of the line!
  • Allow access to system information from anywhere,
    via the telephone

29
Information Retrieval
  • Storage, analysis, and retrieval of text documents

30
Information Retrieval
  • Most current IR systems are based on some
    interpretation of compositional semantics
  • IR is the core of web-based searching, i.e.
    Google, Altavista, etc.

31
Information Retrieval Architecture
  • User inputs a word or string of words
  • System processes the words and retrieves
    documents corresponding to the request

32
Bag of Words
  • The dominant approach to IR systems is to ignore
    syntactic information and process the meaning of
    individual words only
  • Thus, I see what I eat and I eat what I see
    would mean exactly the same thing to the system!

33
Linguistic Knowledge in IR
  • Semantics
  • Compositional
  • Lexical
  • Syntax (depending on the model used)

34
Computational Modeling
  • Computational approaches to problem solving,
    modeling, and development of theories

35
How can we use computational modeling?
  • Test our theories of language change synchronic
    or diachronic
  • Develop working models of language evolution
  • Model speech perception, production, and
    processing
  • Almost any theoretical model can have a
    computational counterpart

36
Why Use Computational Modeling?
  • Forces explicitness no black boxes or behind
    the scenes magic
  • Allows for modeling that would otherwise be
    impossible
  • Allows for modeling that would otherwise be
    unethical

37
Conclusions
  • CL applications utilize linguistic knowledge from
    all of the major subfields of theoretical
    linguistics
  • Computational modeling can aid linguists
    theories of language processing and structure
Write a Comment
User Comments (0)
About PowerShow.com