Computational Linguistics - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Computational Linguistics

Description:

... with a finite alphabet that can be constructed ... Consider a machine that maps between digit strings and their reading as number names in English. ... – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 75
Provided by: richar782
Category:

less

Transcript and Presenter's Notes

Title: Computational Linguistics


1
Computational Linguistics
  • What is it and what (if any) are its
  • unifying themes?

2
Computational linguistics
3
I often agree with XKCD
4
linguistics?
computational linguistics
literary criticism
physics
biology
chemistry
psychology
neuropsychology
more rigorous
less rigorous
more flakey
5
What defines the rigor of a field?
  • Whether results are reproducible
  • Whether theories are testable/falsifiable
  • Whether there are a common set of methods for
    similar problems
  • Whether approaches to problems can yield
    interesting new questions/answers

6
Linguistics
7
literary criticism
engineering
sociology
linguistics
more rigorous
less rigorous
8
The true situation with linguistics
other areas of sociolinguistics (e.g. Deborah
Tannen)
theoretical linguistics (e.g.
lexical-functional grammar)
some areas of sociolinguistics (e.g. Bill Labov)
theoretical linguistics (e.g. minimalist syntax)
experimental phonetics
historical linguistics
psycholinguistics
more rigorous
less rigorous
9
Okay enough alreadyWhat is computational
linguistics
  • Text normalization/segmentation
  • Morphological analysis
  • Automatic word pronunciation prediction
  • Transliteration
  • Word-class prediction e.g. part of speech
    tagging
  • Parsing
  • Semantic role labeling
  • Machine translation
  • Dialog systems
  • Topic detection
  • Summarization
  • Text retrieval
  • Bioinformatics
  • Language modeling for automatic speech
    recognition
  • Computer-aided language learning (CALL)

10
Computational linguistics
  • Often thought of as natural language engineering
  • But there is also a serious scientific component
    to it.

11
Why CL may seem ad hoc
  • Wide variety of areas (as in linguistics)
  • If its natural language engineering, the goal is
    often just to build something that works
  • Techniques tend to change in somewhat faddish
    ways
  • For example machine learning approaches fall in
    and out of favor

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Machine learning in CL
  • In general its a plus since it has meant that
    evaluation has become more rigorous
  • But its important that the field not turn into
    applied machine learning
  • For this to be avoided, people need to continue
    to focus on what linguistic features are
    important
  • Fortunately, this seems to be happening

17
Some interesting themes
  • Finite-state methods
  • Many application areas
  • Raises interesting questions about how much of
    language is regular (in the sense of finite
    state)
  • Grammar induction
  • Linguists have done a poor job at their stated
    goal of explaining how humans learn grammar
  • Computational models of language change
  • Historical evidence for language change is only
    partial. There are many changes in language for
    which we have no direct evidence.

18
Finite state methods
  • Used from the 1950s onwards
  • Went out of fashion a bit during the 1980s
  • Then a revival in the 1990s with the advent of
    weighted finite-state methods

19
Some applications
  • Analysis of word structure morphology
  • Analysis of sentence structure
  • Part of speech tagging
  • Parsing
  • Speech recognition
  • Text normalization
  • Computational biology

20
Regular languages
  • A regular language is a language with a finite
    alphabet that can be constructed out of one or
    more of the following operations
  • Set union
  • Concatenation
  • Transitive closure (Kleene star)

21
Finite state automata formal definition
Every regular language can be recognized by a
finite-state automaton. Every finite-state
automaton recognizes a regular language.
(Kleenes theorem)
22
Representation of FSAs State Diagram
23
Regular relations formal definition
24
Finite-state transducers
25
An FST
26
Composition
  • In addition to union, concatenation and Kleene
    closure, regular relations are closed under
    composition
  • Composition is to be understood here the same way
    as composition in algebra
  • R1oR1 means take the output of R1 and feed it to
    the input of R2

27
Composition an illustration
28
R1 as a transducer
29
R2 as a transducer
30
R1?R2
31
Some things you can do with FSTs
  • Text analysis/normalization
  • Word segmentation
  • Abbreviation expansion
  • Digit-to-number-name mappings
  • i.e. mapping from writing to language
  • Morphological analysis
  • Syntactic analysis
  • E.g. part-of-speech tagging
  • (With weights) pronunciation modeling and
    language modeling for speech recognition

32
Thats fine for engineering but
  • Does it really account for the facts?
  • Is morphology really regular?
  • Is the mapping between writing and speech really
    regular?

33
What is morphology?
  • scripserunt is third person, plural, perfect,
    active of scribo (I write)
  • Morphology relates word forms
  • the lemma of scripserunt is scribo
  • Morphology analyzes the structure of word forms
  • scripserunt has the structure scribserunt

34
Morphology is a relation
  • Imagine you have a Latin morphological analyzer
    comprising
  • D a relation that maps between surface form and
    decomposed form
  • L a relation that maps between decomposed form
    and lemma
  • Then
  • scripserunt ? D scribserunt
  • scripserunt ? D ? L scribo

35
English regular plurals
  • cat s cats /s/
  • dog s dogs /z/
  • spouse s spouses /?z/
  • This can be implemented by a rule that composes
    with the base word, inserting the relevant form
    of the affix at the end

36
Templatic affixes in Yowlumne
Transducer for each affix transforms base into
required templatic form and appends the relevant
string.
37
Subtractive morphology
Transducer deletes final VC of the base
38
Bontoc infixation
  • Insert a marker gt after the first consonant (if
    any)
  • Change gt into the infix um-

39
Side note infixation in English
Kalama
zoo
fg
40
Reduplication Gothic
Problem mapping w to ww is not a regular relation
41
Factoring Reduplication
  • Prosodic constraints
  • Copy verification transducer C

42
Non-Exact Copies
  • Dakota (Inkelas Zoll, 1999)

43
Non-Exact Copies
  • Basic and modified stems in Sye (Inkelas Zoll,
    1999)

44
Morphological Doubling Theory(Inkelas Zoll,
1999)
  • Most linguistic accounts of reduplication assume
    that the copying is done as part of morphology
  • In MDT
  • Reduplication involves doubling at the
    morphosyntactic level i.e. one is actually
    simply repeating words or morphemes
  • Phonological doubling is thus expected, but not
    required

45
Gothic Reduplication under Morphological Doubling
Theory
46
Summary
  • If Inkelas Zoll are right then all morphology
    can be computed using regular relations
  • This in turn suggests that computational
    morphology has picked the right tool for the job

47
Another Example Linguistic analysis of text
  • Maps between the stuff you see on the page e.g.
    text written in the standard orthography of a
    language into linguistic units (words,
    morphemes, phonemes)
  • For example
  • I ate a 25kg bass
  • aI eIt ? twenti faIv kIl?græm bæs
  • This can be done using transducers
  • But is the mapping between writing and language
    really regular (finite-state)?

48
Linguistic analysis of text
  • Abbreviation expansion
  • Disambiguation
  • Number expansion
  • Morphological analysis of words
  • Word pronunciation

49
A transducer for number names
Consider a machine that maps between digit
strings and their reading as number names in
English. 30,294,005,179,018,903.56 ? thirty
quadrillion, two hundred and ninety four
trillion, five billion, one hundred seventy nine
million, eighteen thousand, nine hundred three,
point five six
50
Mapping between speech and writing
  • It seems obvious on the face of it that the
    mapping between speech and its written form is
    regular. After all, the words are ordered in the
    same way as speech. Even the

tend to be ordered in the same
letters
way as the sounds they represent.
51
Some examples where it isnt
honorific inversion
r
m
j
n
t
nx
xpr
w
t
w
nb
52
Finite state methods
  • In morphology they seem almost exactly correct as
    characterizations of the natural phenomenon
  • In the mapping from writing to language, again,
    finite-state models seem almost exactly correct

53
Grammar induction
The common nativist view in linguistics From
Gilbert Harman's review of Chomsky's New Horizons
in the Study of Language and Mind (published in
Journal of Philosophy, 98(5), May 2001) Further
reflection along these lines and a great deal of
empirical study of particular languages has led
to the "principles and parameters" framework
which has dominated linguistics in the last few
decades. The idea is that languages are basically
the same in structure, up to certain parameters,
for example, whether the head of a phrase goes at
the beginning of a phrase or at the end. Children
do not have to learn the basic principles, they
only need to set the parameters. Linguistics aims
at stating the basic principles and parameters by
considering how languages differ in certain more
or less subtle respects. The result of this
approach has been a truly amazing outpouring of
discoveries about how languages are the same yet
different.
54
Similarly
Cedric Boeckx and Norbert Hornstein. 2003. The
Varying Aims of Linguistic Theory.
Children come equipped with a set of principles
of grammar construction (i.e. Universal Grammar
(UG)). The principles of UG have open parameters.
Specific grammars arise once values for these
open parameters are specified. Parameter values
are determined on the basis of the primary
linguistic data. A language specific grammar,
then, is simply a specification the values that
the principles of UG leave open.
55
My challenge with Shalom Lappin
56

57
Automatic induction of grammars from unannotated
text
  • Klein, Dan and Manning, Christopher. 2004.
    Corpus-based induction of syntactic
    structure models of dependency and
    constituency. Proceedings of the 42nd Annual
    Meeting on Association for Computational
    Linguistics
  • Lots of subsequent work

58
Different syntactic representations
59
Dependency Model with Valence (DMV)
  • Each head generates a set of non-STOP arguments
    to one side, then a STOP argument then similarly
    on the other side
  • Trained using expectation maximization

60
Performance
61
Improvements
  • Constituent structure can be induced in a similar
    way to inducing word classes (e.g. parts of
    speech) by considering the environments in
    which the putative constituent finds itself.
  • In Klein Mannings constituent-context model
    (CCM) probability of a bracketing is computed as
    follows

62
Combined DMVCCM
Subsequent work e.g. Rens Bods 2006
Unsupervised Data Oriented Parsing report
F-scores close to 83.0 For comparison, the best
supervised parsers get about 91.0
63
Some objections and a synopsis
  • Children do not learn grammars from unannotated
    text corpora they get a lot of guidance from the
    environmental situation
  • Sure
  • Performance of automatic induction algorithms is
    still far from human performance so they do not
    constitute evidence that we can do away with
    (nativist) linguistic theories of language
    acquisition
  • They do not show this. But the argument would
    have more weight if nativist theories had already
    been demonstrated to contribute to a working
    model of grammar induction
  • But Computational Linguistics is starting to make
    some serious contributions to this 50-year-old
    debate

64
The evolution of complex structure in language
Examples from Stump, Gregory (2001) Inflectional
Morphology A Theory of Paradigm Structure.
Cambridge University Press.
65
Evolutionary Modeling (A tiny sample)
  • Hare, M. and Elman, J. L. (1995) Learning and
    morphological change. Cognition, 56(1)61--98.
  • Kirby, S. (1999) Function, Selection, and
    Innateness The Emergence of Language Universals.
    Oxford
  • Nettle, D. "Using Social Impact Theory to
    simulate language change". Lingua,
    108(2-3)95--117, 1999.
  • de Boer, B. (2001) The Origins of Vowel Systems.
    Oxford
  • Niyogi, P. (2006) The Computational Nature of
    Language Learning and Evolution. Cambridge, MA
    MIT Press.

66
A multi-agent simulation
  • System is seeded with a grammar and small number
    of agents
  • Each agent randomly selects a set of phonetic
    rules to apply to forms
  • Agents are assigned to one of a small number of
    social groups
  • 2 parents beget child agents.
  • Children are exposed to a predetermined number of
    training forms combined from both parents
  • Forms are presented proportional to their
    underlying frequency
  • Children must learn to generalize to unseen slots
    for words
  • Learning algorithm similar to
  • David Yarowsky and Richard Wicentowski (2001)
    "Minimally supervised morphological analysis by
    multimodal alignment." Proceedings of ACL-2000,
    Hong Kong, pages 207-216.
  • Features include last n-characters of input form,
    plus semantic class
  • Learners select the optimal surface form to
    derive other forms from (optimal requiring the
    simplest resulting ruleset a Minimum
    Description Length criterion)
  • Forms are periodically pooled among all agents
    and the n best forms are kept for each word and
    each slot
  • Population grows, but is kept in check by
    natural disasters and a quasi-Malthusian model
    of resource limitations
  • Agents age and die according to reasonably
    realistic mortality statistics

67
Final states for a given initial state
68
Another example
  • Kirby, Simon. 2001. Spontaneous evolution of
    linguistic structure an iterated learning model
    of the emergence of regularity and irregularity.
    IEEE Transactions on Evolutionary Computation,
    5(2)102--110.
  • Assumes two meaning components each with 5
    values, for 25 possible words
  • Initial speaker randomly selects examples from
    the 25, producing random strings for each, and
    teaches them to the hearer
  • Not all of the slots are filled, thus producing a
    bottleneck the hearer must compute forms for
    the missing slots

69
The basic algorithm produces results that are too
regular
Initial state
Final state
70
A more realistic result
  • Addition of other constraints, including
  • a random tendency for speakers to omit symbols,
  • a frequency distribution over the 25 possible
    meaning combinations

71
Summary
  • Evolutionary modeling is evolving slowly
  • We are a long way from being able to model the
    complexities of known language evolution
  • Nonetheless, computational approaches promise to
    lend insights into how complex social systems
    such as language change over time, and complement
    discoveries in historical linguistics

72
Final thoughts
  • Language is central to what it means to be human.
  • Language is used to
  • Communicate information
  • Communicate requests
  • Persuade, cajole
  • (In written form) record history
  • Deceive
  • Other animals do some or most of these things
    (cf. Anindya Sinhas work on bonnet macaques)
  • But humans are better at all of these

73
Final thoughts
  • So the scientific study of language ought to be
    more central than it is
  • We need to learn much more about how language
    works
  • How humans evolved language
  • How languages changed over time
  • How humans learn language
  • Computational linguistics can contribute to all
    of these questions.

74
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com