Large-scale%20Knowledge%20Resources%20in%20Speech%20and%20Language%20Research - PowerPoint PPT Presentation

About This Presentation
Title:

Large-scale%20Knowledge%20Resources%20in%20Speech%20and%20Language%20Research

Description:

Large-scale Knowledge Resources in Speech and Language Research Mark Liberman University of Pennsylvania myl_at_cis.upenn.edu LKR2004 3/8/2004 – PowerPoint PPT presentation

Number of Views:129
Avg rating:3.0/5.0
Slides: 48
Provided by: MarkL264
Category:

less

Transcript and Presenter's Notes

Title: Large-scale%20Knowledge%20Resources%20in%20Speech%20and%20Language%20Research


1
Large-scale Knowledge Resources in Speech and
Language Research
  • Mark LibermanUniversity of Pennsylvaniamyl_at_cis.u
    penn.edu
  • LKR2004 3/8/2004

2
Outline
  • Glimpse of LKR in the U.S. landscape
  • What is the relationship betweenlarge-scale
    knowledge resourcesand research and
    developmenton speech and language?
  • What are some needs and opportunities?
  • What are the trends?
  • Illustrative examples

3
Glimpses of the U.S. LKR landscape
  • DARPA research areas
  • Human Language Technology
  • Cognitive Information Processing
  • NSF initiatives
  • Digital Libraries
  • ITR, Human Social Dynamics
  • terascale linguistics
  • Biomedical research
  • text, ontologies, databases, experiments
  • collaborations with Japan and Europe
  • Language documentation
  • Web archives in many disciplines
  • ...too many other things to list...

4
What is the relationship between large-scale
knowledge resourcesand research and development
on speech and language?
Speech and language RD needs LKR
Modeling text 104-106 words in 1975, 109-1012
words todayModeling speech 1-10 hours in 1975,
103-104 hours today lexicons, parallel text,
DBs for entity tracking, etc. a thousand
languages and dialects history, social
variation, register and genre, ...
Speech and language RD creates LKR
see above.
but also something entirely new...
5
Some needs and opportunities
  • Standards and tools for LKR
  • for creation, improvement, maintenance
  • for publication, distribution, archiving
  • for search, access and use
  • An academic culture that rewards production and
    distribution of LKR
  • most LKR are a side effect of individual and
    small-group research
  • virtual meta-resources from many sources
  • Part of the answer integrate LKR into the
    system of (scientific and scholarly) publication

6
Themes and trends
  • A New Empiricism
  • focus on large-scale resources, because quantity
    (of data) ? quality (of knowledge)
  • Language Life Meaning
  • something new emerges from large collections of
    symbols, signals, contexts, connections
  • People and machines better together
  • cognitive prosthetics
  • interactive working, playing and learning
  • Failure is the basis for success
  • if we can measure error, we can learn to improve

7
Some illustrative examples...
8
A famous argument
  (1) Colorless green ideas sleep
furiously.  (2) Furiously sleep ideas green
colorless.. . . It is fair to assume that
neither sentence (1) nor (2) (nor indeed any
part of these sentences) has ever occurred in an
English discourse. Hence, in any statistical
model for grammaticalness, these sentences will
be ruled out on identical grounds as equally
remote from English. Yet (1), though
nonsensical, is grammatical, while (2) is not.
Noam Chomsky, Syntactic Structures
(1957)
9
But is it true?
10
43 years later
  • someone finally checked...
  • Pereira, Formal grammar and information theory
    (2000)
  • simple aggregate bigram model using hidden
    class variables c
  • with C16, trained on 100MW of newswire data
  • the result
  • "Furiously sleep green ideas colorless" is more
    than 200,000 times less probable thanColorless
    green ideas sleep furiously

11
What changed?
  • Partly
  • new models and estimation methods
  • better computing resources
  • more accessible data
  • Mostly
  • willingness to look for solutions
  • opportunities to apply them
  • To be fair, this kind of modeling became a real
    option only about 1980
  • Now it can be done as an undergraduate term
    project ...

12
Social structure from conversation
  • Human social dynamics model of
    conversational turn-taking
  • U.S. Supreme Court oral arguments
  • Modeling is simple and local
  • one session modeled at a time (250 turns)
  • data is just sequence of (250) speaker IDs
  • Undergraduate term project in intro course
    (credit to Chris Osborn)

13
CHIEF JUSTICE WILLIAM H. REHNQUIST We'll hear
argument next in No. 01-298, Paul Lapides v. the
Board of Regents of the University System of
Georgia. Spectators are admonished, do not talk
until you get outside the courtroom. The court
remains in session. Mr. Bederman. MR. DAVID J.
BEDERMAN Mr. Chief Justice, and may it please
the Court When a State affirmatively invokes
the jurisdiction of the Federal court by removing
a case, that acts as a waiver of the State's
forum immunity to Federal jurisdiction under the
Eleventh Amendment. This principle ... JUSTICE
ANTONIN SCALIA When you say as an actor in any
role, does it ever intervene as a defendant?
MR. BEDERMAN Yes, Justice Scalia. This Court's
precedents seem to indicate that wherever the
State is cast in the role of plaintiff,
defendant, intervenor, or claimant, that the
entry into the Federal proceeding submits the
State to the jurisdiction of the Federal court.
CHIEF JUSTICE REHNQUIST How about the Ford
Motor Company case? MR. BEDERMAN Well, of
course, the authorization requirement in Ford
Motor -- and that's the particular holding in
Ford Motor that I think is of concern to this
Court -- need not be reached here because, of
course, ... CHIEF JUSTICE REHNQUIST So, you
think a line can be drawn between the State
defendant being drawn in as a respondent or
involuntarily as opposed to removing and thereby
invoking Federal jurisdiction. ... 254 turns
...
14
Two-class aggregate bigram model, trained on a
single one-hour argument (01-298),
highest-probability class for each speaker
class 1 ( chief justice william h. rehnquist
justice anthony kennedy justice antonin scalia
justice john paul stevens justice ruth bader
ginsburg justice sandra day o'connor justice
stephen g. breyer ) class 2 ( mr. david j.
bederman mr. irving l. gornstein ms. devon orland
ms. julie c. parsley) )
15
So human social roles can emerge from a
trivial statistical model of speaker sequencing
in a formal setting.
and sometimes you dont need a lot of data.
...though in this case, it was crucial
that Jerry Goldmans Oyez Project is publishing
all Supreme Court oral arguments
(audio and transcripts)
In most cases the quantity of data is crucial
Data quantity ? knowledge quality ...
and available resources are just
starting to pass a threshold
16
A case where size matters...
  • English complex nominals sequence of nouns and
    adjectives, e.g.Volume Feeding Management
    Success Formula Award
  • Part-of-speech string offers little help in
    parsing stone traffic barrier
    job growth statistics N N
    N
  • Apparently, parsing requires understanding

17
The MEDLINE corpus
  • U.S. National Library of Medicine
  • 12 million references and abstracts
  • biomedical journal articles
  • 1966 to present
  • 109 words

18
Parsing by counting (in MEDLINE)
NNN sickle cell anemia10561 2422
NNN rat bile duct203 22366
NAN information theoretic criterion  112       5
NAN monkey temporal lobe   16     10154
ANN giant cell tumour7272 1345
ANN cellular drug transport262  746
AAN   small intestinal activity8723       120
AAN inadequate topical cooling   4     195
19
Parsing by counting (google hits)
N N N stone traffic barrier 338 7,010
N N N job growth statistics 349,000 11,600
First attempt at this idea for ATT TTS in
1987 First real success 15 years later The
difference It doesnt really work with 107-108
tokens It works pretty
well with 109-1012 tokens
You can observe a lot just by watching. -Yogi
Berra here... You can analyze a lot just by
counting.
20
As the SCOTUS example suggests, large-scale
is not just the number of words or
hours. Structure, context and external
relationships can also be
crucial here it was the
sequence of speaker identities.
Heres a simple but compelling example
of how symbol-like structure emerges
as zebra finches practice a song...
This is research by Ofer Tchernichovski (CCNY),
Partha Mitra and others
21
Zebra finch song learningOfer Tchernichovski
(CCNY)
8
22
Song motifs vary across individuals
23
Song imitation young birds imitate adults
Tutors song
Pupils song
24
Song imitation
Can be very accurate Critical period
developmental learning Song template memory
traces of a model Learning requires auditory
feedback
Sensory-motor phase
Sensory phase
25
(No Transcript)
26
The training system
Laboratory of Animal Behavior, CCNY
27
(No Transcript)
28
(No Transcript)
29
Real-time calculation of acoustic features
4 simple acoustic features with articulatory
correlates
30
(No Transcript)
31
5733 66 0.295980722 802.5073242 -2.626851082 33.58778763 0.804081738
6756 66 0.152581334 704.6381836 -2.524046659 27.59897423 0.802883089
7297 53 0.167008847 812.2409058 -1.880394816 45.26642609 0.73422879
7876 62 0.219140843 744.0402222 -2.562429667 34.36729431 0.77498275
8253 76 0.261799634 1212.450928 -2.24555397 48.8947258 0.649886608
8393 121 0.825781465 663.1687012 -2.535212278 20.65950394 0.749277711
8589 61 0.383003145 719.1973877 -2.427448273 29.89187622 0.67703712
8760 65 0.261223316 1119.903198 -2.556747913 45.04622269 0.633399487
8840 92 0.391378433 980.5782471 -2.776203156 29.98022079 0.742950559
9579 50 0.070019156 1089.148315 -2.479059219 29.93981934 0.839425206
10523 70 0.166663319 811.1593628 -2.734509706 27.13637352 0.836294293
10733 51 0.176689878 763.8659058 -1.616189003 45.17594528 0.496240675
10874 36 0.076791681 1103.130981 -1.929902196 58.78096008 0.811875403
10972 62 0.10109444 2110.150879 -2.650181532 46.28370285 0.830607355
11042 44 0.221805096 2779.580322 -3.222234249 60.9871254 0.79437232
11136 53 0.203947186 878.0430298 -1.2962991 46.85206223 0.485266626
11465 53 0.14567025 811.8573608 -1.186548352 41.14878082 0.42596662
11521 65 0.139529422 868.633667 -1.330822468 42.92938232 0.542328238
12355 81 0.536730945 982.7991333 -2.679917574 37.7701149 0.523121655
13481 55 0.185585603 733.9207764 -2.271656036 39.42351151 0.816531181
13669 72 0.342740119 772.1679077 -2.455365419 30.38383102 0.765049458
14466 53 0.276962578 699.7897949 -2.140806913 40.342556 0.822018743
14612 47 0.078976907 1122.309326 -1.729982138 48.15994644 0.823718846
16304 55 0.143629089 769.4672852 -1.626844049 34.90858841 0.711382151
16454 76 0.216472968 769.9150391 -2.356431723 39.29466629 0.794104338
16571 54 0.52569139 687.6394043 -1.956387162 37.81315613 0.616944551
17000 58 0.135118335 864.5578613 -2.363121986 31.00643349 0.858065724
17189 51 0.124977574 752.3527222 -1.94250226 36.36558151 0.691144586
17761 58 0.144002378 1021.027527 -2.258356094 40.53672409 0.708231866
17873 47 0.066938281 1339.068604 -1.668018103 46.29984665 0.69986397
18051 38 0.066276349 1847.560913 -2.551876307 38.55633545 0.805839062
18092 81 0.200010121 2080.408936 -3.075473547 50.34065247 0.776402116
18219 66 0.335276693 858.1080933 -1.750756502 46.40740204 0.511499882
18536 69 0.261755675 890.3964233 -1.860459447 42.50422668 0.500995994
19446 46 0.15915972 993.3217773 -1.601477981 43.11263275 0.527124286
20405 51 0.193706796 800.2883911 -1.413753867 41.22149277 0.428571522
20644 65 0.24410592 802.0982666 -1.589150429 39.50386429 0.429761887
20729 61 0.166723967 901.6841431 -1.771348119 47.49161148 0.556119919
20847 51 0.198818251 852.6430664 -1.053611994 48.11198425 0.44106108
23287 68 0.178408563 784.8914185 -2.134843588 41.99195862 0.656920671
24243 70 0.185866207 990.8589478 -2.562700748 39.49663925 0.763919473
Start on Duration Mean Amp Mean Pitch Mean Entropy Mean FM Mean Continuity
32
Dynamic Vocal Development maps
Duration Mean Pitch Mean Entropy Mean FM
66 802.5073242 -2.626851082 33.58778763
66 704.6381836 -2.524046659 27.59897423
53 812.2409058 -1.880394816 45.26642609
62 744.0402222 -2.562429667 34.36729431
76 1212.450928 -2.24555397 48.8947258
121 663.1687012 -2.535212278 20.65950394
61 719.1973877 -2.427448273 29.89187622
65 1119.903198 -2.556747913 45.04622269
92 980.5782471 -2.776203156 29.98022079
50 1089.148315 -2.479059219 29.93981934
70 811.1593628 -2.734509706 27.13637352
33
Dynamic Vocal Development (DVD) Mapof a single
bird
Day 85
Day 75
Day 65
Day 55
Day 45
Onset of training
Day 35
34
(No Transcript)
35
Language Life Meaning
  • Text (and speech) structured by
  • conversational context
  • time, place, sequence, participants, ...
  • content
  • types and identities of referenced entities
  • explicit links (anaphora, references, hyperlinks)
  • implicit links (quotation, imitation, opposition)
  • other contextual data
  • e.g. neurological, gene expression data in
    birdsong learning
  • gaze, gesture, posture, physiological data in
    conversation

36
A small applicationreal conversational
transcription
  • Perfect automatic speech-to-text (STT) yields

ew very nice yes thats thats the ah first car
uh well my first ownership of something major
thats cool i had to buy my car my other car
burned down so it was my first brand new car
uh-huh but i love it so i am very happy
  • STT metadata yields Rich Transcription

Speaker 1 Very nice. Speaker 2 Yes. Thats
my first ownership of something major. Speaker 1
Thats cool. I had to buy my car. My other
car burned down. It was my first brand new
car. Speaker 2 Uh-huh. Speaker 1 But I love
it. I am very happy.
37
One aspect of conversational metadata Diarization
  • Goal Label acoustic sources and their
    attributes
  • speakers, music, noise, DTMF, background events

38
Interactive annotation
  • Supervised learning human annotates, machine
    learns
  • Unsupervised learning machine looks for
    structure in raw data
  • Semi-supervised learning human annotates a few
    examples, machine tries to generalize
  • Active learning machine selects cases
    that are interesting or uncertain, asks
    for human judgments
  • Sampling experiments human checks machine
    annotation of selected cases, apply sample
    confusion matrix to estimate overall statistics

39
The cycle of interactive annotation
Hand Annotation
Hand Correction
Machine Learning
Automaticannotation
(Selective) Sampling/ Labeling
40
POS taggertrained on WSJ applied to MEDLINE
41
Same tagger,after retraining... (200 MEDLINE
abstracts)
42
The key to success learn to measure
failure...
Even a badly flawed measure can produce important
gains.
43
One year of quantitative evaluation...
Arabic to English
89
Best Research System
Best COTS System
58
57
51
44
Scoring Method
  • Machine
    Translation Score
  • Percent of Human x 100

  • Human Translation Score

Translation Score Weighted sum of n-gram
matches between
translation being scored (human or
machine) and three good reference translations
Reference translation The U.S. island of Guam
is maintaining a high state of alert after the
Guam airport and its offices both received an
e-mail from someone calling himself the Saudi
Arabian Osama bin Laden and threatening a
biological/chemical attack against public places
such as the airport .
Tri-gram match
Bi-gram match
Uni-gram match
Machine translation The American ?
international airport and its the office all
receives one calls self the sand Arab rich
business ? and so on electronic mail , which
sends out The threat will be able after public
place and so on the airport to start the
biochemistry attack , ? highly alerts after the
maintenance.
45
Best System Outputs
2002
2003
  • insistent Wednesday may recurred her trips to
    Libya tomorrow for flying
  • Cairo 6-4 ( AFP ) - an official announced
    today in the Egyptian lines company for flying
    Tuesday is a company " insistent for flying "
    may resumed a consideration of a day Wednesday
    tomorrow her trips to Libya of Security Council
    decision trace international the imposed ban
    comment .
  • And said the official " the institution sent a
    speech to Ministry of Foreign Affairs of lifting
    on Libya air , a situation her receiving replying
    are so a trip will pull to Libya a morning
    Wednesday " .
  • Certain are " the lines is air Libyan I will
    start also in of three trips running weekly to
    Cairo in the coordination with Egypt for flying "
    .
  • Egyptair Has Tomorrow to Resume Its Flights to
    Libya
  • Cairo 4-6 (AFP) - said an official at the
    Egyptian Aviation Company today that the company
    egyptair may resume as of tomorrow, Wednesday its
    flights to Libya after the International Security
    Council resolution to the suspension of the
    embargo imposed on Libya.
  • " The official said that the company had sent a
    letter to the Ministry of Foreign Affairs,
    information on the lifting of the air embargo on
    Libya, where it had received a response, the
    first take off a trip to Libya on Wednesday
    morning ".
  • The Libyan Arab Airways will also in the conduct
    of the three times a week in Cairo in
    coordination with egyptair ".

46
Human v. Machine
Human
2003
  • Egypt Air May Resume its Flights to Libya
    Tomorrow
  • Cairo, April 6 (AFP) - An Egypt Air official
    announced, on Tuesday, that Egypt Air will resume
    its flights to Libya as of tomorrow, Wednesday,
    after the UN Security Council had announced the
    suspension of the embargo imposed on Libya.
  • The official said that, "the company sent a
    letter to the Ministry of Foreign Affairs to
    inquire about the lifting of the air embargo on
    Libya, and in the event that it receives a
    response, then the first flight to Libya, will
    take off, Wednesday morning."
  • He stressed that "the Libyan Airlines will begin
    scheduling three weekly flights to Cairo, in
    coordination with Egypt air."
  • Egyptair Has Tomorrow to Resume Its Flights to
    Libya
  • Cairo 4-6 (AFP) - said an official at the
    Egyptian Aviation Company today that the company
    egyptair may resume as of tomorrow, Wednesday its
    flights to Libya after the International Security
    Council resolution to the suspension of the
    embargo imposed on Libya.
  • " The official said that the company had sent a
    letter to the Ministry of Foreign Affairs,
    information on the lifting of the air embargo on
    Libya, where it had received a response, the
    first take off a trip to Libya on Wednesday
    morning ".
  • The Libyan Arab Airways will also in the conduct
    of the three times a week in Cairo in
    coordination with egyptair ".

47
Summary
  • Speech and Language Research
  • needs LKR
  • creates LKR
  • can help other disciplines deal with LKR
  • is helped by other disciplines, who provide
  • raw data as well as relevant LKR pieces
  • problems, algorithms, inspiration
  • The whole is greater than the sum of the parts
  • Types, sources and amounts of data
  • Collaboration within and across disciplines
  • Cooperation of humans and machines
Write a Comment
User Comments (0)
About PowerShow.com