Title: The%20Web%20in%20Theoretical%20Linguistics%20Research:%20%20Two%20Case%20Studies%20Using%20the%20Linguist
1The Web in Theoretical Linguistics Research
Two Case Studies Using the Linguists Search
Engine
- Philip Resnik, Aaron Elkiss,
- Heather Taylor, and Ellen Lau
- University of Maryland
Berkeley Linguistics Society
February 20, 2005
2Theje dberk eobbfid dbeonc kdoeb
Did that sound ok to you?
a small, imperfect experiment
3Schütze (1996) Cowart (1997) Bard, Robertson, and
Sorace (1996) Crocker and Keller (2005) Sorace
and Keller (2005)
4Corpora Part-of-speech taggers Treebanks Statistic
al parsers Semantic role labeling etc.
?
Nature of Elicitation
5Manning (2003) it remains fair to say that
these tools have not yet made the transition to
the Ordinary Working Linguist without
considerable computer skills.
export TGREP_CORPUSwsj_mrg.crp tgrep -n __
grep . gzip gt wsj_mrg.txt.gz tgrep2 -C -p
wsj_mrg.txt wsj_mrg.t2c.g NP !ltlt PP gt NP gtgt
VP
6Roadmap
- Motivations
- The Linguists Search Engine
- Case Study 1 Psycholinguistics
- Case Study 2 Syntax
- Conclusions
7A Brief Illustration of the LSE
- Pollard and Sag (1994) discussion in Manning
(2003) - (a) We consider Kim to be an acceptable candidate
- (b) We consider Kim an acceptable candidate
- (c) We consider Kim quite acceptable
- (d) We consider Kim among the most acceptable
candidates - (e) We consider Kim as an acceptable candidate
- (f) We consider Kim as quite acceptable
- (g) We consider Kim as among the most acceptable
candidates - (h) We consider Kim as being among the most
acceptable candidates
8Query By Example
9(No Transcript)
10(No Transcript)
11(No Transcript)
12You can choose to match all morphological forms
of a word.
13Hit search and the LSE retrieves sentences
whose analysis matches the structure you
specified.
14One more click to look at a sentence in context
15 or to see the entire Web page where it occurred.
16Two Case Studies
- Focus in this talk
- What was the study about?
- How was the LSE useful?
- In both cases, my co-authors were naïve users of
the Linguists Search Engine. I didnt discover
the LSE had been useful to them until after the
fact.
17Case Study I Psycholinguistics
- Nina Kazanina, Ellen Lau, Moti Lieberman, Colin
Phillips and Masaya Yoshida, Active Dependency
Formation in the Processing of Backwards
Anaphora. 17th Annual CUNY Sentence Processing
Conference, University of Maryland, College Park.
March 2004. - http//www.ling.umd.edu/ninaka/Papers/CUNY_2004_s
lides.pdf
18Active Dependency Formation
The teacher asked what the team was laughing
about __.
- Wh-word signals upcoming dependency formation
- Active processing of dependency observed
- ? filled gap effect
- Dependency formation constrained by grammar
- ? island constraints
19Active Dependency Formation
Results looked good, but there was a confound!
She was cooking dinner while John listened to
the radio.
She was cooking dinner while John listened to
the radio.
Needed a construction where the target position
is expected otherwise processor might simply
have stopped looking for target.
20Active Dependency Formation
Possible solution expletive constructions
It was clear to his mother that John should go.
It was clear to him that John should go.
No Principle C Principle C
It was clear to his mother that John should go.
It was clear to him that John should go.
Question does this construction really have the
right properties?
- Is the second clause consistently expected?
- Is it consistently expletive rather than
referential?
21Query by example It was clear to him Becomes
It AUX clear to NP
22(No Transcript)
23Active Dependency Formation
Result
- Verified that virtually all results of the search
did involve expletive it with a following clause. - Obtained reassurance in designing the follow-up
study - Later double-checked using an off-line completion
study
The LSE made it easy to start with linguists
intuitions and find relevant evidence in
naturally occurring text.
The LSE also makes it easy to look for additional
relevant data that may not have occurred to the
experimenter.
24Query by example It AUX Adj PP that
Any adjective
PP with any preposition
25clear important vital manifest interesting necessa
ry obvious
26Case Study II Syntax
- Heather Taylor, Interclausal (co)dependency the
case of the comparative correlative, Proc.
Michigan Linguistics Society, October 2004. - http//www.ling.umd.edu/events/syntax/abstracts/h
eather1.PDF
27Comparative Correlatives
- The Xer , the Yer
- Highlighted in recent debates about the UG
approach - Central question are these constructions
amenable to an analysis based on UG principles,
or do they present a challenge to the UG view? - Central claim here the LSE is useful regardless
of which side of the debate youre on. -
A.k.a. Conditional correlatives, correlative
conditionals, more-more constructions
28Comparative Correlatives
Culicover and Jackendoff (1999)
Taylor (2004)
IP/CP
CP
Sui generis
CP
CP
CP
CP
CP
the more XPi (that) IP
the more XPj(that) IP
ti
tj
Interclausal relationships accounted for outside
the syntax
UG analysis relating CCs to conditionals
29Comparative Correlatives
- McCawleys generalization (1988, 1998)
- Deletion of copular main verbs in CCs is
sensitive to semantic properties of the subject
(generic/specific)
- The better an advisor , the more successful a
student
- The more obnoxious Fred , the less attention
you should pay
is
Ø
- But analysis of LSE data exposes the role of
- Phonological weight of the subject
- Parallelism (copula in both clauses, deletion in
both clauses) - casting doubt on the generalizations validity
30Comparative Correlatives
- The more obnoxious Fred,
- the less attention you should pay to him.
- ?The more obnoxious Freds younger brother,
- the less attention you should pay to him.
- ?The longer the days activities are, the
sleepier the campers. - ?The longer the days activities, the sleepier
the campers are. - vThe longer the days activities, the sleepier
the campers.
Informant judgments confirm the tendencies
indicated by naturally occurring data.
31Comparative Correlatives
- Overt then?
- The hungrier Romeo gets, then the more pizza he
eats. - Cf. If Romeo gets hungrier, then he eats more
pizza.
32(No Transcript)
33(No Transcript)
34(No Transcript)
35Comparative Correlatives
- Overt then
- The hungrier Romeo gets, then the more pizza he
eats. - Cf. If Romeo gets hungrier, then he eats more
pizza. - LSE searches suggest that overt then is not
anomalous. - Might this support a UG account that provides a
unified treatment of CCs and conditionals? - One more fact to add to the theoretical debate!
36Conclusions
- The LSE is useful to traditional linguists
- Confirming/disconfirming intuitions
- (theory ? data)
- Exposing a wider range of data
- (data ? theory)
- The LSE complements new methodological trends
- Magnitude estimation, etc.
- The LSE is available for anyone to use
- http//lse.umiacs.umd.edu
37(No Transcript)
38Backup slides
39Conclusions
- Chomsky (1979) You can also collect butterflies
and make many observations. If you like
butterflies, thats fine but such work must not
be confounded with research, which is concerned
to discover explanatory principles of some depth
and fails if it does not do so. - Einstein (1940) Science is the attempt to make
the chaotic diversity of our sense-experience
correspond to a logically uniform system of
thought in which experience must be correlated
with the theoretical structure What we call
physics comprises that group of natural sciences
which base their concepts on measurements
40A Web Search Tool for the Ordinary Working
Linguist
- Must have linguist-friendly look and feel
- Must minimize learning/ramp-up time
- Must permit real-time interaction
- Must permit large-scale searches
- Must allow search on linguistic criteria
- Must be reliable
- Must evolve with real use
41LSE Example Text in Parallel Translation
Example seeing how English completive particle
usages (eat up versus simply eat, indicating a
telic event) are rendered in different languages.
42(No Transcript)
43(No Transcript)
44LSE Example Implicit Objects
- Resnik (1993, 1996)
- Information-theoretic model of selectional
constraints - Model makes predictions with respect to implicit
objects - Implicit objects
- John ate Ø ( John ate something edible)
- John found Ø (cant mean John found something
findable). - Question from audience
- Doesnt your model then predict that the verb
titrate should permit implicit objects? - Options
- Find informants for whom titrate is in the
working vocabulary - Slog through corpora looking for titrate used
intransitively
45(No Transcript)
46(No Transcript)
47Custom collection of sentences from the Web
48(No Transcript)
49(No Transcript)
50(No Transcript)
51(No Transcript)
52Active Dependency Formation
Gender mismatch effect reveals active processing
Can grammatical information constrain the process?
- Principle C pronoun cant
- co-refer with antecedent that it c-commands.
- Prediction no gender mismatch effect with
c-commanded positions
53(No Transcript)
54More on Comparative Correlatives(see Taylor,
2004)
- The two clauses behave like a subordinate and
matrix clause, respectively - Tag questions form on clause2 and not clause1
- Only clause2 can host subjunctive case
- In German, the word order is consistent with
clause1 being subordinate to matrix clause2 - In Dutch there is flexibility in the word order
of clause2 characteristic of matrix clauses - NPI licensed in clause1 but not in clause2
- Extraction is equally permissible from both
55- Conditionals
- Presence of then
- Tag questions form on clause2 and not clause1
- NPI licensed in clause1 but not in clause2
- Extraction from both clauses
- Variable binding facts shadow each other
- Lack of Condition C binding between clauses
- Codependence
- Each clause depends on the presence of the other
- The licit values of X in the comparative
strings are determined by each other - Parallelism in copula deletion
56the ADJer the ADJer
57(No Transcript)