Corpus and Experimental Data as Corroborating Evidence: The Case of Preposition Placement in English Relative Clauses - PowerPoint PPT Presentation

About This Presentation
Title:

Corpus and Experimental Data as Corroborating Evidence: The Case of Preposition Placement in English Relative Clauses

Description:

Thomas Hoffmann (University of Regensburg) Corpus and Experimental Data as Corroborating Evidence: The Case of Preposition Placement in English Relative Clauses – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Corpus and Experimental Data as Corroborating Evidence: The Case of Preposition Placement in English Relative Clauses


1
Corpus and Experimental Data as Corroborating
EvidenceThe Case of Preposition Placement in
English Relative Clauses
Thomas Hoffmann (University of Regensburg)
  • Linguistic Evidence Empirical, Theoretical, and
    Computational Perspectives University of
    Tübingen, 02.02.-04.02.2006

2
1. Introduction Corpus vs. Introspection
  • We do not need to use intuition in justifying
    our grammars, and as scientists, we must not use
    intuition in this way. (Sampson 2001 135)
  • You dont take a corpus, you ask questions.
    You can take as many texts as you like, you can
    take tape recordings, but youll never get the
    answer. (Chomsky in Aarts 2000 5-6)
  • ? Which type of data are we left with then?

3
1. Introduction Corpus vs. Introspection
  • A corpus and an introspection-based approach to
    linguistics can be gainfully viewed as being
    complementary. (McEnery and Wilson 1996 16)
  • ? corpus and introspection data
    corroborating evidence
  • ? case study P placement in English Relative
    clauses

4
1. Introduction What to Expect
  • corpora vs. introspection?
  • categorical corpus data (ICE-GB corpus)
  • Magnitude Estimation experiment
  • variable corpus data (ICE-GB corpus)
  • conclusion

5
2. Corpora and Introspection
  • Arguments against corpus data
  • performance problem
  • negative data problem
  • homogeneity problem
  • ? only use introspection

6
2. Corpora and Introspection
  • Arguments against corpus data ? no corpus
  • performance problem yet performance result
    of competence modern corpora representative
  • negative data problem yet only additional
    (different) data needed
  • homogeneity problemyet empirical claim that
    needs to be investigated
  • ? use corpora additional data type

7
2. Corpora and Introspection
  • Arguments against introspection data
  • unnatural data problem
  • irrefutable data problem
  • illusion problem
  • stability problem
  • ? only use corpora

8
2. Corpora and Introspection
  • Arguments against introspection data ? no
    introspection
  • unnatural data problemyet only additional
    (context) data needed
  • irrefutable datayet depends only on
    collection method
  • illusion problem yet only additional
    (natural) data needed
  • stability problem yet empirical claim that
    needs to be investigated
  • ? use corpora additional data type

9
2. Corpora and Introspection
  • Corpora and introspection are corroborating
    evidence

10
3. Case Study Preposition Placement
  • I want a data source ...
  • (1) a. which I can rely on
  • stranded preposition
  • b. on which I can rely
  • pied-piped preposition
  • driving question
  • data source for empirical analysis of (1a,b)?

11
4. Empirical Study I Corpus Data
  • Corpus used
  • International Corpus of English ICE-GB (Nelson
    et al. 2002)(educated Present-day BE, written
    spoken)
  • Analysis tool
  • GOLDVARB computer programme (logistic
    regression Robinson et al. 2001)
  • relative influence of various contextual factors
    (weights lt0.5 inhibiting factors gt0.5
    favouring)

12
4. Empirical Study I Corpus Data I
  • Pstrand/pied-piped token tested for
  • finiteness
  • restrictiveness
  • relativizer
  • XP contained in (V / N, e.g. entrance to sth. /
    Adj, e.g. afraid of sth.)
  • level of formality
  • X-PP relationship (Vprepositional, PPLoc_Adjunct,
    PPMan_Adjunct )
  • except 2 all factors discussed in literature
    before, but not w.r.t. interdependence (e.g.
    Bergh, G. A. Seppänen. 2000 Trotta 2000)

13
4.1 Categorical corpus data
  • raw ICE-GB P-placement data
  • 1074 finite relative clauses
  • 659 (61.4) tokens pied piped
  • 415 (38.6) tokens stranded
  • as expected many categorical effects
  • ? accidental vs. systematic gaps?

14
4.2 Categorical corpus data that/Ø ? WH-relatives
  • relativizer
  • all that/Ø-tokens in ICE-GB stranded
  • 176 thatPstranded-token
  • (2) ?a data source on that I can rely
  • 177 ØPstranded-token
  • (3) ?a data source on Ø I can rely
  • ? ICE-GB result expected
  • ? implications (2) (3)? / that ? WH-

15
4.3 Categorical corpus data Constraints on
Pstrand
  • 2. X-PP relationship
  • Literature (e.g. Bergh, G. A. Seppänen. 2000
    Trotta 2000)
  • Pstranding favoured with complement PP
  • disfavoured with adjunct PP
  • ICE-GB data
  • Pstranding restricted to PPs which
  • add thematic information to predicates/events

16
4.3 Categorical corpus data Constraints on
Pstrand
  • 2. X-PP relationship
  • categorical effect of WH-PPAdjuncts-tokens
  • a) just PWH / no that/ØP in ICE-GB
  • manner, degree, frequency respect PPs,
    e.g.
  • a. the ways in which the satire is achieved
    ltICE-GBS1B-014 51Agt
  • b. ? the ways which/that/Ø the satire is
    achieved in

17
4.3 Categorical corpus data Constraints on
Pstrand
  • 2. X-PP relationship
  • categorical effect of WH-PPAdjuncts-tokens
  • b) just PWH / but that/ØP in ICE-GB
  • subcat. PP (put sth. in/into/under)
  • locative, affected loc., direction PP
    adjuncts
  • a. the world that I was working in and
    studying in ltICE-GBS1A-001 351Bgt
  • b. the world in which I was working and
    studying

18
4.3 Categorical corpus data Constraints on
Pstrand
  • Claim comparison of WH- vs that/Ø shows
  • P can only be stranded if PP adds thematic
    information to predicates/events
  • manner degree adjunctscompare events to
    other possible events of V-ing (Ernst 2002 59)
  • frequency respect adjuncts have scope over
    temporal information (frequency) and truth value
    of entire clause (respect)
  • ? dont add thematic participant ? Pstrand
    with these systematic gap

19
4.3 Categorical corpus data Constraints on
Pstrand
  • Claim comparison of WH- vs that/Ø shows
  • P can only be stranded if PP adds thematic
    information to predicates/events
  • subcat. PP loc., affected loc., direction PP
    adjuncts
  • ? add thematic participant ? WHP with these
    accidental gap

20
4.3 Categorical corpus data Constraints on
Pstrand
  • Claim comparison of WH- vs that/Ø shows
  • P can only be stranded if PP adds thematic
    information to predicates/events
  • Comparison of WH- vs that/Ø good evidence, but
  • still negative data problem
  • further corroborating evidence needed
  • Introspection Magnitude Estimation study

21
5. Empirical Study II Magnitude Estimation
  • relative judgements (reference sentence)
  • informal, restrictive RCs tested for
  • P-PLACEMENT (Pstrand, Ppied-piped)RELATIVIZER
    (WH-, that-, Ø-)X-PP (VPrep, PPTemp/Loc_Adjunct,
    PPManner/Degree_Adjunct)
  • tokens counterbalanced 6 material groups a 18
    tokens 36 filler 54 tokens
  • tokens randomized (Web-Exp-software)
  • N 36 BE native speakers (sex 18m, 18f / age
    17-64)

22
5. Empirical Study II Magnitude Estimation
  • 18 filler sentences ungrammatical
  • a. Thats a tape I sent them that done Ive
    myself (word order violation original source
    ltICE-GBS1A-033 074gt)
  • b. There was lots of activity that goes on there
    (subject contact clause original source
    ltICE-GBS1A-004 067gt)
  • c. There are so many people who needs
    physiotherapy (subject-verb agreement error
    original source ltICE-GBS1A-003 027gt)

23
5. Empirical Study II Magnitude Estimation
  • ANOVA significant effects
  • P-PLACEMENT F(1,33) 4.536, p lt 0.05
  • RELATIVIZER F(2,66) 17.149, p lt 0.001
  • P-PLACEMENTX-PP F(2,66) 9.740, p lt 0.001
  • P-PLACEMENTRELATIVIZER F(2,66) 4.217, p lt
    0.02

24
5. Empirical Study II Magnitude Estimation
  • ANOVA not significant
  • AGE F(1,33) 2.760, p gt 0.10
  • GENDERF(1,33) 1.495, p gt 0.20
  • ? indicates homogeneity of subjects

25
5. Empirical Study II Magnitude Estimation
  • Post-hoc Tukey test P-PlaceRelativizer
  • Ppied-piped WH- gtgt that p lt 0.001 WH-
    gtgt ? p lt 0.001 that gt ? p lt 0.010
  • Pstrand no difference WH- that ? p gtgt
    0.100

26
5. Empirical Study II Magnitude Estimation
  • Post-hoc Tukey test P-PlaceX-PP
  • Ppied-piped PPMan/Deg gt VPrep p lt
    0.010 PPMan/Deg PPTemp/Loc p 0.100
    VPrep PPTemp/Loc p gt 0.100
  • Pstrand no difference VPrep gt PPTemp/Loc gt
    PPMan/Deg p lt 0.001

27
Fig. 1 Magnitude estimation result for P
relativizer PWH gtgt Pthat gt PØ
28
Fig. 2 Magnitude estimation result for P
relativizer compared with fillers Pthat PØ
ungrammatical fillers ? violation of hard
constraint (Sorace Keller 2005)
29
Fig. 3 Magnitude estimation result for
relativizer P WH P that P Ø PVPrep gt
PPTemp/Loc gt PPMan/Deg
30
Fig. 3 Magnitude estimation result for
relativizer P VPrep gt PPTemp/Loc gt PPMan/Deg gtgt
ungrammatical filler? violation of soft
constraint (Sorace Keller 2005)
31
6. Corroborating Evidence
  • Corroborating evidence
  • corpus man/deg PPs no Pstranded (not even with
    that/?)? semantic constraint on Pstranded
  • experimentman/deg PPs worst environment for
    Pstranded yet better than ungrammatical fillers
    (soft constraint violation)

32
7. Empirical Study III Corpus Data II
  • Constraints on variable corpus data (354 finite
    WH-token)
  • Goldvarb identified 3 independent factors (Log
    likelihood -88.437 Significance 0.004 Fit
    X-square(27) 27.977, accepted, p 0.2040)
  • 1. level of formality (as expected)
  • 2. type of PP contained in (as expected)
  • 3. restrictiveness (unexpected)
  • restrictive RC favour pied piping (weight
    0.592)
  • nonrestrictive RC clearly inhibit pied piping
    (i.e. favour stranding weight 0.248)

33
7. Empirical Study III Corpus Data II
  • (6) And uhm he left me there with this packet of
    Durex which I hadn't got a clue what to do
    with to be totally honest ltICE-GBS1B-049
    1671Bgt
  • reasons for restrictiveness effect
  • 1. weaker semantic ties of non-restrictive
    clause with antecedent (pause/comma)
  • 2. Pied-piped P receives connective function
  • ? functionalisation of preposition placement
    in WH-relative clause

34
8. Conclusion
  • corpus and introspection data corroborating
    evidence
  • corporafrequency/context effects (e.g. level of
    formality)unexpected patterns (e.g.
    restrictiveness)categorical data ? require
    further investigation
  • ?
  • introspection differentiation of accidental
    gaps (WHP with PPTemp/Loc)systematic gaps (XP
    with PPMan/Deg)detection of degrees of
    ungrammaticality

35
9. References
  • Aarts, B. 2000. "Corpus linguistics, Chomsky and
    Fuzzy Tree Fragments". In Christian Mair and
    Marianne Hundt, eds. 2000. Corpus Linguistics and
    Linguistic Theory. Amsterdam and Atlanta, GA
    Rodopi, 5-13.
  • Bard, E.G. et al. 1996. Magnitude Estimation of
    Linguistic acceptability. Language 7232-68.
  • Bergh, G. A. Seppänen. 2000. Preposition
    stranding with wh-relatives A historical
    survey. English Language and Linguistics
    4295-316.
  • Cowart, W. 1997. Experimental Syntax Applying
    Objective Methods to Sentence Judgements.
    Thousand Oaks Sage.
  • Huddleston, R. et al. 2002. Relative
    constructions and unbound dependencies. In G.K.
    Pullum R. Huddleston, eds. The Cambridge
    Grammar of the English Language. Cambridge
    Cambridge University Press, 1031-1096.
  • Jackendoff, R. 2002. Foundations of Language
    Brain, Meaning, Grammar, Evolution. Oxford
    Oxford University Press.
  • Levine, R. I.A. Sag. 2003. WH-Nonmovement.
    lthttp//www-csli.stanford.edu/saggt, 04.07.2004.

36
9. References
  • Nelson, G. et al. 2002. Exploring Natural
    Language Working with the British Component of
    the International Corpus of English. Amsterdam,
    Philadelphia Benjamins.
  • McEnery, T. and A. Wilson. 1997. Corpus
    Linguistics. Edinburgh Edinburgh University
    Press.
  • Pesetsky, D. 1998. Some principles of sentence
    production. In Pilar Barbosa et al., eds. Is
    the Best Good Enough? Optimality and Competition
    in Syntax. Cambridge, MA MIT Press, 337-83.
  • Penke, M. A. Rosenbach. 2004. "What counts as
    evidence in linguistics? An introduction".
    Studies in Language 28,3 480-526.
  • Pickering, M. G. Barry. 1991. Sentence
    processing without empty categories. Language
    and Cognitive Processes 6229-259.
  • Quirk, R. et al. 1985. A Comprehensive Grammar of
    the English Language. London Longman.
  • Robinson, J. et al. 2001. GOLDVARB 2001 A
    Multivariate Analysis Application for Windows.
    lthttp//www.york.ac.uk/depts/lang/webstuff/goldvar
    b/manualOct2001gt

37
9. References
  • Sag, I.A. 1997. English relative constructions.
    Journal of Linguistics 33431-484.
  • Sampson, G. 2001. Empirical Linguistics. London,
    New York Continuum.
  • Schütze, Carson T. 1996. The Empirical Base of
    Linguistics Grammaticality Judgements and
    Linguistic Methodology. Chicago Chicago
    University Press.
  • Sorace, Antonella and Frank Keller. 2005.
    "Gradience in linguistic data". Lingua 115,11
    1497-1525.
  • Trotta, J. 2000. Wh-clauses in English Aspects
    of Theory and Description. Amsterdam and
    Philadelphia, GA Rodopi.
  • Van der Auwera, J. 1985. Relative that a
    centennial dispute. Journal of Linguistics
    21149-179.
Write a Comment
User Comments (0)
About PowerShow.com