Magnitude estimation of linguistic acceptability: applications to research on developing grammars - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Magnitude estimation of linguistic acceptability: applications to research on developing grammars

Description:

Stevens was the first experimenter to suggest using magnitude estimations to ... (1) cat the mat on sat the. and you gave it a 1, and if the next example: ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 51
Provided by: antonell
Category:

less

Transcript and Presenter's Notes

Title: Magnitude estimation of linguistic acceptability: applications to research on developing grammars


1
Magnitude estimation of linguistic acceptability
applications to research on developing grammars
  • Antonella Sorace
  • Utrecht, 13 November 2003
  • antonella_at_ling.ed.ac.uk

2
Outline
  • Questions that are difficult to address with
    conventional acceptability judgment tests.
  • Magnitude Estimation from psychophysics to
    linguistics.
  • What you can get with ME that you cant with
    other methods.
  • Applications of ME.
  • Web-based ME and demo.

3
The questions
  • Anyone who deals with DEVELOPING grammars (in
    language acquisition, language attrition,
    diachronic change) is confronted with the
    existence of gradience and optionality in
    linguistic data.

4
Optionality vs. gradedness
  • Optionality is characteristic of a grammar that
    allows different forms for the same meaning.
  • Gradedness is a manifestation of optionality the
    likelihood with which optional variants appear or
    are preferred.

5
Differentiating among constraints
  • Sorace Keller (2003) hard vs. soft
    constraints.
  • Duffield (2003) underlying vs. surface
    competence.
  • In both cases, the distinction is between
    narrow syntax and interface syntax the
    former is categorical and consists of formal
    syntactic principles, the latter is determined by
    the interaction of formal principles and specific
    non-syntactic properties.

6
The problem
  • Can one capture this kind of data with
    conventional acceptability judgment tests?
  • The answer is NO, or not completely.
  • Judgments of linguistic acceptability are
    essential data in linguistic and language
    acquisition research but they are typically
    elicited in informal ways that limit their
    usefulness.

7
Conventional measurements of linguistic
acceptability
  • Judgments of linguistic acceptability usually
    form category scales (acceptable, ) or limited
    ordinal (acceptable, ?, , ) scales.
  • These scales require absolute rating judgments,
    rather than relative ranking judgments.
  • Ordinal scales do not provide information about
    the relative distance between adjacent points on
    the scale.

8
Disadvantage of conventional scales
  • Measurements on these scales have several
    disadvantages
  • They are limited in their range of values
  • They are inconsistent in application
  • They are not susceptible to analysis via
    parametric statistics
  • They are unsuited to comparisons between effects
    of different linguistic constraints, to estimates
    of systematic variability of judgments, etc.
  • They are difficult to interpret (what do the
    middle points on a rating scale mean?)

9
What these scales cant capture
  • relative strength of syntactic violations (for
    native and non-native speakers are they the
    same?)
  • lexical-semantic hierarchies within the domain of
    application of syntactic principles (are these
    acquired by L2ers?)
  • developmental optionality (do we find it in L2
    endstate grammars?).
  • (among many other things.)

10
In these cases, we want to measure
  • The precise differences in acceptability between
    sentences.
  • The strength of preferences expressed by subjects
    for one sentence over another

11
ME in psychophysics
  • Magnitude estimation is an experimental technique
    used to quickly and easily determine how much of
    a given sensation a person is having.
  • Stevens was the first experimenter to suggest
    using magnitude estimations to quantitatively
    scale sensation.

12
  • In a magnitude estimation experiment subjects are
    presented with a standard stimulus (a modulus)
    and are asked to express the magnitude by a
    number.
  • The subjects are then presented with a series of
    stimuli that vary in intensity and are asked to
    assign each of the stimuli a number relative to
    the standard stimulus.

13
  • Subjects assign a number
  • to first stimulus (the modulus), to reflect
  • magnitude of pertinent characteristics
  • (length, loudness, brightness, etc).
  • to each successive stimulus to indicate
  • apparent magnitude relative to the first.

14
Scaling
  • Scaling is not about absolute accuracy of
    judgments
  • scaling is about the relative relationships
    between judgments of stimuli of different
    intensities.

15
Different modalities
  • The numerical modality is the most common but
    other modalities are possible (e.g. line length).
  • Other modalities can be more user-friendly
    particularly if you are testing people who (think
    they) are numerically-challenged.

16
How can you be sure subjects understand how to
perform magnitude estimations?
  • Many magnitude estimation experiments use a
    control condition in which subjects are asked to
    perform magnitude estimations of the length of a
    line.
  • Magnitude estimations of line length have been
    shown to be proportional to the actual length of
    the line.

17
  • If you can show that for a group of subjects
    magnitude estimations increased proportionally
    with the length of lines, you have established
    that the subjects do indeed understand the
    directions they have been given and can assign
    numbers to their sensations systematically.

18
Advantages of ME for physical dimensions
  • ME provides measurements of subjective
    impressions on a numerical scale which can be
    plotted against the objective measure of the
    physical stimuli giving rise to the impressions.
  • it does not restrict the number of values which
    can be used.
  • linear regression of estimates against physical
    measures in log-log coordinates produces a
    straight line with a slope characteristics of the
    physical property being assessed equal ratios on
    the physical dimension give rise to equal ratios
    of judgments (Stevens Power Law).

19
The Power Law
  • The magnitude of sensation varies as the
    intensity of the physical stimulus raised to some
    power m
  • Ssensation          aconstant           
    Iintensity      mexponent for a particular
    sensation
  • when plotted on log-log axes, the power law plots
    as a straight line with a slope of the exponent.

20
(No Transcript)
21
Examples of the Power Law for psychophysical
judgment tasks
22
What about linguistic acceptability?
  • Unlike other dimensions, linguistic acceptability
    has no obvious physical continuum to plot
    against the informants impressions.

23
A psychophysical law for linguistic judgments?
  • Keller (2003) has recently argued that a power
    law of the same kind as that obtained in
    psychophysics can be derived by plotting
    estimated linguistic acceptability against the
    number of linguistic constraints violated in the
    stimuli.

24
Extensions to non-physical domains
  • Magnitude estimation has been adapted to judging
    psycho-social continua with no objective metric
    prestige of occupations, support for political
    policies, etc.
  • Magnitude estimation was used on acceptability
    judgments for the first time by Sorace (1992) not
    to plot any function, but simply to compare the
    results with those obtained using more familiar
    techniques.

25
Typical instructions
  • Heres an example of what the instructions look
    like..

26
Instructions
  • The purpose of this exercise is to get you to
    judge the acceptability of some English
    sentences. You will see a series of sentences on
    the screen. These sentences are all different.
    Some will seem perfectly okay to you, but others
    will not. What we're after is not what you think
    of the meaning of the sentence, but what you
    think of the way it's constructed.

27
  • Your task is to judge how good or bad each
    sentence is by assigning a number to it.
  • You can use any number that seems appropriate to
    you. For each sentence after the first, assign a
    number to show how good or bad that sentence is
    in proportion to the reference sentence.

28
  • For example, if the first sentence was
  • (1) cat the mat on sat the.
  • and you gave it a 1, and if the next example
  • (2) the dog the bone ate.
  • seemed 20 times better, you'd give it twenty. If
  • it seems half as good as the reference sentence,
  • give it the number 0.5

29
  • You can use any range of positive numbers you
    like including, if necessary, fractions or
    decimals.
  • You should not restrict your responses to, say,
    an academic marking scale.
  • You may not use minus numbers or zero, of course,
    because they aren't proper multiples or fractions
    of positive numbers.
  • If you forget the reference sentence don't worry
    if each of your judgments is in proportion to the
    first, you can judge the new sentence relative to
    any of them that you do remember.

30
  • There are no 'correct' answers, so whatever seems
    right to you is a valid response. Nor is there a
    'correct' range of answers or a correct place
    to start.
  • Any convenient positive number will do for the
    reference.
  • We are interested in your first impressions, so
    don't spend too long thinking about your
    judgment.

31
  • Remember
  • Use any number you like for the first sentence.
  • Judge each sentence in proportion to the
    reference sentence.
  • Use any positive numbers you think appropriate.

32
Validation of ME
  • How can we be sure that people can reliably use
    magnitude estimation to judge linguistic
    acceptability given that there is no metric
    measurement?
  • Bard, Robertson and Sorace (1996) applied
    standard validation procedures (i.e.
    cross-modality matching and replication) people
    had to use one modality to judge the magnitude of
    the other.

33
Choices about the modulus and face validity
  • The experimenter has the option of assigning a
    fixed number to the modulus.
  • The other option is to leave the modulus in sight
    throughout the experiment.
  • This option has good face validity, but it
    doesnt affect the ultimate reliability of the
    estimates.
  • People dont need to remember the modulus if
    they are making judgments proportionally, the
    reference point shifts as they move on.

34
Timed vs. untimed ME
  • Timing the intervals between sentences may reduce
    the likelihood that people consult metalinguistic
    or prescriptive knowledge.
  • Intervals have to be different for non-native
    speakers they have to be piloted carefully.

35
Varying the instructions
  • There is a tendency in some people to use a fixed
    (usually 10-point) scale. This is possibly
    because of familiarity with school marking
    systems.
  • If the instructions contain an explicit warning
    against using a restricted range of numbers, the
    tendency is much reduced.
  • People are very sensitive to instructions these
    have to be as explicit and clear as possible.

36
Applying ME to linguistic acceptability
  • ME yields interval scales, which allow the
    application of parametric statistics.
  • mathematical operations can be applied to the
    estimates, allowing
  • a direct indication of the speakers ability
    to discriminate between grammatical and
    ungrammatical sentences
  • a direct measure of the strength of
    speakers preferences.

37
Data analysis
  • ME data need to be normalized. Two ways
  • Transforming raw magnitude values into logarithms
    before carrying out any further operation.
  • Dividing each numerical value by the modulus that
    the subject had assigned to the reference
    sentence then carry out analyses on the
    log-transformed judgments.
  • Any statistical package can do this!

38
Do you need a lab to use ME?
  • No. ME is very adaptable and can be used with
    pencil and paper, an overhead projector,
    booklets, etc.

39
Who can do ME?
  • Any adults, although it may not be the technique
    of choice if you are doing fieldwork with
    low-literacy or low-numeracy people.

40
Lexical gradience the Auxiliary Selection
Hierarchy (Sorace 2000)
  • CHANGE OF LOCATION 'BE'
  • CHANGE OF STATE
  • CONTINUATION OF A STATE
  • EXISTENCE OF STATE
  • UNCONTROLLED PROCESS
  • CONTROLLED PROCESSES (MOT)
  • CONTROLLED PROCESS (-MOT) 'HAVE'

41
Gradedness in Italian auxiliary selection(Bard,
Robertson Sorace, 1996)
42
Other recent ME applications on language
development
  • L1 attrition in the use of referential pronouns
    (Tsimpli et al. 2003 Filiaci 2003).
  • Pronouns and clitics in L2 Spanish and Greek
    (Parodi 2002).
  • Focus in L2 Hungarian (Papp 1999)
  • Verb movement and null subject parameters in L2
    French and Spanish (Ayoun 2000).
  • Residual verb raising in Faroese (Heycock 2003).

43
WebExp
  • Keller et al. (1998) developed a dedicated
    interactive software WebExp which can be used
    to collect acceptability judgment remotedly on
    Internet, as well as in standard experimental
    conditions.
  • The current version of WebExp is still available
    but will undergo substantial revision in order to
    improve compatibility.

44
Collecting data at a distance WebExp
  • WebExp offers the following features for
    conducting web-based experiments
  • Two experimental paradigms are supported
    magnitude estimation and sentence completion.
    Both within-subject and between-subject designs
    can be used.
  • Automatic subject authentication is achieved by
    conducting basic plausibility checks on the
    subject's data and by verifying the subject's
    e-mail address.

45
  • WebExp automatically creates an individual
    randomization of the experimental materials for
    each subject. The experimenter can impose
    constraints on the randomization to prevent
    certain experimental items from occurring
    consecutively.
  • WebExp records the time a subject takes to
    respond to each experimental item. Automatic
    checks can be carried out on both onset times and
    completion times.
  • The response data are stored in a format that can
    be easily processed by standard statistics
    packages.

46
  • WebExp has been subjected to standard validation
    procedures (Keller Alexopoulou 2001), which
    suggest that the data it produces are comparable
    to lab-based data.

47
Future developments
  • We are going to test a non-numerical version of
    ME with older children (4
  • yr olds).
  • Older children should be able to understand the
    concept of proportionality.

48
Conclusions
  • Magnitude estimation can be used by naive
    informants to judge linguistic acceptability.
    Within-group estimates are consistent across
    response modalities and between-group comparisons
    show consistent statistically significant
    effects.
  • Magnitude estimation allows us to use the full
    power of experimental design and statistical
    analysis to test hypotheses derived from
    linguistic theory.

49
It works
  • Magnitude estimation is particular suited to the
    investigation of developing/unstable grammars.
  • ME has now been used in a wide range of language
    studies on different topics and within different
    theoretical frameworks.

50
References
  • Bard, E.G., Robertson, D. and Sorace, A. 1996.
    Magnitude estimation of linguistic acceptability.
    Language 72 32-68.
  • Keller, F. 2003. A psychophysical law for
    linguistic judgments. Proceedings of the 25th
    Annual Conference of the Cognitive Science
    Society. Mahawah Lawrence Erlbaum.
  • Sorace, A. 1996. The use of acceptability
    judgments in second language research. In V. T.
    Bhatia and W. Ritchie (eds.) Handbook of Second
    Language Acquisition. New York Academic Press,
    p. 375-409.
  • Sorace, A. Keller, F. in press. Gradience in
    linguistic data. To appear in Lingua.
Write a Comment
User Comments (0)
About PowerShow.com