Business Research Methods - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Business Research Methods

Description:

Business Research Methods Measurement and Scaling: Noncomparative Scaling Techniques – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 41
Provided by: dcom81
Category:

less

Transcript and Presenter's Notes

Title: Business Research Methods


1
Business Research Methods
  • Measurement and Scaling
  • Noncomparative ScalingTechniques

2
Noncomparative Scaling Techniques
  • Respondents evaluate only one object at a time,
    and for this reason noncomparative scales are
    often referred to as monadic scales.
  • Noncomparative techniques consist of continuous
    and itemized rating scales.

3
Continuous Rating Scale
  • Respondents rate the objects by placing a mark at
    the appropriate position
  • on a line that runs from one extreme of the
    criterion variable to the other.
  • The form of the continuous scale may vary
    considerably.
  •  
  • How would you rate Sears as a department store?
  • Version 1
  • Probably the worst - - - - - - -I - - - - - - - -
    - - - - - - - - - - - - - - - - - - - - - - - - -
    - - - - - Probably the best
  •  
  • Version 2
  • Probably the worst - - - - - - -I - - - - - - - -
    - - - - - - - - - - - - - - - - - - - - - - - -
    - - - -- - Probably the best
  • 0 10 20 30 40 50 60 70 80 90 100
  •  
  • Version 3
  • Very bad Neither good Very
    good
  • nor bad
  • Probably the worst - - - - - - -I - - - - - - - -
    - - - - - - - - - - - - - -- - - - - - - - - - -
    - - - - - -Probably the best
  • 0 10 20 30 40 50 60 70 80 90 100

4
Itemized Rating Scales
  • The respondents are provided with a scale that
    has a number or brief description associated with
    each category.
  • The categories are ordered in terms of scale
    position, and the respondents are required to
    select the specified category that best describes
    the object being rated.
  • The commonly used itemized rating scales are the
  • Likert,
  • semantic differential, and
  • Stapel scales.

5
Likert Scale
  • The Likert scale requires the respondents to
    indicate a degree of agreement or
  • disagreement with each of a series of statements
    about the stimulus objects.
  •  
  • Strongly Disagree Neither Agree Strongly
  • disagree agree nor agree
  • disagree
  •  
  • 1. Sears sells high quality merchandise.
    1 2X 3 4 5
  •  
  • 2. Sears has poor in-store service. 1 2X 3 4 5
  •  
  • 3. I like to shop at Sears. 1 2 3X 4 5
  •  
  • The analysis can be conducted on an item-by-item
    basis (profile analysis), or a total (summated)
    score can be calculated.
  • When arriving at a total score, the categories
    assigned to the negative statements by the
    respondents should be scored by reversing the
    scale.

6
Semantic Differential Scale
  • The semantic differential is a seven-point rating
    scale with end
  • points associated with bipolar labels that have
    semantic meaning.
  •  
  • SEARS IS
  • Powerful ---------X----- Weak
  • Unreliable -----------X--- Reliable
  • Modern -------------X- Old-fashioned
  • The negative adjective or phrase sometimes
    appears at the left side of the scale and
    sometimes at the right.
  • This controls the tendency of some respondents,
    particularly those with very positive or very
    negative attitudes, to mark the right- or
    left-hand sides without reading the labels.
  • Individual items on a semantic differential scale
    may be scored on either a -3 to 3 or a 1 to 7
    scale.

7
A Semantic Differential Scale for Measuring Self-
Concepts, Person Concepts, and Product Concepts
1) Rugged ---------------------
Delicate
2) Excitable ---------------------
Calm 3) Uncomfortable ----------------
----- Comfortable 4)
Dominating ---------------------
Submissive 5)
Thrifty ---------------------
Indulgent 6) Pleasant
--------------------- Unpleasant
7) Contemporary -----------------
---- Obsolete 8)
Organized ---------------------
Unorganized
9) Rational ---------------------
Emotional 10) Youthful
--------------------- Mature
11) Formal ---------------------
Informal 12) Orthodox
--------------------- Liberal
13) Complex ---------------------
Simple 14) Colorless
--------------------- Colorful 15)
Modest --------------------- Vain
8
Stapel Scale
  • The Stapel scale is a unipolar rating scale with
    ten categories
  • numbered from -5 to 5, without a neutral point
    (zero). This scale
  • is usually presented vertically.
  •  
  • SEARS
  •  
  • 5 5
  • 4 4
  • 3 3
  • 2 2X
  • 1 1
  • HIGH QUALITY POOR SERVICE
  • -1 -1
  • -2 -2
  • -3 -3
  • -4X -4
  • -5 -5
  • The data obtained by using a Stapel scale can be
    analyzed in the

9
Basic Noncomparative Scales
Scale

Basic
Examples

Advantages

Disadvantages
Characteristics


Continuous
Place a mark on a
Reaction to
Easy to construct

Scoring can be
continuous line

TV
cumbersome
Rating
commercials

unless
Scale

computerized

Itemized Rating


Scales


Likert Scale

Degrees of
Measurement
Easy to construct,
More
agreement on a 1
of attitudes

administer, and
time
-
consuming

(strongly disagree)
understand

to 5 (strongly agree)
scale


Semantic
Seven
-
point scale
Brand,
Versatile

Controversy as
with bipolar labels

product, and
to whether the
Differential

company
data are interval

images


Stapel
Unipolar ten
-
point
Measurement
Easy to construct,
Confusing and
scale,
-
5 to 5,
of attitudes
administer over
difficult to apply

Scale

witho
ut a neutral
and images

telephone

point (zero)



10
Summary of Itemized Scale Decisions
  • 1) Number of categories Although there
    is no single, optimal number, traditional
    guidelines suggest that there should be
    between five and nine categories
  • 2) Balanced vs. unbalanced In general, the
    scale should be balanced to obtain objective
    data
  • 3) Odd/even no. of categories If a neutral or
    indifferent scale response is possible from
    at least some of the respondents, an odd
    number of categories should be used
  • 4) Forced vs. non-forced In situations where the
    respondents are expected to have no opinion,
    the accuracy of the data may be improved by a
    non-forced scale
  • 5) Verbal description An argument can be made
    for labeling all or many scale categories.
    The category descriptions
    should be located as close to the response
    categories as possible
  • 6) Physical form A number of options should be
    tried and the best selected

11
Balanced and Unbalanced Scales
Figure 9.1
Jovan Musk for Men is Jovan Musk for Men is
Extremely good Extremely good Very
good Very good Good Good
Bad Somewhat good Very bad Bad
Extremely bad Very bad
Balanced Scale
Unbalanced Scale
12
Rating Scale Configurations
A variety of scale configurations may be
employed to measure the gentleness of Cheer
detergent. Some examples include Cheer
detergent is 1) Very harsh
--- --- --- --- --- --- --- Very gentle
2) Very harsh 1 2 3 4 5 6
7 Very gentle 3) . Very
harsh . .
. Neither harsh nor gentle . .
. Very gentle 4)
____ ____ ____
____ ____ ____
____ Very Harsh
Somewhat Neither harsh Somewhat
Gentle Very harsh
Harsh nor gentle gentle
gentle 5)
Very Neither harsh Very
harsh nor gentle

gentle

Figure 9.2

Cheer
-3
-1
0
1
2
-2
3
13
Some Unique Rating Scale Configurations
Thermometer Scale Instructions Please
indicate how much you like McDonalds hamburgers
by coloring in the thermometer. Start at the
bottom and color up to the temperature level that
best indicates how strong your preference is.
Form Smiling Face Scale
Instructions Please point to the face
that shows how much you like the Barbie Doll. If
you do not like the Barbie Doll at all, you would
point to Face 1. If you liked it very much, you
would point to Face 5. Form
1 2 3 4 5
Figure 9.3
Like very much
100 75 50 25 0
Dislike very much
14
Thurstone Scale
  • It is a two stage procedure
  • In the first stage researcher selects 80 to 100
    items indicating different degrees of favourable
    attitude for concept under study
  • They are given to a group of judges to group them
    into favourable disfavour able by keeping equal
    intervals between categories
  • All items that have consensus from judges are
    selected distributed uniformly on a scale of
    favourability
  • This scale is then administered to respondents to
    measure their attitude towards a particular
    concept
  • It is time consuming costly is rarely used in
    applied BR

15
Measurement Accuracy
  • The true score model provides a framework for
    understanding the accuracy of measurement.
  • XO XT XS XR
  • where
  • XO the observed score or measurement
  • XT the true score of the characteristic
  • XS systematic error
  • XR random error

16
Systematic Error
  • Lack of clarity of the scale, including the
    instructions or the items themselves.
  • Mechanical factors, such as poor printing,
    overcrowding items in the questionnaire, and poor
    design.

17
Random Error
  • Short-term or transient personal factors, such as
    health, emotions,and fatigue.
  • Situational factors, such as the presence of
    other people, noise, and distractions.

18
Criteria for evaluating measurement
  • The criteria for evaluating measurements are
  • Reliability
  • Validity
  • Sensitivity
  • Generalizability
  • Relevance

19
Reliability
  • The degree to which measures are free from random
    error and therefore yield consistent results
    across time or situations.
  • Perfect reliability requires that there is no
    random error
  • XR0

20
Validity
  • The ability of a scale to measure what was
    intended to be measured.
  • Perfect validity requires that there is no
    measurement error either systematic or random.
  • XRo XS0

21
Relationship between validity reliability
  • If a measure is perfectly valid it is also
    perfectly reliable
  • However if a measure is perfectly reliable it may
    or may not be perfectly valid
  • If a measure is unreliable it will not be valid
  • Reliability is a necessary but not a sufficient
    condition for validity

22
THE GOAL OF MEASUREMENT VALIDITY and RELIABILITY
23
Reliability and Validity on Target
Old Rifle New Rifle New Rifle
Sunglare Low Reliability High
Reliability Reliable but Not Low Validity High
Validity Valid (Target A) (Target B) (Target C)
24
RELIABILITY
Of index measures
Repeatability
25
Types of Reliability
  • There are two dimensions of reliabilityRepeatabil
    ity Internal consistency
  • If the results of the research are the same even
    when it is conducted second or third time it
    confirms repeatability aspect
  • Test-Retest Method An approach for assessing
    reliability in which respondents are administered
    identical sets of scale items at two different
    times under as nearly equivalent conditions as
    possible
  • This measures repeatability since the same scale
    or measure is administered to the same set of
    respondents at two separate points. If the
    measure is stable over time , it should obtain
    similar results.(40 satisfied with jobs both
    times)
  • However it is difficult to locate all respondents
    for the second round, their attitudes may change
    over time or the first measure may sensitize the
    respondents

26
Equivalent Forms Method
  • An approach to assess reliability that requires
    two equivalent forms of scale to be constructed
    administered to the same respondents at two
    different times
  • However it is difficult , time consuming
    expensive to construct two equivalent forms of
    scale

27
Internal Consistency
  • This measure of reliability focuses on internal
    consistency of the set of items forming the
    scale.
  • It is used to assess reliability of a summated
    scale where several items are summed to form a
    total score .Each item measures some aspect of
    the construct and the items should be consistent
    in what they indicate about the characteristics

28
Split half Method
  • Split half Method It is a method of measuring
    internal consistency reliability in which the
    items constituting the scale are divided into two
    halves and the resulting scores of two halves are
    correlated. High correlation indicates high
    consistency
  • However results will depend on how the scale
    items are split
  • Coefficient alpha(Cronbachs Alpha) A measure of
    internal consistency reliability that is the
    average of all possible split half coefficients
    resulting from different splitting of the scale
    items

29
  • Some multi item scales include several sets of
    items measuring different dimensions of a
    multidimensional construct. Since these
    dimensions are independent a measure of internal
    consistency computed across dimensions would be
    inappropriate. so internal consistency
    reliability can be computed for each dimension
  • Store image is a multidimensional construct that
    includes
  • --- Quality of goods,
  • --- variety of goods,
  • ---returns policy,
  • ---service ,
  • ----price,
  • ----location,
  • ----layout
  • ----billing credit policy

30
Face Professional agreement that logically it
appears valid. (Subjective) Content-Depends on
established theories for support
(objective) Criterion Does it fit or correlate
with other similar measure/constructs? Body Fat
caliper, water displacement, electrical
impedance, BMI. Concurrent two measure, same
time Predictive Two measures at diff.
times. Construct - confirmed with network of
hypotheses. Convergent(High relationship with
similar concepts). and divergent or discriminant
validity (low relationship with dissimilar
concepts).
31
Face Validity
  • Face Validity Subjective agreement among
    professionals that a scale logically appears to
    accurately measure what it is intended to
    measure. Weakest form without any analysis
  • Face validity is concerned with how a measure or
    procedure appears. Does it seem like a reasonable
    way to gain the information the researchers are
    attempting to obtain? Does it seem well designed?
    Does it seem as though it will work reliably?
    Unlike content validity, face does not depend on
    established theories for support

32
Content Validity
  • Content Validity is based on the extent to
    which a measurement reflects the specific
    intended domain of content .
  • Researchers aim to study mathematical learning
    and create a survey to test for mathematical
    skill. If these researchers only tested for
    multiplication and then drew conclusions from
    that survey, their study would not show content
    validity because it excludes other mathematical
    functions.
  • To measure adequacies of facilities in schools
  • attractiveness of school name, frequency of old
    students meet. eatables in the canteen not
    relevant variables
  • Number of classrooms, Number of qualified
    teachers, playground, liabrary- relevant
    variables

33
Criterion related Validity
  • Criterion related validity, also referred to as
    instrumental validity, is used to demonstrate the
    accuracy of a measure or procedure by comparing
    it with another measure or procedure which has
    been demonstrated to be valid.
  • For example, imagine a hands-on driving test has
    been shown to be an accurate test of driving
    skills. By comparing the scores on the written
    driving test with the scores from the hands-on
    driving test, the written test can be validated
    by using a criterion related strategy in which
    the hands-on driving test is compared to the
    written test.
  • New measure correlates with criterion measure

34
Predictive Validity
  • Predictive Validity. A type of criterion validity
    whereby a new measure correlates with criterion
    measure administered at a later time
  • In order for a test to be a valid screening
    device for some future behaviour, it must have
    predictive validity. The SAT is used by college
    screening committees as one way to predict
    college grades. The GMAT is used to predict
    success in business .It measures predictive
    validity .
  • We determine predictive validity by computing a
    correlation coefficient comparing
    SAT(NEW/Independent) scores, for example, and
    college grades (Criterion/dependent). If they
    are directly related, then we can make a
    prediction regarding college grades based on SAT
    score. We can show that students who score high
    on the SAT tend to receive high grades in
    college.

35
Concurrent Validity
  • A type of criterion validity whereby a new
    measure correlates with a criterion measure at
    the same time.
  • A new test of adult intelligence, for example,
    would have concurrent validity if it had a high
    positive correlation with the Wechsler Adult
    Intelligence Scale since the Wechsler is an
    accepted measure of the construct we call
    intelligence. An obvious concern relates to the
    validity of the test against which you are
    comparing your test.

36
Construct Validity
  • Construct validity seeks agreement between a
    theoretical concept and a specific measuring
    device or procedure. For example, a researcher
    inventing a new IQ test might spend a great deal
    of time attempting to "define" intelligence in
    order to reach an acceptable level of construct
    validity.
  • Construct validity can be broken down into two
    sub-categories Convergent validity and
    discriminate validity. Convergent validity is the
    actual general agreement among ratings, where
    measures should be theoretically related.
    Discriminate validity is the lack of a
    relationship among measures which theoretically
    should not be related

37
  • To measure Tendency to stay in low cost hotels
  • Four personality variables High level of self
    confidence, low need for status, low need for
    distinctiveness, high level of adaptability
  • Not related to brand loyalty, high level of
    aggressiveness
  • The scale can be said to have construct if it
    correlates highly with other measures of tendency
    to stay in low cost hotels Reported hotels
    patronised and social class (convergent)
  • Low correlation with the unrelated constructs of
    brand loyalty high level of aggressiveness
    (Divergent)

38
SENSITIVITY
  • A measurement instruments ability to accurately
    measure variability in stimuli or responses.
  • Yes and no agree or disagree are not very
    sensitive
  • Strongly agree, mildly agree, indifferent, mildly
    disagree, strongly disagree ,are categories whose
    inclusion increases scales sensitivity

39
Generizability
  • It is the degree to which a study based on a
    sample applies to a universe of generalization
  • Universe of generalization includes set of all
    conditions of measurement items, interviewers,
    modes of data collection etc.
  • To generalize a scale developed for personal
    interview to other modes of data collection such
    as mail, telephone etc.
  • To generalize from a sample of items to universe
    of items

40
Relevance
  • It represents appropriateness of using a
    particular scale for measuring a variable
  • Relevance Reliability x Validity
  • If either reliability or validity is low then the
    scale will have little relevance
  • If correlation coefficient is used to analyse
    both reliability validity then the scale can
    have relevance from 0 to 1.
Write a Comment
User Comments (0)
About PowerShow.com