Towards a typology of web registers: A multidimensional analysis - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Towards a typology of web registers: A multidimensional analysis

Description:

... awesome group and begin living the dream of working from home for only ... This SPECIALIZED DICTIONARY (and the factual sociological content of the QUALIA ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 31
Provided by: BIB6
Category:

less

Transcript and Presenter's Notes

Title: Towards a typology of web registers: A multidimensional analysis


1
Towards a typology of web registers A
multi-dimensional analysis
  • Douglas Biber
  • Northern Arizona University
  • (Collaborating researchers Jerry Kurjian and
    James K. Jones)

2
Methodology for construction of the web corpus
  • Two Google categories chosen for analysis Home
    and Science
  • Multiple Google sub-categories under each
    top-level category
  • Home
  • Apartment Living. Consumer Information,
    Cooking, Do-It-Yourself, Domestic Services,
    Emergency Preparation, Entertaining, Family,
    Gardens, Home Automation, Home Business, Home
    Buyers, Home Improvement, Homemaking, Homeowners,
    Moving and Relocating, News and Media, Personal
    Finance, Personal Organization, Pets, Rural
    Living, Seniors, Shopping, Software, Urban Living
  • Science
  • Agriculture, Anomalies and Alternative
    Science, Astronomy, Biology, Chats and Forums,
    Chemistry, Conferences, Earth Sciences,
    Educational Resources, Employment, Environment,
    History of Science, Institutions, Instruments and
    Supplies, Math, Methods and Techniques, Museums,
    News, Philosophy of Science, Physics,
    Publications, Reference, Science in Society,
    Social Sciences, Software, Technology, Women

3
Construction of the web corpus (cont.)
  • Download method
  • For each sub-category, webpages from two websites
    were saved
  • Each website contributed approximately 50
    webpages.
  • Thus, each sub-category contributed approximately
    100 webpages to its corpus.
  • The Science sub-corpus consists of webpages from
    81 websites
  • The Home sub-corpus from 63 websites

4
Construction of the web corpus (cont.)
  • Website selection
  • Used the Google list of linked websites, ordered
    by rank of relevance. Chose the first and last
    website from the list.
  • Sites near the top of the list were nearly always
    large "authoritative" commercial or governmental
    sites sites toward the end of the list were
    often smaller, more personal sites.
  • Sampling method
  • Automatic browser downloaded c. 200 webpages per
    site.
  • Every 4th webpage was selected.

5
Composition of the corpus of web documents
  • The corpus extracted from the Web
  • Home Science
  • total documents 2426 2678
  • documents 200 words 1765 1905
  • unproblematic documents 1400 1576
  • (i.e., adjectives)
  • The corpus used for subsequent analyses
  • of documents of words average length
    of document
  • Home 1400 1.68 million 1201 words
  • Science 1576 2.06 million 1308 words
  • Total 2976 documents 3.74 million words

6
Results of the factor analysis 4 factor
solution Promax rotation

  • Factor 1 Factor 2 Factor 3
    Factor 4
  • Factor 1 Features
  • Positive
  • mentalv 0.52866 0.13219
    0.48879 0.15881
  • that_del 0.50498 -0.05699
    0.36928 0.07394
  • pro3 0.49600 -0.08733
    -0.00962 -0.07585
  • pro1 0.48887 -0.08592
    0.34927 -0.03674
  • fact_vth 0.48566 -0.04446
    0.13415 0.06093
  • factadvl 0.45301 0.17506
    0.10033 0.01065
  • commv 0.43627 0.07929
    0.09676 0.25554
  • nonf_vth 0.40773 0.02291
    -0.04169 0.11524
  • perfects 0.39889 -0.06459
    -0.14479 -0.05132
  • lkly_vth 0.36778 0.03769
    0.13411 0.12921
  • sub_othr 0.32987 0.22566
    -0.01789 -0.12808
  • it 0.30174 0.28261
    0.01546 -0.08610
  • all_nth 0.29096 0.14375
    -0.09966 0.19167
  • Negative
  • (nouns -0.55720 -0.56705
    0.05097 -0.00803

7
Table 2. Results of the factor analysis(cont.)
  • Factor 3 Features
  • Positive
  • pro2 -0.20975 0.23060
    0.67108 -0.08567
  • vprogrsv 0.10331 -0.03972
    0.40924 0.04331
  • dsre_vto 0.13164 0.04493
    0.40622 0.05607
  • groupn 0.05023 -0.17888
    0.36500 0.13152
  • actv 0.13322 0.16240
    0.32304 -0.15965
  • wh_cl 0.16565 0.07859
    0.28940 0.05465
  • Negative
  • prep 0.18466 -0.05765
    -0.48977 0.14476
  • allpasv -0.02260 0.22100
    -0.45995 0.05585
  • Factor 4 Features
  • Positive
  • n_nom -0.12713 -0.09568
    0.09601 0.80425
  • abstrctn 0.00280 0.15802
    0.08745 0.65036
  • wrdlngth -0.19665 -0.21485
    -0.10344 0.66341
  • cognitn 0.22940 0.17306
    -0.03108 0.41358

8
  • Inter-Factor Correlations
  • Factor1 Factor2
    Factor3 Factor4
  • Factor1 1.00000 0.30424
    0.12491 -0.28968
  • Factor2 0.30424 1.00000
    0.40607 -0.23135
  • Factor3 0.12491 0.40607
    1.00000 -0.32306
  • Factor4 -0.28968 -0.23135
    -0.32306 1.00000

9
Table 3 Summary of the factorial structure
  • Dimension 1 Personal, Involved (Stance-focused)
    Narration
  • Features with positive loadings past tense,
  • mental verbs, that-deletions, 3rd person
    pronouns,
  • 1st person pronouns, certainty/mental verb
    that-clause, certainty adverbials, communication
    verbs, communication verb that-clause, perfect
    aspect, likelihood/mental verb that-clause,
    other adverbial clause, pronoun it, indefinite
    pronouns,
  • noun that-clause
  • Features with negative loadings nouns

10
Table 3 Summary of the factorial structure
(cont.)
  • Dimension 2 Persuasive/argumentative discourse
  • Features with positive loadings present tense,
    possibility modals, main verb be, predicative
    adjectives, conditional adverbial clauses,
    linking adverbials, necessity modals,
    demonstrative pronouns, prediction modals, split
    auxiliaries
  • Features with negative loadings
  • nouns, past tense

11
Table 3 Summary of the factorial structure
(cont.)
  • Dimension 3 Advice ??
  • Features with positive loadings
  • 2nd person pronouns, progressive verbs,
  • desire verb to-clause, group nouns, activity
    verbs, WH clauses
  • Features with negative loadings prepositions,
    passive verbs

12
Table 3 Summary of the factorial structure
(cont.)
  • Dimension 4 Abstract/technical discourse
  • Features with positive loadings
  • nominalizations, abstract nouns, long words,
    cognitive nouns, topic adjectives, attributive
    adjectives
  • Features with negative loadings
  • concrete nouns

13
Distribution of texts from two Google categories
on Dimension 1 Personal Narration
14
Distribution of texts from two Google categories
on Dimension 2 Persuasion
15
Distribution of texts from two Google categories
on Dimension 3 Advice
16
Distribution of texts from two Google categories
on Dimension 4 Technical Discourse
17
Distribution of texts from Google sub-categories
on Dimension 1
18
Duncans test for Home subcategories Dimension 1
(Means with the same letter are not
significantly different )
  • Duncan Grouping Mean
    N Category
  • A -169.61 58
    family
  • A -177.58 62
    seniors
  • B A -194.86 64
    personalorg
  • B A C -201.70 96
    urban
  • B D C -227.00 112
    smallbiz
  • E D C -242.62 74
    ruralliv
  • E D F -254.00 98
    finance
  • E D F -260.13 54
    domestic
  • E D F -261.62 57
    shopping
  • E G D F -271.36 65
    emergency
  • E G H F -280.42 85
    pets
  • I E G H F -287.48 37
    realest
  • I J G H F -293.83 60
    cook
  • I J G H F -297.46 80
    entertain
  • I J G H F -297.91 73
    consuminfo
  • I J G H -312.85 48
    homeowner
  • I J H K -321.90 49
    diy
  • I J H K -328.79 16
    moving

19
Plot of web documents along Dimensions 1 and 4
  • Dimension 1
  • 8
  • 8
  • 8
  • 40 8 8
  • 8 8 8 88
  • 8 8 8 88
  • 8 8888 8 88 8
  • 8 8 88 6888 6
  • 86 8 8 888686 6 8 666 8 6
  • 86 66866 88 8 8666 3 82
    6 6
  • 20 8 6 8 368 86368688 636 666 6
    6
  • 6366866336336666266616
    62 22
  • 1 3666 666663363363663363661613
    2226 6 2 2 2
  • 6 6 3 333 663631666366362363222
    226222 22 2 2
  • 3 3633136363333363636626326223
    22622 222222222 2 2
  • 3 3331333633333363636333632332
    22322222222 222222 22
  • 3 131 3 33 633333333333333433366322
    2222222222222 22222222 2 2

20
Table 4 Summary of the Cluster Analysis

  • Maximum Distance
  • from Seed
    Nearest Distance Between
  • Cluster Frequency to Observation
    Cluster Cluster Centroids
    --------------------------------------------------
    ------
  • 1 428 27.83 4
    14.72
  • 2 490 22.09 3
    17.24
  • 3 599 24.33 2
    17.24
  • 4 503 25.36 1
    14.72
  • 5 620 21.50 1
    18.22
  • 6 244 30.02 3
    18.06
  • 7 21 23.32 5
    22.22
  • 8 71 24.09 6
    19.52

21
Cluster means for each dimension
  • Cluster Dim. 1 Dim. 2 Dim. 3
    Dim. 4
  • Pers. Narr. Persuasion Advice
    Technical
    --------------------------------------------------
    ----
  • 1 -3.84 -7.38 -9.03
    -6.72
  • 2 4.20 5.64 -3.31
    7.07
  • 3 5.44 6.93 5.77
    -7.46
  • 4 -6.48 -4.76 4.29
    -1.71
  • 5 -9.18 -9.56 -8.10
    10.54
  • 6 14.18 17.48 17.49
    -6.41
  • 7 -9.65 -9.44 -9.36
    32.72
  • 8 28.30 7.04 11.51
    -12.53

22
(No Transcript)
23
(No Transcript)
24
Breakdown of Home and Science Web documents
across the 8 text types
  • CLUSTER Web Category
  • Home Science Total
  • -----------------------------------
  • 1 150 278 428
  • 10.7 17.6
  • -----------------------------------
  • 2 149 341 490
  • 10.6 21.6
  • -----------------------------------
  • 3 385 214 599
  • 27.5 13.5
  • -----------------------------------
  • 4 344 159 503
  • 24.5 10.1
  • -----------------------------------
  • 5 139 481 620
    Informational discourse
  • 9.9 30.5

25
Breakdown of selected Web sub-categories across
the 8 text types
  • CLUSTER Web Sub-categories
    (within Home and Science)
  • altsci earthscifamily finance
    hist home- seniors tech Total

  • owner (all docs)
  • ----------------------------------------------
    --------------------------------------------
  • 1 6 13 3 1
    38 0 1 4 428
  • 11.1 23.2 5.1 1.0
    54.3 0.0 1.6 5.3
  • ----------------------------------------------
    --------------------------------------------
  • 2 15 11 0 20
    7 2 6 12 490
  • 27.7 19.6 0.0 20.4
    10.0 4.1 9.6 16.0
  • ----------------------------------------------
    --------------------------------------------
  • 3 13 4 16 37
    8 25 11 9 599
  • 24.0 7.1 27.5 37.7
    11.4 52.0 17.7 12.0
  • ----------------------------------------------
    --------------------------------------------
  • 4 1 8 12 21
    8 21 14 7 503
  • 1.8 14.2 20.6 21.4
    11.4 43.7 22.5 9.3
  • ----------------------------------------------
    --------------------------------------------
  • 5 10 17 6 5
    8 0 4 39 620
  • Informational 18.5 30.3 10.3 5.1
    11.4 0.0 6.4 52.0

26
Home / Family Web Page Text Type 5
Informational discourse
  • General Science Information
  • Amino Acids - Symbols, formulas and 3D images.
  • Bird Species - Pictures and scientific names will
    help improve your identification skills.
    Includes herons, sparrows, warblers, woodpeckers,
    owls and more.
  • Chemicool Periodic Table - Search and learn about
    the elements.
  • Entomology for Beginners - the basics of insect
    study.
  • Grasshopper - science links and a list of cool
    museums to visit.
  • Human Anatomy 1994 - These x-rays have labeled
    body parts.
  • K-12 WWW Links - links to sites for answers to
    any science question.
  • Mad Scientist Network,The - answers to science
    questions.
  • Microworlds - This interactive tour uses
    graphics, photos and text to explore the
  • structure of materials.
  • SciEd
  • Science Bytes, from UT
  • Science Education Gate-Way - K-12 science
    education resource center for teachers and
    students with learning adventures in Earth and
    Space science from a NASA-sponsored partnership
    of museums, researchers and educators.
  • Science Learning Center - Access to exhibits,
    publications, museums and more.

27
Home / Family Web Page Text Type 6 Persuasive
Advice
  • What is the Mom Team??
  • The Mom Team is an organization that is
    dedicated to assisting, training and supporting
    others who would like to work from home with
    their own business.
  • What kind of business is it?
  • All members of the MOM Team are simply customers
    of a wonderful company where we save time, money,
    provide a safer environment for our homes and
    improve our health.
  • Everyone also has the option to own their own
    business to add to their income, replace an
    income or more depending on their own personal
    goals.
  • It was just announced this morning that for the
    month of June, you can join our awesome group and
    begin living the dream of working from home for
    only ONE DOLLAR!! This is incredible and we
    didn't want you miss the opportunity to take
    advantage of this awesome promotion.
  • How much income can I earn?
  • It's up to you. You can earn a few hundred
    dollars a month or even thousands each month
    depending on you and your own personal goals.
  • Do you have to sell products?
  • No. We don't sell, or stock any products. We
    don't have to deliver anything or collect any
    money.
  • What do I need to be able to run this business
    from my home?
  • You need a computer (or access to one), a
    telephone, and a willingness to become part of
    our team and use our proven system.
  • How much does this cost to get started?
  • You can get started for just 29.00 US.

28
Science / Alternative Science Web Page (Part 1)
Text Type 7 Technical discourse
  • Father Jerome's SPECIALIZED DICTIONARY of
    PSYCHOSOCIOLOGICAL KEYWORDS/PHRASES used in his
    QUALIA III Monograph.
  • This SPECIALIZED DICTIONARY (and the factual
    sociological content of the QUALIA III Monograph)
    is derived from a book by one of the 20th
    Century's greatest and foremost Sociologists, Dr.
    Pierre Bourdieu, of one of France's premiere
    graduate Institutes, the College de France,
    Paris, where Dr. Bourdieu is the President of the
    College of Sociology. His book is Reproduction in
    Education, Society and Culture, where
    'Reproduction' means the reproducing, producing,
    and continuing, of the existing 'status quo', or
    social means, or mechanisms, or, in reality, the
    underlying structure, of any society, by which
    the 'Rulers', the Nobility, the Aristocracy
    (i.e., the Rich), actually 'rule' and control
    that society, and continue their societal rule,
    from generation to generation, by 'training' and
    placing their progeny, their children, in the
    'positions of power' throughout society which
    enable them to inherit not only the 'riches' but
    also the 'power', as passed on to them by that
    ruling class.

29
Science / Alternative Science Web Page (Part 2)
Text Type 7 Technical discourse
  • DEFINITIONS of Keywords/Phrases used in the
    QUALIA III Monograph
  • a full account of the selection process
  • a negatively constructive societal system
  • absolute societal control
  • academia's essential internal function
  • academia's ideological function
  • academic autonomy and class relations
  • academic consecration
  • theodicy
  • theoretical construction
  • title
  • traditional economic conduct
  • tyrannical positivity
  • U.S. statistics
  • ultimate rationale
  • ultimate truth
  • unconscious sanctions anticipation
  • violence

30
Home / Family Web Page Text Type 8 Personal
Narrative
  • Shelters in My Storm
  • by Cyd
  • Shalece
  • The biggest help in my experience of foster
    care was a three year old.
  • Shalece taught me so much about how to love truly
    and without asking
  • anything in return. She taught me what it means
    to be family, when from
  • the day I walked into her house, I was her big
    sister. She never let me
  • forget that even when I had to leave. To this
    day she is excited when I
  • come to see her. She has never let me down. I
    love and trust her more
  • than anyone else. Her parents were also of great
    help to me, but they could
  • never have reached me like that tiny little girl
    with the large heart did from
  • day one. Some day when she is old Some day when
    she is old enough to
  • understand, I think I will show her this to let
    her know I really feel grateful
  • to her. I think people have a hard time seeing
    that from me.
Write a Comment
User Comments (0)
About PowerShow.com