Title: Computational Models of Discourse Analysis
1Computational Models of Discourse Analysis
- Carolyn Penstein Rosé
- Language Technologies Institute/
- Human-Computer Interaction Institute
2Warm-Up Discussion
- What is the distinction between personality,
identity, and perspective? - Does the distinction matter computationally
- How do they related to one another as lenses for
understanding social media data? - What do we take from todays readings for
assignment 4?
Identity
Personality
Perspective
3(No Transcript)
4Student Comment
- At first the paper did not seem related to our
task of identifying gender but perhaps this paper
shows that the way we see ourselves is extremely
consistent. No matter how you ask the question a
subject will always give you an honest answer as
to how they see themselves. This could mean that
no matter how hard we try we will sooner or later
embed signals into our blog posts that indicate
our perceived gender.
5Student Comment
- It seems that the importance of "spiritual self"
in presentation is the most important takeaway
from this paper. 96 of users attempt to describe
themselves with aspects of their "spiritual self"
(i.e., perceived abilities). So focusing on these
instead of the material or the social might be
better (although, it's possible that a particular
gender uses one of these sub-types significantly
more than another, which could also be handy, but
we don't have that information). - Is this personality or identity? How would you
expect it to relate to other online behavior?
6Semester Review
7Semester in Review
- In each Unit
- Readings from Discourse Analysis and
Sociolinguistics - Readings from Language Technologies
- Hands-on assignment
- Implementation and corpus based experiment
- Competitive error analysis
- Student Presentations
- Unit 1 Theoretical Foundation
- Unit 2 Linguistic Structure
- Unit 3 Sentiment
- Unit 4 Identity and Personality
- Unit 5 Social Positioning
8Building Tasks
- According to Gees theory, whenever we speak or
write, we are constructing 7 areas of reality - What we build Significance, Practices,
Identities, Relationships, Politics, Connections,
Sign systems and knowledge - How we build them Social languages, Socially
situated identities, Discourses, Conversations,
Figured worlds, intertextuality
9What we Build
- Significance things and people made more or less
significant through the text - Practices ritualized activities and how are they
being enacted through the text (for example,
lecturing or mentoring) - Identities manner in which things and people are
being cast in a role through the text - Relationships style of social relationship, like
level of formality - Politics how social goods are being
distributed, who is responsible for the flow,
where is it going - Connections connections and disconnections
between things and people, e.g., what ideas are
related, how are things causally connected, what
is affecting what? - Sign Systems and Knowledge languages, social
languages, and ways of knowing, what ways of
communicating and knowing are treated as standard
and acceptable in the context, e.g., that youre
expected to speak in English in class
10Imagine an environmentalist commercial
Form-Function Correspondence Range of meanings
for the word sustainability
Conversation Global Warming
Discourse Environmentalism
Discourse StatusQuo
Socially Situated Identity Environmentalist
Social Language Liberal rhetoric
Figured World Expected structure of
Conservationist Commercial
Situated Meaning Meaning of sustainability in
the commercial
11Computationalizing Gee?
- Challenge not variationist
- Form-function correspondences can be modeled
naturally through rules - Cells of table like feature extractors?
- Social Languages like topic models?
- Figured worlds related to social causality
12(No Transcript)
13Metafunctions
14What is a system?
15Computationalizing SFL?
- See Elijahs ACL paper!
- We had to REALLY simplify to get there
- Not clear how to do that for Heteroglossia yet
16Computational Techniques
- Text entailment/ similarity measures/ paraphrase/
constraint relaxation - Topic models
- Machine Learning
- Techniques bootstrapping, HMMs, other
statistical modeling techniques - Basic features unigrams, bigrams, POS bigrams,
acoustic and prosodic features (speech) - Created features dictionaries, templates,
syntactic dependency relations
17Basic Aspects of Discourse Structure are Easiest
to Model
- Turn taking
- Topic segments
- Speech acts (at least direct ones)
- More recent computational work focuses on more
challenging discoursey problems like sentiment
and stance - Some recent work on metaphors (related to
frames), but not applied to discourse level
problems
18Problems
- Labels in public datasets dont necessarily match
the theory - Computational approaches embody variationist
assumptions, but much of the theory is grounded
in a more contextualized view of meaning making - Lack of a fully satisfying operationalization of
style (style is hard to separate from content) - Grammatical metaphor and other indirect
strategies - Same effect can be achieved in so many ways
each technique only captures one slice so
youre always just grasping a glimpse of whats
there - Overfitting spurious correlations
- subpopulations leading to problems with
generalization - Similar variation arising due to numerous
different factors (gender, age, SES) - Features at too low level words serving
multiple purposes simultaneously
19Engagement and Personality?
20How would you expect an Engagement style analysis
to relate to personality?
- What effect would you expect to see on
conversations? - Are these necessarily connected?
21Freshman Engineering Study
- 131 Freshman engineering students worked in
groups of 3 or 4 to design a better wrench - Applying principles related to stress and
leverage - Procedure
- Tutorial on computer aided engineering
- Pretest
- Collaborative design activity
- Posttest
- Questionnaire
22Heteroglossia Manipulation
23Social Manipulation
24Tutor Agent Design
Tutor One last thing on this topic, Does more (or less) stress in a wrench make it easier to use?
S95 no change?
Tutor You are correct about that.. Stress doesn't determine ease of use.
Tutor It's the moment achieved by the wrench that determines the ease of use.
S89 yay!
Tutor It's good to have your contributions in the discussion Jackie -)
Tutor Go team -)
S89 Go team yay
Tutor I am happy to be working with our team
S89 Me too
S95 whoa the bot knows my name
Tutor Unfortunately maximum stress (12800) in our design1 is way above the maximum allowed stress (i.e. 8750)
Tutor This wrench cannot be safely used!
Kumar, R. Rosé, C. P. (2011). Architecture for
building Conversational Agents that support
Collaborative Learning, IEEE Transactions on
Learning Technologies special issue on
Intelligent and Innovative Support Systems for
Computer Supported Collaborative Learning
25Results on Breadth of Coverage of Design Space
- Significant main effect of Heteroglossia on
number of ideas mentioned - Heteroglossia was better than Monoglossia and
Neutral - Significant interaction
- In the Social condition, Monoglossia was worse
than the other two
26Results on Perception
- Students were significantly happier with the
interaction in the Heteroglossia condition than
Neutral, with Monoglossia in the middle - Students liked the Heteroglossic and Monoglossic
agents better than the Neutral agent - Students in the Heteroglossia condition felt
marginally more successful than students in the
Monoglossia condition - No effect on Personality indicators such as
Pushy, Wishy Washy, etc. - Does that mean that impression of personality and
how you feel about an interaction with someone
are not linked?
27Student Comment
- I would also note that English is a very gender
neutral language, so gender performativity is
harder to classify.
28Engagement
- Already established Positioning a proposition
- But can it also be primarily positioning between
people? - Patterns of positioning propositions as having
the same or different alignment between speaker
and hearer could do this - Is positioning in communication always
positioning by means of propositional content?
29Connection between Heteroglossia and Attitude
But is this really different from a disclaim?
And is this really different from a proclaim?
30Hedging and Occupation?
- And as such, I believe hedging is a much more
effective tool in showing generational or
occupational differences rather than gender
differences. - For example, teenagers often use verbs such as
'like' and 'all' to report speech he was all
'that's stupid' and then he was like ''but I'm
stupid too'. The occupational differences I would
attribute to the differences between people who
need exact values as opposed to people who can
accept generalizations or approximations.
31(No Transcript)
32Questions?