Title: Analyzing Chat Dialogue with Taghelper Tools
1Analyzing Chat Dialogue with Taghelper Tools
- Catherine Chase
- Stanford University
- PSLC Summer Institute
- June 22, 2007
2Study Design (Kumar et al., 2007)
- 16 dyads of 6th grade students
- Math Talk tutor
- built with CTAT and Tutalk
- Two conditions
- Mathtalk with social dialogue enhancers
- Mathtalk alone
- Dependent measure analyzed
- Chat dialogue from collaborative Math Talk
sessions
Kumar, R., Gweon, G., Joshi, M., Cui, Y., Rose,
C. (2007). Supporting Students Working Together
on Math with Social Dialogue.
3Math Talk
4Social Dialogue Agent
- Tutor asks Student1 Do you like tacos or
hamburgers? - Tutor asks Student2 Do you like to hang out on
the weekends or after school? - Tutor uses students responses to generate a story
problem - Alice thought she bought enough tacos for 11 of
her friends one day on the weekend. But 8 of her
friends finished 2/3 of the food. How many times
more food should she have bought?
5Research Questions
- Do social dialogue agents affect students sense
of belonging to a group? - Do social dialogue agents affect the attributions
students make towards solutions to a math
problem? - Do social dialogue agents impact student affect?
6The Process
- Data clean-up in Excel
- Developed and applied coding schemes
- Created coding models using TagHelper Tools
- Evaluated model performance
- Debugged model
- Altered features in model creation
- Coded more data or different types of data
7Coding Scheme Sense of belonging
- Trained a model to code for group, self, and
partner-referenced language using only simple
user-defined rules - Group we, us, our, ours, ourselves
- Self I, I am, Im, Im, my, mine, me
- Partner you, u, your, yours, youre
- Using a decision tree algorithm, the model
achieved 94 reliability
8Coding Scheme Attributions
- Trained a model to code for attributions to any
problem-solving step using a set of pre-coded
data - Group we did it!
- Self I figured it out
- Partner u multiplied by the wrong number
- Only achieved reliability of 34
- Fewer instances of attributions in the data
- More ambiguous language made it more difficult
for the program to find meaningful patterns - Is nice work! an attribution to the partner?
9Coding Scheme Affect Part 1
- Coded for specific types of affective states or
behaviors - Insult you are stoopid, FOOL
- Annoyance This is annoying
- Boredom I am bored
- Achieved low reliability level of 39
- The model used meaningless keywords like it and
I to classify codes into certain categories - Sample of coded data for model training was not
representative of the full distribution of
examples within each category
10Coding Scheme Affect Part 2
- Coded broader categories of positive and negative
emotion - Positive sweet, awesome!, yeah!
- Negative this is annoying, you are stoopid,
stop it - Achieved high reliability level of 69, however,
data coded by the trained model contained many
incorrectly classified cases - Coded as Negative by the model I know that
- Coded as Positive by the model it should be
12! - Coded data for training was not representative of
all instances of positive/negative affect
11Next Steps
- Conduct statistical analyses on sense of
belonging data - Continue to debug training models for
attribution and affect coding schemes - Code more data
- Code more varied examples within data
- Test out different algorithm types and
user-defined features - Examine erroneous coding to determine problems
with the model
12Lessons Learned
- Both typical and atypical coded examples are
necessary to train an accurate model - Experimentation with various features and
algorithms often leads to successful models - TagHelper is particularly useful for
- large data sets
- training a model to analyze multiple data sets
13Special thanks to
- Carolyn Rosé
- Rohit Kumar
- Pittsburgh Science of Learning Center