Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email - PowerPoint PPT Presentation

About This Presentation
Title:

Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email

Description:

Collective Classification using Dependency Networks ... Kappa values with and without collective classification, averaged over the four ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 17
Provided by: vit3
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Learning TFC Meeting, SRI March 2005 On the Collective Classification of Email


1
Learning TFC Meeting, SRI March 2005On the
Collective Classification of Email Speech Acts
Vitor R. Carvalho William W. Cohen Carnegie
Mellon University
2
Classifying Email into Acts
  • From EMNLP-04, Learning to Classify Email into
    Speech Acts, Cohen-Carvalho-Mitchell
  • An Act is described as a verb-noun pair (e.g.,
    propose meeting, request information) - Not all
    pairs make sense. One single email message may
    contain multiple acts.
  • Try to describe commonly observed behaviors,
    rather than all possible speech acts in English.
    Also include non-linguistic usage of email (e.g.
    delivery of files)

Verbs
Nouns
3
Idea Predicting Acts from Surrounding Acts
Example of Email Sequence
  • Strong correlation with previous and next
    messages acts

Delivery
Request
Request
Proposal
Delivery
Commit
Commit
Delivery
  • Act has little or no correlation with other acts
    of same message

ltltIn-ReplyTogtgt
Commit
4
Related work on the Sequential Nature of
Negotiations
  • Winograd and Flores, 1986 Conversation for
    Action Structure
  • Murakoshi et al. 1999 Construction of
    Deliberation Structure in Email

5
Data CSPACE Corpus
  • Few large, free, natural email corpora are
    available
  • CSPACE corpus (Kraut Fussell)
  • Emails associated with a semester-long project
    for Carnegie Mellon MBA students in 1997
  • 15,000 messages from 277 students, divided in 50
    teams (4 to 6 students/team)
  • Rich in task negotiation.
  • More than 1500 messages (from 4 teams) were
    labeled in terms of Speech Act.
  • One of the teams was double labeled, and the
    inter-annotator agreement ranges from 72 to 83
    (Kappa) for the most frequent acts.

6
Evidence of Sequential Correlation of Acts
  • Transition diagram for most common verbs from
    CSPACE corpus
  • It is NOT a Probabilistic DFA
  • Act sequence patterns (Request, Deliver),
    (Propose, Commit, Deliver), (Propose,
    Deliver), most common act was Deliver
  • Less regularity than the expected ( considering
    previous deterministic negotiation state diagrams)

7
Content versus Context
  • Content Bag of Words features only
  • Context Parent and Child Features only ( table
    below)
  • 8 MaxEnt classifiers, trained on 3F2 and tested
    on 1F3 team dataset
  • Only 1st child message was considered (vast
    majority more than 95)

Request
Request
Proposal
???
Delivery
Commit
Parent message
Child message
Parent Boolean Features Child Boolean Features
Parent_Request, Parent_Deliver, Parent_Commit, Parent_Propose, Parent_Directive, Parent_Commissive Parent_Meeting, Parent_dData Child_Request, Child_Deliver, Child_Commit, Child_Propose, Child_Directive, Child_Commissive, Child_Meeting, Child_dData
Kappa Values on 1F3 using Relational (Context)
features and Textual (Content) features.
Set of Context Features (Relational)
8
Collective Classification using Dependency
Networks
  • Dependency networks are probabilistic graphical
    models in which the full joint distribution of
    the network is approximated with a set of
    conditional distributions that can be learned
    independently. The conditional probability
    distributions in a DN are calculated for each
    node given its neighboring nodes (its Markov
    blanket).
  • No acyclicity constraint. Simple parameter
    estimation approximate inference (Gibbs
    sampling)
  • In this case, Markov blanket parent message and
    child message
  • Heckerman et al., JMLR-2000. Neville Jensen,
    KDD-MRDM-2003.

9
Collective Classification algorithm (based on
Dependency Networks Model)
10
Agreement versus Iteration
  • Kappa versus iteration on 1F3 team dataset, using
    classifiers trained on 3F2 team data.

11
Leave-one-team-out Experiments
Kappa Values
  • 4 teams 1f3(170 msgs), 2f2(137 msgs), 3f2(249
    msgs) and 4f4(165 msgs)
  • (x axis) Bag-of-words only
  • (y-axis) Collective classification results
  • Different teams present different styles for
    negotiations and task delegation.

12
Leave-one-team-out Experiments
Kappa Values
  • Consistent improvement of Commissive, Commit and
    Meet acts

13
Leave-one-team-out Experiments
  • Deliver and dData performance usually decreases
  • Associated with data distribution, FYI, file
    sharing, etc.
  • For non-delivery, improvement in avg. Kappa is
    statistically significant (p0.01 on a two-tailed
    T-test)

Kappa Values
14
Act by Act Comparative Results
Kappa values with and without collective
classification, averaged over the four test sets
in the leave-one-team out experiment.
15
Discussion and Conclusion
  • Sequential patterns of email acts were observed
    in the CSPACE corpus.
  • These patterns, when studied an artificial
    experiment, were shown to contain valuable
    information to the email-act classification
    problem.
  • Different teams present different styles for
    negotiations and task delegation.
  • We proposed a collective classification scheme
    for Email Speech Acts of messages. (based on a
    Dependency Network model)

16
Conclusion
  • Modest improvements over the baseline (bag of
    words) were observed on acts related to
    negotiation (Request, Commit, Propose, Meet, etc)
    . A performance deterioration was observed for
    Delivery/dData (acts less associated with
    negotiations)
  • Agrees with general intuition on the sequential
    nature of negotiation steps.
  • Degree of linkage in our dataset is small which
    makes the observed results encouraging.
Write a Comment
User Comments (0)
About PowerShow.com