Corporate Profile 47Billion Information Technology
Summarization and Visualization of Digital Conversations
1. Summarization and
Visualization of Digital
Conversations
Vincenzo Pallotta
Joint work with
Rodolfo Delmonte, University of Venice, Italy
Marita Ailomaa, EPFL, Switzerland
2. Digital Conversations
• The Web
– Social Media
– Forums
– Blogs
• Meetings
• VoIP
• Call centers
• Help Desk
SPIM 2010 - Malta 2
8. 1st Hypothesis…
V. Pallotta, Content-based retrieval of distributed multimedia conversational data. In E.
Vargiu, A. Soro, G. Armano, G. Paddeu (eds.) Information Retrieval and Mining in
Distributed Environments, Springer Verlag, series: Studies in Computational Intelligence
(ISSN: 1860-949X) to Appear, 2010.
SPIM 2010 - Malta 8
9. Challenges for
(spoken) conversation processing
• dealing with multiple speakers
• dealing with foreign language and associated
accents
• incorporating non-speech audio dialogue acts
– (e.g., clapping, laughter, silence?)
• conversational segmentation and summarization
• discourse analysis, such as:
– analyzing speaking rates
– turn taking (frequency, durations)
– concurrence/disagreement
• which often provides insights into speaker emotional state,
– attitudes toward topics and other speakers
– roles/relationships.
M. Maybury: Keynote at the SIGIR 2007 Workshop
Searching Spontaneous Conversational Speech
SPIM 2010 - Malta 9
12. What type of content is user
looking for from conversations?
40
• Users look for 35
30
IM2 set
MS set
argumentative 25
information
20
15
10
– Decision Making 5
– Conflict Resolution 0
Factual Thematic Process Outcome
• Information Retrieval is 80
70
IM2 set:
argumentative
not sufficient 60
50
MS set:
argumentative
– Need for more context 40
30
– Answers not found in 20
words spoken 10
0
IR sufficient IR irrelevant IR insufficient
Pallotta, Seretan, Ailomaa ACL 2007
SPIM 2010 - Malta 12
18. Two reviews from ACL…
• "The idea of using argument structure
annotation to aid dialogue summarization
is very promising. For an abstractive
summary of dialogues this seems almost
like an inevitable step and I am always
glad to see people take on the hard task
of abstractive summarization.“
• "I think the general approach of
detecting the argumentative structure is
the correct one to take and the authors
are laying groundwork for a solid
abstractive system."
SPIM 2010 - Malta 18
19. Our Approach…
• Topic Segmentation
• Recognition of argumentative episodes:
– Based on the GETARUNS system
• Automatic recognition of argumentative
structure:
– Novel discourse parsing algorithm
• Retrieval through:
– Question Answering
– Abstractive summaries
– Visualization of arguments
SPIM 2010 - Malta 19
20. Meeting Description Schema
DISCUSS(issue) <- PROPOSE(alternative)
1702.95 David: so - so my question is should we go ahead and get na- -
nine identical head mounted crown mikes ? {qy} 61a
REJECT(alternative)
1708.89 John: not before having one come here and have
some people try it out . {s^arp^co} 61b.62a
PROVIDE(justification)
1714.09 B: because there's no point in doing that if it's
John: because there's no point in doing that if
it's going to to be better . {s} {s} 61b+
not not goingbe anyany better . 61b+
ACCEPT(justification)
1712.69 David: okay . {s^bk} 62b
PROPOSE(alternative)
1716.85 John: so why don't we get one of these with the crown with a different headset ? {qw^cs}
63a
PROVIDE(justification)
1722.4 John: and - and see if that works . {s^cs} 63a+.64a
1723.53 Mark: and see if it's preferable and if it is then we'll get more . {s^cs^2} 64b
1725.47 Mark: comfort . {s}
ACCEPT(alternative)
1721.56 David: yeah . {s^bk} 63b
1726.05 Lucy: yeah . {b}
1727.34 John: yeah . {b}
Why was David’s proposal on microphones rejected?
SPIM 2010 - Malta 20
21. Abstractive Summary
DISCUSS(issue) <- PROPOSE(alternative)
1702.95 David: so - so my question is should we go ahead and get na- - • David proposal was: “go
ahead and get nine
nine identical head mounted crown mikes ? {qy} 61a
REJECT(alternative)
1708.89 John: not before having one come here and have identical head mounted
some people try it out . {s^arp^co} 61b.62a
crown mikes”
PROVIDE(justification)
1714.09 B: because there's no point in doing that if it's
John: because there's no point in doing that if
it's going to to be better . {s} {s} 61b+
not not goingbe anyany better . 61b+
• David’s proposal was
ACCEPT(justification)
rejected.
1712.69 David: okay . {s^bk} 62b
• John provided an
PROPOSE(alternative)
alternative: “get one of
1716.85 John: so why don't we get one of these with the crown with
a different headset ? {qw^cs} 63a
these with crown with a
PROVIDE(justification)
1722.4 John: and - and see if that works . {s^cs} 63a+.64a
different headset”. John’s
proposal was accepted by
1723.53 Mark: and see if it's preferable and if it is then we'll
get more . {s^cs^2} 64b
1725.47 Mark: comfort . {s}
ACCEPT(alternative)
the majority of participants.
1721.56 David: yeah . {s^bk} 63b
1726.05 Lucy: yeah . {b}
1727.34 John: yeah . {b}
SPIM 2010 - Malta 21
22. Argumentative Labeling with
GETARUNS
• Primitive Discourse Relations labels:
– statement, narration, adverse, result,
cause, motivation, explanation, question,
hypothesis, elaboration, permission,
inception, circumstance, obligation,
evaluation, agreement, contrast, evidence,
hypoth, setting, prohibition.
• Mapped into Argumentative labels:
– ACCEPT, REJECT/DISAGREE, PROPOSE/
SUGGEST, EXPLAIN/JUSTIFY, REQUEST
EXPLANATION/JUSTIFICATION.
Delmonte R., Bistrot A., Pallotta V.,Deep Linguistic Processing with GETARUNS for spoken dialogue
Understanding. Proceedings LREC 2010 (P31 Dialogue Corpora).
SPIM 2010 - Malta 22
23. Evaluation
ICSI corpus of meetings (Janin et al., 2003)
Precision: 81.26% Recall: 97.53%
Total
Correct Incorrect Precision
Found
Accept 662 16 678 98%
Reject 64 18 82 78%
Propose 321 74 395 81%
Request 180 1 181 99%
Explain 580 312 892 65%
Disfluency 19 0 19 100%
Total 1826 421 2247 81%
Delmonte R., Bistrot A., Pallotta V.,Deep Linguistic Processing with GETARUNS for spoken dialogue
Understanding. Proceedings LREC 2010 (P31 Dialogue Corpora).
SPIM 2010 - Malta 23
25. Conversational Graphs
[7:00] # Yes, uh, I've a question, uh, what's mean
exactly advance chip on print? What's the meaning
of that? [7:10] 7 5
[7:02] Yeah [7:2]
[7:10] I think it's um uh a multiple uh chip design uh
and it's maybe printed on to the (curcuit) board.
[7:20] 8 7
[7:21] Mm-hmm. [7:21]
[7:21] Uh I could find out more about that uh before
the next fi- next meeting. [7:26] 8.1 8
[7:24] Yeah, is it means it's on the - x#x is it on the
micro-processor based or uh - [7:30] 9 8
[7:32] I don't know, but I'll find out more on our next
meeting. [7:35] 10 11 11:09
[7:34] [O]okay, uh, that would be great, so if you find
out from the technology backgroud, okay, so that
would be good[.] [7:39] 12 10
[7:39] Sounds good. [7:40]
[7:41] Why was the plastic eliminated as a possible
material? [7:44] 13 3
[7:43] Because um it gets brittle - [7:46] 14 13 3
[7:47] cracks - [7:48] 14 13 3
[7:48] uh-huh [7:49]
[7:51] um [7:51] 14 13 3
[7:53] We want - we expect these um these remote
controls to be around for several hundred years.
[7:59] 14 13 3
[8:00] So $ we could $ (??) - good expression [8:6]
[8:02] (I would gi-) [8:2]
[8:02] Wow $ Good expression, (well) after us $
[8:12]
[8:05] Which - [8:6]
[8:12] Um, speak for yourself, I (??) $ - [8:16]
[8:13] Alth- I think - [8:15]
[8:14] $ [8:16]
[8:16] I think with the wood though you'd run into the
same types of problems (??) I mean it chips, it- if 15:14
you drop it, ehm, it's - I'm not su- $ [8:27] 15 16 (15:3?) 16:15
SPIM 2010 - Malta 25
30. Conversation Memos (1)
GENERAL INFORMATION ON PARTICIPANTS
• The participants to the meeting are 7.
• Participants less actively involved are Ami and Don who
only intervened respectively for 38 and 68 turns.
LEVEL OF INTERACTIVITY IN THE DISCUSSION
• The speaker that has held the majority of turns is
Adam with a total of 722 turns, followed by Fey with a
total of 561.
• The speaker that has undergone the majority of
overlaps is Adam followed by Jane.
• The speaker that has done the majority of overlaps is
Jane followed by Fey.
• Jane is the participant that has been most competitive.
SPIM 2010 - Malta 30
31. Conversation Memos (2)
DISCUSSION TOPICS
• The discussion was centered on the following topics:
" "schemas, action, things and domain.
• The main topics have been introduced by the most
important speaker of the meeting.
• The participant who introduced the main topics in the
meeting is: Adam.
• The most frequent entities in the whole dialogue partly
coincide with the best topics, and are the following:
action, schema, things, 'source-path-goal', person, spg, roles,
bakery, intention, specific, case, categories, information,
idea.
SPIM 2010 - Malta 31
32. Conversation Memos (3)
ARGUMENTATIVE CONTENT EPISODE ISSUE No. 7
The following participants: In this episode we have the following
argumentative exchanges between the
"Andreas, Dave, Don, Jane, Morgan following speakers: Don, Morgan.
expressed their dissent 52 times. However Morgan provides the following explanation:
Dave, Andreas and Morgan expressed [oh, that-s_, good, .]
then he , overlapped by Don, continues:
dissent in a consistently smaller [because, we, have, a_lot, of, breath, noises, .]
percentage. Don accepts the previous explanation:
The following participants: [yep, .]
"Adam, Andreas, Dave, Don, Jane, Morgan then he provides the following explanation:
[test, .]
asked questions 55 times. Morgan continues:
The remaining 1210 turns expressed positive [in_fact, if, you, listen, to, just, the, channels, of, people,
not, talking, it-s_, like, ..., .]
content by proposing, explaining or
then he , overlapped by Don, disagrees with the
raising issues. However Adam, Dave and
previous explanation
Andreas suggested and raised new [it-s_, very, disgust, ..., .]
issues in a consistently smaller Don, overlapped by Morgan, asks the following
percentage. question:
The following participants: Adam, Andreas, [did, you, see, hannibal, recently, or, something, ?]
Dave, Don, Jane, Morgan expressed Morgan provides the following positive answer:
[sorry, .]
acceptance 213 times.
then he provides the following explanation:
[exactly, .]
[it-s_, very, disconcerting, .]
[okay, .]
…
SPIM 2010 - Malta 32
33. Conclusion
• Conversational Search and Condensation is extremely
challenging
– Classical approaches simply don’t work
– Sense-making is needed
• One possible “sense”:
– Argumentative structure
• Possible outputs:
– Question Answering
– Abstractive Summaries
– Conversation Graphs
• Future Work:
– Improving performance of the classifier
– Build the linking structure of arguments
– Approach generation
SPIM 2010 - Malta 33