What is computer-aided summarisation and does it really work? Constantin Orasan http://clg.wlv.ac.uk/projects/CAST/
Structure Introduction CAST Evaluation Conclusions
Computer-aided summarisation Combines automatic methods with human input Relies on automatic methods to identify the important information Humans can decide to include this information and/or additional one Humans post-edit the information to produce a coherent summary
Automatic summarisation (AS) Produces summaries automatically with the help of computers Does not require human input The quality is low (especially when compared to human summaries) Computer-aided summarisation (CAS) Uses automatic methods to produce summaries, but Allows the humans to postedit the result High quality, but less effort
Why CAS can work? Endres-Niggemeyer (1998) identifies three stages in human summarisation:  document exploration ,  relevance assessment  and  summary production We hypothesise the first two stages can be replaced by automatic methods Craven (1996) and Narita (2000) tried to help humans summarisers using automatic means
Computer-aided summarisation tool (CAST) Work funded by Arts and Humanities Research Council Work done together with Laura Hasler The most important outcome of the project is the tool Allows the user to run automatic methods to identify important sentences In order to produce an abstract, the user can take sentences and edit them
CAST- the tool (II) At present CAST contains the following methods: Keyword method Indicating phrases Surface clues Lexical cohesion These methods were chosen because they are highly customisable and domain independent The user can select the setting which is the most appropriate for a particular text/genre
 
 
 
Feedback from the user We analysed the work of our human summariser Term-based summarisation was used first to produce 30% summaries Whenever a useful sentence was found lexical chains were used to identify related sentences Avoids to run too many automatic methods because it becomes confusing Requested a way to know which sentences have been included in the summary
Evaluation Our assumption about CAS is that it is possible to produce summaries in less time without any loss in quality 2 experiments were carried out: We recorded the time for producing summaries with and without CAST Showed pairs of summaries and asked humans to pick the better one
Experiment 1 Used one professional summariser 69 texts from CAST corpus were used Summaries were produced with and without the tool at one year distance Without CAST With CAST Reduction % Newswire texts 498secs 382secs 23.29% New Scientist texts 771secs 623secs 19.19%
Experiment 1 We evaluated the term-based summariser used in the process We found correlation between the success of the automatic summariser and the time reduction
Experiment 2 Turing-like experiment where we asked humans to pick the better summary in a pair Each pair contained one summary produced with CAST and one without CAST 17 judges were shown 4 randomly selected pairs
Experiment 2 In 41 pairs the summary produced with CAST was preferred In 27 pairs the summary produced without CAST was preferred Our assumption was that there is no difference between them Chi-square shows that there is no statistically significant difference with 0.05 confidence
Conclusions Computer-aided summarisation really works for professional summarisers and reduces the time necessary to produce summaries by about 20% It would be interesting to try with non-professional summarisers Try on other texts Compare to other computer-aided methods

What is Computer-Aided Summarisation and does it really work?

  • 1.
    What is computer-aidedsummarisation and does it really work? Constantin Orasan http://clg.wlv.ac.uk/projects/CAST/
  • 2.
    Structure Introduction CASTEvaluation Conclusions
  • 3.
    Computer-aided summarisation Combinesautomatic methods with human input Relies on automatic methods to identify the important information Humans can decide to include this information and/or additional one Humans post-edit the information to produce a coherent summary
  • 4.
    Automatic summarisation (AS)Produces summaries automatically with the help of computers Does not require human input The quality is low (especially when compared to human summaries) Computer-aided summarisation (CAS) Uses automatic methods to produce summaries, but Allows the humans to postedit the result High quality, but less effort
  • 5.
    Why CAS canwork? Endres-Niggemeyer (1998) identifies three stages in human summarisation: document exploration , relevance assessment and summary production We hypothesise the first two stages can be replaced by automatic methods Craven (1996) and Narita (2000) tried to help humans summarisers using automatic means
  • 6.
    Computer-aided summarisation tool(CAST) Work funded by Arts and Humanities Research Council Work done together with Laura Hasler The most important outcome of the project is the tool Allows the user to run automatic methods to identify important sentences In order to produce an abstract, the user can take sentences and edit them
  • 7.
    CAST- the tool(II) At present CAST contains the following methods: Keyword method Indicating phrases Surface clues Lexical cohesion These methods were chosen because they are highly customisable and domain independent The user can select the setting which is the most appropriate for a particular text/genre
  • 8.
  • 9.
  • 10.
  • 11.
    Feedback from theuser We analysed the work of our human summariser Term-based summarisation was used first to produce 30% summaries Whenever a useful sentence was found lexical chains were used to identify related sentences Avoids to run too many automatic methods because it becomes confusing Requested a way to know which sentences have been included in the summary
  • 12.
    Evaluation Our assumptionabout CAS is that it is possible to produce summaries in less time without any loss in quality 2 experiments were carried out: We recorded the time for producing summaries with and without CAST Showed pairs of summaries and asked humans to pick the better one
  • 13.
    Experiment 1 Usedone professional summariser 69 texts from CAST corpus were used Summaries were produced with and without the tool at one year distance Without CAST With CAST Reduction % Newswire texts 498secs 382secs 23.29% New Scientist texts 771secs 623secs 19.19%
  • 14.
    Experiment 1 Weevaluated the term-based summariser used in the process We found correlation between the success of the automatic summariser and the time reduction
  • 15.
    Experiment 2 Turing-likeexperiment where we asked humans to pick the better summary in a pair Each pair contained one summary produced with CAST and one without CAST 17 judges were shown 4 randomly selected pairs
  • 16.
    Experiment 2 In41 pairs the summary produced with CAST was preferred In 27 pairs the summary produced without CAST was preferred Our assumption was that there is no difference between them Chi-square shows that there is no statistically significant difference with 0.05 confidence
  • 17.
    Conclusions Computer-aided summarisationreally works for professional summarisers and reduces the time necessary to produce summaries by about 20% It would be interesting to try with non-professional summarisers Try on other texts Compare to other computer-aided methods

Editor's Notes

  • #6 In the first two stages the summariser identifies the overall structure of the text and the main topics. In the third stage, copy/paste operations followed by post-editing are used to produce the summary. The two first stages can be replaced by automatic methods