SlideShare a Scribd company logo
1 of 8
Learning How Best to Summarize
                    Terry COPECK, Stan SZPAKOWICZ, Nathalie JAPKOWICZ
                           School of Information Technology & Engineering
                                         University of Ottawa
                               800 King Edward, P.O. Box 450 Stn. A
                                 Ottawa, Ontario, Canada K1N 6N5
                              {terry, szpak, nat}@site.uottawa.ca




                      Abstract                         practical reasons and with few hypotheses in play.
                                                       From the outset it was clear that the approach we
    We present a summarizer that uses                  envisioned could be implemented in large part by
    alternative modules to perform its key tasks.
                                                       combining existing NLP research software which
    Such a configurable summarizer benefits
    from machine learning techniques that help         were either in the public domain or freely
    discover which configuration performs best         available from their authors. It suited our purposes
    for documents in general or for those              not to reinvent any more wheels than really
    characterized by particular lexical features.      necessary. It also seemed a good idea to use
    We determine the quality of an                     software which is known to the NLP research
    automatically generated summary by                 community and has already been subject to its
    computing a measure of its coverage of the         scrutiny. We therefore conceived of our
    content phrases in a model summary known           summarization system as a testbed in which we
    to be of good quality. Machine learning
                                                       could evaluate how well alternative black boxes
    demonstrates significant improvement in
    this measure over the average value of             developed by others cooperate as they perform one
    summaries produced from all configuration          of the key tasks involved in our method of
    parameter settings.                                summarization. The rationale inherent in this
                                                       architecture has itself in due course provided a
We perform text summarization by extracting the        research agenda.
most representative sentences from a single                We assumed that a summary composed of
document. Such summaries tend to be informative,       neighboring, thematically-related sentences should
rather than indicative; and generic, rather than       read better, and perhaps also convey more
biased towards the interests or purposes of any        information, than one in which each sentence is
reader. This is how we situate our work in the         independently chosen from the document as a
gradually emerging classification of summary           whole and thus likely bears no relation to its
types which was recently recapitulated in Kan,         neighbours. This assumption led to a process in
Klavans and McKeown (2002).                            which the text to be summarized is first broken
                                                       into segments, which are sequences of adjacent
1   System Design                                      sentences talking about the same topic. In a
Our work on summarization began several years          separate operation keyphrases are extracted from
ago in the context of a larger project with multiple   the whole text and ranked. Each sentence in the
objectives, in which summary generation would be       text is then rated according to the number of
one of the tools. We therefore came to this task for   highly-ranked keyphrases which it contains, and
each segment is ranked according to the ratings of   although coded in plain ASCII, may have a variety
the sentences which it contains. The summary is      of extraneous formatting: pagination, embedded
then constructed by picking sentences from those     figures and tables and, most crucially, columnar
that exceed a threshold count of keyphrases in the   format. Although it took some effort to install the
most highly ranked segments, moving from the         black boxes in wrappers that brought them to a
best segment to the next best until the required     common interface, most of our attention was spent
number of sentences have been accumulated.           on translating input text to the "NLP normal" form
These sentences are put in document order and        — one sentence per line, one extra line between
presented to the user as a summary of the            paragraphs — required by most of the black
document.                                            boxes. This form is also made necessary by the
                                                     task itself; sentence extraction presumes a text in
2   Processing                                       which sentences are identified correctly.
                                                     Recognizing sentence and paragraph boundaries
Because our summarizer was at first designed to
                                                     and extraneous material such as tables interpolated
be a functional component of a larger project, it
                                                     into the flow of text are legitimate research issues
was necessary to handle real world texts which,


                                                            Text




      Segmenting        Segmenter            TextTiling               C99              None




      Keyphrase                 Kea                       Extractor                NPSeek
      Extraction



      Matching                            Exact                             Stem




      Sentence                                            Locality
      Selection


                                                      Summary


                              Figure 1. Summarizer Program Architecture
in themselves. They are, however, not inherently         in rating sentences and segments; 3) the threshold
part of summarization, so we will not discuss them       number of keyphrase matches needed for a
further here.                                            sentence to qualify as a candidate for a summary.
    In the course of processing of input text prior to   There are 24 unique specifications for black boxes
summarization we create a data model of the              alone; each of the three additional parameters
document which records its various lexical               takes a count rather than an enumerated value, so
features on a per-sentence basis as well as details      it is quite feasible to produce hundreds if not
of sentence and paragraph order. This record is the      thousands of summaries for a single document.
basis for the document characterization which                This is computationally expensive. Results
plays a role in summary evaluation discussed in          from DUC 2001 led us to shrink this space by
Section 4.                                               dispensing with stemmed matching (which for this
    The key tasks for which we used alternative          reason is shown dimmed in Figure 1), fixing the
external programs are document segmentation,             keyphrase threshold at two and limiting the range
keyphrase extraction, and phrase matching. The           of the other two count parameters, summary size
Columbia University Segmenter (Kan, Klavans              and number of keyphrases.
and McKeown 1998), Hearst's TextTiling program
(1997) and the C99 program (Choi 2000) were              4   Evaluation
used to divide a document into segments. A fourth
                                                         How good are any of these summaries? Answering
option      implemented     recently     turns     off
                                                         that question has proven to be quite difficult
segmentation and treats the document as a single
                                                         (Goldstein, Kantrowitz, Mittal and Carbonell
segment. A moment’s thought may show that this
                                                         1999), and it is currently the object of quite intense
is the control case for our hypothesis about the
                                                         research interest, of which the two Document
benefit of locality in sentence selection; picking
                                                         Understanding Conferences (DUC) sponsored by
the highest-rated sentences from the entire text
                                                         NIST are notable examples. Summary evaluation
defeats locality. To extract keyphrases we used
                                                         has become a research goal in itself.
Waikato University's Kea (Witten, Paynter, Frank,
                                                             A variety of statistical measures of summary
Gutwin and Nevill-Manning 1999), the Extractor
                                                         content — kappa, relative utility, and the
program from the Canadian NRC's Institute for
                                                         traditional IR metrics of precision and recall — are
Information Technology (Turney 2000), and
                                                         computed in systems like the MEAD Eval toolkit
NPSeek, a program developed at the University of
                                                         (Radev, Teufel, Lam and Saggion 2002) or SEE
Ottawa (Barker and Cornacchia 2000). The Kea
                                                         (Lin 2001) that automate or assist the assessment
program suite also provided modules to match
                                                         of summaries against the original texts on which
phrases exactly and in stemmed form. Figure 1
                                                         they are based. We have, however, chosen to
shows the architecture of the summarization
                                                         compare our summaries against the gold standard
program constructed around these black boxes
                                                         of ones written by human authors. Such
(stemmed matching is dimmed deliberately).
                                                         summaries are not easily come by. We had
3   Operational Parameters                               therefore initially settled on academic papers as a
                                                                                                         1



                                                         subject genre and have adopted the authors’
To run, our program must be told which black box         abstracts which begin such papers as a practical
to use for each of the three key tasks where we          approximation to the gold standard summary we
have choice. Three other aspects of the process or       seek, an approach proposed by Mittal, Kantrowitz,
the task have been judged important enough to
require parametrization: 1) the size of the
summary in sentences, words, or as a proportion of       1 Initially we used a one-million-word corpus of 75
the document; 2) the number of keyphrases to use         academic papers published between 1993-1996 in the
                                                         online Journal of Artificial Intelligence Research.
Text to be         Parameter                                 Best Parameters
           Summarized           Values
                                                  Future Feedback Loop




                                                                             Parameter
                                                                             Assessor


                                                                                     Discretized,
                                                                                     Normalized
                                                                                       Rating
                             Figure 2. Text Summarization Learner Architecture

                                                                            Summary
                                                                            Assessor




              SUMMARY GENERATOR                                   GENERATION CONTROLLER



                                      Summary + Parameter Values              Abstract + Features




Goldstein and Carbonell (1999). Who better than          The formula computes a total for the sentence,
the author knows the intended content of a paper?        assigning a weight of 1.0 to each KPiA which is
   We use a variant of the keyphrase extraction          added to the coverage set and 0.5 to each that
technique used in the summarization program to           repeats a KPiA which is already in the set.
evaluate a summary against the reference
summary. That gold standard is broken down into          5   Learning
a set of unique content keyphrases by processing it
                                                         The combination of a wholly automated evaluation
against a stoplist of 980 closed-class and common
                                                         technique and a large number of instances to
words. The resulting keyphrases in abstract
                                                         which to apply this technique is a textbook
(KPiAs) differ from those used to produce the
                                                         application for machine learning. Accordingly, we
summary under evaluation in that they compose a
                                                         embedded the summarization program in a
comprehensive description of the abstract rather
                                                         framework which automates its operation and the
than being limited to a small number of the most
                                                         evaluation of its results and then applies the C5.0
highly ranked ones.
                                                         decision tree and rule classifier (Quinlan 2002) to
   A summary is ranked according to the degree
                                                         infer rules about which parameter settings produce
to which it covers the set of KPiAs. Its score is
                                                         highly-rated or poorly-rated summaries. This
computed using the following formula:
                                                         architecture appears in Figure 2. To permit
       ∑ 1.0 * KPiAunique + 0.5 * KPiAduplicate          comparisons between documents, each summary
FEATURE                 DESCRIPTION                   FEATURE                       DESCRIPTION

     chars     number of characters in the document           cphr3    number of 3-instance content phrases
     words     number of words in the document                cphr4    number of 4-or-more instance content phrases
     sents     number of sentences in the document            cbig     number of 1-instance content bigrams
     paras     number of paragraphs in the document           cbig2    number of 2-instance content bigrams
     kpiacnt   number of kpia instances                       cbig3    number of 3-instance content bigrams
    conncnt    number of (Marcu) connectives                  cbig4    number of 4-or-more instance content bigrams
     pncnt     number of PNs, acronyms                        abig     number of 1-instance bigrams
     contcnt   number of content phrases                      abig2    number of 2-instance bigrams
      cphr     number of 1-instance content phrases           abig3    number of 3-instance bigrams
     cphr2     number of 2-instance content phrases           abig4    number of 4-or-more instance bigrams


                                               Table 1. Document Features



KPiA rating is normalized by dividing it by the                 content phrases (substrings between stoplist
total count of unique KPiAs found in the same                   entries), bigrams (word pairs) and content
number of sentences ranked highest in the                       bigrams. We record how many of these latter three
document on KPiA count. This establishes the                    primitive lexical items occur once, twice, three
highest possible rating for a summary of a given                times, or more than three times in the document.
size. The set of ratings for summaries of all                       Adding features has the effect of asking C5.0 to
documents is then discretized into the five values        2
                                                                discover rules identifying which parameter values
verybad, bad, medium, good, verygood using                      produce good summaries for documents with
MacQueen’s (1967) k-means clustering technique.                 certain features, rather than for documents in
C5.0 then induces rules based on these ratings and              general. Further active investigation over the last
parameter values. Much of the learner’s operation               year has not yet identified additional features
has been automated, most recently the                           which might be used effectively to characterize
identification of best parameters in the C5.0 output            documents.
and the application of k-means clustering, and the                  Our experiments clearly show the benefit of
job of completely automating operation of the                   machine learning. Choosing parameter values
system is under way.                                            randomly produces an average rating of 2.79, the
    Although our first experiments involved a data              medium value that is to be expected from a
vector containing only the settings of the six                  population of values that have been normalized.
parameters that regulate the production of                      When the best parameter settings for each text are
summaries, availability of the document data                    chosen, however, the average rating increases two
model soon led us to add the features listed in                 full grades to 4.78 or verygood.
Table 1 in order to characterize the document
being summarized. These features are all counts:                6     Document Understanding Conference
the number of characters, words, sentences and
                                                                We participated in the Document Understanding
paragraphs in the document; and the number of
                                                                Conferences in 2001 (Copeck, Japkowicz and
other syntactic elements, such as KPiAs,
                                                                Szpakowicz 2002) and 2002. In DUC, NIST first
connectives and proper nouns. We also count
                                                                provides participants with a corpus of training
2
                                                                documents together with abstracts written by
  C5.0 requires a discrete outcome variable. The use of
five values was determined heuristically.                       skilled editors, and then a set of similar test
                                                                documents which each participant is to summarize
Training           Induced Rules           Test                              Test
                         , , ,                        Parameter Settings
     Data                                      Dat                                   Dat
                                                a                                     a




                                                                                GENERATOR
    TSL SYSTEM                            TSL SYSTEM


                                                                               Submission

                                   Figure 3. Methodology for DUC 2002



in a manner consistent with that inferred from the         NIST provides participants with a full suite of
training set. Our participation involved slight        submissions and analysis material after each event
modification to our learning system. Rather than       as an invitation to get involved in judging results.
constructing rules that relate features and            We used the human summaries of the 2001 test
parameters to KPiA ratings, C5.0 used the training     documents to augment training data for 2002 from
data to build a classification tree recording these    308 to 590 documents. Our own evaluation of our
same relationships. The test data, which is            results for 2001 varied from that computed by Paul
composed of document features only, was then           Over (2001) for no obvious reason, so we held
classified against this structure and the parameters   back from reporting results for DUC 2002 at the
that gave the best summary for the training            July workshop on automatic summarization
document most similar to each test document were       occurring in conjunction with ACL-02. In 2001 we
used to produce the summaries submitted to the         did well on single document extractive summaries
conference. Figure 3 depicts the process.              and worse than average on multi-document
    Ten-fold cross validation on the training data     summaries. Because our technique is not
shows 16.3% classification error (std. dev. .01).      particularly suited to summarize more than one
The 5x5 confusion matrix further indicates that        document at a time, and because we are not
80% of misclassifications were to the adjacent         entirely comfortable with the notion of
value in the scale. This suggests that the rules are   incorporating material from more than one
reasonably trustworthy. The test run showed that       document in a single summary, we chose to
they are productive, and that the ratings they seek    participate only in the single document track in
to maximize are well-founded.                          2002.
    For DUC 2002 we added adaptive boosting                In DUC 2002 our submission ranked somewhat
(Freund and Schapire 1996) to the learning             below the middle of the pack of 13 contributors in
process. Adaptive boosting, or ADABOOST,               the single document track. This is lower than our
improves the classification process by iteratively     performance the previous year, though ranking
generating a number of classifiers from the data,      alone is not a clear indication of relative decrease
each optimized to classify correctly the cases most    in performance.
obviously misclassified on the previous pass. The
votes of a population of these classifiers generally      A measure used both in the training phase and
provide better outcomes than do the decisions of       the test phase offers insights into the state of the
any single classifier.                                 overall process. We compute the discretized KPiA
+ 2.68

                               2 0 0 1                                           2 0 0 2
  RATING           TRAIN    TEST                   SUB             TRAIN    TEST                  SUB

  verygood          11%     14%                    79%              13%      15%                  77%
                                                                                                               + 2.68
    good            12%     21%                    21%              18%         1%                 2%
   medium           20%     19%                                     24%                            3%
    bad             18%     21%                                     26%         4%                12%
   verybad          39%     25%                                     19%      80%                   7%

 AVERAG            2.35     2.79                  4.79              2.80    1.68                  4.31
    E

                      Table 2. KPiA Summary Rating Distributions in DUC Corpora


rating for all summaries of documents in the                 The most obvious explanation for these
                                     + 2.00
training set. We also infer from the classification      circumstances would be an inconsistency between
tree a KPiA rating for each summary of every             the 2002 training and test data. Our system would
document in the test set. Distribution of the five       exhibit this in the values it computes for the 19
rating values across any set of summaries                features used to model each document for machine
establishes the limits of the range of possible          learning. However, despite the presence of outliers
outcomes. A comparison of KPiA distributions for         in each set of documents—texts much longer or
DUC 2001 with their counterparts for DUC 2002            shorter than the average—inspection shows that
appears in Table 2. Rows show the percentage of          the distribution of each feature in one collection
summaries rated, or inferred to be, in the indicated     resembles its counterpart in the other to a far
class. Each column concludes with the average            greater degree than do the two problematic KPiA
rating for all its summaries. In 2001 test and           rating distributions. In addition, where individual
                                     + 2.00
training data were similar, and machine learning,        features of a document are related to one another,
indicated by the gray arrow, was able to find            distributions in the two collections also differ in
settings for each test document that produced at         consistent ways. In sum, the test and training
least a good summary.                                    collections appear to resemble one another to the
                                                         degree one would expect.
    Contrast this with the situation in 2002. The
training data ratings are akin to those for the 2001         We have therefore ruled out the most obvious
data, but ratings of the test data are widely at         explanation for our results. Although we have not
variance with those of the three other document          yet been able to find a reason why our 2002
collections. Fully 80% of the summaries of the test      training summaries had such low ratings, we
data collection are inferred to be verybad. Despite      continue to investigate.
machine learning outperforming its counterpart in
2001, there were simply no well-rated summaries
that could be produced for approximately 22% of          7     Conclusions and Future Work
the 2002 test documents. This situation almost
                                                         We intend in future:
guaranteed the low evaluation produced by the
NIST judges.                                             •   to complete automating the operation of the
                                                             entire text summarization learning system;
+ 2.68
                                                                                     + 2.63
•   to continue to look for lexical features which     Goldstein, J., M. Kantrowitz, V. Mittal, and J.
    describe documents effectively. In this regard       Carbonell. 1999. Summarizing Text Documents:
                                                         Sentence Selection and Evaluation Metrics. Proc
    commercial keyphrasers identify entities as          ACM/SIGIR-99, 121-128.
    place, city, state and country; company,
                                                       Hearst, M. 1997. TexTiling: Segmenting text into
    organization and group; day and year; percent
                                                         multi-paragraph subtopic passages. Computational
    and measure; person; and property. Some of           Linguistics 23 (1), 33-64.
    these may merit recording;
                                                       Kan, M.-Y., J. Klavans and K. McKeown. 1998. Linear
•   to investigate the SEE and MEADEval                  Segmentation and Segment Significance. Proc
    summarization evaluators;                            WVLC-6, 197-205.
•   to merge the output sets of two or all three       Kan, M.-Y., J. Klavans and K. McKeown. 2002. Using
    keyphrasers, providing new alternatives. To          the Annotated Bibliography as a Resource for
                                                         Indicative Summarization. Proc LREC 2002, Las
    what degree do the keyphrasers agree in their        Palmas, Spain. 1746-1752.
    selections? How would merging sets of
                                                       Lin, C.-Y. 2001. SEE―Summary Evaluation Environ-
    keyphrases be accomplished? What effect               ment. www.isi.edu/~cyl/SEE/.
    would use of a merged set have on summary
                                                       MacQueen, J. 1967. Some methods for classification
    output?
                                                         and analysis of multivariate observations. Proc Fifth
•   to consider new techniques for sentence              Berkeley Symposium on Mathematical statistics and
    selection, which may involve factors other           probability, (1), 281-297.
    than keyphrase matching. For instance words        Mittal, V., M. Kantrowitz, J. Goldstein and J.
    marking anaphora may make a sentence less            Carbonell. 1999. Selecting Text Spans for Document
    well suited to a summary;                            Summaries: Heuristics and Metrics. Proc AAAI-99,
                                                         467-473.
•   to continue looking to add new segmenters
                                                       Over, P. 2001. Introduction to DUC 2001: an Intrinsic
    and keyphrasers to our system.                       Evaluation of Generic News Text Summarization
•   to investigate further the role of segmentation      Systems. www-nlpir.nist.gov/projects/duc/duc2001/
    and locality in text summarization.                  pauls_slides/duc2001.ppt.
                                                       Quinlan, J.R. 2002. C5.0: An Informal Tutorial.
    Acknowledgements                                     www.rulequest.com/see5-unix.html.
                                                       Radev, D., S. Teufel, W. Lam and H. Saggion. 2002.
This work has been supported by the Natural
                                                         Automatic Summarization of Multiple Documents
Sciences and Engineering Research Council of             (MEAD) Project. www.clsp.jhu.edu/ws2001/groups/
Canada and by our University.                            asmd/.

    References                                         Turney, P. 2000. Learning algorithms for keyphrase
                                                         extraction. Information Retrieval, 2 (4), 303-336.
Barker, K. and N. Cornacchia. 2000. Using Noun         Witten, I.H., G. Paynter, E. Frank, C. Gutwin and C.
  Phrase Heads to Extract Document Keyphrases. Proc      Nevill-Manning. 1999. KEA: Practical automatic
  13th Conf of the CSCSI, AI 2000 (LNAI 1822).           keyphrase extraction. Proc DL-99, 254-256.
  Montreal, 40-52.
Choi, F. 2000. Advances in domain independent linear
  text segmentation. Proc ANLP/NAACL-00, 26-33.
Copeck, T., N. Japkowicz and S. Szpakowicz. 2002.
  Text Summarization as Controlled Search. Proc 15th
  Conf of the CSCSI, AI 2002 (LNAI 2338), Calgary,
  268-280.
Freund, Y., Shapire, R. 1996. Experiments with a new
   boosting algorithm. Proc 13th ICML. Morgan
   Kaufmann. 325-332.

More Related Content

What's hot

Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Edmond Lepedus
 
Improved text clustering with
Improved text clustering withImproved text clustering with
Improved text clustering withIJDKP
 
Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...IJECEIAES
 
Applications of Error Control Coding
Applications of Error Control CodingApplications of Error Control Coding
Applications of Error Control Codinghamidiye
 
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS ijp2p
 
Secure multipath routing scheme using key
Secure multipath routing scheme using keySecure multipath routing scheme using key
Secure multipath routing scheme using keyijfcstjournal
 
Ballpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless NetworksBallpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless NetworksEditor IJCATR
 
ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...
ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...
ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...ijcsity
 
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATION
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATIONTHE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATION
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATIONijscai
 
Effective Data Retrieval System with Bloom in a Unstructured p2p Network
Effective Data Retrieval System with Bloom in a Unstructured p2p NetworkEffective Data Retrieval System with Bloom in a Unstructured p2p Network
Effective Data Retrieval System with Bloom in a Unstructured p2p NetworkUvaraj Shan
 
Complexity analysis of multilayer perceptron neural network embedded into a w...
Complexity analysis of multilayer perceptron neural network embedded into a w...Complexity analysis of multilayer perceptron neural network embedded into a w...
Complexity analysis of multilayer perceptron neural network embedded into a w...Amir Shokri
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learningsun peiyuan
 
Multicasting in Delay Tolerant Networks: Implementation and Performance Analysis
Multicasting in Delay Tolerant Networks: Implementation and Performance AnalysisMulticasting in Delay Tolerant Networks: Implementation and Performance Analysis
Multicasting in Delay Tolerant Networks: Implementation and Performance AnalysisNagendra Posani
 

What's hot (18)

Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...
 
Improved text clustering with
Improved text clustering withImproved text clustering with
Improved text clustering with
 
Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...Bayesian distance metric learning and its application in automatic speaker re...
Bayesian distance metric learning and its application in automatic speaker re...
 
Applications of Error Control Coding
Applications of Error Control CodingApplications of Error Control Coding
Applications of Error Control Coding
 
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA  AND CHORD DHTS
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTS
 
Secure multipath routing scheme using key
Secure multipath routing scheme using keySecure multipath routing scheme using key
Secure multipath routing scheme using key
 
Ballpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless NetworksBallpark Figure Algorithms for Data Broadcast in Wireless Networks
Ballpark Figure Algorithms for Data Broadcast in Wireless Networks
 
Y24168171
Y24168171Y24168171
Y24168171
 
ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...
ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...
ANALYSE THE PERFORMANCE OF MOBILE PEER TO PEER NETWORK USING ANT COLONY OPTIM...
 
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATION
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATIONTHE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATION
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATION
 
Project titles abstract_2012
Project titles abstract_2012Project titles abstract_2012
Project titles abstract_2012
 
Effective Data Retrieval System with Bloom in a Unstructured p2p Network
Effective Data Retrieval System with Bloom in a Unstructured p2p NetworkEffective Data Retrieval System with Bloom in a Unstructured p2p Network
Effective Data Retrieval System with Bloom in a Unstructured p2p Network
 
E42062126
E42062126E42062126
E42062126
 
Complexity analysis of multilayer perceptron neural network embedded into a w...
Complexity analysis of multilayer perceptron neural network embedded into a w...Complexity analysis of multilayer perceptron neural network embedded into a w...
Complexity analysis of multilayer perceptron neural network embedded into a w...
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 
Multicasting in Delay Tolerant Networks: Implementation and Performance Analysis
Multicasting in Delay Tolerant Networks: Implementation and Performance AnalysisMulticasting in Delay Tolerant Networks: Implementation and Performance Analysis
Multicasting in Delay Tolerant Networks: Implementation and Performance Analysis
 
Dq24746750
Dq24746750Dq24746750
Dq24746750
 
Bv25430436
Bv25430436Bv25430436
Bv25430436
 

Viewers also liked

Certificate of Completion 6
Certificate of Completion 6Certificate of Completion 6
Certificate of Completion 6Carla Dessi
 
MoI present like...a surfer
MoI present like...a surferMoI present like...a surfer
MoI present like...a surferMartin Barnes
 
Ongoing Residential and Commercial Property in Chennai for Sale by Akshaya
Ongoing Residential and Commercial Property in Chennai for Sale by AkshayaOngoing Residential and Commercial Property in Chennai for Sale by Akshaya
Ongoing Residential and Commercial Property in Chennai for Sale by AkshayaAkshaya Construction
 
Отчет о работе фонда помощи хосписам «Вера» 2007-2014
Отчет о работе фонда помощи хосписам «Вера» 2007-2014Отчет о работе фонда помощи хосписам «Вера» 2007-2014
Отчет о работе фонда помощи хосписам «Вера» 2007-2014Фонд Вера
 

Viewers also liked (6)

Certificate of Completion 6
Certificate of Completion 6Certificate of Completion 6
Certificate of Completion 6
 
MoI present like...a surfer
MoI present like...a surferMoI present like...a surfer
MoI present like...a surfer
 
ตัวชี้วัด
ตัวชี้วัดตัวชี้วัด
ตัวชี้วัด
 
Ongoing Residential and Commercial Property in Chennai for Sale by Akshaya
Ongoing Residential and Commercial Property in Chennai for Sale by AkshayaOngoing Residential and Commercial Property in Chennai for Sale by Akshaya
Ongoing Residential and Commercial Property in Chennai for Sale by Akshaya
 
20161010090449648
2016101009044964820161010090449648
20161010090449648
 
Отчет о работе фонда помощи хосписам «Вера» 2007-2014
Отчет о работе фонда помощи хосписам «Вера» 2007-2014Отчет о работе фонда помощи хосписам «Вера» 2007-2014
Отчет о работе фонда помощи хосписам «Вера» 2007-2014
 

Similar to [ ] uottawa_copeck.doc

A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationNinad Samel
 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...IRJET Journal
 
Prediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNPrediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNIJECEIAES
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document ClassificationIDES Editor
 
Text Segmentation for Online Subjective Examination using Machine Learning
Text Segmentation for Online Subjective Examination using Machine   LearningText Segmentation for Online Subjective Examination using Machine   Learning
Text Segmentation for Online Subjective Examination using Machine LearningIRJET Journal
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET Journal
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationDarian Pruitt
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingNimrita Koul
 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsIJECEIAES
 
Survey on Text Prediction Techniques
Survey on Text Prediction TechniquesSurvey on Text Prediction Techniques
Survey on Text Prediction Techniquesvivatechijri
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...IJCSIS Research Publications
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text miningIRJET Journal
 
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...idescitation
 
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELSSENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELSIJDKP
 
Handwritten Text Recognition and Translation with Audio
Handwritten Text Recognition and Translation with AudioHandwritten Text Recognition and Translation with Audio
Handwritten Text Recognition and Translation with AudioIRJET Journal
 
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...ijnlc
 

Similar to [ ] uottawa_copeck.doc (20)

A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...
 
Prediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNNPrediction of Answer Keywords using Char-RNN
Prediction of Answer Keywords using Char-RNN
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
 
Paper review
Paper reviewPaper review
Paper review
 
Thesis Proposal Presentation
Thesis Proposal PresentationThesis Proposal Presentation
Thesis Proposal Presentation
 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document Classification
 
Text Segmentation for Online Subjective Examination using Machine Learning
Text Segmentation for Online Subjective Examination using Machine   LearningText Segmentation for Online Subjective Examination using Machine   Learning
Text Segmentation for Online Subjective Examination using Machine Learning
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
 
A Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference AnnotationA Pilot Study On Computer-Aided Coreference Annotation
A Pilot Study On Computer-Aided Coreference Annotation
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documents
 
Survey on Text Prediction Techniques
Survey on Text Prediction TechniquesSurvey on Text Prediction Techniques
Survey on Text Prediction Techniques
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
 
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
A Novel Method for Keyword Retrieval using Weighted Standard Deviation: “D4 A...
 
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELSSENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
SENTIMENT ANALYSIS FOR MOVIES REVIEWS DATASET USING DEEP LEARNING MODELS
 
Handwritten Text Recognition and Translation with Audio
Handwritten Text Recognition and Translation with AudioHandwritten Text Recognition and Translation with Audio
Handwritten Text Recognition and Translation with Audio
 
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
ON THE RELEVANCE OF QUERY EXPANSION USING PARALLEL CORPORA AND WORD EMBEDDING...
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

[ ] uottawa_copeck.doc

  • 1. Learning How Best to Summarize Terry COPECK, Stan SZPAKOWICZ, Nathalie JAPKOWICZ School of Information Technology & Engineering University of Ottawa 800 King Edward, P.O. Box 450 Stn. A Ottawa, Ontario, Canada K1N 6N5 {terry, szpak, nat}@site.uottawa.ca Abstract practical reasons and with few hypotheses in play. From the outset it was clear that the approach we We present a summarizer that uses envisioned could be implemented in large part by alternative modules to perform its key tasks. combining existing NLP research software which Such a configurable summarizer benefits from machine learning techniques that help were either in the public domain or freely discover which configuration performs best available from their authors. It suited our purposes for documents in general or for those not to reinvent any more wheels than really characterized by particular lexical features. necessary. It also seemed a good idea to use We determine the quality of an software which is known to the NLP research automatically generated summary by community and has already been subject to its computing a measure of its coverage of the scrutiny. We therefore conceived of our content phrases in a model summary known summarization system as a testbed in which we to be of good quality. Machine learning could evaluate how well alternative black boxes demonstrates significant improvement in this measure over the average value of developed by others cooperate as they perform one summaries produced from all configuration of the key tasks involved in our method of parameter settings. summarization. The rationale inherent in this architecture has itself in due course provided a We perform text summarization by extracting the research agenda. most representative sentences from a single We assumed that a summary composed of document. Such summaries tend to be informative, neighboring, thematically-related sentences should rather than indicative; and generic, rather than read better, and perhaps also convey more biased towards the interests or purposes of any information, than one in which each sentence is reader. This is how we situate our work in the independently chosen from the document as a gradually emerging classification of summary whole and thus likely bears no relation to its types which was recently recapitulated in Kan, neighbours. This assumption led to a process in Klavans and McKeown (2002). which the text to be summarized is first broken into segments, which are sequences of adjacent 1 System Design sentences talking about the same topic. In a Our work on summarization began several years separate operation keyphrases are extracted from ago in the context of a larger project with multiple the whole text and ranked. Each sentence in the objectives, in which summary generation would be text is then rated according to the number of one of the tools. We therefore came to this task for highly-ranked keyphrases which it contains, and
  • 2. each segment is ranked according to the ratings of although coded in plain ASCII, may have a variety the sentences which it contains. The summary is of extraneous formatting: pagination, embedded then constructed by picking sentences from those figures and tables and, most crucially, columnar that exceed a threshold count of keyphrases in the format. Although it took some effort to install the most highly ranked segments, moving from the black boxes in wrappers that brought them to a best segment to the next best until the required common interface, most of our attention was spent number of sentences have been accumulated. on translating input text to the "NLP normal" form These sentences are put in document order and — one sentence per line, one extra line between presented to the user as a summary of the paragraphs — required by most of the black document. boxes. This form is also made necessary by the task itself; sentence extraction presumes a text in 2 Processing which sentences are identified correctly. Recognizing sentence and paragraph boundaries Because our summarizer was at first designed to and extraneous material such as tables interpolated be a functional component of a larger project, it into the flow of text are legitimate research issues was necessary to handle real world texts which, Text Segmenting Segmenter TextTiling C99 None Keyphrase Kea Extractor NPSeek Extraction Matching Exact Stem Sentence Locality Selection Summary Figure 1. Summarizer Program Architecture
  • 3. in themselves. They are, however, not inherently in rating sentences and segments; 3) the threshold part of summarization, so we will not discuss them number of keyphrase matches needed for a further here. sentence to qualify as a candidate for a summary. In the course of processing of input text prior to There are 24 unique specifications for black boxes summarization we create a data model of the alone; each of the three additional parameters document which records its various lexical takes a count rather than an enumerated value, so features on a per-sentence basis as well as details it is quite feasible to produce hundreds if not of sentence and paragraph order. This record is the thousands of summaries for a single document. basis for the document characterization which This is computationally expensive. Results plays a role in summary evaluation discussed in from DUC 2001 led us to shrink this space by Section 4. dispensing with stemmed matching (which for this The key tasks for which we used alternative reason is shown dimmed in Figure 1), fixing the external programs are document segmentation, keyphrase threshold at two and limiting the range keyphrase extraction, and phrase matching. The of the other two count parameters, summary size Columbia University Segmenter (Kan, Klavans and number of keyphrases. and McKeown 1998), Hearst's TextTiling program (1997) and the C99 program (Choi 2000) were 4 Evaluation used to divide a document into segments. A fourth How good are any of these summaries? Answering option implemented recently turns off that question has proven to be quite difficult segmentation and treats the document as a single (Goldstein, Kantrowitz, Mittal and Carbonell segment. A moment’s thought may show that this 1999), and it is currently the object of quite intense is the control case for our hypothesis about the research interest, of which the two Document benefit of locality in sentence selection; picking Understanding Conferences (DUC) sponsored by the highest-rated sentences from the entire text NIST are notable examples. Summary evaluation defeats locality. To extract keyphrases we used has become a research goal in itself. Waikato University's Kea (Witten, Paynter, Frank, A variety of statistical measures of summary Gutwin and Nevill-Manning 1999), the Extractor content — kappa, relative utility, and the program from the Canadian NRC's Institute for traditional IR metrics of precision and recall — are Information Technology (Turney 2000), and computed in systems like the MEAD Eval toolkit NPSeek, a program developed at the University of (Radev, Teufel, Lam and Saggion 2002) or SEE Ottawa (Barker and Cornacchia 2000). The Kea (Lin 2001) that automate or assist the assessment program suite also provided modules to match of summaries against the original texts on which phrases exactly and in stemmed form. Figure 1 they are based. We have, however, chosen to shows the architecture of the summarization compare our summaries against the gold standard program constructed around these black boxes of ones written by human authors. Such (stemmed matching is dimmed deliberately). summaries are not easily come by. We had 3 Operational Parameters therefore initially settled on academic papers as a 1 subject genre and have adopted the authors’ To run, our program must be told which black box abstracts which begin such papers as a practical to use for each of the three key tasks where we approximation to the gold standard summary we have choice. Three other aspects of the process or seek, an approach proposed by Mittal, Kantrowitz, the task have been judged important enough to require parametrization: 1) the size of the summary in sentences, words, or as a proportion of 1 Initially we used a one-million-word corpus of 75 the document; 2) the number of keyphrases to use academic papers published between 1993-1996 in the online Journal of Artificial Intelligence Research.
  • 4. Text to be Parameter Best Parameters Summarized Values Future Feedback Loop Parameter Assessor Discretized, Normalized Rating Figure 2. Text Summarization Learner Architecture Summary Assessor SUMMARY GENERATOR GENERATION CONTROLLER Summary + Parameter Values Abstract + Features Goldstein and Carbonell (1999). Who better than The formula computes a total for the sentence, the author knows the intended content of a paper? assigning a weight of 1.0 to each KPiA which is We use a variant of the keyphrase extraction added to the coverage set and 0.5 to each that technique used in the summarization program to repeats a KPiA which is already in the set. evaluate a summary against the reference summary. That gold standard is broken down into 5 Learning a set of unique content keyphrases by processing it The combination of a wholly automated evaluation against a stoplist of 980 closed-class and common technique and a large number of instances to words. The resulting keyphrases in abstract which to apply this technique is a textbook (KPiAs) differ from those used to produce the application for machine learning. Accordingly, we summary under evaluation in that they compose a embedded the summarization program in a comprehensive description of the abstract rather framework which automates its operation and the than being limited to a small number of the most evaluation of its results and then applies the C5.0 highly ranked ones. decision tree and rule classifier (Quinlan 2002) to A summary is ranked according to the degree infer rules about which parameter settings produce to which it covers the set of KPiAs. Its score is highly-rated or poorly-rated summaries. This computed using the following formula: architecture appears in Figure 2. To permit ∑ 1.0 * KPiAunique + 0.5 * KPiAduplicate comparisons between documents, each summary
  • 5. FEATURE DESCRIPTION FEATURE DESCRIPTION chars number of characters in the document cphr3 number of 3-instance content phrases words number of words in the document cphr4 number of 4-or-more instance content phrases sents number of sentences in the document cbig number of 1-instance content bigrams paras number of paragraphs in the document cbig2 number of 2-instance content bigrams kpiacnt number of kpia instances cbig3 number of 3-instance content bigrams conncnt number of (Marcu) connectives cbig4 number of 4-or-more instance content bigrams pncnt number of PNs, acronyms abig number of 1-instance bigrams contcnt number of content phrases abig2 number of 2-instance bigrams cphr number of 1-instance content phrases abig3 number of 3-instance bigrams cphr2 number of 2-instance content phrases abig4 number of 4-or-more instance bigrams Table 1. Document Features KPiA rating is normalized by dividing it by the content phrases (substrings between stoplist total count of unique KPiAs found in the same entries), bigrams (word pairs) and content number of sentences ranked highest in the bigrams. We record how many of these latter three document on KPiA count. This establishes the primitive lexical items occur once, twice, three highest possible rating for a summary of a given times, or more than three times in the document. size. The set of ratings for summaries of all Adding features has the effect of asking C5.0 to documents is then discretized into the five values 2 discover rules identifying which parameter values verybad, bad, medium, good, verygood using produce good summaries for documents with MacQueen’s (1967) k-means clustering technique. certain features, rather than for documents in C5.0 then induces rules based on these ratings and general. Further active investigation over the last parameter values. Much of the learner’s operation year has not yet identified additional features has been automated, most recently the which might be used effectively to characterize identification of best parameters in the C5.0 output documents. and the application of k-means clustering, and the Our experiments clearly show the benefit of job of completely automating operation of the machine learning. Choosing parameter values system is under way. randomly produces an average rating of 2.79, the Although our first experiments involved a data medium value that is to be expected from a vector containing only the settings of the six population of values that have been normalized. parameters that regulate the production of When the best parameter settings for each text are summaries, availability of the document data chosen, however, the average rating increases two model soon led us to add the features listed in full grades to 4.78 or verygood. Table 1 in order to characterize the document being summarized. These features are all counts: 6 Document Understanding Conference the number of characters, words, sentences and We participated in the Document Understanding paragraphs in the document; and the number of Conferences in 2001 (Copeck, Japkowicz and other syntactic elements, such as KPiAs, Szpakowicz 2002) and 2002. In DUC, NIST first connectives and proper nouns. We also count provides participants with a corpus of training 2 documents together with abstracts written by C5.0 requires a discrete outcome variable. The use of five values was determined heuristically. skilled editors, and then a set of similar test documents which each participant is to summarize
  • 6. Training Induced Rules Test  Test , , ,  Parameter Settings Data Dat Dat a a GENERATOR TSL SYSTEM TSL SYSTEM Submission Figure 3. Methodology for DUC 2002 in a manner consistent with that inferred from the NIST provides participants with a full suite of training set. Our participation involved slight submissions and analysis material after each event modification to our learning system. Rather than as an invitation to get involved in judging results. constructing rules that relate features and We used the human summaries of the 2001 test parameters to KPiA ratings, C5.0 used the training documents to augment training data for 2002 from data to build a classification tree recording these 308 to 590 documents. Our own evaluation of our same relationships. The test data, which is results for 2001 varied from that computed by Paul composed of document features only, was then Over (2001) for no obvious reason, so we held classified against this structure and the parameters back from reporting results for DUC 2002 at the that gave the best summary for the training July workshop on automatic summarization document most similar to each test document were occurring in conjunction with ACL-02. In 2001 we used to produce the summaries submitted to the did well on single document extractive summaries conference. Figure 3 depicts the process. and worse than average on multi-document Ten-fold cross validation on the training data summaries. Because our technique is not shows 16.3% classification error (std. dev. .01). particularly suited to summarize more than one The 5x5 confusion matrix further indicates that document at a time, and because we are not 80% of misclassifications were to the adjacent entirely comfortable with the notion of value in the scale. This suggests that the rules are incorporating material from more than one reasonably trustworthy. The test run showed that document in a single summary, we chose to they are productive, and that the ratings they seek participate only in the single document track in to maximize are well-founded. 2002. For DUC 2002 we added adaptive boosting In DUC 2002 our submission ranked somewhat (Freund and Schapire 1996) to the learning below the middle of the pack of 13 contributors in process. Adaptive boosting, or ADABOOST, the single document track. This is lower than our improves the classification process by iteratively performance the previous year, though ranking generating a number of classifiers from the data, alone is not a clear indication of relative decrease each optimized to classify correctly the cases most in performance. obviously misclassified on the previous pass. The votes of a population of these classifiers generally A measure used both in the training phase and provide better outcomes than do the decisions of the test phase offers insights into the state of the any single classifier. overall process. We compute the discretized KPiA
  • 7. + 2.68 2 0 0 1 2 0 0 2 RATING TRAIN TEST SUB TRAIN TEST SUB verygood 11% 14% 79% 13% 15% 77% + 2.68 good 12% 21% 21% 18% 1% 2% medium 20% 19% 24% 3% bad 18% 21% 26% 4% 12% verybad 39% 25% 19% 80% 7% AVERAG 2.35 2.79 4.79 2.80 1.68 4.31 E Table 2. KPiA Summary Rating Distributions in DUC Corpora rating for all summaries of documents in the The most obvious explanation for these + 2.00 training set. We also infer from the classification circumstances would be an inconsistency between tree a KPiA rating for each summary of every the 2002 training and test data. Our system would document in the test set. Distribution of the five exhibit this in the values it computes for the 19 rating values across any set of summaries features used to model each document for machine establishes the limits of the range of possible learning. However, despite the presence of outliers outcomes. A comparison of KPiA distributions for in each set of documents—texts much longer or DUC 2001 with their counterparts for DUC 2002 shorter than the average—inspection shows that appears in Table 2. Rows show the percentage of the distribution of each feature in one collection summaries rated, or inferred to be, in the indicated resembles its counterpart in the other to a far class. Each column concludes with the average greater degree than do the two problematic KPiA rating for all its summaries. In 2001 test and rating distributions. In addition, where individual + 2.00 training data were similar, and machine learning, features of a document are related to one another, indicated by the gray arrow, was able to find distributions in the two collections also differ in settings for each test document that produced at consistent ways. In sum, the test and training least a good summary. collections appear to resemble one another to the degree one would expect. Contrast this with the situation in 2002. The training data ratings are akin to those for the 2001 We have therefore ruled out the most obvious data, but ratings of the test data are widely at explanation for our results. Although we have not variance with those of the three other document yet been able to find a reason why our 2002 collections. Fully 80% of the summaries of the test training summaries had such low ratings, we data collection are inferred to be verybad. Despite continue to investigate. machine learning outperforming its counterpart in 2001, there were simply no well-rated summaries that could be produced for approximately 22% of 7 Conclusions and Future Work the 2002 test documents. This situation almost We intend in future: guaranteed the low evaluation produced by the NIST judges. • to complete automating the operation of the entire text summarization learning system;
  • 8. + 2.68 + 2.63 • to continue to look for lexical features which Goldstein, J., M. Kantrowitz, V. Mittal, and J. describe documents effectively. In this regard Carbonell. 1999. Summarizing Text Documents: Sentence Selection and Evaluation Metrics. Proc commercial keyphrasers identify entities as ACM/SIGIR-99, 121-128. place, city, state and country; company, Hearst, M. 1997. TexTiling: Segmenting text into organization and group; day and year; percent multi-paragraph subtopic passages. Computational and measure; person; and property. Some of Linguistics 23 (1), 33-64. these may merit recording; Kan, M.-Y., J. Klavans and K. McKeown. 1998. Linear • to investigate the SEE and MEADEval Segmentation and Segment Significance. Proc summarization evaluators; WVLC-6, 197-205. • to merge the output sets of two or all three Kan, M.-Y., J. Klavans and K. McKeown. 2002. Using keyphrasers, providing new alternatives. To the Annotated Bibliography as a Resource for Indicative Summarization. Proc LREC 2002, Las what degree do the keyphrasers agree in their Palmas, Spain. 1746-1752. selections? How would merging sets of Lin, C.-Y. 2001. SEE―Summary Evaluation Environ- keyphrases be accomplished? What effect ment. www.isi.edu/~cyl/SEE/. would use of a merged set have on summary MacQueen, J. 1967. Some methods for classification output? and analysis of multivariate observations. Proc Fifth • to consider new techniques for sentence Berkeley Symposium on Mathematical statistics and selection, which may involve factors other probability, (1), 281-297. than keyphrase matching. For instance words Mittal, V., M. Kantrowitz, J. Goldstein and J. marking anaphora may make a sentence less Carbonell. 1999. Selecting Text Spans for Document well suited to a summary; Summaries: Heuristics and Metrics. Proc AAAI-99, 467-473. • to continue looking to add new segmenters Over, P. 2001. Introduction to DUC 2001: an Intrinsic and keyphrasers to our system. Evaluation of Generic News Text Summarization • to investigate further the role of segmentation Systems. www-nlpir.nist.gov/projects/duc/duc2001/ and locality in text summarization. pauls_slides/duc2001.ppt. Quinlan, J.R. 2002. C5.0: An Informal Tutorial. Acknowledgements www.rulequest.com/see5-unix.html. Radev, D., S. Teufel, W. Lam and H. Saggion. 2002. This work has been supported by the Natural Automatic Summarization of Multiple Documents Sciences and Engineering Research Council of (MEAD) Project. www.clsp.jhu.edu/ws2001/groups/ Canada and by our University. asmd/. References Turney, P. 2000. Learning algorithms for keyphrase extraction. Information Retrieval, 2 (4), 303-336. Barker, K. and N. Cornacchia. 2000. Using Noun Witten, I.H., G. Paynter, E. Frank, C. Gutwin and C. Phrase Heads to Extract Document Keyphrases. Proc Nevill-Manning. 1999. KEA: Practical automatic 13th Conf of the CSCSI, AI 2000 (LNAI 1822). keyphrase extraction. Proc DL-99, 254-256. Montreal, 40-52. Choi, F. 2000. Advances in domain independent linear text segmentation. Proc ANLP/NAACL-00, 26-33. Copeck, T., N. Japkowicz and S. Szpakowicz. 2002. Text Summarization as Controlled Search. Proc 15th Conf of the CSCSI, AI 2002 (LNAI 2338), Calgary, 268-280. Freund, Y., Shapire, R. 1996. Experiments with a new boosting algorithm. Proc 13th ICML. Morgan Kaufmann. 325-332.