WikiAsp: A Dataset for Multi-domain
Aspect-based Summarization
Hiroaki Hayashi1 Prashant Budania2 Peng Wang2


Chris Ackerson2 Raj Neervannan2 Graham Neubig1
1Language Technologies Institute, Carnegie Mellon University


2AlphaSense
• Same source, different user needs
Adapting generation models to Various User Needs
2
User needs
Attention Is


All You Need
Method
Multi-head Attention
Learning Rate
Experiments
Barack Obama
Presidency
Early Life
Spouse
Degree
Aspect: Definition
č£½å“ćƒ¬ćƒ“ćƒ„ćƒ¼


• Object: "ćƒ‘ć‚½ć‚³ćƒ³"


• Aspect: "å‡¦ē†ć‚¹ćƒ”ćƒ¼ćƒ‰"
3
Semantic property that specifies a subset of information about an object
Same source, different user needs


Amazon
Aspect: Definition
4
Semantic property that specifies a subset of information about an object
WikipediačØ˜äŗ‹


• Object: ćƒćƒ©ć‚Æćƒ»ć‚Ŗćƒćƒž


• Aspect: 学歓
Same source, different user needs


Wikipedia
Aspect: Definition
5
Semantic property that specifies a subset of information about an object
科学論文


• Object: "Attention Is All You Need"


• Aspect: "Multi-Head Attention"
Same source, different user needs


[Vaswani+17]
• "Aspects" [Snyder&Barzilay+07]


• Aspect Ranking


• Restaurant reviews
• "Keywords" [Gamon+05]


• Aspect-based Sentiment Analysis


• Customer Feedback
• Aspect-based Summarization


• Proposed by [Titov&McDonald08]


• Customer review summarization
• "Feature" [Hu&Liu04]


• Extracting customer reviews


• Focused on a product "feature"
Evolution of Aspects - Same use, Different Terms
6
1972 1983 1992 2004 2005 2007 2008
Evolution of Aspects - Task Developments
7
1972 1983 1992 2004 2005 2007 2008 2016 2020
• Aspect-based Summarization


• Focuses on topic-level content control, with limited application domains


• Product reviews [Angelidis&Lapata+18], Movie reviews [Wang&Ling16],


News Domains [Krishna&Srinivasan18]


• Abstractive Question Answering


• Emphasizes the ability to obtain the most relevant "answer" rather than
comprehensive text


• Wikipedia [Lewis+20]
State of Aspect-based Summarization
• Generates targeted summaries from different perspectives


• Perspective: Subtopic, Attribute


• Mostly worked in product review domains


• e.g. : Summarize PC reviews in terms of its price


Summarize reviews of Cafe 33* in terms of taste


• Lacks in domain diversity
8 *Taiwanese Bistro in Pittsburgh
How can we overcome the dataset
scarcity in diverse domains?
Wikipedia: Large and Domain-diverse Dataset for Summarization
• Human authors compose Wikipedia articles
from references


• Covers various domains from encyclopedic
standpoint


• Previously formulated into a summarization
task [Liu+18]


• Input: Cited (web) references


• Output: Lead section of a Wikipedia article
9
ę–‡ēŒ®
ęœ€åˆć®ę®µč½
Summarizing References according to Aspects
• Each Wikipedia section as an aspect-based summary of references


• e.g. ā€œEarly life and careerā€, ā€œPresidencyā€, ā€œLegacyā€, … of Barack Obama
10
ę–‡ēŒ® ć‚¢ć‚¹ćƒšć‚Æćƒˆ
ć‚¢ć‚¹ćƒšć‚Æćƒˆćƒ™ćƒ¼ć‚¹č¦ē“„
Data Collection
1. Collect base set of documents


• Collect (references, article) pairs from WikiSum [Liu+18]


2. Determine domains


• 20 DBPedia ontology classes constructed bottom-up


3. Find salient aspects


• 10 section titles that mostly describe textual content


• e.g. ✘ ā€œresultsā€ āœ” ā€œbackgroundā€
11
Aspect Count Take?
Background 4326


3092
āœ”
Aftermath 3092 āœ”
Results 2733 ✘
:
History 1735 āœ”
:
DBPedia Ontology Hierarchy
WikiAsp Dataset
• 10 pre-defined aspects for each of the 20 domains


• Wider variety of domains than previous work


• Album, Artist, Building, Company, Event, Infrastructure, ... (full list available on the paper)


• Multi-document inputs
12
Baseline: Two-stage Model
1. Aspect Extraction


• Classify aspect-relevant segments from inputs (and group them)


2. Summarization


• Summarize aspect-grouped segments
13
Baseline Model - Aspect Extraction
• Multi-label ROBERTa-based classifier


• Controlling label decisions via a threshold


• Label each sentence in cited references


• Group sentences together into clusters


• Training data:


• Sentences from aspect-based summaries
14
Baseline Model - Summarization
• Standard summarization setting


• Input: Aspect-clustered sentences


• Output: Aspect-based summary


• Models


• TextRank [Barrios+16]


• Unsupervised extractive model


• BERTAbs [Liu+19]


• BERT-based abstractive model


• Trained on each domain separately
15
Results: Baseline Performance
• Both models are far from Oracle


• ROUGE-2 for all domains are < 10


• Challenging dataset with highly abstractive summaries


• ROUGE-2 by BERTAbs on WikiAsp (5.74), Reddit (6.35), PubMed (12.16), XSum (16.33), CNNDM (19.39)
16
Figure: ROUGE scores on the test set in Album and Event domains
Album Event
Results: Inter-domain Comparison
• Domains show characteristics


• Company, Software: extractive


• HistoricPlace: Highly abstractive


• Difficulties vary across domains


• Mixed results by the two summarization
models
17
Figure: ROUGE-1 scores for multiple domains
ROUGE-1
Aspect-level Evaluation
• BERTAbs (PreSumm) model performed
best on aspects with patterns


• Government structure of a town


• Format of a football tournament (event)


• TextRank performed best on long
summaries


• Maintaining topical coherence across the
summary is challenging for PreSumm
18
Domain-specific Challenges
• Pronoun Resolution for Opinion-based Inputs


• Some source documents used for aspects like "Public reception" are subjective


• Handling mixed-person texts is necessary for certain domains


"A magical album that you can listen to and enjoy many times."


"I would always suggest this album to anyone."


• Chronological Explanation


• Description of a history of an entity, timeline of an event, etc. requires chronologically
consistent content organization


"On 13 March 1815, ..., the powers at the Congress of Vienna declared him an outlaw. ...


As 17 June drew to a close, Wellington's army had arrived at its position at Waterloo."
19
[From Wiki references for DiscoveryĀ (Daft Punk album)]
[From Wiki article for Battle of Waterloo]
Summary
• WikiAsp provides a multi-domain, aspect-based summarization dataset for
encyclopedic texts


• Baseline model results show characteristics of texts from different domains


• Analyses show domain-specific challenges that current aspect-based
summarization models suffer


• Data available at bit.ly/wikiasp
20
Thank you!

WikiAsp: A Dataset for Multi-domain Aspect-based Summarization

  • 1.
    WikiAsp: A Datasetfor Multi-domain Aspect-based Summarization Hiroaki Hayashi1 Prashant Budania2 Peng Wang2 Chris Ackerson2 Raj Neervannan2 Graham Neubig1 1Language Technologies Institute, Carnegie Mellon University 2AlphaSense
  • 2.
    • Same source,different user needs Adapting generation models to Various User Needs 2 User needs Attention Is All You Need Method Multi-head Attention Learning Rate Experiments Barack Obama Presidency Early Life Spouse Degree
  • 3.
    Aspect: Definition č£½å“ćƒ¬ćƒ“ćƒ„ćƒ¼ • Object:"ćƒ‘ć‚½ć‚³ćƒ³" • Aspect: "å‡¦ē†ć‚¹ćƒ”ćƒ¼ćƒ‰" 3 Semantic property that specifies a subset of information about an object Same source, different user needs Amazon
  • 4.
    Aspect: Definition 4 Semantic propertythat specifies a subset of information about an object WikipediačØ˜äŗ‹ • Object: ćƒćƒ©ć‚Æćƒ»ć‚Ŗćƒćƒž • Aspect: 学歓 Same source, different user needs Wikipedia
  • 5.
    Aspect: Definition 5 Semantic propertythat specifies a subset of information about an object 科学論文 • Object: "Attention Is All You Need" • Aspect: "Multi-Head Attention" Same source, different user needs [Vaswani+17]
  • 6.
    • "Aspects" [Snyder&Barzilay+07] •Aspect Ranking • Restaurant reviews • "Keywords" [Gamon+05] • Aspect-based Sentiment Analysis • Customer Feedback • Aspect-based Summarization • Proposed by [Titov&McDonald08] • Customer review summarization • "Feature" [Hu&Liu04] • Extracting customer reviews • Focused on a product "feature" Evolution of Aspects - Same use, Different Terms 6 1972 1983 1992 2004 2005 2007 2008
  • 7.
    Evolution of Aspects- Task Developments 7 1972 1983 1992 2004 2005 2007 2008 2016 2020 • Aspect-based Summarization • Focuses on topic-level content control, with limited application domains • Product reviews [Angelidis&Lapata+18], Movie reviews [Wang&Ling16], 
 News Domains [Krishna&Srinivasan18] • Abstractive Question Answering • Emphasizes the ability to obtain the most relevant "answer" rather than comprehensive text • Wikipedia [Lewis+20]
  • 8.
    State of Aspect-basedSummarization • Generates targeted summaries from different perspectives • Perspective: Subtopic, Attribute • Mostly worked in product review domains • e.g. : Summarize PC reviews in terms of its price 
 Summarize reviews of Cafe 33* in terms of taste • Lacks in domain diversity 8 *Taiwanese Bistro in Pittsburgh How can we overcome the dataset scarcity in diverse domains?
  • 9.
    Wikipedia: Large andDomain-diverse Dataset for Summarization • Human authors compose Wikipedia articles from references • Covers various domains from encyclopedic standpoint • Previously formulated into a summarization task [Liu+18] • Input: Cited (web) references • Output: Lead section of a Wikipedia article 9 ę–‡ēŒ® ęœ€åˆć®ę®µč½
  • 10.
    Summarizing References accordingto Aspects • Each Wikipedia section as an aspect-based summary of references • e.g. ā€œEarly life and careerā€, ā€œPresidencyā€, ā€œLegacyā€, … of Barack Obama 10 ę–‡ēŒ® ć‚¢ć‚¹ćƒšć‚Æćƒˆ ć‚¢ć‚¹ćƒšć‚Æćƒˆćƒ™ćƒ¼ć‚¹č¦ē“„
  • 11.
    Data Collection 1. Collectbase set of documents • Collect (references, article) pairs from WikiSum [Liu+18] 2. Determine domains • 20 DBPedia ontology classes constructed bottom-up 3. Find salient aspects • 10 section titles that mostly describe textual content • e.g. ✘ ā€œresultsā€ āœ” ā€œbackgroundā€ 11 Aspect Count Take? Background 4326 3092 āœ” Aftermath 3092 āœ” Results 2733 ✘ : History 1735 āœ” : DBPedia Ontology Hierarchy
  • 12.
    WikiAsp Dataset • 10pre-defined aspects for each of the 20 domains • Wider variety of domains than previous work • Album, Artist, Building, Company, Event, Infrastructure, ... (full list available on the paper) • Multi-document inputs 12
  • 13.
    Baseline: Two-stage Model 1.Aspect Extraction • Classify aspect-relevant segments from inputs (and group them) 2. Summarization • Summarize aspect-grouped segments 13
  • 14.
    Baseline Model -Aspect Extraction • Multi-label ROBERTa-based classifier • Controlling label decisions via a threshold • Label each sentence in cited references • Group sentences together into clusters • Training data: • Sentences from aspect-based summaries 14
  • 15.
    Baseline Model -Summarization • Standard summarization setting • Input: Aspect-clustered sentences • Output: Aspect-based summary • Models • TextRank [Barrios+16] • Unsupervised extractive model • BERTAbs [Liu+19] • BERT-based abstractive model • Trained on each domain separately 15
  • 16.
    Results: Baseline Performance •Both models are far from Oracle • ROUGE-2 for all domains are < 10 • Challenging dataset with highly abstractive summaries • ROUGE-2 by BERTAbs on WikiAsp (5.74), Reddit (6.35), PubMed (12.16), XSum (16.33), CNNDM (19.39) 16 Figure: ROUGE scores on the test set in Album and Event domains Album Event
  • 17.
    Results: Inter-domain Comparison •Domains show characteristics • Company, Software: extractive • HistoricPlace: Highly abstractive • Difficulties vary across domains • Mixed results by the two summarization models 17 Figure: ROUGE-1 scores for multiple domains ROUGE-1
  • 18.
    Aspect-level Evaluation • BERTAbs(PreSumm) model performed best on aspects with patterns • Government structure of a town • Format of a football tournament (event) • TextRank performed best on long summaries • Maintaining topical coherence across the summary is challenging for PreSumm 18
  • 19.
    Domain-specific Challenges • PronounResolution for Opinion-based Inputs • Some source documents used for aspects like "Public reception" are subjective • Handling mixed-person texts is necessary for certain domains "A magical album that you can listen to and enjoy many times." "I would always suggest this album to anyone." • Chronological Explanation • Description of a history of an entity, timeline of an event, etc. requires chronologically consistent content organization "On 13 March 1815, ..., the powers at the Congress of Vienna declared him an outlaw. ... 
 As 17 June drew to a close, Wellington's army had arrived at its position at Waterloo." 19 [From Wiki references for DiscoveryĀ (Daft Punk album)] [From Wiki article for Battle of Waterloo]
  • 20.
    Summary • WikiAsp providesa multi-domain, aspect-based summarization dataset for encyclopedic texts • Baseline model results show characteristics of texts from different domains • Analyses show domain-specific challenges that current aspect-based summarization models suffer • Data available at bit.ly/wikiasp 20 Thank you!