SlideShare a Scribd company logo
1 of 1
Download to read offline
Source A Source B
Visualization
A seed article A6 with left polarity (L)
contains both original and paragraphs
with reused information. For example,
paragraph P3 contains original
information labeled as center-
oriented (C), although the article
belongs to a left-oriented publisher.
On the contrary, paragraph P4 was
reused from article A1 and has
changed its polarity from the original
central to the left.
What’s in the News? Towards Identification of Bias by Commission, Omission, and Source Selection (COSS)
Anastasia Zhukova1, Terry Ruas1, Felix Hamborg2, Karsten Donnay3, Bela Gipp1
Motivation
The literature on media bias has found that editorial choices in the news production process, such as bias by
commission and omission of information and source selection (COSS), strongly affect public perceptions [1].
Journalists often rely on the familiar news sources and fail to independently verify the information they report,
which tends to lead to a lower quality of reporting [11].
Much of the information in articles typically originates from previously published articles, newswire reports, or
press releases [6, 12]. Compared to its sources, which information is included or excluded in an article is typically
opaque to the news reader [8]. Especially when information is reused as paraphrases where different from the
original source wording is used, which eventually leads to biased reporting [10].
Previous approaches have considered the identification of bias by the source selection and commission separately
from bias by omission [3, 9]. Moreover, the existing approaches use simple methods for text reuse, e.g., TFIDF, and
assign a polarity label to an article by propagating the political leaning of an outlet.
1 University of Göttingen, 2 Heidelberg Academy of Science and Humanities, 3 University of Zurich
Research Questions
How can identifying bias by COSS be addressed as a three-fold objective?
How can a methodology benefit from such research areas as paraphrase identification, information flow analysis,
and identification of political leaning?
Pipeline Concept
The pipeline for the identification of bias by COSS has two purposes:
(1) analysis of a given seed article against a collection of event-related articles to identify which parts of it are
original and which are reused
(2) identification of patterns in information flow in a collection of event-related articles to explore a bigger picture
of information reuse
References
[1] Eytan Bakshy, Solomon Messing, and Lada A Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 6239 (2015),
1130–1132. [2] Darrell Christian, Paula Froke, Sally Jacobsen, and David Minthorn. 2014. The Associated Press stylebook and briefing on media law. The
Associated Press. [3] Jonas Ehrhardt, Timo Spinde, Ali Vardasbi, and Felix Hamborg. 2021. Omission of Information: Identifying Political Slant via an Analysis of
Co-occurring Entities. In Information between Data and Knowledge. Schriften zur Informationswissenschaft, Vol. 74. Werner Hülsbusch, Glückstadt, 80–93. [4]
Susanne Fengler and Stephan Ruß-Mohl. 2008. Journalists and the informationattention markets: Towards an economic theory of journalism. Journalism 9, 6
(2008), 667–690. [5] Russell Frank. 2003. These crowded circumstances’ when pack journalists bash pack journalism. Journalism 4, 4 (2003), 441–458. [6]
Robert Gaizauskas, Jonathan Foster, Yorick Wilks, John Arundel, Paul Clough, and Scott Piao. 2001. The METER corpus: a corpus for analysing journalistic text
reuse. In Proceedings of the corpus linguistics 2001 conference, Vol. 1. Citeseer. [7] Lukas Gebhard and Felix Hamborg. 2020. The POLUSA dataset: 0.9 M
political news articles balanced by time and outlet popularity. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020. 467–468. [8] Felix
Hamborg, Karsten Donnay, and Bela Gipp. 2019. Automated identification of media bias in news articles: an interdisciplinary literature review. International
Journal on Digital Libraries 20, 4 (2019), 391–415. [9] Felix Hamborg, Philipp Meschenmoser, Moritz Schubotz, Philipp Scharpf, and Bela Gipp. 2021. NewsDeps:
Visualizing the Origin of Information in News Articles. Wahrheit und Fake im postfaktisch-digitalen Zeitalter: Distinktionen in den Geistes-und IT-
Wissenschaften (2021), 151–166. [10] Felix Hamborg, Anastasia Zhukova, and Bela Gipp. 2019. Automated Identification of Media Bias by Word Choice and
Labeling in News Articles. In 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL). 196–205. [11] Jonathan Matusitz and Gerald-Mark Breen. 2012. An
examination of pack journalism as a form of groupthink: A theoretical and qualitative analysis. Journal of Human Behavior in the Social Environment 22, 7
(2012), 896–915. [12] Zizi Papacharissi and Maria de Fatima Oliveira. 2008. News frames terrorism: A comparative analysis of frames employed in terrorism
coverage in US and UK newspapers. The international journal of press/politics 13, 1 (2008), 52–74
Conclusion
We propose a concept of identification of bias by COSS as a joint three-fold
objective. Compared to the previous systems identifying these types of bias via
simple direct text reuse, our system leverages a solid research basis in text
plagiarism detection and recent advances in paraphrase identification with
semantic similarity. Revealed cases of paraphrasing help determine the source of
articles and how reused information changed over time and news articles.
Identification of bias by COSS
Statistical and network analysis aims at identifying patterns of text reuse that may induce bias by COSS.
Analysis of the article’s origin includes determining how many paragraphs originate from which sources with which polarities.
If an article reuses a significant amount of information from neutral sources, such as news agencies, we could conclude that
this article is reliable and unlikely bias-prone. If an article consists of too many slanted paragraphs with no sources, it might
indicate a lack of trustworthiness in this article.
Analyzing the information for bias by the commission includes analysis of which information with which polarity tends to be
reused, how often and how reused paragraphs change polarity, how long information continues being reused after the
original publishing, etc. A
Analyzing patterns of bias by omission includes identifying which parts of the source articles were not picked up by which
articles were excluded from discussions.
Candidate
retrieval
Source retrieval
Text alignment
Graph of text
reuse &
paraphrases
Visualization
query
seed
document
Timestamp
DB of the news
articles
or
Original polarity
label based on the
outlet
Polarity
classification
Identification of bias by
COSS: statistical and network
analysis
POLUSA dataset [7]
Trained classification
model
Polarity labels are
assigned to each
paragraph
Text similarity on the paragraph
level based on
SentenceTransformers
Tempo-oriented
directed graph
Patterns of text reuse that may
induce bias by COSS
Enables researchers and news
readers to explore the extracted
information
t
C
L
R
L
R
C
P1
P2
P3
P4
A1
A2
A3
A5
A4 A6
P2
P1
Seed
document
Contact
Anastasia Zhukova zhukova@gipplab.org
@ana_m_zhukova www.gipplab.org/media-bias
Source B
commission
omission

More Related Content

Similar to What's in the News? Towards Identification of Bias by Commission, Omission, and Source Selection (COSS)

From Broadcast to Netcast - PhD Thesis - Bonchek - 1997
From Broadcast to Netcast - PhD Thesis - Bonchek - 1997From Broadcast to Netcast - PhD Thesis - Bonchek - 1997
From Broadcast to Netcast - PhD Thesis - Bonchek - 1997
Mark Bonchek
 
lable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docxlable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docx
croysierkathey
 
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...
Nathan J Stone
 
Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...
Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...
Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...
Amani Channel
 
IAHTopic  Whose work goes into space science How do different .docx
IAHTopic  Whose work goes into space science How do different .docxIAHTopic  Whose work goes into space science How do different .docx
IAHTopic  Whose work goes into space science How do different .docx
florriezhamphrey3065
 

Similar to What's in the News? Towards Identification of Bias by Commission, Omission, and Source Selection (COSS) (20)

The decline in orally negotiated news - Reich 2018
The decline in orally negotiated news - Reich 2018The decline in orally negotiated news - Reich 2018
The decline in orally negotiated news - Reich 2018
 
Era of Sociology News Rumors News Detection using Machine Learning
Era of Sociology News Rumors News Detection using Machine LearningEra of Sociology News Rumors News Detection using Machine Learning
Era of Sociology News Rumors News Detection using Machine Learning
 
From Broadcast to Netcast - PhD Thesis - Bonchek - 1997
From Broadcast to Netcast - PhD Thesis - Bonchek - 1997From Broadcast to Netcast - PhD Thesis - Bonchek - 1997
From Broadcast to Netcast - PhD Thesis - Bonchek - 1997
 
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
 
Future of journalism
Future of journalismFuture of journalism
Future of journalism
 
A Corpus-Based Critical Discourse Analysis Of Pre-2019 General Elections Repo...
A Corpus-Based Critical Discourse Analysis Of Pre-2019 General Elections Repo...A Corpus-Based Critical Discourse Analysis Of Pre-2019 General Elections Repo...
A Corpus-Based Critical Discourse Analysis Of Pre-2019 General Elections Repo...
 
Detecting subtle text manipulations
Detecting subtle text manipulationsDetecting subtle text manipulations
Detecting subtle text manipulations
 
Manosevitch walker09
Manosevitch walker09Manosevitch walker09
Manosevitch walker09
 
lable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docxlable at ScienceDirectComputers in Human Behavior 83 (2018.docx
lable at ScienceDirectComputers in Human Behavior 83 (2018.docx
 
Citizen jounalism web site complement
Citizen jounalism web site complementCitizen jounalism web site complement
Citizen jounalism web site complement
 
Enriching comic book and graphic novel metadata using Linked Open Data (LOD);...
Enriching comic book and graphic novel metadata using Linked Open Data (LOD);...Enriching comic book and graphic novel metadata using Linked Open Data (LOD);...
Enriching comic book and graphic novel metadata using Linked Open Data (LOD);...
 
A Comparative Analysis Of Print Vs. Online News Media Parveen
A Comparative Analysis Of Print Vs. Online News Media ParveenA Comparative Analysis Of Print Vs. Online News Media Parveen
A Comparative Analysis Of Print Vs. Online News Media Parveen
 
Research methodolgy and legal writing: Content Analysis
Research methodolgy and legal writing: Content AnalysisResearch methodolgy and legal writing: Content Analysis
Research methodolgy and legal writing: Content Analysis
 
Application Of Linguistic Cues In The Analysis Of Language Of Hate Groups
Application Of Linguistic Cues In The Analysis Of Language Of Hate GroupsApplication Of Linguistic Cues In The Analysis Of Language Of Hate Groups
Application Of Linguistic Cues In The Analysis Of Language Of Hate Groups
 
Redistributing journalism: Journalism as a data public and the politics of qu...
Redistributing journalism: Journalism as a data public and the politics of qu...Redistributing journalism: Journalism as a data public and the politics of qu...
Redistributing journalism: Journalism as a data public and the politics of qu...
 
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...
FINAL PRINT -Engagement in the Details - AN ANALYSIS OF READER INTERACTION AC...
 
Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...
Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...
Gatekeeping and Citizen Journalism: A Qualitative Examination of Participator...
 
IAHTopic  Whose work goes into space science How do different .docx
IAHTopic  Whose work goes into space science How do different .docxIAHTopic  Whose work goes into space science How do different .docx
IAHTopic  Whose work goes into space science How do different .docx
 
Artificial Intelligence For Investigative Reporting
Artificial Intelligence For Investigative ReportingArtificial Intelligence For Investigative Reporting
Artificial Intelligence For Investigative Reporting
 
A concise review on the role of author self-citations in information science,...
A concise review on the role of author self-citations in information science,...A concise review on the role of author self-citations in information science,...
A concise review on the role of author self-citations in information science,...
 

More from Anastasia Zhukova

M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...
M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...
M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...
Anastasia Zhukova
 
Automated Identification of Framing by Word Choice and Labeling to Reveal Med...
Automated Identification of Framing by Word Choice and Labeling to Reveal Med...Automated Identification of Framing by Word Choice and Labeling to Reveal Med...
Automated Identification of Framing by Word Choice and Labeling to Reveal Med...
Anastasia Zhukova
 
Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...
Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...
Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...
Anastasia Zhukova
 
Towards Evaluation of Cross-document Coreference Resolution Models Using Data...
Towards Evaluation of Cross-document Coreference Resolution Models Using Data...Towards Evaluation of Cross-document Coreference Resolution Models Using Data...
Towards Evaluation of Cross-document Coreference Resolution Models Using Data...
Anastasia Zhukova
 
Concept Identification of Directly and Indirectly Related Mentions Referring ...
Concept Identification of Directly and Indirectly Related Mentions Referring ...Concept Identification of Directly and Indirectly Related Mentions Referring ...
Concept Identification of Directly and Indirectly Related Mentions Referring ...
Anastasia Zhukova
 
XCoref: Cross-document Coreference Resolution in the Wild
XCoref: Cross-document Coreference Resolution in the WildXCoref: Cross-document Coreference Resolution in the Wild
XCoref: Cross-document Coreference Resolution in the Wild
Anastasia Zhukova
 

More from Anastasia Zhukova (11)

Seminar Paper: Putting News in a Perspective: Framing by Word Choice and Labe...
Seminar Paper: Putting News in a Perspective: Framing by Word Choice and Labe...Seminar Paper: Putting News in a Perspective: Framing by Word Choice and Labe...
Seminar Paper: Putting News in a Perspective: Framing by Word Choice and Labe...
 
M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...
M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...
M.Sc. Thesis: Automated Identification of Framing by Word Choice and Labeling...
 
Talk: Automated Identification of Media Bias by Word Choice and Labeling in N...
Talk: Automated Identification of Media Bias by Word Choice and Labeling in N...Talk: Automated Identification of Media Bias by Word Choice and Labeling in N...
Talk: Automated Identification of Media Bias by Word Choice and Labeling in N...
 
Automated Identification of Framing by Word Choice and Labeling to Reveal Med...
Automated Identification of Framing by Word Choice and Labeling to Reveal Med...Automated Identification of Framing by Word Choice and Labeling to Reveal Med...
Automated Identification of Framing by Word Choice and Labeling to Reveal Med...
 
Putting News in a Perspective: Framing by Word Choice and Labeling
Putting News in a Perspective: Framing by Word Choice and LabelingPutting News in a Perspective: Framing by Word Choice and Labeling
Putting News in a Perspective: Framing by Word Choice and Labeling
 
Interpretable Topic Modeling Using Near-Identity Cross-Document Coreference R...
Interpretable Topic Modeling Using Near-Identity Cross-Document Coreference R...Interpretable Topic Modeling Using Near-Identity Cross-Document Coreference R...
Interpretable Topic Modeling Using Near-Identity Cross-Document Coreference R...
 
Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...
Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...
Interpretable and Comparative Textual Dataset Exploration Using Near-Identity...
 
Towards Evaluation of Cross-document Coreference Resolution Models Using Data...
Towards Evaluation of Cross-document Coreference Resolution Models Using Data...Towards Evaluation of Cross-document Coreference Resolution Models Using Data...
Towards Evaluation of Cross-document Coreference Resolution Models Using Data...
 
Concept Identification of Directly and Indirectly Related Mentions Referring ...
Concept Identification of Directly and Indirectly Related Mentions Referring ...Concept Identification of Directly and Indirectly Related Mentions Referring ...
Concept Identification of Directly and Indirectly Related Mentions Referring ...
 
XCoref: Cross-document Coreference Resolution in the Wild
XCoref: Cross-document Coreference Resolution in the WildXCoref: Cross-document Coreference Resolution in the Wild
XCoref: Cross-document Coreference Resolution in the Wild
 
ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts
ANEA: Automated (Named) Entity Annotation for German Domain-Specific TextsANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts
ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts
 

Recently uploaded

HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
Electricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 studentsElectricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 students
levieagacer
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
University of Hertfordshire
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
GOWTHAMIM22
 

Recently uploaded (20)

Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einstein
 
Costs to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of UgandaCosts to heap leach gold ore tailings in Karamoja region of Uganda
Costs to heap leach gold ore tailings in Karamoja region of Uganda
 
Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algae
 
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
Abortion uae unmarried price +27791653574 Contact Us Dubai Abu Dhabi Sharjah ...
 
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
Harry Coumnas Thinks That Human Teleportation is Possible in Quantum Mechanic...
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
 
Electricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 studentsElectricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 students
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary Gland
 
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday LifeGBSN - Microbiology (Unit 7) Microbiology in Everyday Life
GBSN - Microbiology (Unit 7) Microbiology in Everyday Life
 
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptxSaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
SaffronCrocusGenomicsThessalonikiOnlineMay2024TalkOnline.pptx
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpWASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 

What's in the News? Towards Identification of Bias by Commission, Omission, and Source Selection (COSS)

  • 1. Source A Source B Visualization A seed article A6 with left polarity (L) contains both original and paragraphs with reused information. For example, paragraph P3 contains original information labeled as center- oriented (C), although the article belongs to a left-oriented publisher. On the contrary, paragraph P4 was reused from article A1 and has changed its polarity from the original central to the left. What’s in the News? Towards Identification of Bias by Commission, Omission, and Source Selection (COSS) Anastasia Zhukova1, Terry Ruas1, Felix Hamborg2, Karsten Donnay3, Bela Gipp1 Motivation The literature on media bias has found that editorial choices in the news production process, such as bias by commission and omission of information and source selection (COSS), strongly affect public perceptions [1]. Journalists often rely on the familiar news sources and fail to independently verify the information they report, which tends to lead to a lower quality of reporting [11]. Much of the information in articles typically originates from previously published articles, newswire reports, or press releases [6, 12]. Compared to its sources, which information is included or excluded in an article is typically opaque to the news reader [8]. Especially when information is reused as paraphrases where different from the original source wording is used, which eventually leads to biased reporting [10]. Previous approaches have considered the identification of bias by the source selection and commission separately from bias by omission [3, 9]. Moreover, the existing approaches use simple methods for text reuse, e.g., TFIDF, and assign a polarity label to an article by propagating the political leaning of an outlet. 1 University of Göttingen, 2 Heidelberg Academy of Science and Humanities, 3 University of Zurich Research Questions How can identifying bias by COSS be addressed as a three-fold objective? How can a methodology benefit from such research areas as paraphrase identification, information flow analysis, and identification of political leaning? Pipeline Concept The pipeline for the identification of bias by COSS has two purposes: (1) analysis of a given seed article against a collection of event-related articles to identify which parts of it are original and which are reused (2) identification of patterns in information flow in a collection of event-related articles to explore a bigger picture of information reuse References [1] Eytan Bakshy, Solomon Messing, and Lada A Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 6239 (2015), 1130–1132. [2] Darrell Christian, Paula Froke, Sally Jacobsen, and David Minthorn. 2014. The Associated Press stylebook and briefing on media law. The Associated Press. [3] Jonas Ehrhardt, Timo Spinde, Ali Vardasbi, and Felix Hamborg. 2021. Omission of Information: Identifying Political Slant via an Analysis of Co-occurring Entities. In Information between Data and Knowledge. Schriften zur Informationswissenschaft, Vol. 74. Werner Hülsbusch, Glückstadt, 80–93. [4] Susanne Fengler and Stephan Ruß-Mohl. 2008. Journalists and the informationattention markets: Towards an economic theory of journalism. Journalism 9, 6 (2008), 667–690. [5] Russell Frank. 2003. These crowded circumstances’ when pack journalists bash pack journalism. Journalism 4, 4 (2003), 441–458. [6] Robert Gaizauskas, Jonathan Foster, Yorick Wilks, John Arundel, Paul Clough, and Scott Piao. 2001. The METER corpus: a corpus for analysing journalistic text reuse. In Proceedings of the corpus linguistics 2001 conference, Vol. 1. Citeseer. [7] Lukas Gebhard and Felix Hamborg. 2020. The POLUSA dataset: 0.9 M political news articles balanced by time and outlet popularity. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020. 467–468. [8] Felix Hamborg, Karsten Donnay, and Bela Gipp. 2019. Automated identification of media bias in news articles: an interdisciplinary literature review. International Journal on Digital Libraries 20, 4 (2019), 391–415. [9] Felix Hamborg, Philipp Meschenmoser, Moritz Schubotz, Philipp Scharpf, and Bela Gipp. 2021. NewsDeps: Visualizing the Origin of Information in News Articles. Wahrheit und Fake im postfaktisch-digitalen Zeitalter: Distinktionen in den Geistes-und IT- Wissenschaften (2021), 151–166. [10] Felix Hamborg, Anastasia Zhukova, and Bela Gipp. 2019. Automated Identification of Media Bias by Word Choice and Labeling in News Articles. In 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL). 196–205. [11] Jonathan Matusitz and Gerald-Mark Breen. 2012. An examination of pack journalism as a form of groupthink: A theoretical and qualitative analysis. Journal of Human Behavior in the Social Environment 22, 7 (2012), 896–915. [12] Zizi Papacharissi and Maria de Fatima Oliveira. 2008. News frames terrorism: A comparative analysis of frames employed in terrorism coverage in US and UK newspapers. The international journal of press/politics 13, 1 (2008), 52–74 Conclusion We propose a concept of identification of bias by COSS as a joint three-fold objective. Compared to the previous systems identifying these types of bias via simple direct text reuse, our system leverages a solid research basis in text plagiarism detection and recent advances in paraphrase identification with semantic similarity. Revealed cases of paraphrasing help determine the source of articles and how reused information changed over time and news articles. Identification of bias by COSS Statistical and network analysis aims at identifying patterns of text reuse that may induce bias by COSS. Analysis of the article’s origin includes determining how many paragraphs originate from which sources with which polarities. If an article reuses a significant amount of information from neutral sources, such as news agencies, we could conclude that this article is reliable and unlikely bias-prone. If an article consists of too many slanted paragraphs with no sources, it might indicate a lack of trustworthiness in this article. Analyzing the information for bias by the commission includes analysis of which information with which polarity tends to be reused, how often and how reused paragraphs change polarity, how long information continues being reused after the original publishing, etc. A Analyzing patterns of bias by omission includes identifying which parts of the source articles were not picked up by which articles were excluded from discussions. Candidate retrieval Source retrieval Text alignment Graph of text reuse & paraphrases Visualization query seed document Timestamp DB of the news articles or Original polarity label based on the outlet Polarity classification Identification of bias by COSS: statistical and network analysis POLUSA dataset [7] Trained classification model Polarity labels are assigned to each paragraph Text similarity on the paragraph level based on SentenceTransformers Tempo-oriented directed graph Patterns of text reuse that may induce bias by COSS Enables researchers and news readers to explore the extracted information t C L R L R C P1 P2 P3 P4 A1 A2 A3 A5 A4 A6 P2 P1 Seed document Contact Anastasia Zhukova zhukova@gipplab.org @ana_m_zhukova www.gipplab.org/media-bias Source B commission omission