SlideShare a Scribd company logo
Frontiers of
Computational Journalism
Columbia Journalism School
Week 9: Knowledge Representation
November 14, 2018
This class
• Structured Journalism
• Ontologies and Graphs
• Relations from Text
Structured Journalism
Unstructured data
Structured data
Everyblock.com circa 2009
Connected China. Reuters, 2013
Article Metadata
headline
photo
photo caption
byline
photo credit
publication date
dateline
article body
related articles
Schema.org news markup
Overall type of the object on this page, in HTML head
Headline, dateline, date as additions to div/span properties
Byline expressed as nested object (using itemscope) of type schema.org/Person
Driving application: “rich snippets”
Schema.org covers not just news but music, restaurants, people, organizations,
reviews, offers...
Snippets, and better search-ability generally, are motivation for Google, Yahoo, Bing
to push schema.org
Additional metadata from indexing team
In database, but doesn't necessarily make it to HTML.
Application: content navigation
Articles about “Syria”
on NYT topic page
More reliable than simple text
search (because the relevance
algorithm knows a story is
"about" Syria.)
Wall Street is high on Molson Coors Brewing (TAP), expecting it to report earnings that
are up 17.5% from a year ago when it reports its third quarter earnings on Wednesday,
November 7, 2012. The consensus estimate is $1.34 per share, up from earnings of
$1.14 per share a year ago.
The consensus estimate has dipped over the past month, from $1.35, but it’s still up from
the consensus estimate of $1.19 three months ago. For the fiscal year, analysts are
expecting earnings of $3.89 per share. Revenue is projected to eclipse the year-earlier
total of $954.4 million by 31%, finishing at $1.25 billion for the quarter. For the year,
revenue is projected to roll in at $4.04 billion.
The company’s net income has declined in the last two quarters. The company posted
profit falling by 52.8% in the second quarter. This is after it reported a profit decline in the
first quarter by 4.1%.
Automatic story generation (AP/Narrative Science)
Application: automatic stories
Ontologies and Graphs
What objects and relations are available?
Often represented as class hierarchy.
Arrows = “is_a” relation
(Part of) a real ontology, from Cyc
News as relations between entities
“Alice attended the wedding”
attended(alice, wedding)
“IBM was founded in 1917.”
founded(IBM, 1917)
“Hurricane Sandy hit New York”
hit(hurricane_sandy, New_York)
Encode facts as relation(subject,object)
also written (subject relation object)
Things we could do with this
Question answering
“The granddaughter of which actor starred in E.T.?”
(?x acted-in “E.T.”)(?y is-a actor)(?x granddaughter-of ?y)
Inference
(bob brother-of alice)
(alice mother-of lucy) =>
(bob uncle-of lucy)
Answer questions using inference
“how many executives of publicly-traded Canadian companies died in car
crashes?
Every big news org has their own
big ontology 
topics, people, organizations, places...
Enter Linked Data
Triples of (subject relation object), each a URL or literal
<urn:x-states:New%20York>
<http://purl.org/dc/terms/alternative>
"NY”
<http://dbpedia.org/resource/Columbia_University>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://schema.org/CollegeOrUniversity>
Abbreviations possible with many formats...
<http://dbpedia.org/resource/Columbia_University> rdf:type
ns6:CollegeOrUniversity
NYT API can return linked data
{
"title": "Syria's Rebels Open Talks on Forging United Political Front"
"body": "BEIRUT, Lebanon — Syria ’s fractious opposition groups began
negotiations in Doha, Qatar, on Sunday to forge a more unified front to reshape
the political landscape in a bloody conflict that claims more than 100 lives
virtually every day. Given the scant prospects that any attempt to restructure
the opposition will succeed — the",
"dbpedia_resource_url": [
"http://dbpedia.org/resource/Hillary_Rodham_Clinton",
"http://dbpedia.org/resource/Bashar_al-Assad"],
"facet_terms": "CLINTON, HILLARY RODHAM ASSAD, BASHAR AL- SYRIA DOHA
(QATAR) SYRIAN NATIONAL COUNCIL STATE DEPARTMENT WAR AND REVOLUTION DEFENSE AND
MILITARY FORCES"
}
Graph Databases in Journalism
Graph schema for the Panama Papers
William Lyon, Neo4j blog
Property Graphs in the Panama Papers
Relations from Text
Objects and relations in text?
names, dates, places, verbs.
Named Entity Recognition
Extract subjects, objects, from text.
Also, resolve pronouns if possible.
"Gov. Andrew M. Cuomo on Wednesday gave a sea wall the
nod. Because of the recent history of powerful storms hitting the
area, he said, elected officials have a responsibility to consider
new and innovative plans to prevent similar damage in the
future."
Relations from sentence parsing
“The water that made rivers of Avenues C and D receded
on Tuesday, and the East Village was a mixture of disaster
and nonchalance. A group of young men in pajama pants
and shorts threw a football on East 12th Street, while
workers pumped the basement of CHP Hardware on
Avenue C and Eighth Street.”
subject verb object
Stanford Open IE
Ontology explosions
(water made rivers of Avenues C and D)
(East Village was a mixture of disaster and nonchalance)
(group of young men in pajama pants and shorts threw football)
(workers pumped the basement of CHP Hardware )
Do we have all of these in the ontology?
“General Question Answering”
Precision/recall tradeoff. State of the art is IBM’s DeepQA
DeepQA use of structured data
“Watson can also use detected relations to query a triple store and
directly generate candidate answers. Due to the breadth of relations in
the Jeopardy domain and the variety of ways in which they are
expressed, however, Watson’s current ability to effectively use curated
databases to simply “look up” the answers is limited to fewer than 2
percent of the clues.”
- Ferruci et. al. “Building Watson”

More Related Content

Similar to Frontiers of Computational Journalism week 9 - Knowledge representation

Industrial revolution
Industrial revolutionIndustrial revolution
Industrial revolutionmswhitehistory
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
chris wiggins
 
Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example Topics and Well Written...
Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example  Topics and Well Written...Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example  Topics and Well Written...
Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example Topics and Well Written...
Vanessa Martinez
 
Journal of Physical Security 6(1)
Journal of Physical Security 6(1)Journal of Physical Security 6(1)
Journal of Physical Security 6(1)
Roger Johnston
 
Paper OneGo all the way back to Sumerian civilization,” Bill .docx
Paper OneGo all the way back to Sumerian civilization,” Bill .docxPaper OneGo all the way back to Sumerian civilization,” Bill .docx
Paper OneGo all the way back to Sumerian civilization,” Bill .docx
bunyansaturnina
 
Anthem Essay
Anthem EssayAnthem Essay
Anthem Essay
Beth Mack
 
Advantages And Disadvantages Of Technology Ielts Essay
Advantages And Disadvantages Of Technology Ielts EssayAdvantages And Disadvantages Of Technology Ielts Essay
Advantages And Disadvantages Of Technology Ielts Essay
Emma Velasquez
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus
 
Week 5 Persuasive Messages 2
Week 5 Persuasive Messages 2Week 5 Persuasive Messages 2
Week 5 Persuasive Messages 2
SvetlanaPozhidaeva1
 
How To Critique A Scholarly Article. Introduction. 2022-1
How To Critique A Scholarly Article. Introduction. 2022-1How To Critique A Scholarly Article. Introduction. 2022-1
How To Critique A Scholarly Article. Introduction. 2022-1
Angela Overton
 
Embracing societal transformation 20111005 v1
Embracing societal transformation 20111005 v1Embracing societal transformation 20111005 v1
Embracing societal transformation 20111005 v1
ISSIP
 
Essay Nodi
Essay NodiEssay Nodi
Essay Nodi
Jessica Edwards
 
ARC 211: American Diversity and Design: Joshua Rogers
ARC 211: American Diversity and Design: Joshua RogersARC 211: American Diversity and Design: Joshua Rogers
ARC 211: American Diversity and Design: Joshua Rogers
Joshua Rogers
 
Edit Essay Online. Critical essay: Edit essay free
Edit Essay Online. Critical essay: Edit essay freeEdit Essay Online. Critical essay: Edit essay free
Edit Essay Online. Critical essay: Edit essay free
Caitlin Adams
 
1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx
 1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx 1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx
1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx
aryan532920
 
CeB - f - s01
CeB - f - s01CeB - f - s01
CeB - f - s01
gauvins
 
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
The Statistical and Applied Mathematical Sciences Institute
 
50 Informative Essay Introducti. Online assignment writing service.
50 Informative Essay Introducti. Online assignment writing service.50 Informative Essay Introducti. Online assignment writing service.
50 Informative Essay Introducti. Online assignment writing service.
Cherie King
 
Spanish Essay Writing Skills
Spanish Essay Writing SkillsSpanish Essay Writing Skills
Spanish Essay Writing Skills
Victoria Coleman
 
Dclee.module3.webeval
Dclee.module3.webevalDclee.module3.webeval
Dclee.module3.webeval
misterdlee
 

Similar to Frontiers of Computational Journalism week 9 - Knowledge representation (20)

Industrial revolution
Industrial revolutionIndustrial revolution
Industrial revolution
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
 
Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example Topics and Well Written...
Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example  Topics and Well Written...Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example  Topics and Well Written...
Amy Tan Essays. Fish Cheeks by Amy Tan Essay Example Topics and Well Written...
 
Journal of Physical Security 6(1)
Journal of Physical Security 6(1)Journal of Physical Security 6(1)
Journal of Physical Security 6(1)
 
Paper OneGo all the way back to Sumerian civilization,” Bill .docx
Paper OneGo all the way back to Sumerian civilization,” Bill .docxPaper OneGo all the way back to Sumerian civilization,” Bill .docx
Paper OneGo all the way back to Sumerian civilization,” Bill .docx
 
Anthem Essay
Anthem EssayAnthem Essay
Anthem Essay
 
Advantages And Disadvantages Of Technology Ielts Essay
Advantages And Disadvantages Of Technology Ielts EssayAdvantages And Disadvantages Of Technology Ielts Essay
Advantages And Disadvantages Of Technology Ielts Essay
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
 
Week 5 Persuasive Messages 2
Week 5 Persuasive Messages 2Week 5 Persuasive Messages 2
Week 5 Persuasive Messages 2
 
How To Critique A Scholarly Article. Introduction. 2022-1
How To Critique A Scholarly Article. Introduction. 2022-1How To Critique A Scholarly Article. Introduction. 2022-1
How To Critique A Scholarly Article. Introduction. 2022-1
 
Embracing societal transformation 20111005 v1
Embracing societal transformation 20111005 v1Embracing societal transformation 20111005 v1
Embracing societal transformation 20111005 v1
 
Essay Nodi
Essay NodiEssay Nodi
Essay Nodi
 
ARC 211: American Diversity and Design: Joshua Rogers
ARC 211: American Diversity and Design: Joshua RogersARC 211: American Diversity and Design: Joshua Rogers
ARC 211: American Diversity and Design: Joshua Rogers
 
Edit Essay Online. Critical essay: Edit essay free
Edit Essay Online. Critical essay: Edit essay freeEdit Essay Online. Critical essay: Edit essay free
Edit Essay Online. Critical essay: Edit essay free
 
1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx
 1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx 1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx
1Toils of Trouble in Election 2012 John Barrow and Lee Ander.docx
 
CeB - f - s01
CeB - f - s01CeB - f - s01
CeB - f - s01
 
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
NCCU: The Story of Data Science and Machine Learning Workshop - Political Blo...
 
50 Informative Essay Introducti. Online assignment writing service.
50 Informative Essay Introducti. Online assignment writing service.50 Informative Essay Introducti. Online assignment writing service.
50 Informative Essay Introducti. Online assignment writing service.
 
Spanish Essay Writing Skills
Spanish Essay Writing SkillsSpanish Essay Writing Skills
Spanish Essay Writing Skills
 
Dclee.module3.webeval
Dclee.module3.webevalDclee.module3.webeval
Dclee.module3.webeval
 

More from Jonathan Stray

Frameworks for Algorithmic Bias
Frameworks for Algorithmic BiasFrameworks for Algorithmic Bias
Frameworks for Algorithmic Bias
Jonathan Stray
 
Analyzing Bias in Data - IRE 2019
Analyzing Bias in Data - IRE 2019Analyzing Bias in Data - IRE 2019
Analyzing Bias in Data - IRE 2019
Jonathan Stray
 
Frontiers of Computational Journalism week 11 - Privacy and Security
Frontiers of Computational Journalism week 11 - Privacy and SecurityFrontiers of Computational Journalism week 11 - Privacy and Security
Frontiers of Computational Journalism week 11 - Privacy and Security
Jonathan Stray
 
Frontiers of Computational Journalism week 10 - Truth and Trust
Frontiers of Computational Journalism week 10 - Truth and TrustFrontiers of Computational Journalism week 10 - Truth and Trust
Frontiers of Computational Journalism week 10 - Truth and Trust
Jonathan Stray
 
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Jonathan Stray
 
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...
Jonathan Stray
 
Frontiers of Computational Journalism week 6 - Quantitative Fairness
Frontiers of Computational Journalism week 6 - Quantitative FairnessFrontiers of Computational Journalism week 6 - Quantitative Fairness
Frontiers of Computational Journalism week 6 - Quantitative Fairness
Jonathan Stray
 
Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...
Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...
Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...
Jonathan Stray
 
Frontiers of Computational Journalism - Final project suggestions
Frontiers of Computational Journalism - Final project suggestionsFrontiers of Computational Journalism - Final project suggestions
Frontiers of Computational Journalism - Final project suggestions
Jonathan Stray
 
Frontiers of Computational Journalism week 4 - Statistical Inference
Frontiers of Computational Journalism week 4 - Statistical InferenceFrontiers of Computational Journalism week 4 - Statistical Inference
Frontiers of Computational Journalism week 4 - Statistical Inference
Jonathan Stray
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
Jonathan Stray
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
Jonathan Stray
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Jonathan Stray
 

More from Jonathan Stray (13)

Frameworks for Algorithmic Bias
Frameworks for Algorithmic BiasFrameworks for Algorithmic Bias
Frameworks for Algorithmic Bias
 
Analyzing Bias in Data - IRE 2019
Analyzing Bias in Data - IRE 2019Analyzing Bias in Data - IRE 2019
Analyzing Bias in Data - IRE 2019
 
Frontiers of Computational Journalism week 11 - Privacy and Security
Frontiers of Computational Journalism week 11 - Privacy and SecurityFrontiers of Computational Journalism week 11 - Privacy and Security
Frontiers of Computational Journalism week 11 - Privacy and Security
 
Frontiers of Computational Journalism week 10 - Truth and Trust
Frontiers of Computational Journalism week 10 - Truth and TrustFrontiers of Computational Journalism week 10 - Truth and Trust
Frontiers of Computational Journalism week 10 - Truth and Trust
 
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
 
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...
Frontiers of Computational Journalism week 7 - Randomness and Statistical Sig...
 
Frontiers of Computational Journalism week 6 - Quantitative Fairness
Frontiers of Computational Journalism week 6 - Quantitative FairnessFrontiers of Computational Journalism week 6 - Quantitative Fairness
Frontiers of Computational Journalism week 6 - Quantitative Fairness
 
Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...
Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...
Frontiers of Computational Journalism week 5 - Algorithmic Accountability and...
 
Frontiers of Computational Journalism - Final project suggestions
Frontiers of Computational Journalism - Final project suggestionsFrontiers of Computational Journalism - Final project suggestions
Frontiers of Computational Journalism - Final project suggestions
 
Frontiers of Computational Journalism week 4 - Statistical Inference
Frontiers of Computational Journalism week 4 - Statistical InferenceFrontiers of Computational Journalism week 4 - Statistical Inference
Frontiers of Computational Journalism week 4 - Statistical Inference
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 

Recently uploaded

Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 

Recently uploaded (20)

Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 

Frontiers of Computational Journalism week 9 - Knowledge representation

  • 1. Frontiers of Computational Journalism Columbia Journalism School Week 9: Knowledge Representation November 14, 2018
  • 2. This class • Structured Journalism • Ontologies and Graphs • Relations from Text
  • 8. Article Metadata headline photo photo caption byline photo credit publication date dateline article body related articles
  • 9. Schema.org news markup Overall type of the object on this page, in HTML head Headline, dateline, date as additions to div/span properties Byline expressed as nested object (using itemscope) of type schema.org/Person
  • 10. Driving application: “rich snippets” Schema.org covers not just news but music, restaurants, people, organizations, reviews, offers... Snippets, and better search-ability generally, are motivation for Google, Yahoo, Bing to push schema.org
  • 11. Additional metadata from indexing team In database, but doesn't necessarily make it to HTML.
  • 12. Application: content navigation Articles about “Syria” on NYT topic page More reliable than simple text search (because the relevance algorithm knows a story is "about" Syria.)
  • 13. Wall Street is high on Molson Coors Brewing (TAP), expecting it to report earnings that are up 17.5% from a year ago when it reports its third quarter earnings on Wednesday, November 7, 2012. The consensus estimate is $1.34 per share, up from earnings of $1.14 per share a year ago. The consensus estimate has dipped over the past month, from $1.35, but it’s still up from the consensus estimate of $1.19 three months ago. For the fiscal year, analysts are expecting earnings of $3.89 per share. Revenue is projected to eclipse the year-earlier total of $954.4 million by 31%, finishing at $1.25 billion for the quarter. For the year, revenue is projected to roll in at $4.04 billion. The company’s net income has declined in the last two quarters. The company posted profit falling by 52.8% in the second quarter. This is after it reported a profit decline in the first quarter by 4.1%. Automatic story generation (AP/Narrative Science) Application: automatic stories
  • 15.
  • 16. What objects and relations are available? Often represented as class hierarchy. Arrows = “is_a” relation
  • 17. (Part of) a real ontology, from Cyc
  • 18. News as relations between entities “Alice attended the wedding” attended(alice, wedding) “IBM was founded in 1917.” founded(IBM, 1917) “Hurricane Sandy hit New York” hit(hurricane_sandy, New_York) Encode facts as relation(subject,object) also written (subject relation object)
  • 19. Things we could do with this Question answering “The granddaughter of which actor starred in E.T.?” (?x acted-in “E.T.”)(?y is-a actor)(?x granddaughter-of ?y) Inference (bob brother-of alice) (alice mother-of lucy) => (bob uncle-of lucy) Answer questions using inference “how many executives of publicly-traded Canadian companies died in car crashes?
  • 20. Every big news org has their own big ontology  topics, people, organizations, places...
  • 21. Enter Linked Data Triples of (subject relation object), each a URL or literal <urn:x-states:New%20York> <http://purl.org/dc/terms/alternative> "NY” <http://dbpedia.org/resource/Columbia_University> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/CollegeOrUniversity> Abbreviations possible with many formats... <http://dbpedia.org/resource/Columbia_University> rdf:type ns6:CollegeOrUniversity
  • 22.
  • 23.
  • 24.
  • 25. NYT API can return linked data { "title": "Syria's Rebels Open Talks on Forging United Political Front" "body": "BEIRUT, Lebanon — Syria ’s fractious opposition groups began negotiations in Doha, Qatar, on Sunday to forge a more unified front to reshape the political landscape in a bloody conflict that claims more than 100 lives virtually every day. Given the scant prospects that any attempt to restructure the opposition will succeed — the", "dbpedia_resource_url": [ "http://dbpedia.org/resource/Hillary_Rodham_Clinton", "http://dbpedia.org/resource/Bashar_al-Assad"], "facet_terms": "CLINTON, HILLARY RODHAM ASSAD, BASHAR AL- SYRIA DOHA (QATAR) SYRIAN NATIONAL COUNCIL STATE DEPARTMENT WAR AND REVOLUTION DEFENSE AND MILITARY FORCES" }
  • 26. Graph Databases in Journalism
  • 27.
  • 28. Graph schema for the Panama Papers William Lyon, Neo4j blog
  • 29. Property Graphs in the Panama Papers
  • 31. Objects and relations in text? names, dates, places, verbs.
  • 32. Named Entity Recognition Extract subjects, objects, from text. Also, resolve pronouns if possible. "Gov. Andrew M. Cuomo on Wednesday gave a sea wall the nod. Because of the recent history of powerful storms hitting the area, he said, elected officials have a responsibility to consider new and innovative plans to prevent similar damage in the future."
  • 33. Relations from sentence parsing “The water that made rivers of Avenues C and D receded on Tuesday, and the East Village was a mixture of disaster and nonchalance. A group of young men in pajama pants and shorts threw a football on East 12th Street, while workers pumped the basement of CHP Hardware on Avenue C and Eighth Street.” subject verb object
  • 35. Ontology explosions (water made rivers of Avenues C and D) (East Village was a mixture of disaster and nonchalance) (group of young men in pajama pants and shorts threw football) (workers pumped the basement of CHP Hardware ) Do we have all of these in the ontology?
  • 36. “General Question Answering” Precision/recall tradeoff. State of the art is IBM’s DeepQA
  • 37. DeepQA use of structured data “Watson can also use detected relations to query a triple store and directly generate candidate answers. Due to the breadth of relations in the Jeopardy domain and the variety of ways in which they are expressed, however, Watson’s current ability to effectively use curated databases to simply “look up” the answers is limited to fewer than 2 percent of the clues.” - Ferruci et. al. “Building Watson”

Editor's Notes

  1. To open: Connected China http://china.fathom.info/ Wikidata https://www.wikidata.org/wiki/Q2 Building Watson https://www.youtube.com/watch?v=3G2H3DZ8rNc
  2. http://china.fathom.info/
  3. https://www.slideshare.net/Graph-TA/graphium-chrysalis-exploiting-graph-database
  4. https://lod-cloud.net/
  5. https://lod-cloud.net/
  6. https://www.slideshare.net/Graph-TA/graphium-chrysalis-exploiting-graph-database
  7. https://neo4j.com/blog/analyzing-panama-papers-neo4j/
  8. https://nlp.stanford.edu/software/openie.html
  9. https://www.youtube.com/watch?v=3G2H3DZ8rNc