SlideShare a Scribd company logo
1 of 41
THE CONCENTRIC NATURE OF
NEWS SEMANTIC SNAPSHOTS
JOSÉ LUIS REDONDO GARCIA
GIUSEPPE RIZZO
RAPHAËL TRONCY
@peputo / redondo@eurecom.fr
@giusepperizzo / giuseppe.rizzo@eurecom.fr
@rtroncy / raphael.troncy@eurecom.fr
Overview
2October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Functions
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
Overview
3October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Functions
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
The Problem: Contextualizing News
4October 8, 2015 8th International Conference on Knowledge Capture
Wolfgang Schäuble
Finance Minister Ruling Party in Ger.
Christian
Democratic Union
1 2 3
5October 8, 2015 8th International Conference on Knowledge Capture
Sarah Harrison
WikiLeaks Editor Airport in Moscow
Sheremetyevo
The Problem: Contextualizing News
1 2 3
Contextualizing News:
Applications
6October 8, 2015 8th International Conference on Knowledge Capture
1 2 3
1 2 3
7
News Semantic Snapshot
(NSS) [1]
October 8, 2015 8th International Conference on Knowledge Capture
News Semantic Snapshot (NSS)
[1] Redondo et al., Generating the Semantic Snapshot of Newscasts using Entity
Expansion, ICWE 2015, Rotterdam.
Recreating the NSS
News Semantic Snapshot
8October 8, 2015 8th International Conference on Knowledge Capture
(2)
(1)
1 2 3
Involving: (experts in the news domain + users)
Dimensions:
Play with the data and help us to extend it at:
https://github.com/jluisred/NewsConceptExpansion/wiki/Golden-
Standard-Creation
News Semantic Snapshot:
Gold Standard
(1) Video Subtitles
(2) Image in the video
(3) Text in the video image
(4) Suggestions of an expert
(5) Related articles
9October 8, 2015 8th International Conference on Knowledge Capture
1 2 3
Recreating the NSS
News Semantic Snapshot
10October 8, 2015 8th International Conference on Knowledge Capture
(2)
(1)
1 2 3
(1) Bringing in Missing Entities:
News Entity Expansion
October 8, 2015 11
1.a)
8th International Conference on Knowledge Capture
Web sites to be crawled:
- Google
- L1 : A set of 10
internationals English
speaking newspapers
- L2 : A set of 3
international newspapers
used in GS
Temporal Window:
- 1W:
- 2W:
Annotation filtering
- Schema.org
1.b)
Parameters [1]:
1 2 3
[1] Redondo et al., Generating the Semantic Snapshot of Newscasts using Entity
Expansion, ICWE 2015, Rotterdam.
News Semantic Snapshot
12October 8, 2015 8th International Conference on Knowledge Capture
(2)
(1)Recall (E. Expansion)
= 0.91
Recall (NER on Subtitles)
= 0.42
Recreating the NSS
1 2 3
October 8, 2015 138th International Conference on Knowledge Capture
(NSS)
(Entity Expansion)
0
N
FIdeal(ei)
(NSS)
FX(ei)
=?
MNDCG
The Selection Problem:
1 2 3
Overview
14October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Function
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
1º Entity Frequency
SNOW Workshop 2014 [2]
October 8, 2015 158th International Conference on Knowledge Capture
A
1 2 3
[2] Redondo et al., Describing and Contextualizing Events in TV News
Show}, SNOW Workshop, WWW 2014, Seoul, Korea.
Frequency Based: Results
October 8, 2015 168th International Conference on Knowledge Capture
(NSS)
(Expansion)
FREQ
0
N
(NSS)
F(Laura Poitras) = 2
F(Glenn Greenwald) = 1
1 2 3
October 8, 2015 17
(Fr) (FrGaussian)
15th International Conference on Web Engineering (ICWE)
Multidimensional Approach
ICWE 2015 [1]
1 2 3
[1] Redondo et al., Generating the
Semantic Snapshot of Newscasts using
Entity Expansion, ICWE 2015,
Rotterdam.
POPULARITY (FPOP) EXPERT RULES (FEXP)
18
- Based on Google Trends
- w = 2 months
- μ + 2*σ (2.5%)
Example:
- [ Location, = 0.48 ]
- [ Person, = 0.74 ]
- [ Organization, = 0.95 ]
- [ < 2 , = 0.0 ]
October 8, 2015 15th International Conference on Web Engineering (ICWE) 18
Multidimensional Approach
1 2 3
- News Entity Expansion + Dimensions  Generate the
News Semantic Snapshot
- Best score: 0.667 in MNDCG at 10, better than BS1/2
• Collection: CSE (Google + 2W + Schema.org)
• Ranking:
• Expert Rules
• Popularity
October 8, 2015 198th International Conference on Knowledge Capture
Multidimensionality: Results
1 2 3
October 8, 2015 208th International Conference on Knowledge Capture
(NSS))
(Expansion)
FREQ POP EXP
+ + =
(NSS)
Multidimensionality: Results
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 21
Follow up: Fine-Tuning
1. Exploit Google Relevance (+1.80%)
2. Promote Subtitle Entities (+2.50%)
3. Exploit Named Entity Extractor’s confidence (+0.20%)
4. Interpret popularity Dimension (+1.40%)
5. Performing Clustering before Filtering (-0.60%)
- NO SIGNIFICANT IMPROVEMENT -
1 2 3
October 8, 2015 228th International Conference on Knowledge Capture
(NSS)
Tune
Function XFREQ POP EXP
No Improvement: Why?
Re-ShuffleOriginal
(NSS)
How many Dimensions?
How to combine them?
1 2 3
Overview
23October 8, 2015 8th International Conference on Knowledge Capture
1. Introducing the Problem:
Contextualizing News Items
o The News Semantic Snapshot (NSS)
2. Previous Work:
o Frequency-based Function
o Multidimensional Relevancy Approach
3. A Concentric Model for Generating NSS
October 8, 2015 8th International Conference on Knowledge Capture 24
Thinking Outside the Box:
1. Is there room for improvement?
2. Is MNDCG a good measure to
evaluate NSS?
3. How to significantly improve the
approach?
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 25
Room for Improvement?
GAIN
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 26
Room for Improvement?
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 27
How to Evaluate NSS?
MNDCG:
• Too focused on success at first positions (decay
Function)
• NSS intends to be flexible, ranking is application-
dependent
COMPACTNESS:
• Prioritizes coverage over ranking
• Compromise between: Recall and NSS size
• Recall*: positives are weighted according to score in GT
(NSS)
1 2 3
October 8, 2015 288th International Conference on Knowledge Capture
Compactness:
Recall: 22/33 = 0.66
Sa = 27
Sb = 33
Sc = 54
Sa = 27
Sb = 33
Sc= 54
(NSS)
A B CA
B
C
> >
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 29
Re-thinking the Approach:
Concentric Snapshot
Duality in News Entity Spectrum:
• REPRESENTATIVE entities:
• Driving the plot of the story, sometimes evident for
users.
• RELEVANT entities
• Related to former via specific reasons
Exploit the entity semantic relations
Unexpected?
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 30
Hypothesis:
Concentric Snapshot
CORE:
• Representative entities
• Spottable via
Frequency dimensions
• High degree of
cohesiveness
CRUST:
• Attached to the Core via
particular relations
• Agnostic to relevancy
nature: informativeness,
interestingness, etc.
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 31
Core Generation
a) Representative entities:
Frequency Dimension
(NSS)
b) Cohesiveness (DBpedia)
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 32
Crust Generation
The number of Web
documents talking
simultaneously about a
particular entity e and the
Core:
??
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 33
Experimental Settings
1. Entity Frequency
• Core1: Jaro-Winkler > 0.9
• Core2: Frequency based on Exact String matching
2. Cohesiveness:
• Everything is Connected Engine [3]
• Skb(e1, e2) > 0.125
CORE: (2 configurations)
[3] Everything is Connected
Engine:
https://github.com/mmlab/eice
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 34
1. Candidates for CRUST generation:
• Ex1: 1° ICWE2015 by R*(50): L2+Google, F3 1W, Gauss+ POP
• Ex2: 2° ICWE 2015 by R*(50): L2+Google, F3 1W, Freq + POP
2. Function for attaching entities to CORE:
• SWEB(ei, Core) over Google CSE, default Configuration
CRUST:
Experimental Settings
1 2 3
(2 configurations)
October 8, 2015 8th International Conference on Knowledge Capture 35
• Core+Crust:
• CrustOnly:
Projecting CORE and CRUST:
(NSS)
(Expansion)
CORE CRUST Core+Crust CrustOnly
Experimental Settings
1 2 3
(2 configurations)
October 8, 2015 8th International Conference on Knowledge Capture 36
Baselines:
BAS01: best run in ICWE 2015 at R*(50)
BAS02: second best run in ICWE 2015 at R*(50)
FREQPOPEXP
Experimental Settings
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 37
Results: Compactness
Percentage decrease of 36.9% over BAS01
IdealGT: size of SSN according to Gold Standard
(2*2*2 + 2) Runs
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 38
Results: Recall* over N
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 39
Conclusion
• News applications can benefit from the News Semantic Snapshot (NSS)
• Proposed a concentric based model for generating the NSS:
• Formalizes duality in entities (Representative VS Relevant)
• Exploit the entity semantic relations between Core and Crust.
• Accommodate into a single model different relevancy dimensions via the
notion of web presence ( SWeb )
• Concentric model better reproduces the NSS:
• Better Compactness: 36.9% over BAS01
• Similar recall, Smaller size
• Concentric model easier to implement:
• Core can be reproduced via Frequency Dimension
• Crust brings up relevant entities without having to deal with fuzzy
dimensions
1 2 3
October 8, 2015 8th International Conference on Knowledge Capture 40
Future
• Extend the number of videos considered in GT:
From 5 to 23 (+18), check [4] for more information
• Spot not only relationships between Crust and the Core but
also predicates that characterize them:
[4] https://github.com/jluisred/NewsConceptExpansion/wiki/Golden-Standard-Creation
Editor in WikiLeaks
1 2 3
JOSÉ LUIS REDONDO GARCIA
GIUSEPPE RIZZO
RAPHAËL TRONCY
@peputo / redondo@eurecom.fr
@giusepperizzo / giuseppe.rizzo@eurecom.fr
@rtroncy / raphael.troncy@eurecom.fr
http://www.slideshare.net/joseluisredondo/concentric-semantic-snapshot
Visit poster at booth:
34

More Related Content

Viewers also liked

Project of International Buiness, Presenation on Gaming zone
Project of International Buiness, Presenation on Gaming zoneProject of International Buiness, Presenation on Gaming zone
Project of International Buiness, Presenation on Gaming zoneRECONNECT
 
Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese...
 Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese... Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese...
Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese...ICRISAT
 
Loss Exposures in the Restaurant Class of Business 08-2011
Loss Exposures in the Restaurant Class of Business 08-2011Loss Exposures in the Restaurant Class of Business 08-2011
Loss Exposures in the Restaurant Class of Business 08-2011DPSchneider
 
B5s graphic slides (linked in)
B5s graphic slides (linked in)B5s graphic slides (linked in)
B5s graphic slides (linked in)Lowell Puls
 
Scrum rollen sauber einhalten
Scrum rollen sauber einhaltenScrum rollen sauber einhalten
Scrum rollen sauber einhaltenAxel Jung
 

Viewers also liked (9)

Project of International Buiness, Presenation on Gaming zone
Project of International Buiness, Presenation on Gaming zoneProject of International Buiness, Presenation on Gaming zone
Project of International Buiness, Presenation on Gaming zone
 
Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese...
 Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese... Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese...
Asia Regional Program Planning Meeting- Climate Change Impacts in Asia,Prese...
 
Loss Exposures in the Restaurant Class of Business 08-2011
Loss Exposures in the Restaurant Class of Business 08-2011Loss Exposures in the Restaurant Class of Business 08-2011
Loss Exposures in the Restaurant Class of Business 08-2011
 
GRUPO 05
GRUPO 05GRUPO 05
GRUPO 05
 
GRUPO 03
GRUPO 03GRUPO 03
GRUPO 03
 
How does rebranding impact nonprofits?
How does rebranding impact nonprofits?How does rebranding impact nonprofits?
How does rebranding impact nonprofits?
 
B5s graphic slides (linked in)
B5s graphic slides (linked in)B5s graphic slides (linked in)
B5s graphic slides (linked in)
 
Scrum rollen sauber einhalten
Scrum rollen sauber einhaltenScrum rollen sauber einhalten
Scrum rollen sauber einhalten
 
Bhavani Resume
Bhavani ResumeBhavani Resume
Bhavani Resume
 

Similar to Concentric Semantic Snapshot

(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...icwe2015
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...Wolfgang Ksoll
 
PyData - Multi-dimensional, Multi-modal Image Registration
PyData - Multi-dimensional, Multi-modal Image RegistrationPyData - Multi-dimensional, Multi-modal Image Registration
PyData - Multi-dimensional, Multi-modal Image RegistrationMatthew McCormick
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinGuido Schmutz
 
The Cloudification Perspectives of Search-based Software Testing
The Cloudification Perspectives of Search-based Software TestingThe Cloudification Perspectives of Search-based Software Testing
The Cloudification Perspectives of Search-based Software TestingSebastiano Panichella
 
Clustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesClustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesIRJET Journal
 
Cloud Testbeds for Standards Development and Innovation
Cloud Testbeds for Standards Development and InnovationCloud Testbeds for Standards Development and Innovation
Cloud Testbeds for Standards Development and InnovationAlan Sill
 
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...Gabriel Moreira
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...Raffaele Montella
 
01 michael zink open cloud testbed
01 michael zink   open cloud testbed01 michael zink   open cloud testbed
01 michael zink open cloud testbedTereza Gabrielova
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data ContextAlasdair Gray
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰台灣資料科學年會
 
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Laurent Lefort
 
Deep dive into Kubernetes monitoring with Elastic Observability.pptx
Deep dive into Kubernetes monitoring with Elastic Observability.pptxDeep dive into Kubernetes monitoring with Elastic Observability.pptx
Deep dive into Kubernetes monitoring with Elastic Observability.pptxChris Markou
 
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMPTrends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMPChristian Esteve Rothenberg
 
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...SonjaChevre
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overviewkt.mako
 
SC4 Hangout 1: BDE-Transport Webinar Simon Scerri
SC4 Hangout 1: BDE-Transport Webinar Simon ScerriSC4 Hangout 1: BDE-Transport Webinar Simon Scerri
SC4 Hangout 1: BDE-Transport Webinar Simon ScerriBigData_Europe
 

Similar to Concentric Semantic Snapshot (20)

News Semantic Snapshot
News Semantic SnapshotNews Semantic Snapshot
News Semantic Snapshot
 
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
(Linked Data Development and Exploitation track) "Generating the Semantic Sna...
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
 
PyData - Multi-dimensional, Multi-modal Image Registration
PyData - Multi-dimensional, Multi-modal Image RegistrationPyData - Multi-dimensional, Multi-modal Image Registration
PyData - Multi-dimensional, Multi-modal Image Registration
 
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day BerlinReal Time Analytics with Apache Cassandra - Cassandra Day Berlin
Real Time Analytics with Apache Cassandra - Cassandra Day Berlin
 
The Cloudification Perspectives of Search-based Software Testing
The Cloudification Perspectives of Search-based Software TestingThe Cloudification Perspectives of Search-based Software Testing
The Cloudification Perspectives of Search-based Software Testing
 
Clustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining TechniquesClustering of Big Data Using Different Data-Mining Techniques
Clustering of Big Data Using Different Data-Mining Techniques
 
Cloud Testbeds for Standards Development and Innovation
Cloud Testbeds for Standards Development and InnovationCloud Testbeds for Standards Development and Innovation
Cloud Testbeds for Standards Development and Innovation
 
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
 
01 michael zink open cloud testbed
01 michael zink   open cloud testbed01 michael zink   open cloud testbed
01 michael zink open cloud testbed
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data Context
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
 
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
Semantically-Enabling the Web of Things: The W3C Semantic Sensor Network Onto...
 
Deep dive into Kubernetes monitoring with Elastic Observability.pptx
Deep dive into Kubernetes monitoring with Elastic Observability.pptxDeep dive into Kubernetes monitoring with Elastic Observability.pptx
Deep dive into Kubernetes monitoring with Elastic Observability.pptx
 
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMPTrends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
Trends and Hot Topics in Networking 2023 - IA377 Seminar FEEC-UNICAMP
 
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
Migrating from OpenTracing to OpenTelemetry - Kubernetes Community Days Munic...
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overview
 
It syllabus2015
It syllabus2015It syllabus2015
It syllabus2015
 
SC4 Hangout 1: BDE-Transport Webinar Simon Scerri
SC4 Hangout 1: BDE-Transport Webinar Simon ScerriSC4 Hangout 1: BDE-Transport Webinar Simon Scerri
SC4 Hangout 1: BDE-Transport Webinar Simon Scerri
 

Recently uploaded

Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfChristianCDAM
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.elesangwon
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdfsahilsajad201
 
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfNainaShrivastava14
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
signals in triangulation .. ...Surveying
signals in triangulation .. ...Surveyingsignals in triangulation .. ...Surveying
signals in triangulation .. ...Surveyingsapna80328
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodManicka Mamallan Andavar
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书rnrncn29
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfisabel213075
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 

Recently uploaded (20)

Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdfCh10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdf
 
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdfPaper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
Paper Tube : Shigeru Ban projects and Case Study of Cardboard Cathedral .pdf
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
signals in triangulation .. ...Surveying
signals in triangulation .. ...Surveyingsignals in triangulation .. ...Surveying
signals in triangulation .. ...Surveying
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument method
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
 
List of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdfList of Accredited Concrete Batching Plant.pdf
List of Accredited Concrete Batching Plant.pdf
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 

Concentric Semantic Snapshot

  • 1. THE CONCENTRIC NATURE OF NEWS SEMANTIC SNAPSHOTS JOSÉ LUIS REDONDO GARCIA GIUSEPPE RIZZO RAPHAËL TRONCY @peputo / redondo@eurecom.fr @giusepperizzo / giuseppe.rizzo@eurecom.fr @rtroncy / raphael.troncy@eurecom.fr
  • 2. Overview 2October 8, 2015 8th International Conference on Knowledge Capture 1. Introducing the Problem: Contextualizing News Items o The News Semantic Snapshot (NSS) 2. Previous Work: o Frequency-based Functions o Multidimensional Relevancy Approach 3. A Concentric Model for Generating NSS
  • 3. Overview 3October 8, 2015 8th International Conference on Knowledge Capture 1. Introducing the Problem: Contextualizing News Items o The News Semantic Snapshot (NSS) 2. Previous Work: o Frequency-based Functions o Multidimensional Relevancy Approach 3. A Concentric Model for Generating NSS
  • 4. The Problem: Contextualizing News 4October 8, 2015 8th International Conference on Knowledge Capture Wolfgang Schäuble Finance Minister Ruling Party in Ger. Christian Democratic Union 1 2 3
  • 5. 5October 8, 2015 8th International Conference on Knowledge Capture Sarah Harrison WikiLeaks Editor Airport in Moscow Sheremetyevo The Problem: Contextualizing News 1 2 3
  • 6. Contextualizing News: Applications 6October 8, 2015 8th International Conference on Knowledge Capture 1 2 3
  • 7. 1 2 3 7 News Semantic Snapshot (NSS) [1] October 8, 2015 8th International Conference on Knowledge Capture News Semantic Snapshot (NSS) [1] Redondo et al., Generating the Semantic Snapshot of Newscasts using Entity Expansion, ICWE 2015, Rotterdam.
  • 8. Recreating the NSS News Semantic Snapshot 8October 8, 2015 8th International Conference on Knowledge Capture (2) (1) 1 2 3
  • 9. Involving: (experts in the news domain + users) Dimensions: Play with the data and help us to extend it at: https://github.com/jluisred/NewsConceptExpansion/wiki/Golden- Standard-Creation News Semantic Snapshot: Gold Standard (1) Video Subtitles (2) Image in the video (3) Text in the video image (4) Suggestions of an expert (5) Related articles 9October 8, 2015 8th International Conference on Knowledge Capture 1 2 3
  • 10. Recreating the NSS News Semantic Snapshot 10October 8, 2015 8th International Conference on Knowledge Capture (2) (1) 1 2 3
  • 11. (1) Bringing in Missing Entities: News Entity Expansion October 8, 2015 11 1.a) 8th International Conference on Knowledge Capture Web sites to be crawled: - Google - L1 : A set of 10 internationals English speaking newspapers - L2 : A set of 3 international newspapers used in GS Temporal Window: - 1W: - 2W: Annotation filtering - Schema.org 1.b) Parameters [1]: 1 2 3 [1] Redondo et al., Generating the Semantic Snapshot of Newscasts using Entity Expansion, ICWE 2015, Rotterdam.
  • 12. News Semantic Snapshot 12October 8, 2015 8th International Conference on Knowledge Capture (2) (1)Recall (E. Expansion) = 0.91 Recall (NER on Subtitles) = 0.42 Recreating the NSS 1 2 3
  • 13. October 8, 2015 138th International Conference on Knowledge Capture (NSS) (Entity Expansion) 0 N FIdeal(ei) (NSS) FX(ei) =? MNDCG The Selection Problem: 1 2 3
  • 14. Overview 14October 8, 2015 8th International Conference on Knowledge Capture 1. Introducing the Problem: Contextualizing News Items o The News Semantic Snapshot (NSS) 2. Previous Work: o Frequency-based Function o Multidimensional Relevancy Approach 3. A Concentric Model for Generating NSS
  • 15. 1º Entity Frequency SNOW Workshop 2014 [2] October 8, 2015 158th International Conference on Knowledge Capture A 1 2 3 [2] Redondo et al., Describing and Contextualizing Events in TV News Show}, SNOW Workshop, WWW 2014, Seoul, Korea.
  • 16. Frequency Based: Results October 8, 2015 168th International Conference on Knowledge Capture (NSS) (Expansion) FREQ 0 N (NSS) F(Laura Poitras) = 2 F(Glenn Greenwald) = 1 1 2 3
  • 17. October 8, 2015 17 (Fr) (FrGaussian) 15th International Conference on Web Engineering (ICWE) Multidimensional Approach ICWE 2015 [1] 1 2 3 [1] Redondo et al., Generating the Semantic Snapshot of Newscasts using Entity Expansion, ICWE 2015, Rotterdam.
  • 18. POPULARITY (FPOP) EXPERT RULES (FEXP) 18 - Based on Google Trends - w = 2 months - μ + 2*σ (2.5%) Example: - [ Location, = 0.48 ] - [ Person, = 0.74 ] - [ Organization, = 0.95 ] - [ < 2 , = 0.0 ] October 8, 2015 15th International Conference on Web Engineering (ICWE) 18 Multidimensional Approach 1 2 3
  • 19. - News Entity Expansion + Dimensions  Generate the News Semantic Snapshot - Best score: 0.667 in MNDCG at 10, better than BS1/2 • Collection: CSE (Google + 2W + Schema.org) • Ranking: • Expert Rules • Popularity October 8, 2015 198th International Conference on Knowledge Capture Multidimensionality: Results 1 2 3
  • 20. October 8, 2015 208th International Conference on Knowledge Capture (NSS)) (Expansion) FREQ POP EXP + + = (NSS) Multidimensionality: Results 1 2 3
  • 21. October 8, 2015 8th International Conference on Knowledge Capture 21 Follow up: Fine-Tuning 1. Exploit Google Relevance (+1.80%) 2. Promote Subtitle Entities (+2.50%) 3. Exploit Named Entity Extractor’s confidence (+0.20%) 4. Interpret popularity Dimension (+1.40%) 5. Performing Clustering before Filtering (-0.60%) - NO SIGNIFICANT IMPROVEMENT - 1 2 3
  • 22. October 8, 2015 228th International Conference on Knowledge Capture (NSS) Tune Function XFREQ POP EXP No Improvement: Why? Re-ShuffleOriginal (NSS) How many Dimensions? How to combine them? 1 2 3
  • 23. Overview 23October 8, 2015 8th International Conference on Knowledge Capture 1. Introducing the Problem: Contextualizing News Items o The News Semantic Snapshot (NSS) 2. Previous Work: o Frequency-based Function o Multidimensional Relevancy Approach 3. A Concentric Model for Generating NSS
  • 24. October 8, 2015 8th International Conference on Knowledge Capture 24 Thinking Outside the Box: 1. Is there room for improvement? 2. Is MNDCG a good measure to evaluate NSS? 3. How to significantly improve the approach? 1 2 3
  • 25. October 8, 2015 8th International Conference on Knowledge Capture 25 Room for Improvement? GAIN 1 2 3
  • 26. October 8, 2015 8th International Conference on Knowledge Capture 26 Room for Improvement? 1 2 3
  • 27. October 8, 2015 8th International Conference on Knowledge Capture 27 How to Evaluate NSS? MNDCG: • Too focused on success at first positions (decay Function) • NSS intends to be flexible, ranking is application- dependent COMPACTNESS: • Prioritizes coverage over ranking • Compromise between: Recall and NSS size • Recall*: positives are weighted according to score in GT (NSS) 1 2 3
  • 28. October 8, 2015 288th International Conference on Knowledge Capture Compactness: Recall: 22/33 = 0.66 Sa = 27 Sb = 33 Sc = 54 Sa = 27 Sb = 33 Sc= 54 (NSS) A B CA B C > > 1 2 3
  • 29. October 8, 2015 8th International Conference on Knowledge Capture 29 Re-thinking the Approach: Concentric Snapshot Duality in News Entity Spectrum: • REPRESENTATIVE entities: • Driving the plot of the story, sometimes evident for users. • RELEVANT entities • Related to former via specific reasons Exploit the entity semantic relations Unexpected? 1 2 3
  • 30. October 8, 2015 8th International Conference on Knowledge Capture 30 Hypothesis: Concentric Snapshot CORE: • Representative entities • Spottable via Frequency dimensions • High degree of cohesiveness CRUST: • Attached to the Core via particular relations • Agnostic to relevancy nature: informativeness, interestingness, etc. 1 2 3
  • 31. October 8, 2015 8th International Conference on Knowledge Capture 31 Core Generation a) Representative entities: Frequency Dimension (NSS) b) Cohesiveness (DBpedia) 1 2 3
  • 32. October 8, 2015 8th International Conference on Knowledge Capture 32 Crust Generation The number of Web documents talking simultaneously about a particular entity e and the Core: ?? 1 2 3
  • 33. October 8, 2015 8th International Conference on Knowledge Capture 33 Experimental Settings 1. Entity Frequency • Core1: Jaro-Winkler > 0.9 • Core2: Frequency based on Exact String matching 2. Cohesiveness: • Everything is Connected Engine [3] • Skb(e1, e2) > 0.125 CORE: (2 configurations) [3] Everything is Connected Engine: https://github.com/mmlab/eice 1 2 3
  • 34. October 8, 2015 8th International Conference on Knowledge Capture 34 1. Candidates for CRUST generation: • Ex1: 1° ICWE2015 by R*(50): L2+Google, F3 1W, Gauss+ POP • Ex2: 2° ICWE 2015 by R*(50): L2+Google, F3 1W, Freq + POP 2. Function for attaching entities to CORE: • SWEB(ei, Core) over Google CSE, default Configuration CRUST: Experimental Settings 1 2 3 (2 configurations)
  • 35. October 8, 2015 8th International Conference on Knowledge Capture 35 • Core+Crust: • CrustOnly: Projecting CORE and CRUST: (NSS) (Expansion) CORE CRUST Core+Crust CrustOnly Experimental Settings 1 2 3 (2 configurations)
  • 36. October 8, 2015 8th International Conference on Knowledge Capture 36 Baselines: BAS01: best run in ICWE 2015 at R*(50) BAS02: second best run in ICWE 2015 at R*(50) FREQPOPEXP Experimental Settings 1 2 3
  • 37. October 8, 2015 8th International Conference on Knowledge Capture 37 Results: Compactness Percentage decrease of 36.9% over BAS01 IdealGT: size of SSN according to Gold Standard (2*2*2 + 2) Runs 1 2 3
  • 38. October 8, 2015 8th International Conference on Knowledge Capture 38 Results: Recall* over N 1 2 3
  • 39. October 8, 2015 8th International Conference on Knowledge Capture 39 Conclusion • News applications can benefit from the News Semantic Snapshot (NSS) • Proposed a concentric based model for generating the NSS: • Formalizes duality in entities (Representative VS Relevant) • Exploit the entity semantic relations between Core and Crust. • Accommodate into a single model different relevancy dimensions via the notion of web presence ( SWeb ) • Concentric model better reproduces the NSS: • Better Compactness: 36.9% over BAS01 • Similar recall, Smaller size • Concentric model easier to implement: • Core can be reproduced via Frequency Dimension • Crust brings up relevant entities without having to deal with fuzzy dimensions 1 2 3
  • 40. October 8, 2015 8th International Conference on Knowledge Capture 40 Future • Extend the number of videos considered in GT: From 5 to 23 (+18), check [4] for more information • Spot not only relationships between Crust and the Core but also predicates that characterize them: [4] https://github.com/jluisred/NewsConceptExpansion/wiki/Golden-Standard-Creation Editor in WikiLeaks 1 2 3
  • 41. JOSÉ LUIS REDONDO GARCIA GIUSEPPE RIZZO RAPHAËL TRONCY @peputo / redondo@eurecom.fr @giusepperizzo / giuseppe.rizzo@eurecom.fr @rtroncy / raphael.troncy@eurecom.fr http://www.slideshare.net/joseluisredondo/concentric-semantic-snapshot Visit poster at booth: 34

Editor's Notes

  1. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  2. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  3. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  4. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  5. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  6. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  7. ----- Meeting Notes (6/16/15 11:16) ----- Extending the Repository
  8. ----- Meeting Notes (6/16/15 11:16) ----- Extending the Repository
  9. ----- Meeting Notes (6/16/15 11:16) ----- Extending the Repository
  10. Usupervised
  11. ----- Meeting Notes (6/16/15 11:16) ----- Extending the Repository
  12. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  13. Usage of the NSS ?? Why entities? Introduce the importance of this decision
  14. Delete Dbpedia here
  15. Usupervised