SlideShare a Scribd company logo
Interlinking multimedia for the analysis of media
           coverage of political debates

        Max Kemman & Henri Beunders
               NOTaS meeting



                  www.polimedia.nl
Main goal
• Aimed at Humanities researchers
• Using CLARIN standard




25-6-2012         PoliMedia - NOTaS meeting   2
Main research question
 What choices do different media make in the coverage
  of people and topics while reporting on debates in the
 Dutch parliament since the first televised evening news
                   in 1956 until 1995?




25-6-2012              PoliMedia - NOTaS meeting           3
Historical research use case
• How did the European Monetary Union (EMU)
  come to be in the 1990’s?
• What events led to the becoming of the EMU?
• How was this all represented by the media at
  that time?




25-6-2012             PoliMedia - NOTaS meeting   4
Current approach


             +                               =   Too much
                                                   work



                                                 Limited
                                                 material

             +                               =      and
                                                 different
                                                 systems
25-6-2012        PoliMedia - NOTaS meeting                   5
PoliMedia approach

            PoliMedia                                 Newspapers
              Portal                                     KB
                                                        1950-1995
                          Staten
     - Browse:           Generaal                      Television
       debate and        Digitaal                   Sound and Vision
       date                 KB                         1956-1995
     - Search:           1818-1995
                                                         Radio
       debate and
                                                          KB
       person                                          1950-1984


25-6-2012               PoliMedia - NOTaS meeting                      6
Why PoliMedia?
Better insight in the relations between media
items




25-6-2012          PoliMedia - NOTaS meeting    7
Data sets
• Primary data set:
            • The Dutch parliamentary debates (Handelingen der
              Staten-General (Dutch Hansard))
            • Available at the KB in raw format
            • Made CLARIN compliant in War In Parliament project
               – chronological structure of consecutive speakers in a debate
• Secondary data set:
            1. NISV Academia set (OAI protocol)
            2. KB - newspapers (SRU protocol)
            3. KB - radio bulletins (SRU protocol)


25-6-2012                       PoliMedia - NOTaS meeting                      8
Current status of technical work
1. Extract structure of debates
2. Find named entities in debate texts: people,
   organizations, locations.
3. Find links between debates and media.




25-6-2012          PoliMedia - NOTaS meeting      9
1. Debate dataset structure
 Debate
metadata




 Topic




Speaker
                                                       Speech
Segment




     25-6-2012            PoliMedia - NOTaS meeting   10
Debate metadata schema
                               2011-12-14
            Stemmingen over…                                         poli:hasNextSpeech                  poli:hasNextSegment
                                           poli:hasPubDate

       poli:hasDesc

                                                                                                                          sem:hasActor
                                                                                                           speech
                               debate                                      speech
                                                                                                           segment
                                                 poli:hasSpeech
                                                                                    poli:hasSpeechSegment
                                                                  poli:hasDesc




                      poli:MediaType                                    Natuur en milieu



                                                                                                                    poli:coveredIn
                               Dbpedia: transcript                                 poli:mentions                    (media)
                                                                                   (People, locations,
                                                                                   organizations)




25-6-2012                                                    PoliMedia - NOTaS meeting                                                   11
2. Named Entity Recognition
                    in debates
• Fietstas: web services for processing textual
  content
      – http://fietstas.science.uva.nl/
• Lists of named entities (NEs) that appear in
  specific documents or sets of documents
• Works well with Dutch language (unlike other
  popular services like Dbpedia spotlight)


25-6-2012                PoliMedia - NOTaS meeting   12
Named Entity Recognition

            debate1
              .xml                                debate1   ner1
                                                    .xml    .xml



            debate2
              .xml                                debate2   ner2
                                                    .xml    .xml




            debate3                               debate3   ner3
              .xml                                  .xml    .xml



25-6-2012             PoliMedia - NOTaS meeting                    13
Named Entity Recognition
                                                •Persons
                                                •Organizations
                                                •Locations
                                                •Miscellaneous




25-6-2012           PoliMedia - NOTaS meeting            14
3. Find links to newspapers and radio
                 bulletins
We use the dates, topics, named entities and
speakers of the debates to query the media
archives.

Media document harvesting:
• SRU protocol (Search and retrieval via URL )
• http://www.loc.gov/standards/sru/
• JSRU is a Java implementation of the SRU
  protocol at the KB

25-6-2012           PoliMedia - NOTaS meeting    15
Automatic Query Construction
                                        • Persons, Locations and Organizations
                Debate
               Metadata                   mentioned inside topics of the debate
                                        • Speakers

                  Topic 1                 TopicList =
                                             PersonsInTopic      LocationsInTopic   Org.InTopic
            Speaker 1 / Content


            Speaker 2 / Content             +
                                          Speaker n =
            Speaker 3 / Content
                                            ActorFromSegment                        TimeFrame



                  Topic 2
                                                              Example query: give all the
                                                              newspaper issues in the collection
            Speaker 1 / Content          Query
                                                              DDD_krantnr where the date value is
                                                              between 01-01-1940 and 31-12-1945
25-6-2012                         PoliMedia - NOTaS meeting
Newspaper metadata
                                                         1951-11-08


                                                SCHUTJASSEN                   poli:hasPubDate

                                                              poli:hasTitle
                                             De Heerenveensche      poli:PublishedIn
                                                  koerier                                 article instance



                                      poli:Mentions


                                                                 poli:MediaType




                                                     Dbpedia: Newspaper article




25-6-2012        PoliMedia - NOTaS meeting                                                           17
Radio bulletin metadata



                                                                           1946/05/06

                                          ANP Nieuwsbericht -
                                            06-05-1946 - 10                             poli:hasPubDate

                                                           poli:hasTitle

                                                                                        article instance




25-6-2012           PoliMedia - NOTaS meeting Dbpedia: Radio bulletin                                      18
The date of a debate and a media
                     article
                     • We use the dates, topics, named
                       entities and speakers of the
                       debates to query the media
                       archives.
                     • News item is always at the same
                       day or after the debate.
                     • How much time should we allow
                       between debate and media item?
                     • Current choice: 1 month.
                       Result 1-26 of 26 results for “Princen” AND “Van
                       Mierlo”
                       Timeframe: one month period:
                       • 26 articles in period between 21/12. and 21/01
25-6-2012              • 7 on day of the
                   PoliMedia - NOTaS meeting debate, only 1 article 1 month later.
                                                                              19
Debate → Newspaper example




                       Dates between:
                       21.12.1994.(debate date)
                       21.01.1995.

                       • Queries:
                            o Small numbers of topics (to avoid
                            overspecialization)
                            o Shorter timespan (fast media cycle)
25-6-2012      PoliMedia - NOTaS meeting                            20
Overview
                                        PersonsInTopic

                                        LocationsInTopic

                                         Org.InTopic




                         TimeFrame

                                               Query
                     ActorFromSegment




25-6-2012   PoliMedia - NOTaS meeting                      21
PoliMedia+
• Elections in September

                300
            influential
             political
              Twitter
             accounts




25-6-2012                  PoliMedia - NOTaS meeting   22
What can you do with this?
• PoliMedia allows a better insight between
  politics and media
• What can Speech- and Language-technologists
  do with it?




25-6-2012            PoliMedia - NOTaS meeting   23
Contact
                 www.polimedia.nl
               kemman@eshcc.eur.nl
Acknowledgements
• Rest of the team
      – Laura Hollink (VU), Geert-Jan Houben, Damir Juric (TU
        Delft), Johan Oomen, Jaap Blom (NISV), Martijn
        Kleppe (EUR)
      – KB
• War in Parliament
• CLARIN
      – Arjan van Hessen
25-6-2012                  PoliMedia - NOTaS meeting        24

More Related Content

Similar to PoliMedia presentation NOTaS meeting

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
Dr.-Ing. Thomas Hartmann
 
Linked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLinked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche Nationalbibliothek
Lars G. Svensson
 
Keynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official PublicationsKeynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official Publications
maartenmarx
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Heiko Paulheim
 
A new approach to aggregation
A new approach to aggregation A new approach to aggregation
A new approach to aggregation
Enno Meijers
 
Bringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebBringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic Web
Laura Hollink
 
PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
Artificial Intelligence Institute at UofSC
 
Expressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.pptExpressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.ppt
Bharath Abbareddy
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
National Information Standards Organization (NISO)
 
Creation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systemsCreation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systems
GESIS
 
ICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and mediaICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and media
gjhouben
 
Principles for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA viewPrinciples for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA view
Research Data Alliance
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
Rob Grim
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Artificial Intelligence Institute at UofSC
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupal
emmanuel_jamin
 
Here Comes Everything
Here Comes EverythingHere Comes Everything
Here Comes Everything
Nigel Shadbolt
 
Connecting Museums with Linked Data
Connecting Museums with Linked DataConnecting Museums with Linked Data
Connecting Museums with Linked Data
National Institute of Informatics (NII)
 
Alessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELIAlessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELI
mbruemmer
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Prateek Jain
 

Similar to PoliMedia presentation NOTaS meeting (20)

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Linked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLinked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche Nationalbibliothek
 
Keynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official PublicationsKeynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official Publications
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
A new approach to aggregation
A new approach to aggregation A new approach to aggregation
A new approach to aggregation
 
Bringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebBringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic Web
 
PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
 
Expressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.pptExpressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.ppt
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
 
Creation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systemsCreation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systems
 
ICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and mediaICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and media
 
Principles for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA viewPrinciples for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA view
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupal
 
Here Comes Everything
Here Comes EverythingHere Comes Everything
Here Comes Everything
 
Connecting Museums with Linked Data
Connecting Museums with Linked DataConnecting Museums with Linked Data
Connecting Museums with Linked Data
 
Alessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELIAlessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELI
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
 

More from MaxKemman

Boundary practices in digital humanities
Boundary practices in digital humanitiesBoundary practices in digital humanities
Boundary practices in digital humanities
MaxKemman
 
Infrastructure As Afterthought
Infrastructure As AfterthoughtInfrastructure As Afterthought
Infrastructure As Afterthought
MaxKemman
 
Interdisciplinary Ignorance
Interdisciplinary IgnoranceInterdisciplinary Ignorance
Interdisciplinary Ignorance
MaxKemman
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary Objects
MaxKemman
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary Objects
MaxKemman
 
Too Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities ProjectsToo Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities Projects
MaxKemman
 
Oral History Today
Oral History TodayOral History Today
Oral History Today
MaxKemman
 
Dutch Journalism in the Digital Age
Dutch Journalism in the Digital AgeDutch Journalism in the Digital Age
Dutch Journalism in the Digital Age
MaxKemman
 
User research in the development of PoliMedia
User research in the development of PoliMediaUser research in the development of PoliMedia
User research in the development of PoliMedia
MaxKemman
 
User research for the development of search systems
User research for the development of search systemsUser research for the development of search systems
User research for the development of search systems
MaxKemman
 
Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...
MaxKemman
 

More from MaxKemman (11)

Boundary practices in digital humanities
Boundary practices in digital humanitiesBoundary practices in digital humanities
Boundary practices in digital humanities
 
Infrastructure As Afterthought
Infrastructure As AfterthoughtInfrastructure As Afterthought
Infrastructure As Afterthought
 
Interdisciplinary Ignorance
Interdisciplinary IgnoranceInterdisciplinary Ignorance
Interdisciplinary Ignorance
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary Objects
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary Objects
 
Too Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities ProjectsToo Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities Projects
 
Oral History Today
Oral History TodayOral History Today
Oral History Today
 
Dutch Journalism in the Digital Age
Dutch Journalism in the Digital AgeDutch Journalism in the Digital Age
Dutch Journalism in the Digital Age
 
User research in the development of PoliMedia
User research in the development of PoliMediaUser research in the development of PoliMedia
User research in the development of PoliMedia
 
User research for the development of search systems
User research for the development of search systemsUser research for the development of search systems
User research for the development of search systems
 
Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 

PoliMedia presentation NOTaS meeting

  • 1. Interlinking multimedia for the analysis of media coverage of political debates Max Kemman & Henri Beunders NOTaS meeting www.polimedia.nl
  • 2. Main goal • Aimed at Humanities researchers • Using CLARIN standard 25-6-2012 PoliMedia - NOTaS meeting 2
  • 3. Main research question What choices do different media make in the coverage of people and topics while reporting on debates in the Dutch parliament since the first televised evening news in 1956 until 1995? 25-6-2012 PoliMedia - NOTaS meeting 3
  • 4. Historical research use case • How did the European Monetary Union (EMU) come to be in the 1990’s? • What events led to the becoming of the EMU? • How was this all represented by the media at that time? 25-6-2012 PoliMedia - NOTaS meeting 4
  • 5. Current approach + = Too much work Limited material + = and different systems 25-6-2012 PoliMedia - NOTaS meeting 5
  • 6. PoliMedia approach PoliMedia Newspapers Portal KB 1950-1995 Staten - Browse: Generaal Television debate and Digitaal Sound and Vision date KB 1956-1995 - Search: 1818-1995 Radio debate and KB person 1950-1984 25-6-2012 PoliMedia - NOTaS meeting 6
  • 7. Why PoliMedia? Better insight in the relations between media items 25-6-2012 PoliMedia - NOTaS meeting 7
  • 8. Data sets • Primary data set: • The Dutch parliamentary debates (Handelingen der Staten-General (Dutch Hansard)) • Available at the KB in raw format • Made CLARIN compliant in War In Parliament project – chronological structure of consecutive speakers in a debate • Secondary data set: 1. NISV Academia set (OAI protocol) 2. KB - newspapers (SRU protocol) 3. KB - radio bulletins (SRU protocol) 25-6-2012 PoliMedia - NOTaS meeting 8
  • 9. Current status of technical work 1. Extract structure of debates 2. Find named entities in debate texts: people, organizations, locations. 3. Find links between debates and media. 25-6-2012 PoliMedia - NOTaS meeting 9
  • 10. 1. Debate dataset structure Debate metadata Topic Speaker Speech Segment 25-6-2012 PoliMedia - NOTaS meeting 10
  • 11. Debate metadata schema 2011-12-14 Stemmingen over… poli:hasNextSpeech poli:hasNextSegment poli:hasPubDate poli:hasDesc sem:hasActor speech debate speech segment poli:hasSpeech poli:hasSpeechSegment poli:hasDesc poli:MediaType Natuur en milieu poli:coveredIn Dbpedia: transcript poli:mentions (media) (People, locations, organizations) 25-6-2012 PoliMedia - NOTaS meeting 11
  • 12. 2. Named Entity Recognition in debates • Fietstas: web services for processing textual content – http://fietstas.science.uva.nl/ • Lists of named entities (NEs) that appear in specific documents or sets of documents • Works well with Dutch language (unlike other popular services like Dbpedia spotlight) 25-6-2012 PoliMedia - NOTaS meeting 12
  • 13. Named Entity Recognition debate1 .xml debate1 ner1 .xml .xml debate2 .xml debate2 ner2 .xml .xml debate3 debate3 ner3 .xml .xml .xml 25-6-2012 PoliMedia - NOTaS meeting 13
  • 14. Named Entity Recognition •Persons •Organizations •Locations •Miscellaneous 25-6-2012 PoliMedia - NOTaS meeting 14
  • 15. 3. Find links to newspapers and radio bulletins We use the dates, topics, named entities and speakers of the debates to query the media archives. Media document harvesting: • SRU protocol (Search and retrieval via URL ) • http://www.loc.gov/standards/sru/ • JSRU is a Java implementation of the SRU protocol at the KB 25-6-2012 PoliMedia - NOTaS meeting 15
  • 16. Automatic Query Construction • Persons, Locations and Organizations Debate Metadata mentioned inside topics of the debate • Speakers Topic 1 TopicList = PersonsInTopic LocationsInTopic Org.InTopic Speaker 1 / Content Speaker 2 / Content + Speaker n = Speaker 3 / Content ActorFromSegment TimeFrame Topic 2 Example query: give all the newspaper issues in the collection Speaker 1 / Content Query DDD_krantnr where the date value is between 01-01-1940 and 31-12-1945 25-6-2012 PoliMedia - NOTaS meeting
  • 17. Newspaper metadata 1951-11-08 SCHUTJASSEN poli:hasPubDate poli:hasTitle De Heerenveensche poli:PublishedIn koerier article instance poli:Mentions poli:MediaType Dbpedia: Newspaper article 25-6-2012 PoliMedia - NOTaS meeting 17
  • 18. Radio bulletin metadata 1946/05/06 ANP Nieuwsbericht - 06-05-1946 - 10 poli:hasPubDate poli:hasTitle article instance 25-6-2012 PoliMedia - NOTaS meeting Dbpedia: Radio bulletin 18
  • 19. The date of a debate and a media article • We use the dates, topics, named entities and speakers of the debates to query the media archives. • News item is always at the same day or after the debate. • How much time should we allow between debate and media item? • Current choice: 1 month. Result 1-26 of 26 results for “Princen” AND “Van Mierlo” Timeframe: one month period: • 26 articles in period between 21/12. and 21/01 25-6-2012 • 7 on day of the PoliMedia - NOTaS meeting debate, only 1 article 1 month later. 19
  • 20. Debate → Newspaper example Dates between: 21.12.1994.(debate date) 21.01.1995. • Queries: o Small numbers of topics (to avoid overspecialization) o Shorter timespan (fast media cycle) 25-6-2012 PoliMedia - NOTaS meeting 20
  • 21. Overview PersonsInTopic LocationsInTopic Org.InTopic TimeFrame Query ActorFromSegment 25-6-2012 PoliMedia - NOTaS meeting 21
  • 22. PoliMedia+ • Elections in September 300 influential political Twitter accounts 25-6-2012 PoliMedia - NOTaS meeting 22
  • 23. What can you do with this? • PoliMedia allows a better insight between politics and media • What can Speech- and Language-technologists do with it? 25-6-2012 PoliMedia - NOTaS meeting 23
  • 24. Contact www.polimedia.nl kemman@eshcc.eur.nl Acknowledgements • Rest of the team – Laura Hollink (VU), Geert-Jan Houben, Damir Juric (TU Delft), Johan Oomen, Jaap Blom (NISV), Martijn Kleppe (EUR) – KB • War in Parliament • CLARIN – Arjan van Hessen 25-6-2012 PoliMedia - NOTaS meeting 24

Editor's Notes

  1. Limited: not everything is in it, but more importantly no mark-up or pages
  2. Searching and browsing multimedial databases in a single interfaceOffering a better insight in the relations between media itemsAllowing researchers to create their own interface on top of the infrastructure