Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Bringing parliamentary debates to the Semantic WebDamir Juric1,3, Laura Hollink2, Geert-Jan Houben1 1 Delft   University o...
Motivation  Cross-media comparison:• What choices do different media make in the coverage of people and topics while  repo...
Motivation                                         Political eventsMedia  Cross-media comparison:• What choices do differe...
Background: thePoliMedia project  • Funded by CLARIN-NL  • May 2012 - May 2013  • 3 phases :     I. modeling phase: creati...
Research questions• How to represent political events on the Semantic Web?• How to represent links between media and polit...
Research questions• How to represent political events on the Semantic Web?• How to represent links between media and polit...
Political events data set• Events: Dutch parliamentary debates Handelingen der Staten-General or Dutch Hansard• Some prove...
Media data sets• newspaper articles and radio bulletins    • at the National Library of the Netherlands    • Many, mostly ...
Semantic model: what do we need to represent? 1/2• Important information for every parliamentary debate is:             De...
Semantic model: what do we need to represent? 2/2                         • Various information about media               ...
URI’s• PoliMedia vocabulary: http://purl.org/linkedpolitics/nl/polivoc#Speech• Politicians, parties: http://purl.org/linke...
Semantic model
Semantic model
Semantic model
Semantic model
Semantic model
Semantic model   W.R. van Hage, V. Malaisé, R.                 Segers, L. Hollink and A.Th.                 Schreiber. Des...
Semantic model   W.R. van Hage, V. Malaisé, R.                 Segers, L. Hollink and A.Th.                 Schreiber. Des...
Current work: finding links• Queries: speaker name + named entities + topics (created using  topic modeling methods) extrac...
Finally  • SPARQL endpoint with the PoliMedia vocabulary + RDF of Dutch Hansard    data will be available soon.  • Feel fr...
Thank you for your                  attention!  Henri Beunders (EUR)         Damir Juric (TU Delft)     Jaap Blom (NISV)  ...
Upcoming SlideShare
Loading in …5
×

Bringing parliamentary debates to the Semantic Web

739 views

Published on

Presentation of the paper 'Bringing parliamentary debates to the Semantic Web' by Damir Juric, Laura Hollink and Geert-Jan Houben at the workshop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE2012) in conjunction with the 11th International Semantic Web Conference 2012 in Boston, USA.

See also the homepage of the PoliMedia project: http://polimedia.nl/

  • Be the first to comment

Bringing parliamentary debates to the Semantic Web

  1. 1. Bringing parliamentary debates to the Semantic WebDamir Juric1,3, Laura Hollink2, Geert-Jan Houben1 1 Delft University of Technology, 2 VU University Amsterdam, 3 FER University of ZagrebDERIVE 2012Boston, 12.11.2012.
  2. 2. Motivation Cross-media comparison:• What choices do different media make in the coverage of people and topics while reporting on political events?• Does the representation of topics and people change over time and how do the various media types differ?
  3. 3. Motivation Political eventsMedia Cross-media comparison:• What choices do different media make in the coverage of people and topics while reporting on political events?• Does the representation of topics and people change over time and how do the various media types differ?
  4. 4. Background: thePoliMedia project • Funded by CLARIN-NL • May 2012 - May 2013 • 3 phases : I. modeling phase: creating a semantic model (this presentation) II. data production phase: creating links between political events and media III.application phase: searching and navigating linked datasets • www.polimedia.nl
  5. 5. Research questions• How to represent political events on the Semantic Web?• How to represent links between media and political events on the Semantic Web?
  6. 6. Research questions• How to represent political events on the Semantic Web?• How to represent links between media and political events on the Semantic Web?
  7. 7. Political events data set• Events: Dutch parliamentary debates Handelingen der Staten-General or Dutch Hansard• Some provenance: 1. Transcripts are made of the complete debates of the Dutch parliament. 2. Published online by the government on http://www.statengeneraaldigitaal.nl/ (1818 1995) and http:// officielebekendmakingen.nl/ (from 1995) 3. PoliticalMashup project has translated government pdf and txt files into XML, incl URI’s as identifiers, see http:// politicalmashup.nl/ 4. We build on that.
  8. 8. Media data sets• newspaper articles and radio bulletins • at the National Library of the Netherlands • Many, mostly regional news papers 1950- 1995 • Text + images of newspaper layout• newscasts • at the Netherlands institute for Sound and Vision • evening news and current affairs programs • metadata in Dublin Core and CDMI format • enriched with thesaurus terms from the Gemeenschappelijke Thesaurus Audiovisuele Archieven (GTAA)
  9. 9. Semantic model: what do we need to represent? 1/2• Important information for every parliamentary debate is: Debate • When the debate was held Metadata • What is being said in the debate (topics) Topic 1 • Who is giving the speeches in the debate and in which role (persons) Speaker 1 / Content • Additional information about actors involved in the event (names of the politicians, their party, age, etc.) Speaker 2 / Content • Structure: Subparts of the debate have their own identifiers (part of the debate where only one speaker can be identified as actor) Speaker 3 / Content • chronological order (the order in which the subparts where occurring inside the parliament debate, • Named entities apart from politicians (persons, Topic 2 locations, etc.) Speaker 1 / Content
  10. 10. Semantic model: what do we need to represent? 2/2 • Various information about media items linked to the debate • Links between subparts of the debate and news articles, radio bulletins and television newscasts
  11. 11. URI’s• PoliMedia vocabulary: http://purl.org/linkedpolitics/nl/polivoc#Speech• Politicians, parties: http://purl.org/linkedpolitics/nl/poli#Beel• debates and part of debates: http://purl.org/linkedpolitics/nl/nl.proc.sgd.d. 198219830000846.2.11.12• Media articles, bulletins and news casts: http://resolver.kb.nl/resolve?urn=ddd: 010069811:mpeg21:pdf
  12. 12. Semantic model
  13. 13. Semantic model
  14. 14. Semantic model
  15. 15. Semantic model
  16. 16. Semantic model
  17. 17. Semantic model W.R. van Hage, V. Malaisé, R. Segers, L. Hollink and A.Th. Schreiber. Design and use of the Simple Event Model (SEM)
  18. 18. Semantic model W.R. van Hage, V. Malaisé, R. Segers, L. Hollink and A.Th. Schreiber. Design and use of the Simple Event Model (SEM)
  19. 19. Current work: finding links• Queries: speaker name + named entities + topics (created using topic modeling methods) extracted from political events dataset• used for retrieval of media articles TopicList = NamedEntitiesVector TopicWordSetVector NamedEntitiesVector TopicWordSetVector Speech Speech PartOfDebate PartOfDebate + Speaker X = ActorFromSpeech TimeFrame
  20. 20. Finally • SPARQL endpoint with the PoliMedia vocabulary + RDF of Dutch Hansard data will be available soon. • Feel free to use it! • Links to media + search/browse app are expected early next year.
  21. 21. Thank you for your attention! Henri Beunders (EUR) Damir Juric (TU Delft) Jaap Blom (NISV) Max Kemman (EUR) Laura Hollink (VU) Martijn Kleppe (EUR)Geert-Jan Houben (TU Delft) Johan Oomen (NISV)

×