Searching Political Data by Strategy


Published on

Presentation on the Exposé demonstrator, to enable search of Dutch parliamentary proceedings (the Political Mashup data collection).

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Searching Political Data by Strategy

  1. 1. SearchingPolitical Data by Strategy Roberto Cornacchia Jaap Kamps Wouter Alink Arjen P. de Vries
  2. 2. Search by Strategy An iterative 2-stage search process  Express domain knowledge as high-level search strategies  Generate search engine from the strategy  A dynamic REST API  UI controls for unspecified parameters Separate search strategy definition (the how) from actual searching and browsing of data collections (the what)
  3. 3. dashboard/demo04: /p/topic/Mokken
  4. 4. Search by Strategy captures: Arbitrary retrieval unit types (not just documents)  E.g., expert finding, entity search “Semantic” search  The building blocks operate on scored triples Semi-structured search  Data objects may be structured in hierarchies Exploratory search  Use facets as preferences
  5. 5. Exposé Searching the parliamentary proceedings of the Dutch parliament  Complete transcripts of everything said in parliament  Organized by parliamentary session  Detailing who sais what in what role and context
  6. 6. Exposé Original data is PDF, transformed into XML by award-winning project Political Mashup 
  7. 7. In Politics… Essence is not only what is said, but also by who and to whom, and why Concrete example:  Wilders sais “knettergek” in parliament (in 2007) – is this remarkable?
  8. 8. “Knettergek” caseThe word “knettergek” has been used many times in parliament…… but never to address a member of the government
  9. 9. Varying result typesUtterances Person / Party / …
  10. 10. Flexibility Concrete case:  Maarten: “I cannot find Prof. Mokken, who I know has been spoken about in parliament multiple times!”
  11. 11. Flexibility Default indexing uses stemming and normalization But… searching for people’s names (and, as we mention it, many other domain specific terminology) can be negatively affected by stemming  “Mokken” transformed into “mok”, leading us to geographic locations “Mook” and “De Mok”, but not to the famous professor!
  12. 12. /p/topic/mokken/p/emphasis_stemming/0
  13. 13. Joins to the rescue!Which house speakers from the Rotterdam harbour say what about “Amsterdam”?
  14. 14. Semantic Searchbiographies describes person utterance
  15. 15. Advantages Define and execute custom build search strategies  Specialized to the task, or even to the search at hand Search multiple data sources at once Explore and refine results interactively “Search provenance”  Complete transparency on how search results were obtained
  16. 16. Position Statement Search professionals think in terms of search strategies already Let them design their own strategies, and thereby tailor their search engines So they learn to trust what we claim to be the effective information retrieval techniques!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.