SearchingPolitical Data by Strategy Roberto Cornacchia Jaap Kamps Wouter Alink Arjen P. de Vries firstname.lastname@example.org
Search by Strategy An iterative 2-stage search process Express domain knowledge as high-level search strategies Generate search engine from the strategy A dynamic REST API UI controls for unspecified parameters Separate search strategy definition (the how) from actual searching and browsing of data collections (the what)
Search by Strategy captures: Arbitrary retrieval unit types (not just documents) E.g., expert finding, entity search “Semantic” search The building blocks operate on scored triples Semi-structured search Data objects may be structured in hierarchies Exploratory search Use facets as preferences
Exposé Searching the parliamentary proceedings of the Dutch parliament Complete transcripts of everything said in parliament Organized by parliamentary session Detailing who sais what in what role and context
Exposé Original data is PDF, transformed into XML by award-winning project Political Mashup http://politicalmashup.nl/
In Politics… Essence is not only what is said, but also by who and to whom, and why Concrete example: Wilders sais “knettergek” in parliament (in 2007) – is this remarkable?
“Knettergek” caseThe word “knettergek” has been used many times in parliament…… but never to address a member of the government
Varying result typesUtterances Person / Party / …
Flexibility Concrete case: Maarten: “I cannot find Prof. Mokken, who I know has been spoken about in parliament multiple times!”
Flexibility Default indexing uses stemming and normalization But… searching for people’s names (and, as we mention it, many other domain specific terminology) can be negatively affected by stemming “Mokken” transformed into “mok”, leading us to geographic locations “Mook” and “De Mok”, but not to the famous professor!
Joins to the rescue!Which house speakers from the Rotterdam harbour say what about “Amsterdam”?
Semantic Searchbiographies describes person utterance
Advantages Define and execute custom build search strategies Specialized to the task, or even to the search at hand Search multiple data sources at once Explore and refine results interactively “Search provenance” Complete transparency on how search results were obtained
Position Statement Search professionals think in terms of search strategies already Let them design their own strategies, and thereby tailor their search engines So they learn to trust what we claim to be the effective information retrieval techniques!