The Daedalus is a search engine that visualizes semantic relationships between words and concepts. It goes beyond typical search engines that treat documents as "bags of words" by assigning importance to words based on their context and connections to other words. This allows it to better understand relationships between synonyms, concepts from different domains, and how terms are used across languages. The tool also lets users manually re-weight key terms and concepts to better control searches and uncover new connections not found through typical search algorithms. The creators believe this approach could help researchers more deeply explore archives and texts at universities.
Return to the Materials Digital Humanities Conference 2013
1. The Daedalus: A search-engine for
visualizing semantic relationships
Brent Kievit-Kylar, Sean Connolly, & Colin Allen
Indiana University, Bloomington
Background
Search engines make predictions. Given the set of words a user
enters, a search engine makes a prediction about what
information that user is seeking. The predictions are generally
given in the form of a list, with the highest ranked prediction
coming first. But, what determines this ranking?
Internet search engines turn each webpage into a generic
“bag” of words. The “words in the bag” contain no semantic or
grammatical meaning. Each is given a score based on how
many times it was repeated in the doc, and its connection to all
Semantic Linking
Search engines can’t always tell when different words mean the
same thing. In the “chemist” example below, the search engine
reveals it doesn’t know that “qc” and “quality”-“control” mean the
same thing. It may never natively figure this out. But to a human
with expertise in a specific domain, the matter could be trivial.
Semantic linking could help for corpuses compiled over many
years, or over different languages. Aristotle separated “plot”
from “story”: “story” is the elements and events of a narrative
and “plot” is their ordering. Russian Formalists used the words
“fabula” and “syuzhet” to write a similar conceptualization of
narrative. Researchers might want to know of both. The tool lets
a researcher build domain-specific knowledge into generic IT.
Future Work
We believe Daedalus – the “data list” – can perhaps help best
in the querying and cataloging of archives that exist at
universities. We believe the tool can help researchers dive
deeper into texts with technology, see yet unseen connections
“Words are known by the company they keep.” (Firth 1957)
Semantic Override
Do you know the search-weighting protocols for your data
search tools? The way your tool is built impacts the efficacy and
limits its potential for use. The tool allows the re-weighting of
terms so users may “take over” the search and override the
strength of the weightings of the word-symbol relationships..
Re-weighitng the relationships of key words also simultaneously
refreshes the search with the new weights and generates a new
the other words.
A visual representation
for the weighted “bag
of words” for the non-
grammatical query
“potter’s patronus
animal” at left (drawn
from a real web query
by our Daedalus tool
9/29/12)
Reweight
key terms
across and within
texts, and give users
greater control over
digital research tools.
As part of the InPho
project, the Daedalus
represents each
article of the Stanford
Encyclopedia of
Philosophy as a
meta-object, showing
the introduction in
one domain and the
rest of the article as
another. Re-
weighting and linking
generates new
search results.
search results page.