Good afternoon. My name is Nathan Yergler, and I'm Chief Technology Officer at Creative Commons. This afternoon I'm going to talk about a semantic enhanced search engine for education we've been working on called DiscoverEd. It's built on commodity hardware and open source tools, and the software can be used for other domains. I'm going to talk about some approaches we tried and rejected, and give you some information on tools you can use for building your own semantic search without investing in your own server farm.
Commodity Semantic Search: A Case Study of DiscoverEd
Commodity Semantic Search: A Case Study of DiscoverEd Nathan R. Yergler Creative Commons Semantic Technology Conference 24 June 2010
Initial effort: Google CSE <ul><li>Google Custom Search Engine allows you to “create a search engine for a website or a collection of interesting websites.” </li><ul><li>Define resource patterns for inclusion
Optionally include annotations – facets and labels </li></ul><li>Python scripts to consume resource lists
Prototype Results <ul><li>Curator model allows for very directed crawl </li><ul><li>Low cost, not very resource intensive </li></ul><li>Scale </li><ul><li>Flexibly filter on predicate values </li></ul><li>Limitations </li><ul><li>Provenance for curator metadata
Predicate filters had to be “hand-crafted” </li></ul></ul>