Tolog optimization


Published on

A short talk on the tolog optimizations that Ontopia performs

Published in: Technology
  • Be the first to comment

Tolog optimization

  1. 1. The tolog optimizations in Ontopia<br />TMRA, 2010-09-30<br />Lars Marius Garshol<br /><br />
  2. 2. Query reordering<br />Changes the order in which clauses are executed<br />Can make an enormous difference to execution speed<br />Uses simple cost estimation<br />there are two estimators in Ontopia<br />o:composed_by($OPERA : o:Work, $COMPOSER : o:Composer),<br />o:based_on($OPERA : o:Result, $WORK : o:Source),<br />o:written_by($WORK : o:Work, o:Shakespeare : o:Writer)<br /><br />
  3. 3. Hierarchy walker<br />Recursive rules do a lot of unnecessary work<br />when evaluating rules using the normal algorithm<br />essentially, already discarded matches come back again<br />Recursive rules usually consist of wrapping, and a recursion step<br />the optimizer finds the recursion step,<br />then runs a transitive closure of it by "pumping" matches through the recursive step<br />Greatly improves the speed of hierarchical queries<br />
  4. 4. Rule inlining<br />Given a query like<br />composed-by($C, $O) :- composed-by($C : composer, $O : work).<br />select $OPERA from composed-by(puccini, $OPERA)?<br />the optimizer inlines the rule<br />this gets rid of the overhead involved in calling the rule<br />Could be extended to handling bigger rules, but this is tricky<br />
  5. 5. Duplicate remover<br />This optimizer inserts a "fake" predicate which removes duplicate temporary results<br />It analyzes queries to see which ones would benefit from the removal<br />At the moment it only does this with recursive rules<br />The effect there can be enormous<br />
  6. 6. String prefix searches<br />Consider the query<br />ph:time-taken($PHOTO, $DATE),<br />$DATE < %time%<br />order by $DATE desc limit 1?<br />It has to <br />find all photos,<br />throw away the ones that are too late in time,<br />sort the remainder,<br />then keep the first<br />The optimizer rewrites this into an index lookup using an ordered index on occurrence values<br />it then pushes all topics with the correct value in increasing order, through the query predicate until it finds enough matches to satisfy the "limit"<br />
  7. 7. Faster role access<br />Given a query that contains<br />...<br />role-player($R1, fixed-point),<br />...<br />type($R1, roletype),<br />...<br />the optimizer rewrites this into<br />... role-player($R1, fixed-point <, roletype>) <br />so that the engine can use a faster method in the API, saving lots of temporary results<br />