Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

On Applying Or-Parallelism and Tabling to Logic Programs


Published on

Lecture about Or-Parallelism and Tabling applied to Logic Programming

Published in: Technology
  • Be the first to comment

On Applying Or-Parallelism and Tabling to Logic Programs

  1. 1. On Applying Or-Parallelism and Tabling to Logic Programs Possamai Lino Department of Computer Science University of Venice Logic Programming Lecture February 2, 2007
  2. 2. Introduction <ul><li>After introduction of Prolog programming language, many reseachers concentrated their attention to the possibily of turning sequential Prolog interpreter into parallel once in order to speedup applications. </li></ul><ul><li>This is an interesting area because Prolog allow implicit parallelism, so any code annotation are necessary. </li></ul><ul><li>Another key-point of this article if the fact that SLD-derivations are not suitable for non-terminating programs (infinite failure) but using another derivation strategy we can assure the termination of a logic program (under certains circumnstances) . </li></ul><ul><li>Earlier Prolog engine when compute a predicate just encountered, din’t remember the c.a.s. calculated, and so this is a waste of computer resouce. Tabling is the response to this problem. </li></ul>
  3. 3. Parallelism within Prolog <ul><li>while (Query ≠ 0) </li></ul><ul><ul><li>Begin </li></ul></ul><ul><ul><li>select literal B from Query; </li></ul></ul><ul><ul><li>repeat </li></ul></ul><ul><ul><li>select clause (H:-Body) from Program; </li></ul></ul><ul><ul><li>until (unify(H,B) or (no clauses left)); </li></ul></ul><ul><ul><li>if (no clauses left) then </li></ul></ul><ul><ul><li>FAIL. </li></ul></ul><ul><ul><li>else </li></ul></ul><ul><ul><li> begin </li></ul></ul><ul><ul><li> ∂ =mostgeneralunifier(H,B); </li></ul></ul><ul><ul><li> Query=(Query-{B} U {Body})∂ </li></ul></ul><ul><ul><li> end </li></ul></ul><ul><li>end. </li></ul>
  4. 4. Parallelism within Prolog <ul><li>These types of parallelism are named as: </li></ul><ul><ul><li>And-Parallelism </li></ul></ul><ul><ul><ul><li>Appears from the non-determinism of the selection rule when more than one subgoal is present in the query or in the body of a clause. </li></ul></ul></ul><ul><ul><li>Or-Parallelism </li></ul></ul><ul><ul><ul><li>Derived from the non-determinism of the clause selected when more than one clause is applicable to the same subgoal. </li></ul></ul></ul><ul><ul><li>Unification Parallelism </li></ul></ul><ul><ul><ul><li>Appears during the process of unifying the arguments of a subgoal with those of a head clause for the predicate with the same name and arity. </li></ul></ul></ul><ul><li>Many implementations have been proposed for each type, but only or-parallelism have good performance and simple implementation, so this is the choice of the article’s authors. </li></ul><ul><li>The main advantages of using or-parallelism despite and-parallelism are: </li></ul><ul><ul><li>It doesn’t limit the Prolog’s power. </li></ul></ul><ul><ul><li>It’s easy to extend the Prolog sequential engine to cope with this new feature. </li></ul></ul><ul><ul><li>Can be exploited without any extra annotation. </li></ul></ul>
  5. 5. Or-Parallelism (YapOr)
  6. 6. An Or-Parallelism Example p(L):-s(L,M),t(M,L). p(K):-r(K). q(1). q(2). s(2,3). s(4,5). t(3,3). t(3,2). r(1). r(3). ?- t(X,3),p(Y),q(Y). p(Y),q(Y) s(L,M),t(M,L),q(L) {X/3} {Y/L} r(K),q(K) {Y/K} t(5,4),q(4) t(3,2),q(2) {L/2,M/3} {L/4, M/5} {K/1} {K/3} q(1) q(3) W1 W1 W2 W1 W3 W2 W4 W1 q(2) Success Fail Success Fail
  7. 7. Or-Parallelism drawbacks <ul><li>Although at the first sight or-parallelism can be considerate an easy task to manage, it has some problems to take into account related to: </li></ul><ul><ul><li>Multiple binding representation </li></ul></ul><ul><ul><ul><li>The concurrent execution of alternative branches of the search tree can result in several conflicting bindings for shared variables. </li></ul></ul></ul><ul><ul><ul><li>Consider, for example, the variable Y in the previous example. In this case, that variable is called conditional variable. </li></ul></ul></ul><ul><ul><ul><li>Any problem for X variable (called unconditional variable). </li></ul></ul></ul><ul><ul><li>Work scheduling </li></ul></ul><ul><ul><ul><li>Even in this case a scheduling is necessary for minimizing the overhead of work assignments. As we know, this is a complex problem because of the dynamic nature of work in or-parallel systems. </li></ul></ul></ul><ul><li>Before approching to these problems, is necessary to present the underlying level that we can see into each Prolog engine: the Warren Abstract Machine. </li></ul>
  8. 8. Wam, Warren Abstract Machine <ul><li>The Wam is a data structure (stack) necessary to the evaluation of Prolog programs and is widely used in every Prolog engine.. </li></ul><ul><li>It can be represented as: </li></ul>PDL : area used for unification process. Trail : array of address that points to variables memorized into heap or local stack resetted upon backtracking. Stack : containts environments and choice points . Heap : Memorize variables than can’t be stored in the stack. Code area : contains instruction for wam and Prolog compiled code.
  9. 9. How Or-Parallelism is implemented? <ul><li>Different models were proposed for exploiting or-parallelism, but only two of them are the best one’s. </li></ul><ul><li>Those models maily differ in the mechanism employed for solving the problem of environment representation and so for the solution to the multiple binding presented above. </li></ul><ul><ul><li>Environment copying. </li></ul></ul><ul><ul><ul><li>It’s the most efficient way to cope with the multiple binding problem. We can considerate this techique as the lazy one. Used in some other Prolog engine, for example Sictus, Muse. </li></ul></ul></ul><ul><ul><li>Binding arrays. </li></ul></ul><ul><ul><ul><li>Consist on recording conditional bindings in a global data structure and attaching a unique identifier to each binding that identifies the branch to which the binding belongs. </li></ul></ul></ul>
  10. 10. Environment copying <ul><li>In this model, each worker mantains its own copy of the environment in which it can write without causing binding conflicts. </li></ul><ul><li>When a variable is bound, the binding is stored in the private environment of the worker doing the binding. </li></ul><ul><li>When a idle worker picks work from another worker, it copies all the stacks from the sharing worker. As a consequence, each worker can carry out execution exactly like a sequential system, requiring very little synchronization. </li></ul><ul><li>Copying stacks is made efficient using the incremental copying technique. </li></ul><ul><li>The idea of incremental copying is </li></ul><ul><li>based on the fact that the idle </li></ul><ul><li>worker could have already traversed </li></ul><ul><li>a part of the seach tree that is </li></ul><ul><li>common to the sharing worker, and </li></ul><ul><li>thus it does not need to copy this </li></ul><ul><li>part of stacks. </li></ul>N2 N5 N3 N4 N1 Q P Node without alternatives Node with alternatives Alt 1 Alt 2 Alt 3 Alt 1 Alt 2 Alt 3
  11. 11. Tabling (YapTab)
  12. 12. Tabling (YapTab) <ul><li>SLD-derivations are not suitable for non-terminating programs (infinite loop) but using another derivation strategy we can assure the termination of a logic program (under certains circumnstances) . </li></ul><ul><li>Multiple calls to the same predicate, within the SLD-derivations, are considered like redundant evaluations. </li></ul><ul><li>Tabling or memoing is a tecnhique that consist of remembering subcomputations and reusing their results in order to respond to later requests. </li></ul><ul><li>Although various procedural semantics have been proposed for dealing with tabling, one has been gaining in popularity: SLG resolution. </li></ul>
  13. 13. Tabling (YapTab) <ul><li>Tabling is implemented using tries data structure that permit look up and possibly insertion to be performed in a single pass through a term. </li></ul><ul><li>Table space can be accessed in different ways: </li></ul><ul><ul><li>To look up if a subgoal is in the table and if not insert it; </li></ul></ul><ul><ul><li>To verify whether a newly found answer is already in the table, and if not insert it; </li></ul></ul><ul><ul><li>To pick up answers to consumer nodes; </li></ul></ul><ul><ul><li>To mark subgoals as completed. </li></ul></ul><ul><li>Correct design of the algorithms to access and </li></ul><ul><li>manipulate the table data is a critical issue to </li></ul><ul><li>obtain an efficient tabling system implementation. </li></ul><ul><li>Scheduling is another issue to cope with: at several </li></ul><ul><li>points the engine had to choose between continuing </li></ul><ul><li>execution, backtracking, returning answer to </li></ul><ul><li>consumer nodes or performing completion. </li></ul>
  14. 14. SLG-resolution <ul><li>To manage tabling, we must expand SLD resolution to deal with tabling. </li></ul><ul><li>A SLG system is a forest of SLG trees along with an associated table . </li></ul><ul><li>Root nodes of SLG trees are subgoals of tabled predicates. </li></ul><ul><li>A SLG evaluation Θ of a goal G, given a tabled program P, is a sequence of systems S 0 , S 1 , ... , S n such that: </li></ul><ul><ul><li>S 0 is the forest of a single SLG tree rooted by G and the table {(G,emptyset,incomplete)} </li></ul></ul><ul><ul><li>For each k, S k+1 is obtained from S k by an application of one of the subsequent operations: </li></ul></ul><ul><ul><ul><li>New tabled SubGoal Call, </li></ul></ul></ul><ul><ul><ul><li>Program Clause Resolution, </li></ul></ul></ul><ul><ul><ul><li>Answer Resolution, </li></ul></ul></ul><ul><ul><ul><li>New Answer, </li></ul></ul></ul><ul><ul><ul><li>Completion. </li></ul></ul></ul><ul><li>The Warren Abstract Machine introduced above, must be extended to cope with these new operations. </li></ul>
  15. 15. ?-table path. path(X,Z):-path(X,Y),path(Y,Z). path(X,Z):-arc(X,Z). arc(a,b). arc(b,c). Tabling, an example ?-path(a,Z). path(a,Y),path(Y,Z) arc(a,Z) path(b,Z) path(b,Z) path(b,Y),path(Y,Z) arc(b,Z) path(c,Z) path(c,Z) path(c,Y),path(Y,Z) arc(c,Z) {Z/b} fail fail {Z/c} fail fail fail fail path(a,Z) path(b,Z) path(c,Z) {Z/b} {Z/c} {Z/c} {Z/c} path(c,Z) fail path(a,Y),path(Y,Z) Generator Node Consumer Node Interior Node Answer Subgoal
  16. 16. Or-Parallelism and Tabling (OPTYap) <ul><li>Or-Parallelism within Tabling (OPT) is the model introduced by the autors that allow concurrent execution of all available alternatives. </li></ul><ul><li>This model is formed by a set of indipendent tabling engine that may share different common branches of the search tree during execution. So, each worker can be considered a sequential tabling engine that fully implement the tabling operations above mentioned. </li></ul><ul><li>The or-parallel component is triggered to allow synchronized access to the shared parts of the execution tree, in order to get new job when a worker runs out of alternatives to exploit. </li></ul><ul><li>In the OPT model a worker can keep its nodes private until reaching a sharing point. This is a key issue in reducing parallel overhead. They remark that it is common in or-parallel works to say that work is initially private, and that is made public after sharing. </li></ul><ul><li>Worker suspension is a common characteristic of Or-P and Tabling. In the former because of side effects, a worker must wait until another worker terminate his job. In the second, arise when a node ask for evaluation of a tabled predicate but in the table this no answer just calculated. </li></ul><ul><li>Worker are allowed to concurrently read and update the predicate table. To do so, workers need to be able to lock the table. Finer locking allows more parallelism while coarser locking has less overheads. </li></ul>
  17. 17. <ul><li>In this example, W 1 start evaluating the goal a(X). as a sequential </li></ul><ul><li>tabling engine should do. </li></ul><ul><li>It uses the first clause, but there is no answer for the predicate and so </li></ul><ul><li>it backtracks. At this point it uses the second clause and can calculate </li></ul><ul><li>the first answer that is inserted in the table. W 1 continues to evaluate </li></ul><ul><li>the second alternative. </li></ul><ul><li>Now, suppose that W 2 ask for work and W 1 decide to share all its </li></ul><ul><li>nodes, so the last b(X) alternative is assigned to W 2 . </li></ul>An OPT Example :-table a/1. a(X):-a(X). a(X):-b(X). b(1). b(X):- … b(X):- … ?-a(X). ?- a(X). a(X) b(X) {X/1} W2 W1 a(X) {X/1} Shared Nodes Answer Subgoal
  18. 18. Performance, non-tabled programs <ul><li>Now, we are able to analyze what are the execution times of engines considered above (YapOr, YapTab, OPTYap) with respect to Yap, the sequential engine for Prolog programs and respect XSB system, the well know engine for tabling processing. </li></ul><ul><li>First of all, we can see the performance and speedup of non-tabled programs. In this way, they have measured the overhead introduced by the model when running programs that do not take advantage of the particular extension. </li></ul><ul><li>In the XSB case, execution times are worst than Yap due to the overhead for managing tabling and or-parallelism (10%, 5%, 17%). </li></ul><ul><li>Good speedup when consider YapOr and OPTYap. </li></ul>
  19. 19. Performance, tabled programs <ul><li>For a complete analysis of performance, they had considered also tabled programs, in order to see if the YapTab engine is good. </li></ul><ul><li>Linear speedup are obtained in five of seven programs considered (that represent the usual programs) but in two cases, the sequential version is faster than the parallel. This behaviour is not caused because workers became idle and start seaching for work, as usually happens with parallel execution of non-tabled programs, but there is a lot of the contention to access that work. </li></ul>
  20. 20. Index / References <ul><li>Introduction </li></ul><ul><li>Parallelism within Prolog </li></ul><ul><ul><li>Or-Parallelism </li></ul></ul><ul><ul><ul><li>Example of Or-Parallelism </li></ul></ul></ul><ul><ul><ul><li>Drawbacks </li></ul></ul></ul><ul><ul><li>Wam </li></ul></ul><ul><ul><li>Models of parallelism </li></ul></ul><ul><ul><li>Environment copy </li></ul></ul><ul><li>Tabling </li></ul><ul><ul><li>Example of Tabling </li></ul></ul><ul><ul><li>SLG resolutions </li></ul></ul><ul><li>Or-Parallelism within Tabling </li></ul><ul><ul><li>Example of OPT model </li></ul></ul><ul><li>Experimental Results </li></ul><ul><ul><li>Non tabled programs </li></ul></ul><ul><ul><li>Tabled Programs </li></ul></ul><ul><li>On Applying Or-Parallelism and Tabling to Logic Programs. </li></ul><ul><li>Ricardo Rocha, Fernando Silva and Vitor Santos Costa </li></ul><ul><li>Theory and Practice of Logic Programming, Volume 5, Issue 1-2, January 2005, pp 161-205 </li></ul>