Distributed Multi-Threading
       in GNU Prolog
      Nuno Morgadinho, Salvador Abreu
          {nm, spa}@di.uevora.pt




             Departamento de Informática


                     Évora 2007
Introduction



          •   Distributed - one or more computers
              communicating over a network

          •   Multi-Threading - more than one thread of
              execution

          •   Thread - a flow of execution



Slide 2                                                   ICLP ’07 CICLOPS
Introduction



          •   Obtain results faster for problems where
              performance is critical

          •   The problem can be divided into smaller tasks
              which can be carried out simultaneously

          •   Parallelism



Slide 3                                                  ICLP ’07 CICLOPS
What We Present




          •   PM2 - a distributed multi-threading
              programming environment

          •   GNU Prolog - efficient native Prolog compiler

          •   Combination of both




Slide 4                                               ICLP ’07 CICLOPS
GNU Prolog and PM2: Why?




          •   GNU Prolog produces stand-alone executables

          •   The size of the executables is relatively small

          •   Doesn’t work with machine independent saved-
              states




Slide 5                                                   ICLP ’07 CICLOPS
Implementation



          •   Established a model for connecting PM2 and
              GNU Prolog

          •   The approach doesn’t involve modifications to
              GNU Prolog neither to PM2

          •   Compatibility with GNU Prolog libraries is
              retained



Slide 6                                                ICLP ’07 CICLOPS
Approach




          •   Tabard - new program that manages distributed
              instances of GNU Prolog engines

          •   pm2prolog - new library that allows the
              development of distributed multithreaded
              Prolog applications




Slide 7                                                  ICLP ’07 CICLOPS
Functionality


          •   Allow to execute computations remotely

          •   Manage the engines responsible for the
              computations

          •   Manage the communication involved between
              the several machines

          •   Based on distributed memory and explicit
              message-passing


Slide 8                                                ICLP ’07 CICLOPS
Example
          :- initialization(init).
          :- include('pm2prolog').

          % thread with rank 0
          init:-
            pm2_is_master, !,
            pm2_max_rank(MaxRank),
            start_prolog_workers(MaxRank),
            test_prolog_workers(MaxRank),
            read_test(MaxRank),
            stop_prolog_workers.

          % thread != 0
          init:-
            worker_code.

Slide 9                                      ICLP ’07 CICLOPS
Before Running




           •   Configuration that specifies the list of machines

           •   Each machine is mapped to one or more
               processing nodes or virtual processors (VPs)




Slide 10                                                      ICLP ’07 CICLOPS
Execution Model




           •   Binary is copied to all machines

           •   In VP0 (master) a gprolog engine is created

           •   In the other VPs (workers) a pthread in C is
               created and stands awaiting messages




Slide 11                                                 ICLP ’07 CICLOPS
Execution Model



           •   In the master, now in the Prolog thread, a
               predicate is called to send a message to every
               worker

           •   The workers receive the message, initiate a
               gprolog engine and the thread stands awaiting
               more messages to come



Slide 12                                                 ICLP ’07 CICLOPS
Execution Model




           •   In the master, work is distributed throughout
               the workers through message-passing

           •   The workers receive tasks which they execute
               locally. As soon as they finish, they send their
               results back to the master




Slide 13                                                 ICLP ’07 CICLOPS
Execution Model


           •   The master assembles the work results by
               reading as many messages as the number of
               previously sent messages

           •   The master redistributes work again or orders
               the workers to finish their execution

           •   The workers terminate

           •   The master reiniciates the workers or
               terminates itself

Slide 14                                               ICLP ’07 CICLOPS
Inside a Virtual Processor
             UNLOCK                Message Queue

           LOCK
                    WRITE                                 READ
                                                           WRITE



                                       ...
           C Thread Listener
                                                 Prolog Thread


            write_message_queue()               thread_send_message/2

                  Start_Prolog()                thread_get_message/2

                  Stop_Prolog()                      pm2_self/1

                      ...                              ...


                            RECEIVE                SEND

                                       SOCKET


Slide 15                                                                ICLP ’07 CICLOPS
•   Several VPs per physical machine is possible
                               Node 1              Node 2


                                                VP 2
                              VP 0                           Prolog
                                                              Proc


                                       Prolog
                                        Proc
                                                   Node 3
                              VP 1

                                                VP 2
                                       Prolog                Prolog
                                        Proc                  Proc


                                 ...
                                                       ...




Slide 16                                                              ICLP ’07 CICLOPS
ISO Support




           •   ISO/IEC Draft Technical Report 13211-5:2007,
               Prolog Multi-Threading Support

           •   Extensions to take into consideration remote
               threads




Slide 17                                               ICLP ’07 CICLOPS
PM2-Prolog Remote Threads




           •   thread_create/2
           •   vid(Rank, ThreadID)

                •   Rank - VP identifier

                •   ThreadID - identifier of the thread inside




Slide 18                                                  ICLP ’07 CICLOPS
Experimental Evaluation


                                100


                                 75


                                 50


                                 25


                                  0
                                       2004    2005    2006    2007




             Results obtained with 7x Intel(R) Pentium(R) 4 2.80 Mhz each, Hyperthreading
           enabled, 512 Mbytes each, Linux 2.4.19 kernel, 100 Mbits TCP/IP Ethernet network
Slide 19                                                                        ICLP ’07 CICLOPS
Conclusions



           •   We presented a distributed multi-threading
               GNU Prolog system on top of PM2

           •   First results show that it can obtain substantial
               speedups, even for real-world

           •   Proved the approach to be technically possible
               and can be of use to other implementers



Slide 20                                                   ICLP ’07 CICLOPS
Further Work


           •   Improving our proposal

           •   Extend the API with introspection and
               monitoring predicates

           •   Experiment with more programs and bigger
               configurations

           •   Build our own applications using this
               technology


Slide 21                                               ICLP ’07 CICLOPS

Distributed Multi-Threading in GNU-Prolog

  • 1.
    Distributed Multi-Threading in GNU Prolog Nuno Morgadinho, Salvador Abreu {nm, spa}@di.uevora.pt Departamento de Informática Évora 2007
  • 2.
    Introduction • Distributed - one or more computers communicating over a network • Multi-Threading - more than one thread of execution • Thread - a flow of execution Slide 2 ICLP ’07 CICLOPS
  • 3.
    Introduction • Obtain results faster for problems where performance is critical • The problem can be divided into smaller tasks which can be carried out simultaneously • Parallelism Slide 3 ICLP ’07 CICLOPS
  • 4.
    What We Present • PM2 - a distributed multi-threading programming environment • GNU Prolog - efficient native Prolog compiler • Combination of both Slide 4 ICLP ’07 CICLOPS
  • 5.
    GNU Prolog andPM2: Why? • GNU Prolog produces stand-alone executables • The size of the executables is relatively small • Doesn’t work with machine independent saved- states Slide 5 ICLP ’07 CICLOPS
  • 6.
    Implementation • Established a model for connecting PM2 and GNU Prolog • The approach doesn’t involve modifications to GNU Prolog neither to PM2 • Compatibility with GNU Prolog libraries is retained Slide 6 ICLP ’07 CICLOPS
  • 7.
    Approach • Tabard - new program that manages distributed instances of GNU Prolog engines • pm2prolog - new library that allows the development of distributed multithreaded Prolog applications Slide 7 ICLP ’07 CICLOPS
  • 8.
    Functionality • Allow to execute computations remotely • Manage the engines responsible for the computations • Manage the communication involved between the several machines • Based on distributed memory and explicit message-passing Slide 8 ICLP ’07 CICLOPS
  • 9.
    Example :- initialization(init). :- include('pm2prolog'). % thread with rank 0 init:- pm2_is_master, !, pm2_max_rank(MaxRank), start_prolog_workers(MaxRank), test_prolog_workers(MaxRank), read_test(MaxRank), stop_prolog_workers. % thread != 0 init:- worker_code. Slide 9 ICLP ’07 CICLOPS
  • 10.
    Before Running • Configuration that specifies the list of machines • Each machine is mapped to one or more processing nodes or virtual processors (VPs) Slide 10 ICLP ’07 CICLOPS
  • 11.
    Execution Model • Binary is copied to all machines • In VP0 (master) a gprolog engine is created • In the other VPs (workers) a pthread in C is created and stands awaiting messages Slide 11 ICLP ’07 CICLOPS
  • 12.
    Execution Model • In the master, now in the Prolog thread, a predicate is called to send a message to every worker • The workers receive the message, initiate a gprolog engine and the thread stands awaiting more messages to come Slide 12 ICLP ’07 CICLOPS
  • 13.
    Execution Model • In the master, work is distributed throughout the workers through message-passing • The workers receive tasks which they execute locally. As soon as they finish, they send their results back to the master Slide 13 ICLP ’07 CICLOPS
  • 14.
    Execution Model • The master assembles the work results by reading as many messages as the number of previously sent messages • The master redistributes work again or orders the workers to finish their execution • The workers terminate • The master reiniciates the workers or terminates itself Slide 14 ICLP ’07 CICLOPS
  • 15.
    Inside a VirtualProcessor UNLOCK Message Queue LOCK WRITE READ WRITE ... C Thread Listener Prolog Thread write_message_queue() thread_send_message/2 Start_Prolog() thread_get_message/2 Stop_Prolog() pm2_self/1 ... ... RECEIVE SEND SOCKET Slide 15 ICLP ’07 CICLOPS
  • 16.
    Several VPs per physical machine is possible Node 1 Node 2 VP 2 VP 0 Prolog Proc Prolog Proc Node 3 VP 1 VP 2 Prolog Prolog Proc Proc ... ... Slide 16 ICLP ’07 CICLOPS
  • 17.
    ISO Support • ISO/IEC Draft Technical Report 13211-5:2007, Prolog Multi-Threading Support • Extensions to take into consideration remote threads Slide 17 ICLP ’07 CICLOPS
  • 18.
    PM2-Prolog Remote Threads • thread_create/2 • vid(Rank, ThreadID) • Rank - VP identifier • ThreadID - identifier of the thread inside Slide 18 ICLP ’07 CICLOPS
  • 19.
    Experimental Evaluation 100 75 50 25 0 2004 2005 2006 2007 Results obtained with 7x Intel(R) Pentium(R) 4 2.80 Mhz each, Hyperthreading enabled, 512 Mbytes each, Linux 2.4.19 kernel, 100 Mbits TCP/IP Ethernet network Slide 19 ICLP ’07 CICLOPS
  • 20.
    Conclusions • We presented a distributed multi-threading GNU Prolog system on top of PM2 • First results show that it can obtain substantial speedups, even for real-world • Proved the approach to be technically possible and can be of use to other implementers Slide 20 ICLP ’07 CICLOPS
  • 21.
    Further Work • Improving our proposal • Extend the API with introspection and monitoring predicates • Experiment with more programs and bigger configurations • Build our own applications using this technology Slide 21 ICLP ’07 CICLOPS