SOEN 423: Project Report
in fulfillment of SOEN 423 fall 2009 – Ver. 1.2


project ID : 3
12/11/2009

Team Members




               Date       Rev. Description     Author(s)   Contributor(s)

               10/12/2009 1.0   First Draft    Ali Ahmed The Team

               11/12/2009 1.2   Document Review Ali Ahmed The Team




                          Concordia University Montreal

                                      Winter 2009
Concordia University                                                 Project Report                                                                 SOEN 423
CS & SE                                                                                                                                              Fall 2009




Table of Contents
1.     Introduction ........................................................................................................................................... 3
2.     Problem Statement ................................................................................................................................ 3
3.     Design Description................................................................................................................................ 3
4.     Implementation Details ......................................................................................................................... 5
     4.1.     Corba ............................................................................................................................................. 5
     4.2.     Client ............................................................................................................................................. 5
     4.3.     Front End ...................................................................................................................................... 6
     4.4.     Replica Manager ........................................................................................................................... 7
     4.5.     Branch Servers / Replicas ............................................................................................................. 9
     4.6.     Byzantine Scenarios ...................................................................................................................... 9
     4.7.     Reliable FIFO communication via UDP ..................................................................................... 12
     4.8.     Synchronization .......................................................................................................................... 13
     4.9.     Asynchronous call back in Corba ............................................................................................... 14
5.     Test cases and overview...................................................................................................................... 15
6.     Team Organization and Contribution ................................................................................................. 21
7.     Conclusion .......................................................................................................................................... 21




                                                                                                                                                  2|Page
Concordia University                        Project Report                                     SOEN 423
CS & SE                                                                                         Fall 2009




   1. Introduction
         This report is in fulfillment of the requirements for Soen 423 Distributed System Programming
   Project for Fall 2009. It describes the problem specified, the design and implementation of our
   solution and the resulting output from the system and verification by test cases




   2. Problem Statement
       We were required to implement a Distributed Banking System ( DBS ) , extending the core idea
       of the individual assignments . Our project ( group of 3 ) was to have the following features

      A failure-free front end (FE) which receives requests from the clients as CORBA invocations,
       atomically broadcasts the requests to the server replicas, and sends a single correct result back
       to the client by properly combining the results received from the replicas. The FE should also
       keep track of the replica which produced the incorrect result (if any) and informs the replica
       manager (RM) to replace the failed replica. The FE should also be multi threaded so that it can
       handle multiple concurrent client requests using one thread per request.

      A replica manager (RM) which creates and initializes the actively replicated server subsystem.
       The RM also manages the server replica group information (which the FE uses for request
       forwarding) and replaces a failed replica with another one when requested by the FE.

      A reliable FIFO communication subsystem over the unreliable UDP layer for the communication
       between replicas.




   3. Design Description
    Based on the requirements we saw a possible bottleneck, the fact that the methods parameters in
our assignment code had return types which would result in blocking and hence low performance.
Additionally Branch Servers had to be destroyed and re-instantiated hence we reference them through a


                                                                                              3|Page
Concordia University                          Project Report                                     SOEN 423
CS & SE                                                                                           Fall 2009




    Branch Server Proxy Object. Transfer operations were delegated to FE's and they split them into
deposit and withdraw requests since then may need to contact other FE’s. Each FE and associated group
of replicas deals with a subset of accounts. If in an account transfer both accounts are to referenced to
the same FE the first account is referenced to the correct FE to do the withdraw , then a message is sent
to the same FE to do the deposit operation.




                    Fig . High Level design overview of the implementation and deployment




                                                                                               4|Page
Concordia University                         Project Report                                      SOEN 423
CS & SE                                                                                           Fall 2009

   4. Implementation Details

        4.1.     Corba
         All elements of the Corba were defined in our IDL file which is listed below. Things to be noted
       are the call back object and void method declarations.
         Our IDL File

       //---------------

       module dbs
       {
          module corba
          {
             interface CallBack
             {
             void responseMessage(in string message);
             };
                interface FailureFreeFE
                {
             string sayHello();
             void deposit(in long long accountNum,in float amount);
             void withdraw(in long long accountNum,in float amount);
             void balance(in long long accountNum);
             void transfer(in long long src_accountNum,in long long
       dest_accountNum,in float amount);
             void requestResponse(in CallBack cb);
                oneway void shutdown();
                };
          };
       };
       //---------------




        4.2.     Client
               The Client simply registers itself with the ORB daemon; it maps references of Front
       ends with the Branch ID of requested accounts (first two digits). It also registers the call back
       object with the FE when making the request, there is no blocking at any stage and the FE
       response is asynchronous so multiple requests can be sent and the responses arrive later.




                                                                                                5|Page
Concordia University                           Project Report                                  SOEN 423
CS & SE                                                                                         Fall 2009

        4.3.    Front End




                                  Fig . High-level view of frontend components.

         The frontend manages the communication between the clients and branch server replicas. It
       provides clients with a failure free interface to branch servers allowing them to perform the
       basic banking operations (deposit, withdraw, balance, transfer).

       Clients are only required to know the location of the server running the ORB. They then obtain a
       reference to the frontend hosting the account they wish to manipulate. Requests are sent via
       CORBA invocations to the frontend who then messages to the FIFO UDP to broadcast to each of
       its branch server replicas. Each replica does the requested operation on the account and returns
       a result in the form of an account balance. The results are all compared and response that
       reflects the correct result is sent back to the client.

         The CORBA middleware provides us with transparent threading and concurrency control. Also
       the UDP Server runs in its own thread and all its operations are hence asynchronous. Therefore
       spawn additional threads per request was no longer needed.




                                                                                             6|Page
Concordia University                         Project Report                                      SOEN 423
CS & SE                                                                                           Fall 2009

        4.4.    Replica Manager
       Why replication is needed

       Process failure should not imply the failure of the whole system

       A distributed system needs to implement techniques guaranteeing the availability and the
       correctness of the result provided to the client. We use active replication as per the specification
       so that the system may provide a collection of results, which represents a consensus that the
       client may rely on without the awareness of the inner process of replication.

       A replica is a server process that carries the execution of a client request with peers processing
       the request in its own memory space. This way we guarantee that a replicated process does not
       alter the state of another replica.

       But the replication comes at a cost. True or most up to date correct results must be maintained
       by proper message ordering or simply avoidance of byzantine failure (malicious or non
       malicious). In the best case all replicas hold the same correct value. In the worst case we expect
       the value to be returned be reached by a majority vote.

       We may expect the replicas to fail for arbitrary reasons. Hardware failure can be among the
       most dramatic. Replicas in the same host that fail will result in the total system failure unless
       you have a group that can take its place. In this project we do not make the assumption that
       “replica group replacement” can exist or that replicas are in different hosts but we keep in mind
       it can exist in some distributed systems.

       To prevent failure we implemented replicas has entities who relieve some of the work from the
       front end and insure that even if a replicated processes fails the front end can still manage a to
       send correct answer to the client.

       Inaccurate response should not be sent back to the client

       Now we need to discuss why replication in general provides good guarantee that no incorrect
       answer will be sent to the client. Suppose a single process is handling the request of a client.
       Suppose this answer is incorrect. First the front end cannot know the answer is incorrect and
       failure detection will be absent.


                                                                                                7|Page
Concordia University                            Project Report                                         SOEN 423
CS & SE                                                                                                 Fall 2009




       Now we suppose we replicate a server process. In some case we may expect the replicas to send
       a different answer to the front end. We assume that a consensus can be reached by a majority
       vote. Else the server (replicas and front end) cannot proceed further and a replica must be
       rebuilt or recovered to the last know good state.

       In the project we make the assumption that replies sent by the replicas always reach a
       consensus and that this consensus is truthful (not altered by some byzantine general).

       The place of the replica and the replica manager within the system

       Replica and branch server

       The BranchServerProxy is the instance that holds reference to the replica (BranchServerImpl). A
       replica is a copy of a branch server holding a private copy of the set of accounts within the
       branch server. According to our assumption we assume two replicas will return the correct
       result but a faulty replica will “lie“when it returns its result to the front end. For the sake of
       simplicity we will assume a replica is an instance of a branch server.

       Life of a replica

       A branch server is expected to return correct results unless otherwise instructed. To keep the
       implementation general a branch server can be created, killed and synched with a trusted
       replica. A replica is kept alive while it is not generating three errors. If it does it is signaled to
       stop and synched with a trusted replica data.

       Killing a replica means the reference to the implementation along with private accounts are
       removed for garbage collection. This is not a good way to proceed in a real distributed system
       but since the set of accounts is no more than ten at any time this factor does not alter the
       performance of the system in any way.

       Once the reference to the implementation is collected by garbage collection it is synched with a
       trusted branch server. We define synching as requesting a trusted replica account data set.




                                                                                                     8|Page
Concordia University                         Project Report                                      SOEN 423
CS & SE                                                                                           Fall 2009

       The replicas do not communicate directly. Messages must be sent through a UPD server
       contained in the front end and send the response back to the failed replica. Accounts in the
       faulty replica are updated one by one until all accounts are updated to the correct values.

       The replica manager

       A replica manager is an entity in the system that manages the life of replicas. We expect the
       replica manager to signal to a server holding replicated data to stop referencing incorrect
       key/value pairs (id ,amount) defining bank accounts. It's job is to simply send a UDP message to
       the failed branch server proxy to eliminate the reference to the old replicated data.

       The replica manager does not play a bigger role in the system other than initiating the
       replacement of replicated bank accounts. We have decided to integrate its functionality to other
       subsystems.



        4.5.    Branch Servers / Replicas


           Branch Servers exist with their own repository which is simulated by Synchronized hash
       table and have methods to conduct operations on them; they don't have a transfer operation,
       as they can't call methods from a replica server having a different group of accounts. They exist
       as a reference to a Branch Server Proxy Object which handles udp messaging via its udp server
       and passes respective messages to it. The proxy object also handles the replica failure scenario
       by reinitializing the object.



        4.6.    Byzantine Scenarios
          The requirement for the project states that our failure free frontend must also detect a single
       non-malicious Byzantine failure and correct it using active replication. To achieve this we first
       had to make one of the replicas return incorrect results, which we accomplished via a command
       line parameter flag that when activate gave the replica high probability of producing these
       incorrect results. The frontend must also keeps a hash table containing each replica and the
       numbers of times they have consecutively sent false data to determine if it’s time to actively
       replicate one.

                                                                                                9|Page
Concordia University                          Project Report                                       SOEN 423
CS & SE                                                                                             Fall 2009

       When the frontend examines each response from the replicas for a corresponding client request
       it is able to detect the single Byzantine failure by comparing two results that agree. The one that
       does not agree is obliviously the one that produced an incorrect result and must be dealt with.
       The replica error hash table entry for this replica is checked and if its error count is 3 it must be
       replicated and synced with clean data. The tasks of terminating the faulty replica, instantiating a
       new one and repopulating it with known good data from another replica are then delegated to
       the Replica Manager. Finally the frontend resets the error count of the new replica.




       How we simulate byzantine failure

         Under the conditions of the network where the software was tested no byzantine failure
       (malicious or not) was not expected to happen but in real conditions this has to be taken into
       account. Also if the network is reliable then we can expect byzantine failure due to message
       loss. To handle byzantine failure we had to simulate a replica sending inaccurate data.

       We were asked to implement a system were only one byzantine failure will occur at any time.
       Under this condition we assume only one replica will generate incorrect result. To implement
       this we have added an extra parameter to the replica indicating whether it will generate an
       incorrect result or not.

       This way we know only one byzantine replica will fail and we know we can determine which one
       it is. But this approach has the limitation of testing the system under unlikely circumstances but
       is sufficient for the scope of the assignment.




                                                                                                10 | P a g e
Concordia University                        Project Report                                   SOEN 423
CS & SE                                                                                       Fall 2009




    Fig . Sequence Diagram of the Fail Case Scenario and the respective action taken by the replica
                                   manage to handle the response.




                                                                                           11 | P a g e
Concordia University                        Project Report                                   SOEN 423
CS & SE                                                                                       Fall 2009




        4.7.    Reliable FIFO communication via UDP


        One of the requirements of the project was to have reliable Communication in a fifo order
       implemented over the unreliable UDP protocol. As per our design every entity requiring this
       network access has an object reference of a udp server. In our project those elements are the
       Branch Proxy and the Front End. The UDP server object runs in a thread and also maintains
       another concurrent Re transmit thread object which runs in parallel.

         To maintain FIFO ordering messages are held in queue and are numbered. Sequence number
       pairs are uniquely held for each destination and subsequently at the udp receiver object. One a
       packet is sent from the queue we await for a corresponding acknowledgment for that message.

         Once received the next message is acted upon. Message numbering allows us to discard
       messages arriving out of order and duplicate messages. The retransmit thread which runs a
       check method at a specified interval monitors whether the current message sent has had its
       Acknowledge returned if not he the same message is broadcast again. If the total retries exceeds
       a value a system exception is thrown.




                                                                                           12 | P a g e
Concordia University                        Project Report                                   SOEN 423
CS & SE                                                                                       Fall 2009




       Fig .Sequence diagram of a sample operation sequence for the FIFO UDP Subsystem.




        4.8.    Synchronization


               This is extension of the process implemented in the assignments. The Front End uses
       synchronized blocks to handle concurrency and also obtain fine grained control over the locking
       mechanism for maximum performance. Additionally Concurrent data structures are used were
       appropriate eg: Concurrent Queues in the UDP server, Hash tables for the account information.




                                                                                          13 | P a g e
Concordia University                        Project Report                                    SOEN 423
CS & SE                                                                                        Fall 2009




        4.9.    Asynchronous call back in Corba




                       Fig. A simple setup with return types specified in Corba methods




         To avoid the above scenario we use void methods in our interface definitions but for each
       operation we register a call back object, which can be used by the FE to respond asynchronously
       without blocking the client and also can keep processing additional requests from the client.




                                                                                            14 | P a g e
Concordia University                        Project Report                                    SOEN 423
CS & SE                                                                                        Fall 2009

   5. Test cases and overview




          Fig .Terminal Screens of the project being initialized normally , instance shows three front
                                    ends and associated 9 replica servers.




                                                                                            15 | P a g e
Concordia University            Project Report                       SOEN 423
CS & SE                                                               Fall 2009




                       Fig . Source tree overview of the project


                                                                   16 | P a g e
Concordia University                        Project Report                                     SOEN 423
CS & SE                                                                                         Fall 2009




   Fig. Output Consoles for the clients and replica serves running Driver Test1 ( Single thread test)




                                                                                            17 | P a g e
Concordia University               Project Report                   SOEN 423
CS & SE                                                              Fall 2009




                       Fig . Double threaded client (Branch 30)




                                                                  18 | P a g e
Concordia University               Project Report                   SOEN 423
CS & SE                                                              Fall 2009




                       Fig . Double threaded client (Branch 20)




                                                                  19 | P a g e
Concordia University      Project Report          SOEN 423
CS & SE                                            Fall 2009




                       Fig . Fifo UDP testing




                                                20 | P a g e
Concordia University                          Project Report                                    SOEN 423
CS & SE                                                                                          Fall 2009




   6. Team Organization and Contribution

       As per the Project document roles were already specified for the three team members . That is
   primarily how the team was organized there some overlap in areas in the later stages and the
   documentation was done together in team setting.




      Ali Ahmed - Primary responsibility to design and implement the FIFO UDP Message passing
          system. Asynchronous call back in Corba , Design of the front end and Replica servers




      *********** - Handling the Replication Manages and delegating responsibilities to the
             relevant classes for the appropriate actions




   7. Conclusion

      Our implementation correctly implements the specifications and results are as expected, changing
   variable values results in output variations in line with manual calculations. Hence we feel the
   solution satisfies all criteria required in the scope of the assignment.




                                                                                             21 | P a g e

Soen 423 Project Report Revised

  • 1.
    SOEN 423: ProjectReport in fulfillment of SOEN 423 fall 2009 – Ver. 1.2 project ID : 3 12/11/2009 Team Members Date Rev. Description Author(s) Contributor(s) 10/12/2009 1.0 First Draft Ali Ahmed The Team 11/12/2009 1.2 Document Review Ali Ahmed The Team Concordia University Montreal Winter 2009
  • 2.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Table of Contents 1. Introduction ........................................................................................................................................... 3 2. Problem Statement ................................................................................................................................ 3 3. Design Description................................................................................................................................ 3 4. Implementation Details ......................................................................................................................... 5 4.1. Corba ............................................................................................................................................. 5 4.2. Client ............................................................................................................................................. 5 4.3. Front End ...................................................................................................................................... 6 4.4. Replica Manager ........................................................................................................................... 7 4.5. Branch Servers / Replicas ............................................................................................................. 9 4.6. Byzantine Scenarios ...................................................................................................................... 9 4.7. Reliable FIFO communication via UDP ..................................................................................... 12 4.8. Synchronization .......................................................................................................................... 13 4.9. Asynchronous call back in Corba ............................................................................................... 14 5. Test cases and overview...................................................................................................................... 15 6. Team Organization and Contribution ................................................................................................. 21 7. Conclusion .......................................................................................................................................... 21 2|Page
  • 3.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 1. Introduction This report is in fulfillment of the requirements for Soen 423 Distributed System Programming Project for Fall 2009. It describes the problem specified, the design and implementation of our solution and the resulting output from the system and verification by test cases 2. Problem Statement We were required to implement a Distributed Banking System ( DBS ) , extending the core idea of the individual assignments . Our project ( group of 3 ) was to have the following features  A failure-free front end (FE) which receives requests from the clients as CORBA invocations, atomically broadcasts the requests to the server replicas, and sends a single correct result back to the client by properly combining the results received from the replicas. The FE should also keep track of the replica which produced the incorrect result (if any) and informs the replica manager (RM) to replace the failed replica. The FE should also be multi threaded so that it can handle multiple concurrent client requests using one thread per request.  A replica manager (RM) which creates and initializes the actively replicated server subsystem. The RM also manages the server replica group information (which the FE uses for request forwarding) and replaces a failed replica with another one when requested by the FE.  A reliable FIFO communication subsystem over the unreliable UDP layer for the communication between replicas. 3. Design Description Based on the requirements we saw a possible bottleneck, the fact that the methods parameters in our assignment code had return types which would result in blocking and hence low performance. Additionally Branch Servers had to be destroyed and re-instantiated hence we reference them through a 3|Page
  • 4.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Branch Server Proxy Object. Transfer operations were delegated to FE's and they split them into deposit and withdraw requests since then may need to contact other FE’s. Each FE and associated group of replicas deals with a subset of accounts. If in an account transfer both accounts are to referenced to the same FE the first account is referenced to the correct FE to do the withdraw , then a message is sent to the same FE to do the deposit operation. Fig . High Level design overview of the implementation and deployment 4|Page
  • 5.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 4. Implementation Details 4.1. Corba All elements of the Corba were defined in our IDL file which is listed below. Things to be noted are the call back object and void method declarations. Our IDL File //--------------- module dbs { module corba { interface CallBack { void responseMessage(in string message); }; interface FailureFreeFE { string sayHello(); void deposit(in long long accountNum,in float amount); void withdraw(in long long accountNum,in float amount); void balance(in long long accountNum); void transfer(in long long src_accountNum,in long long dest_accountNum,in float amount); void requestResponse(in CallBack cb); oneway void shutdown(); }; }; }; //--------------- 4.2. Client The Client simply registers itself with the ORB daemon; it maps references of Front ends with the Branch ID of requested accounts (first two digits). It also registers the call back object with the FE when making the request, there is no blocking at any stage and the FE response is asynchronous so multiple requests can be sent and the responses arrive later. 5|Page
  • 6.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 4.3. Front End Fig . High-level view of frontend components. The frontend manages the communication between the clients and branch server replicas. It provides clients with a failure free interface to branch servers allowing them to perform the basic banking operations (deposit, withdraw, balance, transfer). Clients are only required to know the location of the server running the ORB. They then obtain a reference to the frontend hosting the account they wish to manipulate. Requests are sent via CORBA invocations to the frontend who then messages to the FIFO UDP to broadcast to each of its branch server replicas. Each replica does the requested operation on the account and returns a result in the form of an account balance. The results are all compared and response that reflects the correct result is sent back to the client. The CORBA middleware provides us with transparent threading and concurrency control. Also the UDP Server runs in its own thread and all its operations are hence asynchronous. Therefore spawn additional threads per request was no longer needed. 6|Page
  • 7.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 4.4. Replica Manager Why replication is needed Process failure should not imply the failure of the whole system A distributed system needs to implement techniques guaranteeing the availability and the correctness of the result provided to the client. We use active replication as per the specification so that the system may provide a collection of results, which represents a consensus that the client may rely on without the awareness of the inner process of replication. A replica is a server process that carries the execution of a client request with peers processing the request in its own memory space. This way we guarantee that a replicated process does not alter the state of another replica. But the replication comes at a cost. True or most up to date correct results must be maintained by proper message ordering or simply avoidance of byzantine failure (malicious or non malicious). In the best case all replicas hold the same correct value. In the worst case we expect the value to be returned be reached by a majority vote. We may expect the replicas to fail for arbitrary reasons. Hardware failure can be among the most dramatic. Replicas in the same host that fail will result in the total system failure unless you have a group that can take its place. In this project we do not make the assumption that “replica group replacement” can exist or that replicas are in different hosts but we keep in mind it can exist in some distributed systems. To prevent failure we implemented replicas has entities who relieve some of the work from the front end and insure that even if a replicated processes fails the front end can still manage a to send correct answer to the client. Inaccurate response should not be sent back to the client Now we need to discuss why replication in general provides good guarantee that no incorrect answer will be sent to the client. Suppose a single process is handling the request of a client. Suppose this answer is incorrect. First the front end cannot know the answer is incorrect and failure detection will be absent. 7|Page
  • 8.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Now we suppose we replicate a server process. In some case we may expect the replicas to send a different answer to the front end. We assume that a consensus can be reached by a majority vote. Else the server (replicas and front end) cannot proceed further and a replica must be rebuilt or recovered to the last know good state. In the project we make the assumption that replies sent by the replicas always reach a consensus and that this consensus is truthful (not altered by some byzantine general). The place of the replica and the replica manager within the system Replica and branch server The BranchServerProxy is the instance that holds reference to the replica (BranchServerImpl). A replica is a copy of a branch server holding a private copy of the set of accounts within the branch server. According to our assumption we assume two replicas will return the correct result but a faulty replica will “lie“when it returns its result to the front end. For the sake of simplicity we will assume a replica is an instance of a branch server. Life of a replica A branch server is expected to return correct results unless otherwise instructed. To keep the implementation general a branch server can be created, killed and synched with a trusted replica. A replica is kept alive while it is not generating three errors. If it does it is signaled to stop and synched with a trusted replica data. Killing a replica means the reference to the implementation along with private accounts are removed for garbage collection. This is not a good way to proceed in a real distributed system but since the set of accounts is no more than ten at any time this factor does not alter the performance of the system in any way. Once the reference to the implementation is collected by garbage collection it is synched with a trusted branch server. We define synching as requesting a trusted replica account data set. 8|Page
  • 9.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 The replicas do not communicate directly. Messages must be sent through a UPD server contained in the front end and send the response back to the failed replica. Accounts in the faulty replica are updated one by one until all accounts are updated to the correct values. The replica manager A replica manager is an entity in the system that manages the life of replicas. We expect the replica manager to signal to a server holding replicated data to stop referencing incorrect key/value pairs (id ,amount) defining bank accounts. It's job is to simply send a UDP message to the failed branch server proxy to eliminate the reference to the old replicated data. The replica manager does not play a bigger role in the system other than initiating the replacement of replicated bank accounts. We have decided to integrate its functionality to other subsystems. 4.5. Branch Servers / Replicas Branch Servers exist with their own repository which is simulated by Synchronized hash table and have methods to conduct operations on them; they don't have a transfer operation, as they can't call methods from a replica server having a different group of accounts. They exist as a reference to a Branch Server Proxy Object which handles udp messaging via its udp server and passes respective messages to it. The proxy object also handles the replica failure scenario by reinitializing the object. 4.6. Byzantine Scenarios The requirement for the project states that our failure free frontend must also detect a single non-malicious Byzantine failure and correct it using active replication. To achieve this we first had to make one of the replicas return incorrect results, which we accomplished via a command line parameter flag that when activate gave the replica high probability of producing these incorrect results. The frontend must also keeps a hash table containing each replica and the numbers of times they have consecutively sent false data to determine if it’s time to actively replicate one. 9|Page
  • 10.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 When the frontend examines each response from the replicas for a corresponding client request it is able to detect the single Byzantine failure by comparing two results that agree. The one that does not agree is obliviously the one that produced an incorrect result and must be dealt with. The replica error hash table entry for this replica is checked and if its error count is 3 it must be replicated and synced with clean data. The tasks of terminating the faulty replica, instantiating a new one and repopulating it with known good data from another replica are then delegated to the Replica Manager. Finally the frontend resets the error count of the new replica. How we simulate byzantine failure Under the conditions of the network where the software was tested no byzantine failure (malicious or not) was not expected to happen but in real conditions this has to be taken into account. Also if the network is reliable then we can expect byzantine failure due to message loss. To handle byzantine failure we had to simulate a replica sending inaccurate data. We were asked to implement a system were only one byzantine failure will occur at any time. Under this condition we assume only one replica will generate incorrect result. To implement this we have added an extra parameter to the replica indicating whether it will generate an incorrect result or not. This way we know only one byzantine replica will fail and we know we can determine which one it is. But this approach has the limitation of testing the system under unlikely circumstances but is sufficient for the scope of the assignment. 10 | P a g e
  • 11.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Fig . Sequence Diagram of the Fail Case Scenario and the respective action taken by the replica manage to handle the response. 11 | P a g e
  • 12.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 4.7. Reliable FIFO communication via UDP One of the requirements of the project was to have reliable Communication in a fifo order implemented over the unreliable UDP protocol. As per our design every entity requiring this network access has an object reference of a udp server. In our project those elements are the Branch Proxy and the Front End. The UDP server object runs in a thread and also maintains another concurrent Re transmit thread object which runs in parallel. To maintain FIFO ordering messages are held in queue and are numbered. Sequence number pairs are uniquely held for each destination and subsequently at the udp receiver object. One a packet is sent from the queue we await for a corresponding acknowledgment for that message. Once received the next message is acted upon. Message numbering allows us to discard messages arriving out of order and duplicate messages. The retransmit thread which runs a check method at a specified interval monitors whether the current message sent has had its Acknowledge returned if not he the same message is broadcast again. If the total retries exceeds a value a system exception is thrown. 12 | P a g e
  • 13.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Fig .Sequence diagram of a sample operation sequence for the FIFO UDP Subsystem. 4.8. Synchronization This is extension of the process implemented in the assignments. The Front End uses synchronized blocks to handle concurrency and also obtain fine grained control over the locking mechanism for maximum performance. Additionally Concurrent data structures are used were appropriate eg: Concurrent Queues in the UDP server, Hash tables for the account information. 13 | P a g e
  • 14.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 4.9. Asynchronous call back in Corba Fig. A simple setup with return types specified in Corba methods To avoid the above scenario we use void methods in our interface definitions but for each operation we register a call back object, which can be used by the FE to respond asynchronously without blocking the client and also can keep processing additional requests from the client. 14 | P a g e
  • 15.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 5. Test cases and overview Fig .Terminal Screens of the project being initialized normally , instance shows three front ends and associated 9 replica servers. 15 | P a g e
  • 16.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Fig . Source tree overview of the project 16 | P a g e
  • 17.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Fig. Output Consoles for the clients and replica serves running Driver Test1 ( Single thread test) 17 | P a g e
  • 18.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Fig . Double threaded client (Branch 30) 18 | P a g e
  • 19.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Fig . Double threaded client (Branch 20) 19 | P a g e
  • 20.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 Fig . Fifo UDP testing 20 | P a g e
  • 21.
    Concordia University Project Report SOEN 423 CS & SE Fall 2009 6. Team Organization and Contribution As per the Project document roles were already specified for the three team members . That is primarily how the team was organized there some overlap in areas in the later stages and the documentation was done together in team setting. Ali Ahmed - Primary responsibility to design and implement the FIFO UDP Message passing system. Asynchronous call back in Corba , Design of the front end and Replica servers *********** - Handling the Replication Manages and delegating responsibilities to the relevant classes for the appropriate actions 7. Conclusion Our implementation correctly implements the specifications and results are as expected, changing variable values results in output variations in line with manual calculations. Hence we feel the solution satisfies all criteria required in the scope of the assignment. 21 | P a g e