Testing of Large Scale Distributed Software
Gerson Sunyé
Lina - Université de Nantes
http://sunye.free.fr

1
Agenda
• Large-Scale Distributed Systems.
• Characteristics of Distributed Software
• Testing of Distributed Systems
• Web...
Large-Scale Distributed Systems
Large-Scale Distributed (LSD) Systems
• A distributed system is a piece of software that ensures that a collection of
inde...
LSD Systems Examples

• Peer-to-Peer Systems: Gnutella, Napster, Skype, Joost, Distributed Hash
Tables, etc.
• Cloud Compu...
Cloud Computing
• Main concept: the computing is
"in the cloud".
• Cloud service providers:
Amazon, Microsoft, Google.
• I...
MapReduce
• Framework introduced by
Google.
• Supports distributed computing
on large data sets on clusters of
computers.
...
Distributed Hash Table (DHT)
• Second generation P2P overlay network (Structured).
• Self-organizing, load balanced, fault...
DHT Principles
• Each node is responsible for a range of keys.
• Core operation: find the node responsible for a given key...
DHT
• Node X Id association:
• 1 node knows all other nodes (centralized).
• All nodes know all other nodes (replication)....
DHT Example: Chord
Why do we need DHTs ?
• Abstraction:
• Persistence, volatility, message routing, etc.
• Applications:
• File systems.
• Da...
Characteristics of Distributed Software

13
How is distributed software different?

• Non-determinism;
• Third-party infrastructure;
• Partial failures;
• Time-outs;
...
Non-determinism

• Difficult to reproduce a test execution.
• The thread execution order may be affected by external progr...
Third-party infrastructure

• Distributed software often rely on third-party middleware: RPC, Message
Exchanging, Brokers,...
Partial Failures

• A failure in a particular node prevents part of the system from achieving its
behavior.

© G. Sunyé,
2...
Time-Outs

• Time-outs are used to avoid deadlocks.
• When a response is not received within a certain time, the request i...
Dynamic nature of the structure.

• Often, the architecture changes.
• Specific requests may be redirected dynamically to ...
Distributed Software Testing

20
How is Distributed Testing Different?

• Test harness deployment.
• Results retrieval.
• Execution synchronization.
• Node...
How is Distributed Testing Different?

• Level: System Testing.
• Growing need of non-functional testing: security, load, ...
Test harness deployment

• Test harness must be deployed to every node that composes the SUT.
• Input data.
• Test cases.
...
Results retrieval

• All results must be retrieved.
• A timeline must be constructed.

© G. Sunyé,
2010.

Université de
Na...
Synchronization Service
• The testing architecture must ensure synchronization among nodes.

25
Node Failure Simulation

• Removing network connections.
• Shutting down a node.
• Blocking all threads of a node.
• Testi...
Number of Nodes Variation
• Tests must be executed on different configurations.

27
Distributed Test Architecture

28
A Distributed System

29
Distributed Test Architecture

30
A Distributed Test Case

Step

Action

Nodes

1

insert(K,V)

1

2

V’ := retrieve(K)

all

3

? V = V’

all
31
Test Case Execution

? V = V’
insert(K,
V)
V’ :=
retrieve(K)

S
tep
1
2
3

Action
insert(K,V
)
V’ :=
retrieve(K)
? V = V’
...
Web Software

34
Web Software
• Composed of two or more components:
• Loosely coupled.
• Communicate by messages.
• Almost no shared memory...
Web Applications

• A software deployed on the web.
• HTML User Interface.
• International (available worldwide).
• HTTP a...
How is Web Application Testing Different?

• Traditional graphs (control flow, call) do not apply.
• State behavior is har...
Common difficulties

• Stateless protocol: requests are independent from each other.
• Different hardware devices, program...
New Problems
• The user interface is created dynamically.
• The flow of control may be changed:
• back/forward, refresh
• ...
Web Services

41
Web Services

• A software deployed on the web that accepts remote procedure call (RPC):
SOAP, XML-RPC, RESTful.
• No huma...
Service Publication

43
How is Web Services Testing Different?

• Communications are often between services published by different
organizations.
...
Peer-to-Peer Systems

45
P2P Applications

• Huge number of nodes.
• No central point of failure.
• Node volatility is a common behavior.
• Nodes a...
How is P2P Testing Different?

• Volatility must be simulated and controlled.
• Centralized services are a bottleneck:
• S...
Tools

48
Testing Tools

• SeleniumHQ
• HttpUnit
• Apache JMeter
• PeerUnit

© G. Sunyé,
2010.

Université de
Nantes

49
SeleniumHQ

• Selenium IDE: Firefox plugin
• Selenium Remote Control: test runner
• Selenium Grid: RC extension for parall...
HttpUnit

• Server testing.
• Emulates browser behavior.

© G. Sunyé,
2010.

Université de
Nantes

51
Apache JMeter

• Performance testing.
• Not restricted to web software.
• DBMS, LDAP, JMS, SMTP, etc.

© G. Sunyé,
2010.

...
A Framework for Testing Distributed
Systems

58
PeerUnit

• A framework for testing P2P systems;
• Written in Java 1.5;
• Open source, GPL license.

© G. Sunyé,
2010.

Un...
Properties

• Uses Java annotations.
• Synchronization (for > 1,000 testers).
• Volatility simulation.
• Shared variables....
Basic Principles

Test

PeerUnit

Test Case

Java Class

Test Step

Java Method

61
A Simple Test Case
public class SimpleTest {
@TestStep(range = "1", timeout = 100,
order = 1)
public void insert() {...}
@...
Test Execution
Step 1 - Start the Test
Controller:
java
fr.inria.peerunit.CoordinatorRunner
&Step 2 - Run 1 or more
Tester...
Test Results

[java] -----------------------------[java] Test Case Verdict:
[java] Step: 1. Pass: 1. Fails: 0. Errors: 0. ...
PeerUnit Architecture
Architecture 1 : Centralized
Architecture 2: Tree
Architecture 3: Optimized Tree
Synchronization Performance
Systems already tested

• OpenChord
• FreePastry
• Kademlia
• PostgreSQL
Testing a DHT

71
The SUT Interface

72
Test Case Summary

Step

Action

Testers

1

Create the (mock) DHT

1

2

Connect to DHT

*

3

Insert Data

2

4

Retriev...
Test Case Implementation

(RemoteDHTTest.j
ava)

74
Configuration

tester.peers=7
tester.server=127.0.0.1
tester.port=8181
# Log level order SEVERE,WARNING,INFO,CONFIG,FINE,F...
Test Case Execution

• Create a build file.
• Deploy the test bed.
• Execute.
• Demo.

© G. Sunyé,
2010.

Université de
Na...
Conclusion and
References
References

77
Conclusion

• Large Scale Distributed Systems provide a new way to build software.
• Several new technologies introducing ...
Bibliography

• « Testing Object-Oriented Systems: Models, Patterns, and Tools».
• Robert V. Binder.
• Addison-Wesley Obje...
« If debugging is the process of removing bugs,
then programming must be the process of
putting them in»
Edsger W.
Dijkstr...
Upcoming SlideShare
Loading in …5
×

Testing Large-Scale Distributed Software

1,557 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,557
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
44
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Details on next slides.
  • defined by a scheduler
  • Changement de versions?
    self-organizing: A specific input data set can cause a different path to be executed because the previous path no longer exists
  • Software must perform correctly when request is answered and when it isn’t, albeit perhaps differently
  • timeline: can not use a ntp, can not modify datetime on machine.
  • Once synchronization has been mastered everything else falls into place!
  • web software == low quality !
    must test more !!!
  • Shared memory: only by «session object», stored in the client and restricted.
  • They accept requests through HTTP and return responses
    HTTP is stateless – each request/response pair is independent
  • UI dynamic: the ui may vary according to users/browsers/devices/versions/etc.
    SQL Injection:
    SELECT username FROM adminuser WHERE username=‘turing’ OR ‘1’ = ‘1’ AND password =‘enigma’ OR ‘1’ = ‘1’
    Dynamically: EJB, WebServices.
  • Trust is a major issue!
    Another one: other implementation or other version.
  • IDE: Record actions.
    RC: runs tests in multiple browsers (http proxy).
  • Testing Large-Scale Distributed Software

    1. 1. Testing of Large Scale Distributed Software Gerson Sunyé Lina - Université de Nantes http://sunye.free.fr 1
    2. 2. Agenda • Large-Scale Distributed Systems. • Characteristics of Distributed Software • Testing of Distributed Systems • Web Applications and Web Services. • Peer-to-Peer Systems • A Framework for Testing Distributed Systems • Conclusion
    3. 3. Large-Scale Distributed Systems
    4. 4. Large-Scale Distributed (LSD) Systems • A distributed system is a piece of software that ensures that a collection of independent computers appears to its users as a single coherent system. • Large scale often relates to systems with thousands or millions of nodes.
    5. 5. LSD Systems Examples • Peer-to-Peer Systems: Gnutella, Napster, Skype, Joost, Distributed Hash Tables, etc. • Cloud Computing: GoogleDocs, ThinkFree Online, etc. • Grid Computing: Berkeley Open Infrastructure for Network Computing (BOINC), etc. • MapReduce: Hadoop, Google, Oracle, etc. © G. Sunyé, 2010. Université de Nantes 5
    6. 6. Cloud Computing • Main concept: the computing is "in the cloud". • Cloud service providers: Amazon, Microsoft, Google. • Infrastructure as a service. • Software as a service. • Examples: Eucalyptus, OpenNebula, etc. 6
    7. 7. MapReduce • Framework introduced by Google. • Supports distributed computing on large data sets on clusters of computers. • Exemples: Apache Hadoop, Greenplum’s, Aster Data’s, etc. • Users: Google Index, Facebook,... 7
    8. 8. Distributed Hash Table (DHT) • Second generation P2P overlay network (Structured). • Self-organizing, load balanced, fault tolerant. • Scalable: guarantees the number of hops to answer a query. • Simple interface: • store(key, value) • retrieve(key) : value
    9. 9. DHT Principles • Each node is responsible for a range of keys. • Core operation: find the node responsible for a given key. ¿Quién es responsable por la clave «AF80»? ¡Yo!
    10. 10. DHT • Node X Id association: • 1 node knows all other nodes (centralized). • All nodes know all other nodes (replication). • All nodes have a partial view of the whole system.
    11. 11. DHT Example: Chord
    12. 12. Why do we need DHTs ? • Abstraction: • Persistence, volatility, message routing, etc. • Applications: • File systems. • Data sharing (Gnutella index).
    13. 13. Characteristics of Distributed Software 13
    14. 14. How is distributed software different? • Non-determinism; • Third-party infrastructure; • Partial failures; • Time-outs; • Dynamic nature of the structure. © G. Sunyé, 2010. Université de Nantes 14
    15. 15. Non-determinism • Difficult to reproduce a test execution. • The thread execution order may be affected by external programs. • Some defects do not appear on all executions. © G. Sunyé, 2010. Université de Nantes 15
    16. 16. Third-party infrastructure • Distributed software often rely on third-party middleware: RPC, Message Exchanging, Brokers, SOA, etc. • Infrastructure may be self-organizing. • Infrastructure (services) may belong to other entities. © G. Sunyé, 2010. Université de Nantes 16
    17. 17. Partial Failures • A failure in a particular node prevents part of the system from achieving its behavior. © G. Sunyé, 2010. Université de Nantes 17
    18. 18. Time-Outs • Time-outs are used to avoid deadlocks. • When a response is not received within a certain time, the request is aborted. • This is not an error. © G. Sunyé, 2010. Université de Nantes 18
    19. 19. Dynamic nature of the structure. • Often, the architecture changes. • Specific requests may be redirected dynamically to different nodes, depending on the load on various machines • The number of participating nodes may vary. © G. Sunyé, 2010. Université de Nantes 19
    20. 20. Distributed Software Testing 20
    21. 21. How is Distributed Testing Different? • Test harness deployment. • Results retrieval. • Execution synchronization. • Node failure simulation. • Number of nodes variation. © G. Sunyé, 2010. Université de Nantes 21
    22. 22. How is Distributed Testing Different? • Level: System Testing. • Growing need of non-functional testing: security, load, stress. © G. Sunyé, 2010. Université de Nantes 22
    23. 23. Test harness deployment • Test harness must be deployed to every node that composes the SUT. • Input data. • Test cases. • Software under test. • Libraries. © G. Sunyé, 2010. Université de Nantes 23
    24. 24. Results retrieval • All results must be retrieved. • A timeline must be constructed. © G. Sunyé, 2010. Université de Nantes 24
    25. 25. Synchronization Service • The testing architecture must ensure synchronization among nodes. 25
    26. 26. Node Failure Simulation • Removing network connections. • Shutting down a node. • Blocking all threads of a node. • Testing must ensure that software behaves correctly when time-outs are reached. © G. Sunyé, 2010. Université de Nantes 26
    27. 27. Number of Nodes Variation • Tests must be executed on different configurations. 27
    28. 28. Distributed Test Architecture 28
    29. 29. A Distributed System 29
    30. 30. Distributed Test Architecture 30
    31. 31. A Distributed Test Case Step Action Nodes 1 insert(K,V) 1 2 V’ := retrieve(K) all 3 ? V = V’ all 31
    32. 32. Test Case Execution ? V = V’ insert(K, V) V’ := retrieve(K) S tep 1 2 3 Action insert(K,V ) V’ := retrieve(K) ? V = V’ Nod es 1 V’ := retrieve(K) ? V = V’ ? V = V’ V’ := retrieve(K) ? V = V’ V’ := retrieve(K) all all Verdict = [PASS | FAIL] 32
    33. 33. Web Software 34
    34. 34. Web Software • Composed of two or more components: • Loosely coupled. • Communicate by messages. • Almost no shared memory. • Concurrent and distributed. • Technological jungle. © G. Sunyé, 2010. Université de Nantes 35
    35. 35. Web Applications • A software deployed on the web. • HTML User Interface. • International (available worldwide). • HTTP as underlying protocol. © G. Sunyé, 2010. Université de Nantes 36
    36. 36. How is Web Application Testing Different? • Traditional graphs (control flow, call) do not apply. • State behavior is hard to model and to describe (HTTP is stateless). • All inputs go through the HTML UI – low controllability. • Stress testing is crucial. © G. Sunyé, 2010. Université de Nantes 37
    37. 37. Common difficulties • Stateless protocol: requests are independent from each other. • Different hardware devices, programming languages, versions, etc. © G. Sunyé, 2010. Université de Nantes 38
    38. 38. New Problems • The user interface is created dynamically. • The flow of control may be changed: • back/forward, refresh • HTML may be rewritten! • Input validation. • Dynamic integration: modules can be loaded dynamically. © G. Sunyé, 2010. Université de Nantes 39
    39. 39. Web Services 41
    40. 40. Web Services • A software deployed on the web that accepts remote procedure call (RPC): SOAP, XML-RPC, RESTful. • No human interface. • Must be published to be discovered. © G. Sunyé, 2010. Université de Nantes 42
    41. 41. Service Publication 43
    42. 42. How is Web Services Testing Different? • Communications are often between services published by different organizations. • The owner of a service can not control the external services it uses. • Reconfiguration: a service can be replace by another one. © G. Sunyé, 2010. Université de Nantes 44
    43. 43. Peer-to-Peer Systems 45
    44. 44. P2P Applications • Huge number of nodes. • No central point of failure. • Node volatility is a common behavior. • Nodes are autonomous. © G. Sunyé, 2010. Université de Nantes 46
    45. 45. How is P2P Testing Different? • Volatility must be simulated and controlled. • Centralized services are a bottleneck: • Synchronization. • Results retrieval. © G. Sunyé, 2010. Université de Nantes 47
    46. 46. Tools 48
    47. 47. Testing Tools • SeleniumHQ • HttpUnit • Apache JMeter • PeerUnit © G. Sunyé, 2010. Université de Nantes 49
    48. 48. SeleniumHQ • Selenium IDE: Firefox plugin • Selenium Remote Control: test runner • Selenium Grid: RC extension for parallel testing. © G. Sunyé, 2010. Université de Nantes 50
    49. 49. HttpUnit • Server testing. • Emulates browser behavior. © G. Sunyé, 2010. Université de Nantes 51
    50. 50. Apache JMeter • Performance testing. • Not restricted to web software. • DBMS, LDAP, JMS, SMTP, etc. © G. Sunyé, 2010. Université de Nantes 52
    51. 51. A Framework for Testing Distributed Systems 58
    52. 52. PeerUnit • A framework for testing P2P systems; • Written in Java 1.5; • Open source, GPL license. © G. Sunyé, 2010. Université de Nantes 59
    53. 53. Properties • Uses Java annotations. • Synchronization (for > 1,000 testers). • Volatility simulation. • Shared variables. • Distributed Architecture. © G. Sunyé, 2010. Université de Nantes 60
    54. 54. Basic Principles Test PeerUnit Test Case Java Class Test Step Java Method 61
    55. 55. A Simple Test Case public class SimpleTest { @TestStep(range = "1", timeout = 100, order = 1) public void insert() {...} @TestStep(range = "*", timeout = 100, order = 2) public void retrieve() {...} @TestStep(range = "*", timeout = 100, order = 3) 62
    56. 56. Test Execution Step 1 - Start the Test Controller: java fr.inria.peerunit.CoordinatorRunner &Step 2 - Run 1 or more Testers: java fr.inria.peerunit.TestRunner SimpleTest & 63
    57. 57. Test Results [java] -----------------------------[java] Test Case Verdict: [java] Step: 1. Pass: 1. Fails: 0. Errors: 0. Inconclusive: 0. elapsed: 23 msec. Average: 12 msec. Method: insert Time [java] Step: 2. Pass: 7. Fails: 0. Errors: 0. Inconclusive: 0. elapsed: 57 msec. Average: 11 msec. Method: retrieve Time [java] Step: 3. Pass: 7. Fails: 0. Errors: 0. Inconclusive: 0. elapsed: 5 msec. Average: 1 msec. Method: compare Time [java] ------------------------------ 64
    58. 58. PeerUnit Architecture
    59. 59. Architecture 1 : Centralized
    60. 60. Architecture 2: Tree
    61. 61. Architecture 3: Optimized Tree
    62. 62. Synchronization Performance
    63. 63. Systems already tested • OpenChord • FreePastry • Kademlia • PostgreSQL
    64. 64. Testing a DHT 71
    65. 65. The SUT Interface 72
    66. 66. Test Case Summary Step Action Testers 1 Create the (mock) DHT 1 2 Connect to DHT * 3 Insert Data 2 4 Retrieve and Compare * 5 Insert More Data 1 6 Retrieve and Compare * 73
    67. 67. Test Case Implementation (RemoteDHTTest.j ava) 74
    68. 68. Configuration tester.peers=7 tester.server=127.0.0.1 tester.port=8181 # Log level order SEVERE,WARNING,INFO,CONFIG,FINE,FINER,FINEST tester.log.level=INFO tester.logfolder=. # Tree Parameters test.treeOrder=1 # Coordination Type (0) centralized, (1) tree test.coordination=1 75
    69. 69. Test Case Execution • Create a build file. • Deploy the test bed. • Execute. • Demo. © G. Sunyé, 2010. Université de Nantes 76
    70. 70. Conclusion and References References 77
    71. 71. Conclusion • Large Scale Distributed Systems provide a new way to build software. • Several new technologies introducing fascinating new problems. • Very active research area. © G. Sunyé, 2010. Université de Nantes 78
    72. 72. Bibliography • « Testing Object-Oriented Systems: Models, Patterns, and Tools». • Robert V. Binder. • Addison-Wesley Object Technology Series. • «Introduction to Software Testing». • Paul Ammann and Jeff Offutt. • Cambridge University Press. © G. Sunyé, 2010. Université de Nantes 80
    73. 73. « If debugging is the process of removing bugs, then programming must be the process of putting them in» Edsger W. Dijkstra « There’s always one more bug» Lubarsky’s Law of Cybernetic Entomology 81

    ×