Clustering In The Wild

1,108 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,108
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
33
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Clustering In The Wild

  1. 1. Clustering in the wild <ul><li>Ugo Landini </li></ul><ul><ul><li>CTO, Sourcesense </li></ul></ul><ul><li>Sergio Bossa </li></ul><ul><ul><li>Software Architect, Sourcesense </li></ul></ul>
  2. 2. Agenda <ul><li>Why Clustering? </li></ul><ul><li>Clustering J(2)EE </li></ul><ul><li>Terracotta in a nutshell. </li></ul><ul><li>Jira clustering issues. </li></ul><ul><ul><li>Files and indexes. </li></ul></ul><ul><ul><li>Stateful applications and home grown caches. </li></ul></ul><ul><ul><li>Thread and services. </li></ul></ul><ul><ul><li>HTTP Session. </li></ul></ul><ul><li>Summary. </li></ul>
  3. 3. Why clustering? <ul><li>Horizontal scalability: </li></ul><ul><ul><li>Scale out. </li></ul></ul><ul><ul><li>More computers, to improve throughput when a single one is not enough or costs too much. </li></ul></ul><ul><li>High availability: </li></ul><ul><ul><li>More computers to improve uptime. </li></ul></ul><ul><ul><li>If you unplug a network cable, the system should remain up and running. </li></ul></ul><ul><ul><li>24/7, or around. </li></ul></ul><ul><ul><li>Usually more important than scalability. </li></ul></ul>
  4. 4. Clustering J(2)EE <ul><li>In an ideal world </li></ul><ul><ul><li><distributable /> tag in your web.xml </li></ul></ul><ul><ul><li>Serializable objects in your HTTP session. </li></ul></ul><ul><li>True, if and only if is J(2)EE Compliant </li></ul><ul><ul><li>Basically, no arbitrary use of resources and state </li></ul></ul><ul><ul><ul><li>Files. </li></ul></ul></ul><ul><ul><ul><li>Threads. </li></ul></ul></ul><ul><ul><ul><li>Sockets. </li></ul></ul></ul><ul><ul><ul><li>... ? </li></ul></ul></ul>
  5. 5. Clustering J(2)EE <ul><li>What do I do with my files? </li></ul><ul><ul><li>java.io.tmpdir </li></ul></ul><ul><ul><li>JNDI lookup </li></ul></ul><ul><li>What do I do with the state of my application (caches, conversational state, etc.)? </li></ul><ul><ul><li>Stateful Enterprise Java Beans </li></ul></ul><ul><ul><li>Well established caching frameworks </li></ul></ul><ul><ul><ul><li>EHCache, OSCache, JbossCache </li></ul></ul></ul><ul><ul><ul><li>JSR 107 </li></ul></ul></ul>
  6. 6. Clustering J(2)EE <ul><li>What do I do with my thread/services? </li></ul><ul><ul><li>JMS (MDBs and topics, mostly) </li></ul></ul><ul><ul><li>Commonj (Bea and IBM effort) </li></ul></ul><ul><li>What do I do with my HTTP Session? </li></ul><ul><ul><li>Serializable objects. </li></ul></ul><ul><ul><li>Use a good Load Balancer. </li></ul></ul>
  7. 7. Wake up! <ul><li>Almost all successful J(2)EE applications around won't pass the Sun AVK (Application Verification Kit). </li></ul><ul><li>Most people go straight for the simple solution </li></ul><ul><ul><li>and that one could be a cluster antipattern </li></ul></ul><ul><ul><li>home grown caches, lucene indexes, quartz jobs, singletons... add your favourite quickie here. </li></ul></ul>
  8. 8. Enter Terracotta <ul><li>Transparent (Translucid? ...) Clustering. </li></ul><ul><ul><li>Very few changes to already existent code. </li></ul></ul><ul><ul><li>Low development effort. </li></ul></ul><ul><li>Open Source, free for any use. </li></ul><ul><li>Emerging (and cool!) technology. </li></ul><ul><li>Did I mention that we are Terracotta partner? :) </li></ul>
  9. 9. The quest for antipatterns <ul><li>Jira is NOT easily clusterable, so it is a nice testbed. </li></ul><ul><li>Jira is a bug tracking, issue tracking, and project management application developed to make this process easier. </li></ul><ul><li>Jira is the leading issue tracker in the open source world (though it is not strictly open source). </li></ul><ul><li>People is asking for a clustered Jira! </li></ul><ul><ul><li>http://jira.atlassian.com/browse/JRA-7330 </li></ul></ul><ul><li>Did I mention that we are Atlassian partner? </li></ul>
  10. 10. Terracotta magic
  11. 11. Terracotta magic
  12. 12. Terracotta magic
  13. 13. Terracotta magic
  14. 14. Terracotta magic
  15. 15. Terracotta magic
  16. 16. Terracotta magic
  17. 17. Terracotta magic <ul><li>Terracotta moves around the bytes changed in shared objects </li></ul><ul><ul><li>No serialization. </li></ul></ul><ul><ul><li>superstatic objects! </li></ul></ul><ul><ul><li>same semantic, only new() behaves differently </li></ul></ul><ul><li>Demarcation of transaction with guarded block </li></ul><ul><ul><li>essentially moves multi-thread application semantic to cluster level. </li></ul></ul><ul><li>For performance reasons, for certain objects it moves behaviour and not data (logicaly managed vs physically managed objects) </li></ul><ul><ul><li>you can do the same thing if you need to. (distributed methods) </li></ul></ul>
  18. 18. Terracotta in a nutshell <ul><li>Features, part one: </li></ul><ul><ul><li>Transparent JVM-level clustering. </li></ul></ul><ul><ul><ul><li>Transparently works inside your JVM as an infrastructure service. </li></ul></ul></ul><ul><ul><ul><li>Plugs into your code thanks to bytecode injection. </li></ul></ul></ul><ul><ul><ul><li>No API, no code changes! </li></ul></ul></ul><ul><ul><li>Hub-and-Spoke architecture. </li></ul></ul><ul><ul><ul><li>Central server based architecture. </li></ul></ul></ul><ul><ul><ul><li>All nodes talk only to the central server. </li></ul></ul></ul><ul><ul><ul><li>Linear scalability. </li></ul></ul></ul><ul><ul><ul><li>No split-brain problem. </li></ul></ul></ul>
  19. 19. Terracotta in a nutshell <ul><li>Features, part two: </li></ul><ul><ul><li>Active/Passive mode. </li></ul></ul><ul><ul><ul><li>One central active server, n passive servers. </li></ul></ul></ul><ul><ul><li>Network Attached Memory. </li></ul></ul><ul><ul><ul><li>Shares your objects graph with the central server. </li></ul></ul></ul><ul><ul><ul><li>Virtual Heap (on disk, with Berkeley DB) </li></ul></ul></ul><ul><ul><ul><li>Maintains your object graph in the memory heap. </li></ul></ul></ul><ul><ul><li>Preserved Java semantics. </li></ul></ul><ul><ul><ul><li>Object equality (equals, hashCode) </li></ul></ul></ul><ul><ul><ul><li>Concurrency. (syncronized, java.util.concurrency) </li></ul></ul></ul><ul><ul><ul><li>Thread communication. (wait, notify) </li></ul></ul></ul>
  20. 20. Terracotta in a nutshell <ul><li>Main concepts: </li></ul><ul><ul><li>Roots. </li></ul></ul><ul><ul><ul><li>Defines where your shared objects graph starts. </li></ul></ul></ul><ul><ul><li>Locks. </li></ul></ul><ul><ul><ul><li>Ensures data consistency. </li></ul></ul></ul><ul><ul><ul><li>Enables Terracotta intra-node communication. </li></ul></ul></ul><ul><ul><ul><li>All code changing parts of the shared objects graph must be guarded by locks. </li></ul></ul></ul><ul><ul><li>Distributed methods. </li></ul></ul><ul><ul><ul><li>Enables plain old Java methods to be simultaneously called in all cluster nodes. </li></ul></ul></ul>
  21. 21. Out in the wild <ul><li>How did we actually cluster the beast? </li></ul>
  22. 22. Clustering Lucene indexes : Problems <ul><li>Lucene indexes are typically stored in files. </li></ul><ul><ul><li>Do you remember? clustering antipattern </li></ul></ul><ul><li>Used to improve data access speed. </li></ul><ul><li>How to cluster them? </li></ul><ul><ul><li>Network based solution : SAN or NFS. </li></ul></ul><ul><ul><ul><li>Not a viable solution due to locks </li></ul></ul></ul><ul><ul><li>Messaging based solution : JMS </li></ul></ul><ul><ul><ul><li>Complicated! </li></ul></ul></ul><ul><ul><ul><li>Indexes should improve performances, rather than make them worse! </li></ul></ul></ul>
  23. 23. Clustering Lucene indexes : Solution <ul><li>Let's store indexes in memory! </li></ul><ul><li>Lucene: </li></ul><ul><ul><li>Provides support for memory-based indexes. </li></ul></ul><ul><ul><li>Just use org.apache.lucene.store.RAMDirectory. </li></ul></ul><ul><li>Terracotta: </li></ul><ul><ul><li>Just a matter of configuration. </li></ul></ul><ul><ul><li>And you can share your lucene indexes. </li></ul></ul>
  24. 24. Clustering Jira caches : Problems <ul><li>Guess what ... Jira uses home grown caches! </li></ul><ul><ul><li>Do you remember? clustering antipattern </li></ul></ul><ul><ul><li>From bad to worse: </li></ul></ul><ul><ul><ul><li>No unified API! </li></ul></ul></ul><ul><ul><ul><ul><li>Just a lot of HashMaps and HashSets. </li></ul></ul></ul></ul><ul><ul><ul><li>Very poor locking policies. </li></ul></ul></ul><ul><ul><ul><ul><li>Makes configuration-only Terracotta clustering impossible! </li></ul></ul></ul></ul><ul><ul><li>Unfeasible to use an already existent caching framework. </li></ul></ul>
  25. 25. Clustering Jira caches : Solution <ul><li>Write a new, ad-hoc, unified caching API. </li></ul><ul><li>Goals: </li></ul><ul><ul><li>Simplicity. </li></ul></ul><ul><ul><ul><li>As simple as using an HashMap. </li></ul></ul></ul><ul><ul><li>Thread safety. </li></ul></ul><ul><ul><ul><li>Cache consistency. </li></ul></ul></ul><ul><ul><ul><li>Terracotta ready. </li></ul></ul></ul><ul><ul><li>Efficiency. </li></ul></ul><ul><ul><ul><li>No bottlenecks. </li></ul></ul></ul><ul><ul><ul><li>No liveness failures. </li></ul></ul></ul>
  26. 26. Caching API : Striving for simplicity. <ul><li>No strange methods. No cluster related configuration. </li></ul><ul><ul><li>Just the usual GET/PUT methods, and alike. </li></ul></ul><ul><ul><li>Terracotta makes the clustering work! </li></ul></ul><ul><li>When choosing how to cluster the cache: </li></ul><ul><ul><li>Distribute behaviour, rather than data. </li></ul></ul><ul><ul><ul><li>Jira puts heavyweight objects in cache. </li></ul></ul></ul><ul><ul><li>Distribute cache invalidation, rather than cache updates. </li></ul></ul><ul><ul><ul><li>Lower hit ratio but ... </li></ul></ul></ul><ul><ul><ul><ul><li>Lower network traffic! </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Higher simplicity! </li></ul></ul></ul></ul>
  27. 27. Caching API : Striving for thread safety. <ul><li>Carefully use Java locks (ok, this was obvious ...). </li></ul><ul><li>Due to how Jira works: </li></ul><ul><ul><li>The caching API must be able to group more than one cache under the same lock. </li></ul></ul><ul><ul><li>The caching API must be able to execute a code block atomically under the same lock. </li></ul></ul><ul><ul><li>Not so obvious ... </li></ul></ul><ul><ul><ul><li>Use what we call “ owner based locking.” </li></ul></ul></ul>
  28. 28. Caching API : Striving for efficiency. <ul><li>Choose the right balance between too fine grained and too coarse grained locks. </li></ul><ul><ul><li>Do not use complex lock constructs. </li></ul></ul><ul><ul><ul><li>Use plain synchronized blocks. </li></ul></ul></ul><ul><ul><li>Use lock striping techniques. </li></ul></ul>
  29. 29. Threads and services <ul><li>Jira periodically triggers threads: </li></ul><ul><ul><li>Do you remember? clustering antipattern </li></ul></ul><ul><li>Threaded Jira services: </li></ul><ul><ul><li>Mail sending. </li></ul></ul><ul><ul><li>Backup export. </li></ul></ul><ul><ul><li>Index optimization </li></ul></ul>
  30. 30. Clustering threads and services : Problems <ul><li>Threads cannot be clustered. </li></ul><ul><li>We have to cluster the launched services. </li></ul><ul><ul><li>Some services must be shared among cluster nodes. </li></ul></ul><ul><ul><li>Other services must be distributed. </li></ul></ul><ul><ul><li>How to distinguish them? </li></ul></ul>
  31. 31. Clustering threads and services : Solution <ul><li>Shared services. </li></ul><ul><ul><li>Clustered through Terracotta XML configuration. </li></ul></ul><ul><ul><li>A shared service is executed only on a single node. </li></ul></ul><ul><ul><li>The default. </li></ul></ul><ul><li>Distributed services. </li></ul><ul><ul><li>Distributed through Terracotta XML configuration. </li></ul></ul><ul><ul><li>A distributed service is executed on every node. </li></ul></ul><ul><ul><li>Just implement com.atlassian.jira.service.JiraDistributedService </li></ul></ul>
  32. 32. HTTP Session <ul><li>Two choices: </li></ul><ul><ul><li>Cluster it through Terracotta. </li></ul></ul><ul><ul><ul><li>Very hard. </li></ul></ul></ul><ul><ul><ul><ul><li>Again, Jira puts a lot of heavyweight objects into session. </li></ul></ul></ul></ul><ul><ul><li>Leave it unclustered. </li></ul></ul><ul><ul><ul><li>Use a load balancer with sticky sessions enabled. </li></ul></ul></ul><ul><ul><ul><ul><li>Jira is not a mission critical application. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>More simplicity, less complexity. </li></ul></ul></ul></ul><ul><li>Guess what we chose ... </li></ul><ul><ul><li>Please give me that shiny new load balancer ... </li></ul></ul>
  33. 33. Dealing with external code <ul><li>Applications are often pluggable. </li></ul><ul><li>Jira has a rich plugin architecture. </li></ul><ul><li>External plugins must fit and work into the cluster </li></ul><ul><ul><li>It is necessary to provide simple APIs or configuration options for making cluster-ready plugins. </li></ul></ul><ul><ul><ul><li>Practical example : com.atlassian.jira.service.JiraDistributedService </li></ul></ul></ul>
  34. 34. Toward an end <ul><li>Conclusions </li></ul>
  35. 35. Summary <ul><li>Terracotta is a transparent clustering solution but ... </li></ul><ul><ul><li>You have to take a lot of decisions and trade-off. </li></ul></ul><ul><li>If you have to access files in a clustered environment: </li></ul><ul><ul><li>Slow access: network filesystem, database system. </li></ul></ul><ul><ul><li>Fast access: use Terracotta network attached memory. </li></ul></ul><ul><li>If you have to cluster your application state: </li></ul><ul><ul><li>Carefully make it thread safe. </li></ul></ul><ul><ul><li>Choose between distributing data or behaviour. </li></ul></ul>
  36. 36. Summary <ul><li>If you have application services: </li></ul><ul><ul><li>Choose services to share. </li></ul></ul><ul><ul><ul><li>A shared service runs once per cluster. </li></ul></ul></ul><ul><ul><li>Choose services to distribute. </li></ul></ul><ul><ul><ul><li>A distributed service runs once per node. </li></ul></ul></ul><ul><li>If you have to cluster the HTTP session state: </li></ul><ul><ul><li>Consider not to cluster it! </li></ul></ul><ul><li>If you have to deal with application plugins: </li></ul><ul><ul><li>Provide API hooks or configuration options. </li></ul></ul>
  37. 37. Terracotta + Jira = Scarlet <ul><li>Scarlet. </li></ul><ul><ul><li>Clusters Jira through Terracotta. </li></ul></ul><ul><ul><li>Published as a Jira extension. </li></ul></ul><ul><ul><ul><li>http://confluence.atlassian.com/x/woQuBg </li></ul></ul></ul><ul><ul><li>Open Source. </li></ul></ul><ul><ul><ul><li>We want you! </li></ul></ul></ul><ul><ul><li>Actively developed: </li></ul></ul><ul><ul><ul><li>November 06, 2007 : 1.0 Beta 1. </li></ul></ul></ul><ul><ul><ul><li>Very soon : 1.0 Beta 2. </li></ul></ul></ul>
  38. 38. The end <ul><li>Q&A </li></ul>

×