Infinispanfrom POC to Production
Who am I?   Mark Addy, Senior ConsultantFast, Reliable, Secure, Manageable
AgendaPart 1• An existing production system unable to scalePart 2• A green-field project unable to meet SLA’s
About the Customer• Global on-line travel & accommodation  provider  – 50 million searches per day• Our relationship  – Tr...
Part 1         Part 1 – Existing Application     Connectivity Engine     • Supplements site content with data from       t...
Part 1 Logical View
Part 1         Content Provider Challenges    •    Unreliable third party systems    •    Distant network communications  ...
Part 1                        Existing Cache      • NOT Hibernate 2LC      • Spring Interceptors wrap calls to content pro...
Part 1         Extreme Redundancy               800,000 elements          10 nodes = 10 copies of data
Part 1                      The Price     • 10G JVM Heap         – 10-12 second pauses for major GC         – Over 8G of h...
Part 1  How to Scale?
Part 1                     Objectives     • Reduce JVM Heap Size         – 10 second pauses are too long     • Increase ca...
Part 1                   Discussions     • Pre-sales workshop         – Express Terracotta EhCache         – Oracle Cohere...
Part 1             Why Infinispan?    • Open source advocates    • Cutting edge technology
Part 1                  Benchmarking     • Must be reproducible     • Must reflect accurately the production       load an...
Part 1                       Solution     • Replica load-test environment     • Port mirror production traffic         – C...
Part 1         Benchmarking Architecture
Part 1          Benchmarking Validation     • Understand your cached data         – jmap         jmap –dump:file=mydump.hp...
Part 1               Benchmarking Validation  Extract cached object properties - you can learn a lot quickly         –   c...
Part 1           Enable JMX for Infinispan    Enable CacheManager Statistics          <global>             <globalJmxStati...
Part 1            Enable Remote JMX    -Dcom.sun.management.jmxremote.port=nnnn    -Dcom.sun.management.jmxremote.authenti...
Part 1             Record Performance     • RHQ http://rhq-project.org         – JVM memory, GC profile, CPU usage        ...
Part 1         Infinispan
Part 1         Distributed Mode         hash(key) determines owners
Part 1             Distribution Features     • Configurable redundancy         – numOwners     • Dynamic scaling         –...
Part 1                        Hotrod     • Client – Server architecture         – Java client         – Connection pooling...
Part 1    Application – Cache Separation  Application  • CPU intensive  • High infant mortality  Cache  • Low CPU requirem...
Part 1         Hotrod Architecture
Part 1     Remember this is cutting edge     • Latest final release was 4.2.1     • Lets get cracking...         – Distrib...
Part 1  Topology Aware Consistent Hash     • Ensure back-ups are held preferentially on       separate machine, rack and s...
Part 1               Virtual Nodes    Sub-divides hash wheel positions    <hash numVirtualNodes=2/>
Part 1                  Virtual Nodes     • Improves data distribution     • But didn’t work at the time for Hotrod     • ...
Part 1         Hotrod Concurrent Start-up • Dynamic scaling    – Replicated ___hotRodTopologyCache holds current cluster t...
Part 1     Hotrod Client Failure Detection         Unable to recover from cluster splits
Part 1    Hotrod Client Failure Detection    • New servers only added to      ___hotRodTopologyCache on start-up    • Rest...
Part 1   Hotrod Server Cluster Meltdown
Part 1   Hotrod Server Cluster Meltdown     • Clients can’t start without an available       server     • Static Configura...
Part 1                      Change of tack     • Hotrod abandoned, for now         –   Data distribution         –   Concu...
Part 1                    Dynamic Scaling    • Unpredictable under heavy load, writers blocked         – Unacceptable wait...
Part 1                 Cache Entry Size    • Average cache entry ~6K         – 1 million entries = 6GB         – Hotrod st...
Part 1         Compression Considerations     • Trade         – Capacity in JVM vs Serialization Overhead     • Suitabilit...
Part 1           Advanced Cache Tuning     cache.getAdvancedCache.withFlags(Flag... flags)     • Flag.SKIP_REMOTE_LOOKUP  ...
Part 1                         JGroups    • UDP out-performed TCP (for us)    • Discovery         – For a cold, full clust...
Part 1         Extending Embedded
Part 1         Current Production System    • Over 20 nodes         – 8 Request facing, remainder storage only    • Over 1...
Part 1                           Summary     • Don’t compromise on the benchmarking         – Understand your cached data ...
Part 2         Part 2 – Green Field SLA’s    New Pricing Engine         –   Tomcat         –   Spring & Grails         –  ...
Part 2                        Logical View  • New Pricing Engine         – Side by side rollout         – Controller deter...
Part 2                Proposed Caching    • Everything distributed         – It worked before so we just turn in on, right?
Part 2                         The Pain     • Distributed Mode         – Network saturation on 1Gb switch           (125MB...
Part 2                    Cache Analysis • Eclipse Memory Analyzer    – Identify cache profile    – Small subset of elemen...
Part 2              Revised Caching    • Local caching for numerous “small” elements    • Distributed for “large” expensiv...
Part 2                  Distributed Issue • Here’s why normal distributed doesn’t work    – One Paris request requires 500...
Part 2                             Options  • Rewrite the application caching logic         – Significantly reduce the ele...
Part 2          Change in Psychology...           If the mountain will not come to         Muhammad, then Muhammad must go...
Part 2                  Distributed Execution  • DefaultExecutorService         – http://docs.jboss.org/infinispan/5.1/api...
Part 2                Pricing Controller     • Callable task         – Contains code to           • Grab reference to loca...
Part 2                    Pricing Controller • Create a new DefaultExecutorService    – Create callable tasks required to ...
Part 2           Distributed Execution    • Only the relevant information from the cache      entry is returned
Part 2                       Results     • Latency – Paris search         – Historic Engine 2 seconds         – Dist-Exec ...
Part 2                        Limitations     • Failover         – Task sent to primary owner only         – https://commu...
Part 2                         Summary     • Analysis and re-design of cached data     • Accessing large data sets require...
Thanks for Listening!    Any Questions?
Infinispan from POC to Production
Upcoming SlideShare
Loading in …5
×

Infinispan from POC to Production

4,373 views

Published on

by Mark Addy - Senior Middleware Consultant at C2B2
Presented at JBUG London and JUDCon Boston in June 2012

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,373
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
31
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Infinispan from POC to Production

  1. 1. Infinispanfrom POC to Production
  2. 2. Who am I? Mark Addy, Senior ConsultantFast, Reliable, Secure, Manageable
  3. 3. AgendaPart 1• An existing production system unable to scalePart 2• A green-field project unable to meet SLA’s
  4. 4. About the Customer• Global on-line travel & accommodation provider – 50 million searches per day• Our relationship – Troubleshooting – Workshops
  5. 5. Part 1 Part 1 – Existing Application Connectivity Engine • Supplements site content with data from third parties (Content Providers) – Tomcat – Spring – EhCache – MySQL – Apache load-balancer / mod_jk
  6. 6. Part 1 Logical View
  7. 7. Part 1 Content Provider Challenges • Unreliable third party systems • Distant network communications • Critical for generating local site content • Response time • Choice & low response time == more profit
  8. 8. Part 1 Existing Cache • NOT Hibernate 2LC • Spring Interceptors wrap calls to content providers<bean id="searchService" class="org.springframework.aop.framework.ProxyFactoryBean"> <property name="proxyInterfaces" value=“ISearchServiceTargetBean"/> <property name="target" ref="searchServiceTargetBean"/> <property name="interceptorNames"> <list> <value>cacheInterceptor</value> </list> </property></bean><bean id="searchServiceTargetBean“ class=“SearchServiceTargetBean"> ...</bean>
  9. 9. Part 1 Extreme Redundancy 800,000 elements 10 nodes = 10 copies of data
  10. 10. Part 1 The Price • 10G JVM Heap – 10-12 second pauses for major GC – Over 8G of heap is cache • Eviction before Expiry – More trips to content providers • EhCache expiry / eviction piggybacks client cache access
  11. 11. Part 1 How to Scale?
  12. 12. Part 1 Objectives • Reduce JVM Heap Size – 10 second pauses are too long • Increase cache capacity • Remove Eviction – Cache entries should expire naturally • Improve Response Times – Latency decreases if eviction, GC pauses and frequency are reduced
  13. 13. Part 1 Discussions • Pre-sales workshop – Express Terracotta EhCache – Oracle Coherence – Infinispan
  14. 14. Part 1 Why Infinispan? • Open source advocates • Cutting edge technology
  15. 15. Part 1 Benchmarking • Must be reproducible • Must reflect accurately the production load and data – 50 million searches / day == 600 / sec • Must be able to imitate the content providers
  16. 16. Part 1 Solution • Replica load-test environment • Port mirror production traffic – Capture incoming requests – Capture content provider responses • Custom JMeter script • Mock application Spring Beans
  17. 17. Part 1 Benchmarking Architecture
  18. 18. Part 1 Benchmarking Validation • Understand your cached data – jmap jmap –dump:file=mydump.hprof <pid> – Eclipse Memory Analyzer – OQL SELECT toString(x.key) , x.key.@retainedHeapSize , x.value.@retainedHeapSize FROM net.sf.ehcache.Element x
  19. 19. Part 1 Benchmarking Validation Extract cached object properties - you can learn a lot quickly – creationTime – timeToLive – lastAccessTime – timeToIdle – lastUpdateTime – etc – hitCount – etc
  20. 20. Part 1 Enable JMX for Infinispan Enable CacheManager Statistics <global> <globalJmxStatistics enabled="true" jmxDomain="org.infinispan" cacheManagerName=“MyCacheManager"/> ... </global> Enable Cache Statistics <default> <jmxStatistics enabled="true"/> ... </default>
  21. 21. Part 1 Enable Remote JMX -Dcom.sun.management.jmxremote.port=nnnn -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
  22. 22. Part 1 Record Performance • RHQ http://rhq-project.org – JVM memory, GC profile, CPU usage – Infinispan plugin
  23. 23. Part 1 Infinispan
  24. 24. Part 1 Distributed Mode hash(key) determines owners
  25. 25. Part 1 Distribution Features • Configurable redundancy – numOwners • Dynamic scaling – Automatic rebalancing for distribution and recovery of redundancy • Replication (distribution) overhead does not increase as more nodes are added
  26. 26. Part 1 Hotrod • Client – Server architecture – Java client – Connection pooling – Dynamic scaling – Smart routing • Separate application and cache memory requirements
  27. 27. Part 1 Application – Cache Separation Application • CPU intensive • High infant mortality Cache • Low CPU requirement • Mortality linked to expiry / eviction
  28. 28. Part 1 Hotrod Architecture
  29. 29. Part 1 Remember this is cutting edge • Latest final release was 4.2.1 • Lets get cracking... – Distributed mode – Hotrod client – What issues did we encounter...
  30. 30. Part 1 Topology Aware Consistent Hash • Ensure back-ups are held preferentially on separate machine, rack and site • https://community.jboss.org/thread/168236 • We need to upgrade to the latest 5.0.0.CR
  31. 31. Part 1 Virtual Nodes Sub-divides hash wheel positions <hash numVirtualNodes=2/>
  32. 32. Part 1 Virtual Nodes • Improves data distribution • But didn’t work at the time for Hotrod • https://issues.jboss.org/browse/ISPN-1217
  33. 33. Part 1 Hotrod Concurrent Start-up • Dynamic scaling – Replicated ___hotRodTopologyCache holds current cluster topology – New starters must lock and update this cache to add themselves to the current view – Deadlock! – https://issues.jboss.org/browse/ISPN-1182 • Stagger start-up
  34. 34. Part 1 Hotrod Client Failure Detection Unable to recover from cluster splits
  35. 35. Part 1 Hotrod Client Failure Detection • New servers only added to ___hotRodTopologyCache on start-up • Restart required to re-establish client topology view
  36. 36. Part 1 Hotrod Server Cluster Meltdown
  37. 37. Part 1 Hotrod Server Cluster Meltdown • Clients can’t start without an available server • Static Configuration is only read once • To restart client-server communications either – Restart last “known” server – Restart the client
  38. 38. Part 1 Change of tack • Hotrod abandoned, for now – Data distribution – Concurrent start up – Failure detection – Unacceptable for this customer • Enter the classic embedded approach • How did we get this to work...
  39. 39. Part 1 Dynamic Scaling • Unpredictable under heavy load, writers blocked – Unacceptable waits for this system <hash numOwners=“2” rehashEnabled=“false” /> – Accept some data loss during a leave / join • Chunked rehashing / state transfer (5.1) – https://issues.jboss.org/browse/ISPN-284 • Non-blocking state transfer – https://issues.jboss.org/browse/ISPN-1424 • Manual rehashing – https://issues.jboss.org/browse/ISPN-1394
  40. 40. Part 1 Cache Entry Size • Average cache entry ~6K – 1 million entries = 6GB – Hotrod stores serialized entries by default • JBoss Marshalling – Default Infinispan mechanism – Get reference from ComponentRegistry • JBoss Serialization – Quick, easy to implement
  41. 41. Part 1 Compression Considerations • Trade – Capacity in JVM vs Serialization Overhead • Suitability – Assess on a cache by cache basis – Very high access is probably too expensive • Average 6K reduced to 1K
  42. 42. Part 1 Advanced Cache Tuning cache.getAdvancedCache.withFlags(Flag... flags) • Flag.SKIP_REMOTE_LOOKUP – Prevents remote gets being run for an update put(K key, V value) DistributionInterceptor.remoteGetBeforeWrite() DistributionInterceptor.handleWriteCommand() DistributionInterceptor.visitPutKeyValueCommand() – We don’t need to return the previous cache entry value
  43. 43. Part 1 JGroups • UDP out-performed TCP (for us) • Discovery – For a cold, full cluster start-up avoid split brain / merge scenarios <PING timeout="3000" num_initial_members="10"/> • Heartbeat – Ensure failure detection is configured appropriately <FD_ALL interval="3000" timeout="10000"/>
  44. 44. Part 1 Extending Embedded
  45. 45. Part 1 Current Production System • Over 20 nodes – 8 Request facing, remainder storage only • Over 15 million entries – 7.5 million unique – 20GB cached data – Nothing is evicted before natural expiration • 5GB JVM Heap, 3-4 second GC pauses • 30% reduction in response times
  46. 46. Part 1 Summary • Don’t compromise on the benchmarking – Understand your cached data profile – Functional testing is NOT sufficient – Monitoring and Analysis is essential • Tune Virtual Nodes for best distribution • Mitigate memory usage of embedded cache – Consider compressing embedded cache entries – Non request facing storage nodes • Distributed Infinispan out performs EhCache • Don’t rule Hotrod out – Not acceptable for this customer – Many improvements and bug fixes
  47. 47. Part 2 Part 2 – Green Field SLA’s New Pricing Engine – Tomcat – Spring & Grails – Infinispan – Oracle RAC – Apache load-balancer / mod_jk Historical Pricing Engine – EhCache – MySQL – 2 second full Paris Query
  48. 48. Part 2 Logical View • New Pricing Engine – Side by side rollout – Controller determines where to send requests and aggregates results – NOT Hibernate 2LC – Spring Interceptors containing logic to check / update cache wrap calls to DB that extract and generate cache entries
  49. 49. Part 2 Proposed Caching • Everything distributed – It worked before so we just turn in on, right?
  50. 50. Part 2 The Pain • Distributed Mode – Network saturation on 1Gb switch (125MB/second) under load – Contention in org.jgroups • Performance SLA’s – Caching data in Local mode required 14G heap & 20 second GC pauses • Aggressive rollout strategy – Struggling at low user load
  51. 51. Part 2 Cache Analysis • Eclipse Memory Analyzer – Identify cache profile – Small subset of elements account for almost all the space – Paris “Rates” sizes 20K – 1.6MB – Paris search (500 rates records) == 50MB total – 1Gb switch max throughput = 125MB/second
  52. 52. Part 2 Revised Caching • Local caching for numerous “small” elements • Distributed for “large” expensive elements
  53. 53. Part 2 Distributed Issue • Here’s why normal distributed doesn’t work – One Paris request requires 500 rates records (50MB) – 10 nodes distributed cluster = 1 in 5 chance data is local – 80% remote Gets == 40MB network traffic
  54. 54. Part 2 Options • Rewrite the application caching logic – Significantly reduce the element size • Run Local caching with oversized heap – Daily restart, eliminate full GC pauses – Large memory investment and careful management • Sacrifice caching and hit the DB – Hits response times and hammer the database • Distributed Execution? – Send a task to the data and extract just what you need
  55. 55. Part 2 Change in Psychology... If the mountain will not come to Muhammad, then Muhammad must go to the mountain
  56. 56. Part 2 Distributed Execution • DefaultExecutorService – http://docs.jboss.org/infinispan/5.1/apidocs/org/infinispan/distexec /DefaultExecutorService.html • Create the Distributed Execution Service to run on the cache node specified public DefaultExecutorService(Cache masterCacheNode) • Run task on primary owner of Key input public Future<T> submit(Callable<T> task, K... input) – Resolve primary owner of Key then either • Run locally • Issue a remote command and run on the owning node
  57. 57. Part 2 Pricing Controller • Callable task – Contains code to • Grab reference to local Spring Context • Load required beans • Spring interceptor checks cache at the owningExisting node (local get) Code • If not found then goto database, retrieve and update cache • Extract pricing based on request criteria • Return results
  58. 58. Part 2 Pricing Controller • Create a new DefaultExecutorService – Create callable tasks required to satisfy request – Issue callable tasks concurrently while (moreKeys) { Callable<T> callable = new MyCallable<T>(...); Future<T> future = distributedExecutorService.submit(callable, key); ... } – Collate results and assemble response while (moreFutures) { T result = future.get(); }
  59. 59. Part 2 Distributed Execution • Only the relevant information from the cache entry is returned
  60. 60. Part 2 Results • Latency – Paris search – Historic Engine 2 seconds – Dist-Exec 200ms • JVM – 5GB Heap – 3-4 second pauses
  61. 61. Part 2 Limitations • Failover – Task sent to primary owner only – https://community.jboss.org/wiki/Infinispan60- DistributedExecutionEnhancements – Handle failures yourself • Hotrod not supported – This would be fantastic! – https://issues.jboss.org/browse/ISPN-1094 • Both in 6.0?
  62. 62. Part 2 Summary • Analysis and re-design of cached data • Accessing large data sets requires an alternative access pattern • Dramatically reduced latency – Parallel execution – Fraction of data transferred across the wire • Execution failures must be handled by application code, at the moment...
  63. 63. Thanks for Listening! Any Questions?

×