GeeCon- 'www.NoSQL.com' by Mark Addy

891 views

Published on

'www.NoSQL.com' by Mark Addy

The talk, split into two sections and drawing from a real implementation covers potential NoSQL architectures and the features they offer in the quest to reach the Holy Grail AKA linear scalability. First we examine an existing traditional replicated cache running at full capacity and how replacement with a distributed solution allowed the application to scale dramatically with a performance improvement to boot. Secondly we look at one of the pitfalls of distributed caching and how, using an essential tool from the Data Grid functionality armory, namely grid-execution we can provide massive scale out and low latencies. Both parts describe a real-world implementation delivered for a global on-line travel agency using Infinispan. Level: All levels, the talk is aimed at developers, architects and middleware specialists interested in caching solutions. Focus: Use-cases including technical detail.

Published in: Technology, Business
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
891
On SlideShare
0
From Embeds
0
Number of Embeds
70
Actions
Shares
0
Downloads
11
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide
  • Two parts to this presentation Part 1 – Replacement of replicated EhCache with distributed Infinispan Part 2 – Using Infinispans distributed execution framework to enable a new project to meet SLA’s
  • Accommodation and associated add-ons – excursions, day trips, food etc Content supplied from their own ‘catalogue’ and data supplied from third parties to supplement offerings USP is response time – they are quicker than their competitors! Existing relationship JBoss workshop Troubleshooting JBoss Messaging Troubleshooting Drools Diagnosis and fixes for Memory Leaks Diagnosis and fixes for 100% CPU race conditions
  • Connectivity Engine Responsible for retrieving third party offerings for presentation in the local site
  • Above the line is local to the customer Below the line is over the WWW to third party content providers Third parties may then make further trips over the WWW to the source Hotel chains for data NOTE that the third parties also attempt to cache the remote Hotel chain data locally
  • Third party systems may have periods of downtime Network latency associated with trips to third parties Local site content relies on third party content to provide end user choice Response time is critical, studies show that users will navigate away More Choice, Low Response times == more promfit
  • Cache for third party data Saves on remote network trips Faster response time NOT Hibernate, hand-cranked Local database consulted to determine which third party providers have content for a search Spring Interceptors Separation of caching code from business logic Spring configuration enables/disables caching Ops team have control via external spring-config.xml
  • Existing Cache Mixture of local and replicated 10 application nodes == 10xcopies of data Stateless requests Unable to increase max entries – huge full GC pauses Would like to make all caches replicated for redundancy & reduce latency
  • Ideally heap should be no more than 4G Elements evicted prior to natural expiry Results in more trips to third parties Eviction and expiry occurs in client thread Slows down data retrieval Cache contains many stale elements
  • Unable to effectively cache all data from existing third parties More providers coming on-line More hotel offerings
  • This is what we said we would do!
  • Terracota – simple xml configuration to extend existing EhCache Byte code enhancement Terracotta server array – static hash, restart to scale Coherence – rolls royce grid – very expensive – full features mapreduce Infinispan – free, cutting edge, open source - Since start of 2010
  • First stage is to replicate live system Must be able to repeatedly simulate live for comparison YOU CANT BENCHMARK AGAINST 3 rd PARTIES
  • Replicate LIVE infrastructure This took some persuasion – tried small virtualised instances – no go Port Mirroring This is the most effective way to capture REAL traffic Simulating use-cases manually is not effective enough JMeter to replay live traffic Captured traffic converted to files, loaded by custom jmeter app Mock Spring beans Can’t match captured requests to captured responses
  • Multiple JMeter instances -Protect against overloading jmeter and this becoming a bottleneck Index.php - picks a response at random - Mocked application spring beans process as if this is a matching (valid) response
  • Get a good understanding of whats in the cache - EhCache Monitor – not great for large caches - EhCache JMX Heap dump & OQL
  • Deep visibility into the properties of your cached data Size Time to Live Time to Idle Data properties etc
  • Cache Manager Statistics Current cluster view JGroups transport statistics Bytes transferred / received etc Cache Statistics Hits / Misses Current count
  • Enable attachment of a remote JMX Client Jconsole Jvisualvm
  • RHQ free monitoring tool Infinispan distributed with plugin Plugin collects metrics Customisable RHQ collects platform statistics Allows historical comparisons Graph shows Full GC time per minute (PS MarkSweep)
  • We started out of the latest FINAL release 4.2.1 Went live 5.0.0.CR8
  • Explain distributed mode Dynamic scaling with numOwners=2 Redundancy = 2 copies Hash of KEY determines which cache nodes hold the data
  • Summarises last slide numOwners = level of redundancy Dynamic scaling = auto rebalancing of data for distribution and redundancy recovery Overhead of replication is constant no matter how many nodes are added
  • Other client server architectures REST (Http) – NO smart routing, Dynamic scaling would involve HW / SW loadbalancer, Assemble HTTP requests your self Memcached – NO smart routing, NO Dynamic scaling WebSocket – experimental java script API websocket implementation 100% Java Client Connection pooling based on commons-pool (NOTE this maybe removed) Dynamic Scaling – cluster topology returned to clients as part of request Smart Routing – clients are “aware” of the hashing algorthm used to distributed data between server nodes so can route requests appropriately Cached and application memory have distinct and separate memory / CPU requirements
  • Graph from SUN’s JVM 1.5 performance tuning Discuss why its better to separate cache and application JVM
  • Visually this is how the hotrod architecture looks
  • Option of defining machine, rack and site ID’s as part of the Infinispan configuration Doing this improves redundancy Machine failure OR Rack Failure OR Site Failure Should not result in data loss Didnt work in 4.2.1 – we upgraded
  • Data owners determined by hash wheel locations Standard hash algorithm doesn't give great data distribution Virtual nodes subdivides the “hash-wheel” positions == better distribution NOW.... MIN_INT -> MAX INT And turned ON by default 40?????
  • RHQ graph Shows improvement in distribution with more virtual nodes We found 20 gave the best results
  • Dynamic Scaling and Smart Routing hotRodTopologyCache Internal cache, fully replicated
  • Talk through SPLIT BRAIN New servers only added to ___hotRodTopologyCache on start-up Restart failed Nodes
  • Requires careful management Wasn't appropriate for this customer We have other customers that are happy to manage these “features”
  • Hotrod wasn't appropriate for this customer We started looking at what we could achieve with embedded
  • More trouble Dynamic Scaling – disable rehash / re-balancing Still generates new “Hash” on leaves / joins but no data movement Get small “loss” of data – becomes un-reachable when the new positions on hash-wheel are determined NBST – cf. Gemfire / coherence Manual / on-demand rehashing – ESSENTIAL EG. Adding 10 new nodes to a 10 node cluster == 10 rehashes!
  • Embedded compromises JVM heap size JBoss Marshalling Infinispans standard method for serializing onto the wire Used in Hotrod JBoss Serialization Very simple high performance serialization Coded both ways switch to turn on JBM – better compression, slightly worse performance JBS – slightly worse compression, slightly better performance
  • Compression not suitable for all use-cases Requires additional CPU Test before assuming it is the solution Best for low frequency, large item access
  • org.infinispan.Cache implements java.util.Map
  • Failure detection based on simple heartbeat protocol. Every member periodically multicasts a heartbeat. Every member also maintains a table of all members (minus itself). When data or a heartbeat from P are received, we reset the timestamp for P to the current time. Periodically, we check for expired members, and suspect those. Example: <FD_ALL interval="3000" timeout="10000"/> In the example above, we send a heartbeat every 3 seconds and suspect members if we haven't received a heartbeat (or traffic) for more than 10 seconds. Note that since we check the timestamps every 'interval' milliseconds, we will suspect a member after roughly 4 * 3s == 12 seconds. If we set the timeout to 8500, then we would suspect a member after 3 * 3 secs == 9 seconds.
  • Not everything needs to be request facing Increase capacity further by using non-request facing nodes Helps keep JVM size small as possible
  • Brand new application Infinispan based to start!!!
  • Controller == Router & Aggregator Determines which engine will generate results Aggregates results Same Caching architecture as before Not Hibernate Manual Spring AspectJ interceptor based Cache expensive Hotel pricing entries built from local DB NO THIRD PARTY DATA
  • Belief was that the same architecture used for CE could be used again Then they called us...
  • Distributed mode blew the network bandwidth All the code sat in org.jgroups when we thread dumped To meet SLA’s (i.e. as fast as possible) Run infinispan locally with 14G heap but unacceptable pauses
  • First we analyzed the cached data RATES records account for very few records but most of the size
  • Next we re-architected the caches Numerous very small elements == LOCAL Few, very large elements == DISTRIBUTED Still problems though...
  • Explain why distributed doesnt work for large data items
  • Rewrite the application caching logic Late in the day Very expensive in development cost You dont want to lose any data richness Oversized heap CMS – dangerous, last resort GC algorithm – expensive CPU Requires very careful monitoring and management Very expensive in terms of hardware Some financial platforms do this!!!!!!!!!!! Go back to the DB Response times suffer Database performance suffers
  • Without distributed execution / Map reduce a data grid is just a big cache... Meaning If one's will does not prevail, one must submit to an alternative. Origin The full phrase 'If the mountain will not come to Muhammad, then Muhammad must go to the mountain' arises from the story of Muhammad, as retold by Francis Bacon, in  Essays , 1625: Mahomet cald the Hill to come to him. And when the Hill stood still, he was neuer a whit abashed, but said; If the Hill will not come to Mahomet, Mahomet wil go to the hil. Present uses of the phrase usually use the word 'mountain' rather than 'hill' and this version appeared soon after Bacon's Essays, in a work by John Owen, 1643: If the mountaine will not come to Mahomet, Mahomet will goe to the mountaine. The early citations use various forms of the spelling of the name of the founder of the Islamic religion - Muhammad, Mahomet, Mohammed, Muhammed etc.
  • Implements java.util.concurrent.ExecutorService
  • How much code do I need to write?? Not much!! Make use of existing AspectJ Cache interceptor Task runs on data owning node – if data isnt there then it retrieves and caches it as part of the task Return just what you need
  • Concurrent execution Collate results using future.get() Handle failures yourself
  • Describe Distributed Execution
  • GeeCon- 'www.NoSQL.com' by Mark Addy

    1. 1. NoSQL.com
    2. 2. © C2B2 Consulting Limited 2013All Rights ReservedWho am I?Mark Addy, Senior ConsultantFast, Reliable, Secure, Manageable
    3. 3. © C2B2 Consulting Limited 2013All Rights ReservedAgendaPart 1• An existing production system unable to scalePart 2• A green-field project unable to meet SLA’s
    4. 4. © C2B2 Consulting Limited 2013All Rights ReservedAbout the Customer• Global on-line travel & accommodation provider– 50 million searches per day• Our relationship– Troubleshooting– Workshops
    5. 5. © C2B2 Consulting Limited 2013All Rights ReservedPart 1 – Existing ApplicationConnectivity Engine• Supplements site content with data from thirdparties (Content Providers)– Tomcat– Spring– EhCache– MySQL– Apache load-balancer / mod_jkPart 1
    6. 6. © C2B2 Consulting Limited 2013All Rights ReservedPart 1Logical View
    7. 7. © C2B2 Consulting Limited 2013All Rights ReservedContent Provider Challenges• Unreliable third party systems• Distant network communications• Critical for generating local site content• Response time• Choice & low response time == more profitPart 1
    8. 8. © C2B2 Consulting Limited 2013All Rights ReservedExisting Cache• This is NOT Hibernate 2LC• Spring Interceptors wrap calls to content providersPart 1<bean id="searchService" class="org.springframework.aop.framework.ProxyFactoryBean"><property name="proxyInterfaces" value=“ISearchServiceTargetBean"/><property name="target" ref="searchServiceTargetBean"/><property name="interceptorNames"><list><value>cacheInterceptor</value></list></property></bean><bean id="searchServiceTargetBean“ class=“SearchServiceTargetBean">...</bean>
    9. 9. © C2B2 Consulting Limited 2013All Rights ReservedExtreme Redundancy800,000 elements10 nodes = 10 copies of dataPart 1
    10. 10. © C2B2 Consulting Limited 2013All Rights ReservedThe Price• 10G JVM Heap– 10-12 second pauses for major GC– Over 8G of heap is cache• Eviction before Expiry– More trips to content providers• EhCache expiry & eviction– Piggybacks client cache access, lockingPart 1
    11. 11. © C2B2 Consulting Limited 2013All Rights ReservedHow to Scale?Part 1
    12. 12. © C2B2 Consulting Limited 2013All Rights ReservedObjectives• Reduce JVM Heap Size– 10 second pauses are too long• Increase cache capacity– Fewer requests to providers• Remove Eviction– Cache entries should expire naturally• Improve Response Times– Latency decreases if eviction, GC pauses andfrequency are reducedPart 1
    13. 13. © C2B2 Consulting Limited 2013All Rights ReservedDiscussions• Pre-sales workshop• Infinispan Selected– Open source advocates– Cutting edge technologyPart 1
    14. 14. © C2B2 Consulting Limited 2013All Rights ReservedBenchmarking• Must be reproducible– We want to compare and tune• Must reflect accurately the production load and data– 50 million searches / day == 600 / sec• Must be able to imitate the content providers– We cant load test against our partners!Part 1
    15. 15. © C2B2 Consulting Limited 2013All Rights ReservedSolution• Replica load-test environment• Port mirror production traffic– Capture incoming requests– Capture content provider responses• Custom JMeter script• Mock application Spring BeansPart 1
    16. 16. © C2B2 Consulting Limited 2013All Rights ReservedBenchmarking ArchitecturePart 1
    17. 17. © C2B2 Consulting Limited 2013All Rights ReservedBenchmarking Validation• Understand your cached data– jmapjmap –dump:file=mydump.hprof <pid>– Eclipse Memory Analyzer– OQLSELECTtoString(x.key), x.key.@retainedHeapSize, x.value.@retainedHeapSizeFROM net.sf.ehcache.Element xPart 1
    18. 18. © C2B2 Consulting Limited 2013All Rights ReservedBenchmarking ValidationExtract cached object properties– creationTime– lastAccessTime– lastUpdateTime– hitCountPart 1- you can learn a lot quickly– timeToLive– timeToIdle– etc– etc
    19. 19. © C2B2 Consulting Limited 2013All Rights ReservedEnable JMX for InfinispanEnable CacheManager Statistics<global><globalJmxStatisticsenabled="true"jmxDomain="org.infinispan"cacheManagerName=“MyCacheManager"/>...</global>Enable Cache Statistics<default><jmxStatistics enabled="true"/>...</default>Part 1
    20. 20. © C2B2 Consulting Limited 2013All Rights ReservedEnable Remote JMX-Dcom.sun.management.jmxremote.port=nnnn-Dcom.sun.management.jmxremote.authenticate=false-Dcom.sun.management.jmxremote.ssl=falsePart 1
    21. 21. © C2B2 Consulting Limited 2013All Rights ReservedRecord Performance• RHQ– http://rhq-project.org– JVM memory, GC profile, CPU usage– Infinispan pluginPart 1
    22. 22. © C2B2 Consulting Limited 2013All Rights ReservedInfinispanPart 1
    23. 23. © C2B2 Consulting Limited 2013All Rights ReservedDistributed ModePart 1hash(key) determines owners
    24. 24. © C2B2 Consulting Limited 2013All Rights ReservedDistribution Features• Configurable Redundancy– numOwners• Dynamic Scaling– Automatic rebalancing for distribution and recoveryof redundancy• Replication Overhead– Does not increase as more nodes are added indistributed modePart 1
    25. 25. © C2B2 Consulting Limited 2013All Rights ReservedHotrod• Client – Server architecture– Java client– Connection pooling– Dynamic scaling– Smart routing• Separate application and cache memoryrequirementsPart 1
    26. 26. © C2B2 Consulting Limited 2013All Rights ReservedApplication – Cache SeparationApplication• CPU intensive• High infant mortalityCache• Low CPU requirement• Mortality linked to expiry /evictionPart 1
    27. 27. © C2B2 Consulting Limited 2013All Rights ReservedHotrod ArchitecturePart 1
    28. 28. © C2B2 Consulting Limited 2013All Rights ReservedRemember this is cutting edge• Latest final release was 4.2.1• Lets get cracking...– Distributed mode– Hotrod client– What issues did we encounter...Part 1
    29. 29. © C2B2 Consulting Limited 2013All Rights ReservedTopology Aware Consistent Hash• Ensure back-ups are held preferentially on separatemachine, rack and sitePart 1• We need to upgrade to the latest 5.0.0.CR
    30. 30. © C2B2 Consulting Limited 2013All Rights ReservedVirtual Nodes (Segments)Sub-divides hash wheel positionsPart 1<hash numVirtualNodes=2/>
    31. 31. © C2B2 Consulting Limited 2013All Rights ReservedVirtual Nodes (Segments)• Improves data distributionPart 1• But didn’t work at the time for Hotrod• https://issues.jboss.org/browse/ISPN-1217
    32. 32. © C2B2 Consulting Limited 2013All Rights ReservedHotrod Concurrent Start-up• Dynamic scaling– Replicated ___hotRodTopologyCache holds current cluster topologyPart 1– New starters must lock and update this cache to add themselves to thecurrent view– Deadlock!– https://issues.jboss.org/browse/ISPN-1182• Stagger start-up!
    33. 33. © C2B2 Consulting Limited 2013All Rights ReservedHotrod Client Failure DetectionUnable to recover from cluster splitsPart 1
    34. 34. © C2B2 Consulting Limited 2013All Rights ReservedHotrod Client Failure Detection• New servers only added to ___hotRodTopologyCache on start-up• Restart required to re-establish client topology viewPart 1
    35. 35. © C2B2 Consulting Limited 2013All Rights Reserved• Hotrod 5.0.0 abandoned– Data distribution– Concurrent start up– Failure detection– Unacceptable for this customer
    36. 36. © C2B2 Consulting Limited 2013All Rights ReservedChange of tack• Enter the classic embedded approachPart 1• How did we get this to work...
    37. 37. © C2B2 Consulting Limited 2013All Rights Reserved• Unpredictable under heavy load– Writers blocked– Unacceptable waits for this system<hash numOwners=“2” rehashEnabled=“false” />– Accept some data loss during a leave / join• Future Enhancements– Chunked rehashing / state transfer (5.1)• https://issues.jboss.org/browse/ISPN-284– Non-blocking state transfer• https://issues.jboss.org/browse/ISPN-1424– Manual rehashing• https://issues.jboss.org/browse/ISPN-1394Dynamic ScalingPart 1
    38. 38. © C2B2 Consulting Limited 2013All Rights ReservedCache Entry Size• Average cache entry ~6K– 1 million entries = 6GB– Hotrod stores serialized entries by default• JBoss Marshalling– Default Infinispan mechanism– Get reference from ComponentRegistry• JBoss Serialization– Quick, easy to implementPart 1
    39. 39. © C2B2 Consulting Limited 2013All Rights ReservedCompression Considerations• Trade– Capacity in JVM vs Serialization Overhead• Suitability– Assess on a cache by cache basis– Very high access is probably too expensive• Compression– Average 6K reduced to 1KPart 1
    40. 40. © C2B2 Consulting Limited 2013All Rights ReservedAdvanced Cache Tuningcache.getAdvancedCache.withFlags(Flag... flags)• Flag.SKIP_REMOTE_LOOKUP– Prevents remote gets being run for an update put(K key,V value)DistributionInterceptor.remoteGetBeforeWrite()DistributionInterceptor.handleWriteCommand()DistributionInterceptor.visitPutKeyValueCommand()– We don’t need to return the previous cache entry valuePart 1
    41. 41. © C2B2 Consulting Limited 2013All Rights ReservedJGroups• UDP out-performed TCP (for us)• Discovery– For a cold, full cluster start-up avoid split brain / mergescenarios<PING timeout="3000" num_initial_members="10"/>• Heartbeat– Ensure failure detection is configured appropriately<FD_ALL interval="3000" timeout="10000"/>Part 1
    42. 42. © C2B2 Consulting Limited 2013All Rights ReservedExtending EmbeddedPart 1
    43. 43. © C2B2 Consulting Limited 2013All Rights ReservedCurrent Production System• 40 nodes• Over 30 million entries– 15 million unique– 40GB compressed cached data (== 240GB)– Nothing is evicted before natural expiration• 5GB JVM Heap, 3-4 second GC pauses• 30% reduction in response timesPart 1
    44. 44. © C2B2 Consulting Limited 2013All Rights ReservedCurrent Issues• Application coupled to Cache– Application deployments disrupt clustering– Management of 40 nodes– Scaling compromised– Cached data lost with full restartPart 1
    45. 45. © C2B2 Consulting Limited 2013All Rights ReservedThe Future• Hotrod 5.2– Data distribution (Segments)– Concurrent start up fixed– Failure detection fixed– Non-Blocking State Transfer– Rolling Upgrades
    46. 46. © C2B2 Consulting Limited 2013All Rights ReservedSummary• Don’t compromise on the benchmarking– Understand your cached data profile– Functional testing is NOT sufficient– Monitoring and Analysis is essential• Tune Segments (Virtual Nodes)– Distribution should be even• Mitigate memory usage of embedded cache– Consider compressing embedded cache entries– Non request facing storage nodes• Don’t rule Hotrod out– Many improvements and bug fixesPart 1
    47. 47. © C2B2 Consulting Limited 2013All Rights ReservedPart 2 – Green Field SLA’sHistorical Pricing Engine Tomcat Spring EhCache MySQL Apache load-balancer / mod_jk 2 second full Paris QueryNew Pricing EngineTomcatSpring & GrailsInfinispanOracle RACApache load-balancer / mod_jkPart 2
    48. 48. © C2B2 Consulting Limited 2013All Rights ReservedLogical View• New Pricing Engine– Side by side rollout– Controller determines where to sendrequests and aggregates results– NOT Hibernate 2LC– Spring Interceptors containing logic tocheck / update cache wrap calls to DBthat extract and generate cache entriesPart 2
    49. 49. © C2B2 Consulting Limited 2013All Rights ReservedProposed Caching• Everything Distributed– It worked before so we just turn in on, right?Part 2
    50. 50. © C2B2 Consulting Limited 2013All Rights ReservedThe Pain• Distributed Mode– Network saturation on 1Gb switch (125MB/second) under load– Contention in org.jgroups• Performance SLA’s– Caching data in Local mode required 14G heap & 20 second GCpauses• Aggressive rollout strategy– Struggling at low user loadPart 2
    51. 51. © C2B2 Consulting Limited 2013All Rights ReservedCache Analysis• Eclipse Memory Analyzer– Identify cache profile– Small subset of elements account for almost allthe space– Paris “Rates” sizes 20K – 1.6MB– Paris search (500 rates records) == 50MB total– 1Gb switch max throughput = 125MB/secondPart 2
    52. 52. © C2B2 Consulting Limited 2013All Rights ReservedRevised Caching• Local caching for numerous “small” elements• Distributed for “large” expensive elementsPart 2
    53. 53. © C2B2 Consulting Limited 2013All Rights ReservedDistributed Issue• Why standard Distributed doesn’t work– One Paris request requires 500 rates records (50MB)– 10 nodes distributed cluster = 1 in 5 chance data is local– 80% remote Gets == 40MB network trafficPart 2
    54. 54. © C2B2 Consulting Limited 2013All Rights ReservedOptions• Rewrite the application caching logic– Significantly reduce the element size• Run Local caching with oversized heap– Daily restart, eliminate full GC pauses, CMS– Large memory investment and careful management• Sacrifice caching and hit the DB– Hits response times and hammer the database• Distributed Execution?– Send a task to the data and extract just what you needPart 2
    55. 55. © C2B2 Consulting Limited 2013All Rights ReservedChange in PsychologyIf the mountain will not come to Muhammad, thenMuhammad must go to the mountainPart 2
    56. 56. © C2B2 Consulting Limited 2013All Rights ReservedDistributed Execution• DefaultExecutorService– http://docs.jboss.org/infinispan/5.2/apidocs/org/infinispan/distexec/DefaultExecutorService.html• Create the Distributed Execution Service to run on the cache nodespecifiedpublic DefaultExecutorService(Cache masterCacheNode)• Run task on primary owner of Key inputpublic Future<T> submit(Callable<T> task, K... input)– Resolve primary owner of Key then either• Run locally• Issue a remote command and run on the owning nodePart 2
    57. 57. © C2B2 Consulting Limited 2013All Rights ReservedPricing ControllerCallable task– Contains code to• Grab reference to local Spring Context• Load required beans• Spring interceptor checks cache at the owning node(local get)• If not found then goto database, retrieve and updatecache• Extract pricing based on request criteria• Return resultsExistingCodePart 2
    58. 58. © C2B2 Consulting Limited 2013All Rights ReservedPricing ControllerCreate DefaultExecutorService– Create callable tasks required to satisfy request– Issue callable tasks concurrentlywhile (moreKeys) {Callable<T> callable = new MyCallable<T>(...);Future<T> future = distributedExecutorService.submit(callable, key);...}– Collate results and assemble responsewhile (moreFutures) {T result = future.get();}Part 2
    59. 59. © C2B2 Consulting Limited 2013All Rights ReservedDistributed ExecutionOnly extract relevant information from the cache entryPart 2
    60. 60. © C2B2 Consulting Limited 2013All Rights ReservedResults• Latency – Paris search– Historic Engine 2 seconds– Local 150ms– Dist-Exec 200ms• JVM– 5GB Heap– 3-4 second pausesPart 2
    61. 61. © C2B2 Consulting Limited 2013All Rights ReservedLimitations• Hotrod not supported– This would be fantastic!– https://issues.jboss.org/browse/ISPN-1094Part 2
    62. 62. © C2B2 Consulting Limited 2013All Rights ReservedSummary• Analyse & re-design of cached data– Select the appropriate caching strategies• Accessing large data sets requires an alternative accesspattern– Network bandwidth• Dramatically reduced latency– Parallel execution– Fraction of data transferred across the wirePart 2
    63. 63. © C2B2 Consulting Limited 2013All Rights ReservedThanks for Listening!Any Questions?

    ×