Your SlideShare is downloading. ×
Database backed coherence cache
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Database backed coherence cache

7,205
views

Published on

Here is slide deck

Here is slide deck

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,205
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
69
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Database backed Coherence cache Tips, Tricks and Patterns Alexey Ragozin alexey.ragozin@gmail.com May 2012
  • 2. Power of read-write-backing-map • Fetching data as needed • Separation of concerns • Gracefully handling concurrency • Write-behind – removing DB from critical path • Database operation bundling
  • 3. … and challenges DB operations are order of magnitude slower • Less deterministic response time • Coherence thread pools issues How verify persistence with write behind? Data are written in DB in random order read-write-backing-map and expiry
  • 4. TIPS
  • 5. BinaryEntryStore, did you know? BinaryEntryStore – an alternative to CacheLoader / CacheStore interface. Works with BinaryEntry instead of objects.  You can access binary key and value • Skip deserialization, if binary is enough  You can access previous version of value • Distinguish inserts vs. updates • Find which fields were cached  You cannot set entry TTL in cache loader 
  • 6. When storeAll(…) is called? cache.getAll(…) • loadAll(…) will be called with partition granularity (since Coherence 3.7) cache.putAll(…) • write-behind scheme will use storeAll(…) • write-through scheme will use store(…) (this could be really slow)
  • 7. When storeAll(…) is called? cache.invokeAll(…)/aggregate(…) • calling get() on entry will invoke load(…) (if entry is not cached yet) • calling set() on entry will invoke put(…) (in case of write-through) • you can check entry.isPresent() to avoid needless read-through • Coherence will never use bulk cache store operations for aggregators and entry processors
  • 8. Warming up aggregatorpublic static void preloadValuesViaReadThrough(Set<BinaryEntry> entries) { CacheMap backingMap = null; Set<Object> keys = new HashSet<Object>(); for (BinaryEntry entry : entries) { if (backingMap == null) { backingMap = (CacheMap) entry.getBackingMapContext().getBackingMap(); } if (!entry.isPresent()) { keys.add(entry.getBinaryKey()); } } backingMap.getAll(keys);}Code above will force all entries for working set to bepreloaded using bulk loadAll(…).Call it before processing entries.
  • 9. Why load(…) is called on write? Case: • Entry processor is called on set of entries which is not in cache and assigns values to them Question: • Why read-through is triggered? Answer: • BinaryEntry.setValue(Object) returns old value • Use BinaryEntry.setValue(Object, boolean)
  • 10. Bulk put with write throughYou can use same trick for updates.1. Pack your values in entry processor.2. In entry processor obtain backing map reference.3. Call putAll(…) on backing map.Be careful !!!• You should only put key for partition entry processor was called for.• Backing map accepts serialized objects.
  • 11. Using operation bundling Worker thread 3 Worker Worker Worker Cache RWBM thread1 thread2 thread3 Store Req 1 Req 2 Waiting Req 3 storeAll() Req 1 Req 2 Req 3 Worker Worker Worker Cache RWBM thread1 thread2 thread3 Store
  • 12. Using operation bundlingstoreAll(…) with N keys could be called if You have at least N concurrent operations You have at least N threads in worker pool <cachestore-scheme> <operation-bundling> <bundle-confing> <operation-name>store</operation-name> <delay-millis>5</delay-millis> <thread-threshold>4</thread-threshold> </bundle-config> </operation-bundling> </cachestore-scheme>
  • 13. Checking STORE decoration  Configure cache as “write-behind”  Put data  Wait until, STORE decoration become TRUE (actually it will switch from FALSE to null)public class StoreFlagExtractor extends AbstractExtractor implements PortableObject { // ... private Object extractInternal(Binary binValue, BinaryEntry entry) { if (ExternalizableHelper.isDecorated(binValue)) { Binary store = ExternalizableHelper.getDecoration(binValue, ExternalizableHelper.DECO_STORE); if (store != null) { Object st = ExternalizableHelper.fromBinary(store, entry.getSerializer()); return st; } } return Boolean.TRUE; }}
  • 14. BEHIND SCENES
  • 15. How it works? Distributed cache service backing-map read-write Internal map Miss cache Cache store
  • 16. How it works?Distribution and backup of data Distributed cache service Storing cache data, expiry backing-map read-write Internal map Caching cache loader misses Miss cache Interacting with persistent storage Cache store Coordination
  • 17. How it works? get(key) Distributed cache service Cache service is receiving get(…) request. backing-map read-write Internal map Miss cache Cache store#1
  • 18. How it works? get(key) Distributed cache service Cache service get(key) is invoking get(…) on backing map. backing-map read-write Internal map Partition transaction is open. Miss cache Cache store#2
  • 19. How it works? get(key) Distributed cache service Backing map get(key) checks internal map and miss cache if present. backing-map read-write Internal map Key is not found. Miss cache Cache store#3
  • 20. How it works? get(key) Distributed cache service Backing map is get(key) invoking load(…) on cache loader. backing-map read-write Internal map Miss cache Cache store load(key)#4
  • 21. How it works? get(key) Distributed cache service Cache loader get(key) is retrieving value for external backing-map read-write Internal map source Miss cache Cache store load(key) query external data source#5
  • 22. How it works? get(key) Distributed cache service Value is loaded get(key) backing-map read-write Internal map Miss cache Cache store load(key) query external data source#6
  • 23. How it works? get(key) Distributed cache service Backing map is get(key) updating internal map backing-map read-write Internal map Miss cache Cache store load(key) query external data source#7
  • 24. How it works? get(key) Distributed cache service Internal map is get(key) observable and map event cache service is receiving event backing-map read-write Internal map about new entry in internal map. Miss cache Cache store load(key) query external data source#8
  • 25. How it works? get(key) Distributed cache service Call to backing get(key) map returns. map event Cache service is ready to backing-map read-write Internal map commit partition transaction. Miss cache Cache store load(key) query external data source#9
  • 26. How it works? get(key) update backup Distributed cache service Partition get(key) transaction is map event being committed. backing-map read-write Internal map New value is being sent to backup node. Miss cache Cache store load(key) query external data source#10
  • 27. How it works? get(key) update backup Distributed cache service Response for get(key) get(…) map event request is sent back as backup backing-map read-write Internal map has confirmed update. Miss cache Cache store load(key) query external data source#11
  • 28. How it works? put(k,v) Distributed cache service Cache service is receiving put(…) requiest. backing-map read-write Internal map Miss cache Cache store#1
  • 29. How it works? put(k,v) Distributed cache service Cache service put(k,v) is invoking put(…) on backing map. backing-map read-write Internal map Partition transaction is open. Miss cache Cache store#2
  • 30. How it works? put(k,v) Distributed cache service Value is put(k,v) immediately stored in internal map backing-map read-write Internal map and put to write-behind queue. Miss cache Cache store#3
  • 31. How it works? put(k,v) Distributed cache service Cache service is put(k,v) map event receiving event, but DECO_STORE=false backing map is decorating value with backing-map read-write Internal map DECO_STORE=false flag to mark that value is yet-to-stored. Miss cache Cache store#4
  • 32. How it works? put(k,v) Distributed cache service Call to backing put(k,v) map event map return. DECO_STORE=false backing-map read-write Internal map Miss cache Cache store#5
  • 33. How it works? put(k,v) update backup Distributed cache service Partition transaction put(k,v) map event is being committed. DECO_STORE=false Backup will receive value decorated with backing-map read-write Internal map DECO_STORE=false. Miss cache Cache store#6
  • 34. How it works? put(k,v) update backup Distributed cache service Cache service is put(k,v) map event sending response DECO_STORE=false back as soon as backup is backing-map read-write Internal map confirmed. Miss cache Cache store#7
  • 35. How it works? put(k,v) Distributed cache service Eventually, cache put(k,v) store is called to persist value. It is done on backing-map read-write Internal map separate thread. Miss cache Cache store#8
  • 36. How it works? put(k,v) Distributed cache service Value is stored put(k,v) in external storage by cache store. backing-map read-write Internal map Miss cache Cache store store to external storage#9
  • 37. How it works? put(k,v) Distributed cache service Once call to cache put(k,v) map event store has returned DECO_STORE=null successfully. Backing map is removing backing-map read-write Internal map DECO_STORE decoration from value is internal map. Miss cache Cache service is receiving map event Cache store store to external storage#10
  • 38. How it works? put(k,v) update backup Distributed cache service Map event was put(k,v) map event received by cache DECO_STORE=null service outside of service thread. It will backing-map read-write Internal map be put to OOB queue and eventually processed. Miss cache Update to backup will be sent once event is processed. Cache store store to external storage#11
  • 39. THREADING
  • 40. Requests and jobs Client side: One method call putAll(…)
  • 41. Requests and jobs Network: One request for each member PutRequest putAll(…) PutRequest PutRequest
  • 42. Requests and jobs Storage node: One job per partition Job PutRequest Job putAll(…) PutRequest Job PutRequest
  • 43. Requests and jobsProblem Single API call may produce hundreds of jobs for worker threads in cluster (limited by partition count). Write-through and read-through jobs could be time consuming. While all threads are busy by time consuming jobs, cache is unresponsive.
  • 44. Requests and jobsWorkarounds Huge thread pools Request throttling  By member (one network request at time)  By partitions (one job at time) Priorities  Applicable only to EP and aggregators
  • 45. “UNBREAKABLE CACHE” PATTERN
  • 46. “Canary” keys Canary keys – special keys (one per partitions) ignored by all cache operations. Canary key is inserted once “recovery” procedure have verified that partition data is complete. If partition is not yet loaded or lost due to disaster, canary key will be missing.
  • 47. Recovery procedure Store object hash code in database Using hash you can query database for all keys belonging to partition Knowing all keys, can use read-through to pull data to a cache Cache is writable during recovery! Coherence internal concurrency control will ensure consistency
  • 48. “Unbreakable cache”read/write-trough + canary keys + recovery Key based operations rely on read-through Filter based operations are checking “canary” keys (and activate recovery is needed) Preloading = recovery Cache is writable at all times
  • 49. Checking “canary” keys Option 1  check “canary” keys  perform query Option 2  perform query  check “canary” keys
  • 50. Checking “canary” keys Option 1  check “canary” keys  perform query Option 2  perform query  check “canary” keys Right way  check “canaries” inside of query!
  • 51. “Unbreakable cache”Motivation Incomplete data set would invalidate hundred of hours of number crunching 100% complete data or exception Persistent DB is requirement anywaySummary Transparent recovery (+ preloading for free) Always writable (i.e. feeds are not waiting for recovery) Graceful degradation of service in case of “disastrous conditions”
  • 52. Thank youhttp://blog.ragozin.info- my articleshttp://code.google.com/p/gridkit- my open source code Alexey Ragozin alexey.ragozin@gmail.com