Data Grids with Oracle Coherence

  • 8,582 views
Uploaded on

An introduction to building data grids in Oracle Coherence

An introduction to building data grids in Oracle Coherence

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Outstanding ! Thank you.
    Are you sure you want to
    Your message goes here
  • Great Presentation. Fantastic job for putting it together.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
8,582
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
251
Comments
2
Likes
6

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Welcome to the Coherence section of the Enterprise Engineering Program. The purpose of this section is to give you an understanding of what a data cache is, why one is useful for making an application both performant and scalable and how Coherence, the banks recommended data cache, works under the hood.
  • We ’ll start off by looking at the problems arising from bottlenecking at data sources, in particular those alluded to by the previous speaker. Bottlenecks generally arise when a system accesses a data source located on a single physical machine, be it a client machine (as in the grid invocations described in previous lectures) or a database. Both these use cases do not scale as the data requirements and processing requirements increase. Having understood the bottleneck problem we ’ll see how clustering is a suitable solution as it spreads the data source across multiple machines inducing scalability. The third section looks at Coherence in detail, focussing on the various functions offered to developers. Finally we ’ll reflect on how Coherence can be used as more than just a caching technology becoming an application container that facilitates the construction of distributed, fault tolerant, inherently scalable applications easier.
  • Here we see a common grid use case with a client invoking a task on four DataSynapse grid engines, all of which source their data from the database (most likely simultaneously). There is an obvious problem with this architecture. What is it?
  • The database becomes a bottleneck as the number of engines in the grid increases. The bottleneck arises as database is generally located on a single physical machine. Thus there are physical constraints placed on its scalability when used in conjunction with a scalable middle tier such as a DataSynapse compute grid. These key constraints being the bandwidth and CPU of the data server.
  • Although Coherence may have a simple interface, behind it lies a powerful technology. Unlike some simple clustered data repositories, which rely on copies of the dataset being held on each machine, Coherence spreads its data across the cluster. Thus each machine is responsible for its own portion of the data set. Thus, in the example seen here, the user requests the key “2” from the cache (note that a cache is analogous to a table in a database, it is single HashMap instance). The query for key “2” is directed to the single machine on which the data resides. In this case the node in the top left corner. A subsequent request for key “334” is routed to the machine in the bottom left corner as it is this machine which is responsible for that key.
  • So lets introduce Coherence – A story of accidental genius. Why accidental genius? Well…
  • … but we ’ll come back to this later…
  • Coherence is fundamentally a data repository (or data fabric - the term data fabric is coined as the data is held in a distributed manor across a set of machines, a cluster). It is however a special kind of data repository designed with low latency, highly available, distributed systems in mind. If you require fast access to prefabricated data (that is to say data that has been pre-processed into the required form) in a distributed world, Coherence is likely to be your technology of choice. It has three important entities. The Coherence cluster itself, which is sandwiched between the client on the left and the persistent data source on the right. The client has it ’s own, in process, 2 nd level cache. The persistent data source is usually only used for data writes, it does not contribute to data retrieval (as the cluster, in the centre of the diagram, will typically be pre-populated with data, but more on that later).
  • And of course there is the possibility that it may be more than just a clever data repository…. but we ’ll come to that later too…
  • Having briefly introduced the technology, lets take a look at what it is and what it is not and how it relates to other data repository technologies.
  • Key concept: Coherence is just a map. All data is stored as key value pairs.
  • In a typical installation Coherence will be prepopulated with data so that the cluster become the primary data source rather than just a caching layer sitting above it. The read through pattern is not used in most implementations as it relies on the speed and scalability of the database tier.
  • Coherence is not a database. It is a much lighter-weight product designed for fast data retrieval operations. Databases provide a variety of additional functionality which Coherence does not support including ACID (Atomic, Consistent, Isolated and Durable), the joining of data in different caches (or tables) and all the features of the SQL language. Coherence does however support an object based query language which is not dissimilar to SQL. However Coherence is not suited to complex data operations or long transactions. It is designed for very fast data access via lookups based on simple attributes e.g. retrieving a trade by its trade ID, writing a new trade, retrieving trades in a date range etc.
  • Now lets compare coherence with some other prominent products in the Oracle suite (which RBS favour). Firstly lets look at the relationship between Oracle RAC (Real Application Cluster) and Coherence. RAC is a clustered database technology. Being clustering it, like Coherence, is fault tolerant and highly available - that is to say that loss of a single machine will not significantly effect the running of the application. However, unlike Coherence, RAC is durable to almost any failure as data is persisted to (potentially several different) disks. However Coherence ’s lack of disk access makes it significantly faster and thus the choice for many highly performant applications. Finally RAC supports SQL and thus can handle complex data processing.
  • Coherence is fast, fault tolerant and scalable. Lets look at what makes it each of these things…
  • Coherence is faster than most other data repositories for five main reasons: Firstly it stores all data solely in memory. There is no need to access disk during data retrieval. Secondly objects are always held in their serialised form (and there is a custom implementation of serialisation which outperforms the standard mechanism). Holding data in a serialised form allows Coherence to skip the serialisation step on the server meaning that data requests only have one serialisation hit, occurring when they are deserialised on the client after a response. Note that both keys and values are held in their serialised form (and in fact the hash code has to be cached as a result of this). Thirdly writes to the database are usually performed asynchronously (although this is configurable). Asynchronous persistence of data is desirable as it means Coherence does not have to wait for disk access on a potentially bottlenecked resource. As we ’ll see later it also does some clever stuff to batch writes to persistent stores to make them more efficient. The result of asynchronous database access is that writes to the Coherence cluster are fast and will stay fast as the cluster scales. The downside being that data could be lost should a critical failure occur. Fourthly Coherence includes a second level cache that sits in process on the client. This is a analogous to a typical caching layer, holding on to some defined number of objects previously requested by the client. Coherence ensures that the data in these Near Caches is kept coherent (no pun intended :) Finally queries which run across multiple data elements can be run in parallel as the data exists on different machines. This obviously significantly optimises such queries.
  • Coherence is both fault tolerant and highly available. That is to say that the loss of a single machine will not significantly impact the operation of the cluster. The reason for this resilience is that loss of a single node will result in a seamless failover to a backup copy held elsewhere in the cluster. All operations that were running on the node when it went down will also be re-executed elsewhere. It is worth emphasising that this is one of the most powerful features of the product. It can efficiently detect node loss and deal with it. It also deals with the addition of new nodes in the same seamless manor.
  • Coherence holds data on only one machine (two if you include the backup). Thus adding new machines to the cluster increases the storage capacity by a factor of 1/n, where n is the number of nodes. CPU and bandwidth capacity will obviously be increased too as machines are added. This allows the cluster to scale linearly through the simple addition of commodity hardware. There is no need to buy bigger an bigger boxes.
  • So in summary: Scaling up an application inevitably implies scaling the ability to access data. Without a scalable data layer a bottleneck will inevitably occur.
  • Coherence is a data source that will scale processing power, bandwidth and storage through the addition of commodity hardware.
  • However the price paid is two-fold. It lacks the true durability of a database (even though it is fault-tolerant) and it lacks the ability to efficiently process complex data. But these sacrifices are made to enable very low latency, scalable data access.
  • The Well Known Hashing Algorithm is the algorithm used to determine on which machine each hash bucket will be stored. This algorithm is distributed to all members of the cluster, hence “well known”. This has the effect that the location of all keys are known to all nodes.
  • Now looking at writing data to the cluster, the format is similar to gets with the put travelling through a connection proxy which locates the data to be written and forwards on the write. The difference is that writes must also be written to the backup copy which will exist on a different machine. This adds two extra network hops to the transaction.
  • Coherence includes a clever mechanism for detecting and responding to node failure. In the example given here node X suffers a critical failure due to say a network outage or machine failure. The surrounding cluster members broadcast alerts stating that they have not heard from Node X for some period of time. If several nodes raise alerts about the same machine a group decision is made to orphan the lost node from the cluster. Once Node X has been removed from the cluster the backup of its data, seen here on the node to its left, is instantly promoted to being a Primary store. This is quickly followed by the redistribution of data around the cluster to fully backup all data and to ensure there is an even distribution across the cluster. The redistribution step is throttled to ensure it does not swamp cluster communication. However this step completes more quickly on larger clusters where less data must be redistributed to each node.
  • All client processes can configure a near cache that sits “in process”. This cache provides an in-process repository of values recently requested. Coherence takes responsibility for keeping the data in each near cache coherent. Thus in the example shown here Client A requests key1 from the cluster. This is returned and the key-value pair are stored in the client ’s in-process near cache. Next Client B writes a new value to key1 from a different process. Coherence messages all other clients that have the value near cached notifying them that the value for key1 has changed. Note that this is dynamic invalidation, the new value is not passed in the message. Should Client A make a subsequent request for key1 this will fall through to the server to retrieve the latest value. Thus Near Caching is a great way to store data which may be needed again by a client process.
  • Locking keys directly is supported in Coherence, but it is expensive. In the example here a client locks a key, performs an action and then unlocks it again. This takes a scary 12 network hops to complete. Fortunately, there is a better way…
  • In this example the client invokes an Entry Processor against a specific key in the cache. A serialised version of the entry processor is passed from the client to the cluster. The cluster locks the key and executes the passed Entry Processor code. The Entry Processor performs the set of actions defined in the process() method. The cluster unlocks the key. Thus an arbitrary piece of code is run against a key on the server.
  • Here we see an example of an entry processor, the ValueChangingEntryProcessor which updates the value associated with a certain key. Note that in contrast to the locking example described on a previous slide, this execution involves only 4 rather than 12 network hops.
  • Invocables are the second of the four primary constructs and are analogous to a DataSynapse grid task in that they allow an arbitrary piece of code to be run on the server. Invocables are similar to Entry Processors except that they are not associated with any particular key. As such they can be defined to run against a single machine or across the whole cluster. In the example here an Invocable is used to invoke a garbage collection on all nodes on the cluster. Other good examples of the use of Invocables are the bulk loading of data, with Invocables being used to parallelise the execution across the available machines.
  • Backing Map Listeners are the third of the four primary constructs and are analogous to triggers in a database. In the example here the client writes a tuple to the cache and in response to this event a Backing Map Listener fires, executing some user defined code. The code is executed synchronously, that is to say that the key is locked for the duration of the execution.
  • The last of the four primary constructs is the CacheStore. CacheStores are usually used to persist data to a database and contain built in retry logic should an exception be thrown during their execution. Looking at the example here: The client writes a tuple to the cache. This event causes a CacheStore to fire in an attempt to persist the tuple. Note that this may be executed synchronously or asynchronously. In this case the user defined code in the CacheStore throws an throws an exception. The CacheStore catches the exception and adds the store event to a retry queue. A defined period of time later the cache store is called again. This time the execution succeeds and the tuple is written to the database. The retry queue is fault tolerant. So long as the cluster is up it will continue to retry store events until they succeed. Should multiple values be received for the same key during the write delay of an asynchronous CacheStore the values will be coalesced, that is to say that only the most recent tuple will be persisted. This coalescing also applies to the retry queue.
  • Thus, to summarise the four primary constructs: Both Entry Processors and Invocables are called from the client but run on the server. They both except parameters during construction and can return values after their execution. Backing Map Listeners and CacheStores both run on the cluster in response to cache events. Backing Map listeners, like Entry Processors, lock on the key for which they are executing. Synchronous cache stores also lock but their use in asynchronous mode tends to be more common. Cache stores are guaranteed, in that they will retry should execution fail and this retry logic is fault tolerant (it will retry on a different machine should the one it is running on fail). They also coalesce changes.
  • Coherence currently supports Java and .NET as client platforms with C++ being added later this year. The communication between languages is done by converting objects into a language neutral binary format known as POF (Portable Object Format). In the example the C# client defines the POF serialisation routine which is executed by the IPofSerialiser (written in C#) to create a POF object which is stored in the cluster. When a Java client requests the same object it is inflated with the PofSerialiser (written in Java) to create a comparable Java object.
  • The previous slide covered the marshalling of data from one language to another. However non-Java clients also need to execute code on the cluster and, as the cluster is written in Java, any executions run there must also be in Java. To solve this problem server side code, such as the Invocable shown here, is mapped from a C# implementation on the client to a Java implementation on the server. Thus calling MyInvovable in C# will result in the Java version of MyInvocable being run on the server with the objects it uses being marshalled from one language to another via POF (as described in the previous slide).
  • Data affinity allows associations to be set up between data in different caches so that the associated data objects in the two different caches are collocated on the same machine. In the example here trade data and market data are linked via the ticker meaning that all trades for ticker ATT will be stored on the same machine as the ATT market data.
  • Thus when an entry processor executes, say to run a trade pricing routine, it can access the trade and its market data without having to make a wire call as the market data for that particular trade will be held on the same machine (whenever possible).
  • In a standard architecture (the upper example) data is retrieved from a data source and sent to the application tier for processing. However in the Coherence Application-Centric approach (the lower example) the code is sent to the machine that holds the data for execution. This is one of the real penny-dropping concepts that can revolutionise a systems performance. But it is important to note that Coherence is not a direct substitute for a compute grid such as DataSynapse. Application-Centric Coherence involves leveraging in the inherent distribution Coherence provides as well as its inherent collocation of processing and data.
  • The classic Service-Centric approach to using Coherence is described in this slide. A set of DataSynapse grid nodes source their data from a Coherence data cluster. But as we have seen, Coherence allows you to run arbitrary routines in a distributed manor across the cluster, running these routines on the nodes on which the data lives.
  • This presents the possibility of folding the compute grid and data cluster tiers so that parallel execution occurs solely across the data cluster. This has a fundamental advantage that far less data needs to be transmitted across the wire.
  • Thus looking at the anatomy of a simple Application-Centric deployment we see: A feed server enters a trade into the Trade cache using an Entry Processor to execute some pre-processing. This in turn fires a CacheStore which reliably executes some domain processing for that trade on the same machine. The domain processing results in the trade being updated in the cache. One of the key benefits of this architecture is the inherent distribution of processing. As various feeds come in for various trades, the domain processing for each one is executed on the machine on which that trade data is held. This means that not only is the execution collocated with the data but the executions are implicitly load balanced across the Coherence cluster.
  • So falling back to my opening comment about post-it notes and Business Evolution, Coherence has evolved from being a data repository to an application container which provides: Free distribution of processing across multiple machines Free fault tolerance Free scalability to potentially thousands of machines This is an enticing proposition for any new application.
  • Although Coherence may have a simple interface, behind it lies a powerful technology. Unlike some simple clustered data repositories, which rely on copies of the dataset being held on each machine, Coherence spreads its data across the cluster. Thus each machine is responsible for its own portion of the data set. Thus, in the example seen here, the user requests the key “2” from the cache (note that a cache is analogous to a table in a database, it is single HashMap instance). The query for key “2” is directed to the single machine on which the data resides. In this case the node in the top left corner. A subsequent request for key “334” is routed to the machine in the bottom left corner as it is this machine which is responsible for that key.
  • So in conclusion, we have seen that Coherence provides a variety of additional functionality to facilitate the highly available execution of domain processing across a cluster of machines. Thus a simple data caching layer has become a framework for distributed, fault tolerant processing.

Transcript

  • 1. But data replication does General solution is toThe Data Bottleneck not scale... scale repository by– grid performance is replicating data acrosscoupled to how several machines.quickly it can send … but dataand receive data partitioning does. Coherence has evolved functions for Low latency access is server side facilitated by simplifying processing too. the contract.These include tools forproviding reliable,asynchronous, Coherence leverages these todistributed work that can provide the fastest, mostbe collocated with data. scalable cache on the market. 2
  • 2. DS Node DS NodeClient Data Source DS Node DS Node 3
  • 3. DS Node DS Node DS Node Bottleneck DS NodeClient Data Source DS Node DS Node DS Node DS Node 4
  • 4. 5
  • 5. DB DB •  Data exists on a shared file system. •  Multiple machines add Data on bandwidth and processingDB Disk DB ability DB DB 6
  • 6. Greatly increases bandwidth available for reading data. Copy 1 Copy 2 Data onCopy 6 Disk Copy 3 Copy 5 Copy 6 7
  • 7. Copy 1 Copy 2 Data is copied to different machines.Copy 6 Copy 3 What is wrong with Copy 5 Copy 6 this architecture? 8
  • 8. Becomes out of date Client Writes Data Record 1 = fooRecord 1 = bar Data on Disk Record 1 = foo 9
  • 9. Client Writes Data Record 1 = foo Lock Record 1Record 1 = bar Data on Disk Record 1 = foo 10
  • 10. Copy 1 Copy 2ClientWrites Controller Copy 6 Copy 3 Data Copy 5 Copy 4 All nodes must be locked before data can be written => The Distributed Locking Problem 11
  • 11. Each machine is responsible for asubset of the records. Each record exists on only one machine. 1, 2, 3… 97, 98, 99… Client 765, 769… 169, 170… 333, 334… 244, 245… 12
  • 12. • Data volume will naturally increase with the number of machines in the cluster as the data only exists in one place.• Writes are only ever sent to one machine (that holds the singleton piece of data being modified) so write latency scales with the cluster. 13
  • 13. • Oracle RAC• Gigaspaces• Terracotta• Oracle Coherence 14
  • 14. Node NodeNode Node Node Node 15
  • 15. 16
  • 16. The Post-It Note is a great example of Business Evolution – a productthat starts its life in one role, but evolves into something else …otherwise known as Accidental Genius.Coherence is another good example!! 17
  • 17. Node NodeClient Client Near Node Node Database Client Cache Node Node 18
  • 18. 19
  • 19. All data is stored as key value pairs [key1, value1] [key2, value2] [key3, value3] [key4, value4] [key5, value5] [key6, value6] 21
  • 20. •  DB used for persistence only - Coherence is typically prepopulated•  Caching is over two levels (server and client) Bulk load at start up Node Node Node Node Database Node Node Write Asynchronous 22
  • 21. Coherence does not support:-  ACID-  Joins (natively)-  SQL*Coherence works to a simplercontract. It is efficient only forsimple data access. As such it cando this one job quickly andscalably. 23
  • 22. What is RAC? RAC is a clustered database which runs in parallel over several machines. It supports all the features of vanilla Oracle DB but has better scalability, fault tolerance and bandwidth. Clustered therefore fault tolerant and scalable Cache Cache DB DB Node Node Supports ACID,Cache Cache SQL etc Node DB DBNode Cache Cache Node Node No disk access DB DB 24
  • 23. Fast Fault Tolerant Scalable 25
  • 24. In-memory storage of data – no disk induced latencies (unless you want them). Objects held in serialised form Node NodeClient Async write behind Client Near Node Node Client Database Cache Node Node Queries run in parallel Near Cache kept Coherent via where possible Proactive update/expiry of data as (aggregations etc) it changes 26
  • 25. InMemory Cache(Fragile) DB 27
  • 26. Data is held on at least two machines. Thus, should one fail,the backup copy will still be available. Node 1 Node 2 Node 6 Node 3 Node 5 Node 4 Node 2 Backup The more machines, the faster failover will be! 28
  • 27. -  Scale the application by adding commodity hardware-  Coherence automatically detects new cluster members- Near-linear scalability due to partitioned data Processing / Storage / Node Node Node Bandwidth Node Node Node Node Node Number of nodes (n) 29
  • 28. Not that resilient:- Single machine failure will be tolerated.- Concurrent machine failure will cause data loss.Key Point: Resilience is sacrificed for speed 30
  • 29. • Coherence works to a simpler contract. It is efficient only for simple data access. As such it can do this one job quickly and scalably.• Databases are constrained by the wealth of features they must implement. Most notably (from a latency perspective) ACID.• In the bank we are often happy to sacrifice ACID etc for speed and scalability. 31
  • 30. Summary so far… 32
  • 31. Summary so far… 33
  • 32. Summary so far… 34
  • 33. Client cache.get( foo )Node Node Node Node Foo Well Known UDPHashing Algorithm 36
  • 34. myCache.put( Key , Value );! Client Connection Connection Proxy Proxyta Storage Data Storage Data Storage Data StorageProcess Process Process ProcessPrimary Primary Primary PrimaryBackup Backup Backup Backup 37
  • 35. ConsensusDeath detection is votebased – there is no centralmanagement I think Node I think Node X has died X has died Node XData Storage Data Storage Data Storage Data Process Process Process Pr Primary Primary… Primary Bar Pri Node X Redistribution Redistribution Backup Primary Backup…Backup Foo Backup Bar Ba Node X Repartioning Repartioning 38
  • 36. Data Invalidation Message: value for key1 is now invalidNear Cache(in-process) Node Node Near Cache Node Node Client A key1, val cache.get(key1) Node Node cache.put(key1, SomethingNew) Client B 39
  • 37. 1, 2, 3… 97, 98, 99… Lock(Key3) Process Client 765, 769… 169, 170… Unlock(Key3) 333, 334… 244, 245…12 Network Hops(6 internal to cluster, 6 external to cluster) 40
  • 38. Analogous to: Stored Procedures Key is Locked 1, 2, 3… 97, 98… cache.invoke(EP)Client 765, 769… 169, 170… EntryProcessor Key is 333, 334… 244, 245… Unlocked Class Foo extends AbstractProcessor ! public Object process(Entry entry)! public Map processAll(Set setEntries)! Your code goes here 41
  • 39. cache.invoke("Key3",!new ValueChangingEntryProcessor( NewVal"));! 1, 2, 3… 97, 98, 99… cache.invoke(..) Client 765, 769… 169, 170… 333, 334… 244, 245…4 Network Hops(2 internal to cluster, 2 external to cluster) 42
  • 40. Analogous to: Grid Taskservice.execute(new GCAgent(), null, null);! Node Node service.execute(some code) Client Node Node Node Node Run any arbitrary piece of code on any or all of the nodes 43
  • 41. Analogous to: TriggersBacking Map Listeners / Triggers allow code to be run in response to a cache event such as an entry being added, updated or deleted. Key is Your code Locked goes here 1, 2, 3… 97, 98… cache.put(foo)Client 765, 769… 169, 170… MapListener 333, 334… MapListener ! Class 244, 245… Key is Public void entryInserted(MapEvent evt)! Unlocked public void entryUpdated(MapEvent evt)! public void entryDeleted(MapEvent evt)! 44
  • 42. Analogous to: Triggers (but with fault tolerance and built in retry) Exception Retry Queue Thrown 1, 2, 3… 97, 98… cache.put(765, X)Client 765, 769… 169, 170… CacheStore Database Should multiple 333, 334… CacheStore ! Class 244, 245… changes be made to public void store(Object key, Object val)! the same key they will be coalesced public void erase(Object key)! 45
  • 43. Called by Coherence Called by client Backing Map Entry Processor Locks the key Listener Takes Responds to a Parameters cache event &Returns Values CacheStore Invocable Is Guaranteed Coalesces changes 46
  • 44. All cluster side programming must be done in Java. However clients can be:•  Java•  C#•  C++ Serialisation is done to an intermediary binary format known as POF. This allows theoretical transformation from POF directly to any language. Currently only Java, C# and C++ are supported. Node Node C# Object IPofSerialiser Node Node Java Object PofSerialiser Node Node 47
  • 45. Java Invocable Run on Server Mapping via pof-config file Node Node C#MyInvocable Node Node Data objects are marshalled as Node Node described in previous slide 48
  • 46. Set up associations between affinity attributes so that they are stored on the same machine Trades Market Data Affinity attributes define mappings between data 50
  • 47. Trade and market data for the same ticker are collocated Trades Market Data Entry ProcessorPrice trade based onmarket data 51
  • 48. Public class foo{! Public void bar(){! Send data to code !//do some stuff! DATA }! }!Public class foo{! Public void bar(){! Send code to data !//do some stuff! }! DATA}! 53
  • 49. DS Node Node Node DS NodeClient Node Node DS Node Node Node DS Node Processing Layer: Data Layer: DataSynapse Grid Coherence Cluster 54
  • 50. Node Node Client Node Node Node NodeProcesses are performed on the node where data exists 55
  • 51. DS Node DBClient (Compute) (Data) CoherenceClient Data + Compute 56
  • 52. Feed Cache Store Server Trade Cache Domain Processing EntryProcessor Update Trade 57
  • 53. •  Free distribution of processing across multiple machines•  Free fault tolerance•  Free scalability to potentially thousands of machines A very enticing proposition for new applications 58
  • 54. Coherence is not suitable for large scale processor intensive tasks (think Monte Carlo simulations etc). Why?•  Compute grids provide much more control of the execution environment (priorities, a UI etc)•  The grid is far more scalable in terms of compute power (dynamic provisioning of engines etc).•  The grid is much cheaper (per core) than Coherence. 59
  • 55. 60
  • 56. Copy 1 Copy 2ClientWrites Controller Copy 6 Copy 3 Data Copy 5 Copy 4 61
  • 57. 1, 2, 3… 97, 98, 99…Client 765, 769… 169, 170… 333, 334… 244, 245… 62
  • 58. 63
  • 59. InMemory Cache(Fragile) DB 64
  • 60. Data affinity Reliable ProcessingTrades Market Data 1, 2, 3… 98… 97, Client 169, 170… Cache DB Store 333, 334… 245… 244, 65
  • 61. So in conclusion… 66
  • 62. Summary so far… 67