Ogf2008 Grid Data Caching - Presentation Transcript
Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone Systems [email_address]
Background on GemStone Systems
Known for its Object Database technology since 1982
Now specializes in memory-oriented distributed data management
12 pending patents
Over 200 installed customers in global 2000
Grid focus driven by:
Very high performance with predictable throughput, latency and availability
Capital markets – risk analytics, pricing, etc
Large e-commerce portals – real time fraud
Federal intelligence
Batch to real-time - long jobs to short tasks
Increasing focus on DATA management
Workloads where
task duration is getting shorter
latency of data access is important
consistency in data is crucial
high availability is not enough; it has to be continuously available
common data across thousands of parallel activities
Accessing data in Grid today
Direct access to enterprise database or
Federated data access layer
Exposed to the weakest link problem
only as fast as the slowest data source
only as available as the weakest link
can only scale as well as the weakest link
Distributed/parallel file systems
What if too many tasks go after the same data?
Disk access speed is still 1000X slower than memory
Data consistency challenges
Might be controversial here
Impact to Grid SLA
Introducing memory oriented data fabric
Pool memory (and disk) across cluster/Grid
Managed as a single unit
Replicate data for high concurrent load, HA
Distribute (partition) data for high data volume, scale
Gracefully expand capacity to meet scalability/Perf goals
Distributed Data Space Data warehouses Rational databases Distributed Applications
How does it work? When data is stored, it is transparently replicated and/or partitioned; Redundant storage can be in memory and/or on disk— ensures continuous availability Machine nodes can be added dynamically to expand storage capacity or to handle increased client load Shared Nothing disk persistence - Each cache instance can optionally persist to disk Synchronous read through, write through or Asynchronous write-behind to other data sources and sinks
Predictably scale with partitioning Distributed Apps By keeping data spread across many nodes in memory, we can exploit the CPU and network capacity on each node simultaneously to provide linear scalability A1 B1 C1 D1 E1 F1 G1 H1 I1 Local Cache Partitioning Meta Data Single Hop
Parallel loading by many Grid nodes
only limited by CPU and network backbone
With partitioning meta data on each compute node, access to any single piece of data is a single hop
As changes are redundantly and synchronously managed, availability and consistency is preserved
Dynamically detect load changes and add or remove nodes for data
Automatic data re-partitioning will condition the load
Collocate data for near infinite scale Distributed Apps A1 B1 C1 D1 E1 F1 G1 H1 I1 Local Cache Partitioning Meta Data Single Hop
Different Partitioning policies
Hash partitioning
Suitable for key based access
Uniform random hashing
Dramatically scale by keeping all related data together
Application managed - associations
Orders hash partitioned but associated line items are collocated
Application managed
Grouped on data object field(s)
Customize what is collocated
Example: ‘Manage all Sept trades in one data partition’
Move business logic to data f 1 , f 2 , … f n FIFO Queue Data fabric Resources Exec functions Sept Trades Submit (f1) -> AggregateHighValueTrades(<input data>, “ where trades.month=‘Sept ’) Function (f1) Function (f2)
Principle: Move task to computational resource with most of the relevant data before considering other nodes where data transfer becomes necessary
Fabric function execution service
Data dependency hints
Routing key, collection of keys, “where clause(s)”
Serial or parallel execution
“ Map Reduce”
Parallel queries
Query execution for Hash policy
Parallelize query to each relevant node
Each node executes query in parallel using local indexes on data subset
Query result is streamed to coordinating node
Individual results are unioned for final result set
This “scatter-gather” algorithm can waste CPU cycles
Partition the data on the common filter
For instance, most queries are filtered on a Trade symbol
Query predicate can be analyzed to prune partitions
1. select * from Trades where trade.month = August 2. Parallel query execution 3. Parallel streaming of results 4. Results returned
Key lessons
Apps should think about capitalizing memory across Grid (it is abundant)
Keep IO cycles to minimum through main memory caching of operational data sets
Scavange Grid memory and avoid data source access
Achieve near infinite scale for your Grid apps by horizontally partitioning your data and behavior
0 comments
Post a comment