6. Traditional Processing - RDBMS Scale-up Database Use traditional RDBMS Stored procedure Flash memory to reduce I/O Read-only replica Limitations Doesn’t scale on write Extremely expensive (HW + SW) 6
7. Traditional Processing - CEP Process the data as it comes Maintain a small fraction of the data in-memory Pros: Low-latency Relatively low-cost Cons Hard to scale (Mostly limited to scale-up) Not agile - Queries must be pre-generated Fairly complex 7
11. NoSQL DB Distributed database Hbase, Cassandra, MongoDB Pros Scale on write/read Elastic Cons High latency on Read (tunable) Consistency tradeoffs are hard Non-Transactional 9
12. Hadoop Map/Reduce Distributed batch processing Pros Designed to process massive amount of data Mature Low cost Cons Not real-time New Programming Model HDFS must be carefully tuned to improve data locality 10
13. So what’s the bottom line? One size fit all model doesn’t cut it.. The solution has to be a combination of several technologies and patterns... 11
18. GigaSpaces GigaSpaces delivers software middleware that provides enterprises and ISVs with end-to-end application scalability and cloud-enablement for mission-critical applications for hundreds of tier-1 organizations worldwide. 13
19. GigaSpaces XAP Components Java-.Net-C++ Ruby-Groovy-Jython-Spring JPA-JMS JDBC Schema-Free Customize Application Management Rules & Workflows 1 Clustering Model for all components Run entire application in-memory… transaction -safe In-Memory Data Grid Real-Time Automated Deployment Monitoring Management Virtualize All Middleware Components 14
42. Main Features Used Data Partitioning: Transparent content-based data partitioning to evenly and intelligently distribute data across your data-grid cluster Querying: Sophisticated query engine with support for SQL and example based queries Indexing: Predefined and ad-hoc property indexing for blazing fast data access Write Behind: Asynchronous and reliable propagation of data to any external data source Locking Support: Locking and transaction isolation for robust and hassle-free data access Master-Worker Support: Intuitive and highly scalable master-worker implementation for distributing computation-intensive tasks Distributed Code Execution: Dynamic code shipment and map/reduce execution across the grid for optimized processing and data access Content Based Routing: Routing of events to relevant cluster members based on their content Workflow Support: Implement complex workflows using event propagation and sophisticated event filtering Admin API: Comprehensive and intuitive API for monitoring and controlling every aspect of your cluster and application 18
43. Elastic Calculation Engine - Colocated Logic Step 2 - The Task reads all the Trade objects and performs the NPV calculation for each Task. Result sent back into the client for final aggregation Step 1 - The client sends calculation Task to each partition with the specific Trade IDs required. Step 3 - The Calculation Task searches for all Trades. Any missing Trades are loaded in a lazy manner from the DB in one bulk query. The Data-Grid and the calculations Grid scale together Step 4 - Intermediate results retrieved from each partition and reduced. 19
44. Elastic Calculation Engine - Remote Logic Step 3 - The Calculation logic searches for all Trades. Any missing Trades are loaded in a lazy manner from the DB in one bulk query and written into the space to be reused later. Step 2 - Each Calculation engine consumes a different Request , processes it and writes the Result back into the space. Using local cache for reference data. Step 1 - The client sends calculation Requests to the space cluster. Scales on demand separately from the Data-Grid The Data Grid and the calculations Grid scale independently Step 4 - The client consumes all the calculation results and performs final aggregation. 20
45. Demos Simple IMDG Operations IMDG write,read,execute… IMDG and NoSQL DB Integration Cassandra MongoDB Calculation Engine Small scale Demo Large scale Demo – on the Cloud 21