NoSQL meetup July 2011

1. NoSQL meetup July 2011 Real-Time processing with In-Memory-Data-Grid and NoSQL Shay Hassidim Deputy CTO GigaSpaces Inc. shay@gigaspaces.com

2. Agenda Slides – 30 min Live Demos – 45 min Q&A – 15 min 2

3. Real-Time Processing Use Cases Risk – Calculation engines Call Center Management E-commerce – auction monitoring , inventory Gaming – Multi-user , on-line gaming On-line marketing – Improve conversion rate Weather reporting Traffic analysis Supply-Chain optimization Manufacturing - Quality management in Shipment & Delivery Monitoring Fraud Detection 3

4. Note the Time dimension 4

5. Data resolution & processing models 5

6. Traditional Processing - RDBMS Scale-up Database Use traditional RDBMS Stored procedure Flash memory to reduce I/O Read-only replica Limitations Doesn’t scale on write Extremely expensive (HW + SW) 6

7. Traditional Processing - CEP Process the data as it comes Maintain a small fraction of the data in-memory Pros: Low-latency Relatively low-cost Cons Hard to scale (Mostly limited to scale-up) Not agile - Queries must be pre-generated Fairly complex 7

9. Memory capacity is limited

10. SQL8

11. NoSQL DB Distributed database Hbase, Cassandra, MongoDB Pros Scale on write/read Elastic Cons High latency on Read (tunable) Consistency tradeoffs are hard Non-Transactional 9

12. Hadoop Map/Reduce Distributed batch processing Pros Designed to process massive amount of data Mature Low cost Cons Not real-time New Programming Model HDFS must be carefully tuned to improve data locality 10

13. So what’s the bottom line? One size fit all model doesn’t cut it.. The solution has to be a combination of several technologies and patterns... 11

15. Java, .Net, C++

18. GigaSpaces GigaSpaces delivers software middleware that provides enterprises and ISVs with end-to-end application scalability and cloud-enablement for mission-critical applications for hundreds of tier-1 organizations worldwide. 13

19. GigaSpaces XAP Components Java-.Net-C++ Ruby-Groovy-Jython-Spring JPA-JMS JDBC Schema-Free Customize Application Management Rules & Workflows 1 Clustering Model for all components Run entire application in-memory… transaction -safe In-Memory Data Grid Real-Time Automated Deployment Monitoring Management Virtualize All Middleware Components 14

21. IBM extreme scale

22. Microsoft Velocity

23. Oracle Coherence

24. JBoss Infinispan

25. ScaleOut Software

26. Terracotta-EHCache

27. Tibco ActiveSpaces

28. Vmware GemFire

29. Gridgain

30. hazelcast15

32. Map-reduce

33. Event-driven

34. Execute code with data

35. Transactional

36. Secured

38. Write/Read scalability

39. Dynamic scaling

40. Raw Data and aggregated DataAnalytics Application Generate Patterns 16

41. Use Case Calculation Engine Design Patterns With XAP 17

42. Main Features Used Data Partitioning: Transparent content-based data partitioning to evenly and intelligently distribute data across your data-grid cluster Querying: Sophisticated query engine with support for SQL and example based queries Indexing: Predefined and ad-hoc property indexing for blazing fast data access Write Behind: Asynchronous and reliable propagation of data to any external data source Locking Support: Locking and transaction isolation for robust and hassle-free data access Master-Worker Support: Intuitive and highly scalable master-worker implementation for distributing computation-intensive tasks Distributed Code Execution: Dynamic code shipment and map/reduce execution across the grid for optimized processing and data access Content Based Routing: Routing of events to relevant cluster members based on their content Workflow Support: Implement complex workflows using event propagation and sophisticated event filtering Admin API: Comprehensive and intuitive API for monitoring and controlling every aspect of your cluster and application 18

43. Elastic Calculation Engine - Colocated Logic Step 2 - The Task reads all the Trade objects and performs the NPV calculation for each Task. Result sent back into the client for final aggregation Step 1 - The client sends calculation Task to each partition with the specific Trade IDs required. Step 3 - The Calculation Task searches for all Trades. Any missing Trades are loaded in a lazy manner from the DB in one bulk query. The Data-Grid and the calculations Grid scale together Step 4 - Intermediate results retrieved from each partition and reduced. 19

44. Elastic Calculation Engine - Remote Logic Step 3 - The Calculation logic searches for all Trades. Any missing Trades are loaded in a lazy manner from the DB in one bulk query and written into the space to be reused later. Step 2 - Each Calculation engine consumes a different Request , processes it and writes the Result back into the space. Using local cache for reference data. Step 1 - The client sends calculation Requests to the space cluster. Scales on demand separately from the Data-Grid The Data Grid and the calculations Grid scale independently Step 4 - The client consumes all the calculation results and performs final aggregation. 20

45. Demos Simple IMDG Operations IMDG write,read,execute… IMDG and NoSQL DB Integration Cassandra MongoDB Calculation Engine Small scale Demo Large scale Demo – on the Cloud 21

46. 22

NoSQL meetup July 2011

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to NoSQL meetup July 2011

Similar to NoSQL meetup July 2011 (20)

More from Shay Hassidim

More from Shay Hassidim (10)

Recently uploaded

Recently uploaded (20)

NoSQL meetup July 2011