Published on

StrangeLoop 2012 presentation slides for "Making

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • Comments:\n---------------- \nExample:\nCounts number of non-space characters in given argument by spreading the processing on multiple remote nodes. \nThis is 100% full source code. Can be written in Java or Groovy just as easily.\nCompile and run it. Works transparently on one computer or on thousands.\nFully distributed and parallelized processing.\nNo deployment or provisioning required - 100% zero deployment.\nDevelop and start in the cloud in minutes.\nIt will automatically load balance and fail over, if necessary.\nIt will automatically scale up and down based on dynamically available resources.\n
  • Comments:\n---------------\n- In-Memory Computing is high growth industry:\n- SAP has seen 206% increase in profit in Q112 in large due to SAP HANA, in-memory processing product\n- IBM grew in-memory computing revenue 80% in 2011\n- Software AG 3x revenue in 2012 for in-memory computing products\n- VMWare is looking at 30% increase in in-memory computing products\n\nGartner References:\n- Top 10 Strategic Technology Trends: In-Memory Computing Massimo Pezzini, Carl Claunch, Joseph Unsworth (G00230658)\n- Insight: Invest in In-Memory Computing for Breakthrough Competitive Advantage Massimo Pezzini (G00226070)\n- What CIOs Need to Know About In-Memory Database Management Systems Roxane Edjlali, Donald Feinberg (G00215735)\n- Need for Speed Powers In-Memory Business Intelligence James Richardson (G00212168)\n- Predicts 2012: Cloud and In-Memory Drive Innovation in Application Platforms Massimo Pezzini, Yefim Natis, Eric Knipp (G00226073)\n\nLinks:\n-\n-\n-\n-\n\n\n\n
  • \n
  • \n
  • \n
  • \n
  • Strangeloop2012

    1. Making Hadoop Real Time with Scala & GridGainNikita Ivanov, Founder & CEO GridGain SystemsSeptember 2012 1065 East Hillsdale Blvd, Suite 230 Foster City, CA 94404 #gridgain
    2. Table Of Contents: > 30% > 70% > Why Real Time Hadoop? > Live Coding > GridGain Overview > Real Time & Streaming Word Count > In-Memory Computing > Compute & Data Slide 2
    3. Real Time Hadoop: Why? Slide 3
    4. GridGain Overview: > Scalable In-Memory Data Platform > Full ACID In-Memory Compute Grid + In-Memory Data Grid Fully distributed ACID transactions Real Time & Streaming MapReduce, CEP > Three Editions: > Simplicity and Productivity Dramatically reduces cost of application development > “Compute Grid” Edition Demonstrably faster time-to-market Targeted at HPC market > “Data Grid” Edition Example: Targeted at Transactional Data Caching market Full source code in Scala of world’s shortest real time MapReduce app > “Big Data” Edition built with GridGain. Works on one or thousands of computers with no code changes required. Targeted at Real Time Big Data market > Language support: Server: Java, Scala, Groovy, Clients: .NET, PHP, REST, C++ > Mobile platforms support: iOS/ObjectiveC, Android Slide 4
    5. In-Memory Computing: Why Now? “In-memory will have an industry impact comparable to web and cloud. RAM is a new disk, and disk is a new tape.”Technology & Cost: Performance & Scalability Matters:> 64-bit CPU can address 16 exabytes > Citi: 100ms == $1M loss Entire active data set on the planet is addressable by just 1 CPU. Forex trading> Disk up to 105 times slower than DRAM > Google: 500ms == 20% traffic drop Dropping 20% of revenue SSD drives are up to 103 times slower> Super effective in-memory parallelization > SAP sees +206% in profit in Q112 For in-memory SAP HANA products Enabled by modern multicore CPUs > Software AG sees 3x revenue in 2012> DRAM prices drop 30% every 18 months For in-memory Terracota products 1TB RAM & 48 cores cluster ~ $40K (< $20K in 3 years) Slide 5
    6. GridGain: In-Memory Compute Grid> Direct API for split and aggregation> Pluggable failover, topology and collision resolution> Distributed task session> Distributed continuations & recursive split> Support for Streaming MapReduce> Support for Complex Event Processing (CEP)> Node-local cache> AOP-based, OOP/FP-based execution modes> Direct closure distribution in Java, Scala and Groovy> Cron-based scheduling> Direct redundant mapping support> Zero deployment with P2P class loading> Partial asynchronous reduction> Direct support for weighted and adaptive mapping> State checkpoints for long running tasks> Early and late load balancing> Affinity routing with data grid Slide 6
    7. GridGain: In-Memory Data Grid> Zero deployment for data> Local, full replicable and partitioned cache types> Pluggable expiration policies (LRU, LIRS, random, time-based)> Read-through and write-through logic with pluggable cache store> Synchronous and asynchronous cache operations> MVCC-based concurrency> Pluggable data overflow storage via new swap space SPI> PESSIMISTIC, OPTIMISTIC transactions> Standard isolation levels, JTA/JCA integration> Master/Master data replication/invalidation> Write-behind cache store support> Concurrent and transactional data preloading> Delayed preloading support> Affinity routing with compute grid> Partitioned cache with active replicas> Structured and unstructured data> Datacenter replication> JDBC driver for in-memory object data store> Off-heap memory support> Pluggable indexing via Indexing SPI> Tiered storage with on-heap, off-heap, swap space, SQL, and Hadoop> Distributed in-memory query capability> SQL, H2, Lucene, predicate-based affinity co-located queries Slide 7
    8. Live Coding: GridGain + Scala> 100% Live Coding: > Nothing pre-built > Every line & character > Everything from the start Slide 8
    9. Thank You! #gridgain