In-memory data and compute on top of Hadoop
Upcoming SlideShare
Loading in...5
×
 

In-memory data and compute on top of Hadoop

on

  • 719 views

Speakers: Anthony Baker and Jags Ramnarayan ...

Speakers: Anthony Baker and Jags Ramnarayan
Hadoop gives us dramatic volume scalability at a cheap price. But core Hadoop is designed for sequential access - write once and read many times; making it impossible to use hadoop from a real-time/online application. Add a distributed in-memory tier in front and you could get the best of two worlds - very high speed, concurrency and the ability to scale to very large volume. We present the seamless integration of in-memory data grids with hadoop to achieve interesting new design patterns - ingesting raw or processed data into hadoop, random read-writes on operational data in memory or massive historical data in Hadoop with O(1) lookup times, zero ETL Map-reduce processing, enabling deep-scale SQL processing on data in Hadoop or the ability to easily output analytic models from hadoop into memory. We introduce and present the ideas and code samples through Pivotal in-memory real-time and the Hadoop platform.

Statistics

Views

Total Views
719
Views on SlideShare
719
Embed Views
0

Actions

Likes
3
Downloads
41
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

In-memory data and compute on top of Hadoop In-memory data and compute on top of Hadoop Presentation Transcript

  • In-memory data and compute on top of Hadoop Jags Ramnarayan – Chief Architect, Fast Data, Pivotal Anthony Baker – Architect, Fast Data, Pivotal © 2013 SpringOne 2GX. All rights reserved. Do not distribute without permission.
  • Agenda •  •  •  •  •  •  In-memory data grid – concepts, strengths, weaknesses HDFS – strengths, weaknesses What is our proposal? How do you use this? SQL syntax and demo HDFS integration architecture and demo MapReduce integration and demo –  In-memory, parallel stored procedures •  Comparison to Hbase
  • “It is raining databases in the cloud” •  Next Gen transactional DB is memory based, distributed, elastic, HA, cloud ready … –  In-Memory data grids(IMDG), NoSQL, Caching •  Pivotal GemFire, Oracle coherence, Redis, Cassandra, … •  Next Gen OLAP DB is centered around Hadoop –  Driver: They say it is ‘Volume, velocity, variety’ –  Or, is it just cost/TB? (The 451Group)
  • Agenda •  •  •  •  •  •  In-memory data grid – concepts, strengths, weaknesses HDFS – strengths, weaknesses What is our proposal? How do you use this? SQL syntax and demo HDFS integration architecture and demo MapReduce integration and demo –  In-memory, parallel stored procedures •  Comparison to Hbase
  • IMDG basic concepts –  Distributed memory oriented store •  KV/Objects or SQL •  Queriable, Indexable and transactional –  Multiple storage models •  Replication, partitioning in memory •  With synchronous copies in cluster •  Overflow to disk and/or RDBMS –  Parallelize Java App logic –  Multiple failure detection schemes –  Dynamic membership (elastic) –  Vendors differentiate on •  SQL support, WAN, events, etc Handle thousands of concurrent connections Replicated Region Low latency for thousands of clients Synchronous replication for slow changing data Redundant copy Partitioned Region Partition for large data or highly transactional data 5
  • Key IMDG pattern - Distributed Caching •  Designed to work with existing RDBs –  Read through: Fetch from DB on cache miss –  Write through: Reflect in cache IFF DB write succeeds –  Write behind: reliable, in-order queue and batch write to DB 6
  • Traditional RDB integration can be challenging Memory Tables (4) (1) Updates (1) (2) Queue (2) Asynchronous, Batches DB WRITER (3) Memory Tables (4) (1) DB Synchronizer Updates (1) (2) Queue (2) DB WRITER (3) Synchronous “Write through” Single point of bottleneck and failure Not an option for “Write heavy” Complex 2-phase commit protocol Parallel recovery is difficult DB Synchronizer Asynchronous “Write behind” Cannot sustain high “write” rates Queue may have to be persistent Parallel recovery is difficult
  • Some IMDG, NoSQL offer ‘Shared nothing persistence’ •  Memory Tables LOG Compressor LOG Compressor OS Buffers Record1 Record1 Record2 Record2 Record3 Record3 Memory Tables Append only Operation logs •  OS Buffers Record1 Record1 Record2 Record2 Record3 Record3 •  •  Append only operation logs Fully parallel Zero disk seeks Append only Operation logs But, cluster restart requires log scan •  Very large volumes pose challenges
  • Agenda •  •  •  •  •  •  In-memory data grid – concepts, strengths, weaknesses HDFS – strengths, weaknesses What is our proposal? How do you use this? SQL syntax and demo HDFS integration architecture and demo MapReduce integration and demo –  In-memory, parallel stored procedures •  Comparison to Hbase
  • Hadoop core(HDFS) for scalable, parallel storage •  •  •  •  maturing and will be ubiquitous Handle very large data sets on commodity Handle failures well Simple Coherency model
  • Hadoop design center – batch and sequential Ÿ  64MB immutable blocks Ÿ  For random reads, you have to sequentially walk through records each time Ÿ  Write once, read many design Ÿ  Namenode can be a contention point Ÿ  Slow failure detection
  • Hadoop Strengths Ÿ  Massive volumes ( TB to PB) Ÿ  HA, compression Ÿ  Ever growing and maturing eco-system for parallel compute and analytics Ÿ  Storage systems like Isilon now offer HDFS interface Ÿ  Optimized for virtual machines
  • Agenda •  •  •  •  •  •  In-memory data grid – concepts, strengths, weaknesses HDFS – strengths, weaknesses What is our proposal? How do you use this? SQL syntax and demo HDFS integration architecture and demo MapReduce integration and demo –  In-memory, parallel stored procedures •  Comparison to Hbase
  • SQL + IMDG(Objects) + HDFS Data in many shapes – support multiple data models Operational data is the focus. It is in memory (mostly) Main-memory based, distributed low latency, data store for big data All Data, History in HDFS
  • SQL + IMDG(Objects) + HDFS Replication or partitioning Storage model: In-memory, In-memory with local disk or In-memory with HDFS persistence
  • SQL + IMDG(Objects) + HDFS SQL Engine – designed for online/OLTP, Transactions IMDG caching features – readThru, writeBehind, etc
  • SQL + IMDG(Objects) + HDFS analytics without access via in-memory tier – sequential walk through or incremental processing. With parallel ingestion, you get near real time visibility to data for deep analytics. Tight HDFS integration – streaming, RW cases
  • SQL + IMDG(Objects) + HDFS closed loop between real-time and the analytics MR ‘reduce’ can directly emit results to in-memory tier
  • GemFire XD – a Pivotal HD Service Working set in memory, geo replicated SQL SQLFire Objects, JSON GemFire History, time series in HDFS SQL engine – cost based optimizer, inmemory indexing, DTxn, RDB integration.. + Clustering, in-memory storage, HA, replication, WAN, Events, Distributed queue… Pivotal HD Integrated Install, config; command center – monitoring, optimizations to Hadoop
  • The real time Latency spectrum Machine latency Human interactions Milliseconds Seconds GemFire XD, Online/OLTP/Operational DBs Interactive reports Seconds, Minutes Batch processing Minutes, Hours Analytics, Data Warehousing PivotalHD HAWQ
  • Real time on top of Hadoop – who else? Many more…. Most focused on interactive queries for analytics
  • Design patterns •  Streaming ingest – consume unbounded event streams –  Write fast into memory; stream all writes to HDFS for batch analytics •  e.g. Maintain latest price for each security in memory; time series in HDFS •  Continuously ingest click streams, audit trail or interaction data –  Trap interactions or OLTP transactions, do in-line stream processing (actionable insights) and write results or raw state into HDFS
  • Design patterns •  High performance Operational Database –  Keep operational data in-memory; history in HDFS is randomly accessible •  e.g. Last 1 month of trades in-memory but all history is accessible at some cost –  Take analytic output from Hadoop/SQL analytics and make it visible to online apps
  • Agenda •  •  •  •  •  •  In-memory data grid – concepts, strengths, weaknesses HDFS – strengths, weaknesses What is our proposal? How do you use this? SQL syntax and demo HDFS integration architecture and demo MapReduce integration and demo –  In-memory, parallel stored procedures •  Comparison to Hbase
  • Agenda •  How do you use this? SQL syntax and demo
  • In-Memory Partitioning & Replication
  • Explore features using simple STAR schema FLIGHTAVAILABILITY --------------------------------------------- FLIGHTS FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , FLIGHT_DATE DATE NOT NULL , ECONOMY_SEATS_TAKEN INTEGER , ….. --------------------------------------------FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , ORIG_AIRPORT CHAR(3), DEPART_TIME TIME, ….. PRIMARY KEY (FLIGHT_ID, SEGMENT_NUMBER) 1–M PRIMARY KEY ( FLIGHT_ID, SEGMENT_NUMBER, FLIGHT_DATE)) 1–1 FOREIGN KEY (FLIGHT_ID, SEGMENT_NUMBER) REFERENCES FLIGHTS ( FLIGHT_ID, SEGMENT_NUMBER) FLIGHTHISTORY --------------------------------------------FLIGHT_ID CHAR(6), SEGMENT_NUMBER INTEGER, ORIG_AIRPORT CHAR(3), DEPART_TIME TIME, DEST_AIRPORT CHAR(3), ….. SEVERAL CODE/DIMENSION TABLES --------------------------------------------AIRLINES: AIRLINE INFORMATION (VERY STATIC) COUNTRIES : LIST OF COUNTRIES SERVED BY FLIGHTS CITIES: MAPS: PHOTOS OF REGIONS SERVED Assume, thousands of flight rows, millions of flightavailability records 27
  • Creating tables CREATE TABLE AIRLINES ( AIRLINE CHAR(2) NOT NULL PRIMARY KEY, AIRLINE_FULL VARCHAR(24), BASIC_RATE DOUBLE PRECISION, DISTANCE_DISCOUNT DOUBLE PRECISION,…. ); Table GF XD GF XD GF XD
  • Replicated tables CREATE TABLE AIRLINES ( AIRLINE CHAR(2) NOT NULL PRIMARY KEY, AIRLINE_FULL VARCHAR(24), BASIC_RATE DOUBLE PRECISION, DISTANCE_DISCOUNT DOUBLE PRECISION,…. ) REPLICATE; Replicated Table Replicated Table GF XD Design Pattern Replicate reference tables in STAR schemas (seldom change, often referenced in queries) Replicated Table GF XD GF XD
  • Partitioned tables CREATE TABLE FLIGHTS ( FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , ORIG_AIRPORT CHAR(3), DEST_AIRPORT CHAR(3) DEPART_TIME TIME, FLIGHT_MILES INTEGER NOT NULL) PARTITION BY COLUMN(FLIGHT_ID); Design Pattern Partition Fact tables in STAR schemas for load balancing (large, write heavy) Replicated Table Table Partitioned Table Replicated Table Partitioned Table GF XD Replicated Table Partitioned Table GF XD GF XD
  • Partitioned but highly available CREATE TABLE FLIGHTS ( FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , ORIG_AIRPORT CHAR(3), DEST_AIRPORT CHAR(3) DEPART_TIME TIME, FLIGHT_MILES INTEGER NOT NULL) PARTITION BY COLUMN (FLIGHT_ID) REDUNDANCY 1; Design Pattern Increase redundant copies for HA and load balancing queries across replicas Replicated Table Partitioned Table Replicated Table Table Partitioned Table Replicated Table Partitioned Table Redundant Partition Redundant Partition Redundant Partition GF XD GF XD GF XD
  • Colocation for related data CREATE TABLE FLIGHTAVAILABILITY ( FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , ….. PARTITION BY COLUMN (FLIGHT_ID) COLOCATE WITH (FLIGHTS); Design Pattern Colocate related tables for maximum join performance Replicated Table Table Partitioned Table Colocated Partition Redundant Partition Redundant Partition Replicated Table Partitioned Table Colocated Partition Redundant Partition Redundant Partition GF XD GF XD Replicated Table Partitioned Table Colocated Partition Redundant Partition Redundant Partition GF XD
  • Native Disk resident tables (operation logging) CREATE TABLE FLIGHTS ( FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , ….. PARTITION BY COLUMN (FLIGHT_ID) PERSISTENT; Data dictionary is always persisted in each server sqlf backup /export/fileServerDirectory/sqlfireBackupLocation Replicated Table Table Partitioned Table Colocated Partition Redundant Partition Redundant Partition Replicated Table Partitioned Table Colocated Partition Redundant Partition Redundant Partition GF XD GF XD Replicated Table Partitioned Table Colocated Partition Redundant Partition Redundant Partition GF XD
  • Demo environment SQL client jdbc:sqlfire://localhost:1527 Virtual Machine GemFire XD Locator GemFire XD Server GemFire XD Server GemFire XD Server Pulse (monitoring)
  • Demo: replicated and partitioned tables
  • Agenda •  HDFS integration architecture and demo
  • Effortless HDFS integration Ÿ  Options –  Fast Streaming writes –  Random RW –  With or without time series
  • Streaming all writes to HDFS CREATE HDFSSTORE streamingstore NAMENODE hdfs://PHD1:8020 DIR /stream-tables BATCHSIZE 10 BATCHTIMEINTERVAL 2000 QUEUEPERSISTENT true; Replicated Table Partitioned Table Colocated Partition Redundant Partition CREATE TABLE FLIGHTS ( FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , ….. PARTITION BY COLUMN (FLIGHT_ID) PERSISTENT HDFSSTORE streamingstore WRITEONLY; Replicated Table Table Partitioned Table Colocated Partition Redundant Partition Replicated Table Partitioned Table Colocated Partition Redundant Partition
  • Read and Write to HDFS CREATE HDFSSTORE RWStore NAMENODE hdfs://PHD1:8020 DIR /indexed-tables BATCHSIZE 10 BATCHTIMEINTERVAL 2000 QUEUEPERSISTENT true; Replicated Table Partitioned Table Colocated Partition Redundant Partition CREATE TABLE FLIGHTS ( FLIGHT_ID CHAR(6) NOT NULL , SEGMENT_NUMBER INTEGER NOT NULL , ….. PARTITION BY COLUMN (FLIGHT_ID) PERSISTENT HDFSSTORE RWStore; Replicated Table Table Partitioned Table Colocated Partition Redundant Partition Replicated Table Partitioned Table Colocated Partition Redundant Partition
  • Write Path – streaming to HDFS GemFire XD 1 SQL client Table FLIGHTS (bucket N) 3 local store (append only) 2 5 4 DFS Client NameNode DataNode 6 7 HDFS GemFire XD Table FLIGHTS (bucket N backup) local store (append only) Directory: /GFXD/APP/FLIGHTS/BucketN In-memory partitioned data colocated with HD DN
  • Directory structure in HDFS Time-stamped records allow incremental Map/Reduce jobs /GFXD Write-only /GFXD /APP.FLIGHT_HISTORY Read/Write /APP.FLIGHTS /0 Table FLIGHT_HISTORY (bucket 0) /0 data 0-1-XXX.shop data bloom index 0-1-XXX.hop data 0-2-XXX.shop data bloom index 0-2-XXX.hop /1 Table FLIGHT_HISTORY (bucket 1) /1 data 1-1-XXX.shop data bloom index 1-1-XXX.hop data 1-2-XXX.shop data bloom index 1-2-XXX.hop
  • Read/Write with Compaction Now with sorting! GemFire XD 1 SQL client Table FLIGHTS (bucket N) 3 local store (append only) 2 4 DFS Client …and compaction 5 NameNode DataNode 6 7 HDFS GemFire XD Table FLIGHTS (bucket N backup) local store (append only) Directory: /GFXD/APP/FLIGHTS/BucketN Log structured merge tree (like HBase, Cassandra)
  • Read path for HDFS tables GemFire XD SQL client 1 Table FLIGHTS (bucket N) 4 3 2 local store (append only) DFS Client NameNode 5 DataNode Block cache 6 data index data Short circuit read path for local blocks; Block cache avoids I/O for bloom and index lookups bloom bloom index …
  • Tiered compaction •  Async writes allow lock-free sequential I/O …but more files means slower reads •  Compactions balance read/write throughput •  Minor compactions merge small files into bigger files •  Major compactions merge all files into one single file Time order Level 0 Level 1 Level 2 data bloom data index bloom data bloom index data index data data bloom bloom index index bloom … … index
  • “Closed-loop” with analytics GemFire XD Table FLIGHTS (bucket N) Map/Reduce Pivotal Hawq Hive DFS Client OutputFormat local store (append only) InputFormat HDFS Time order Level 0 Level 1 data bloom data index bloom data bloom index index data data bloom bloom index index …
  • Demo environment with PivotalHD SQL client Virtual Machine jdbc:sqlfire://localhost:1527 GemFire XD Locator GemFire XD Server GemFire XD Server GemFire XD Server PivotalHD NameNode PivotalHD DataNode Pulse (monitoring)
  • Demo: HDFS tables
  • Operational vs. Historical Data •  Operational data is retained in memory for fast access •  User-supplied criteria identifies operational data –  Enforced on incoming updates or periodically CREATE TABLE flights_history (…) PARTITION BY PRIMARY KEY EVICTION BY CRITERIA (LAST_MODIFIED_DURATION > 300000) EVICTION FREQUENCY 60 SECONDS HDFSSTORE (bar); •  Query hints or connection properties control use of historical data SELECT * FROM flights_history --PROPERTIES queryHDFS = true WHERE orig_airport = ‘PDX’ AND miles > 1000 ORDER BY dest_airport
  • Agenda •  MapReduce integration and demo
  • Hadoop Map/Reduce •  Map/Reduce is a framework for processing massive data sets in parallel –  Mapper acts on local file splits to transform individual data elements –  Reducer receives all values for a key and generates aggregate result –  Driver provides job configuration –  InputFormat and OutputFormat define data source and sink InputFormat supplies local data Mapper transforms data Node 1 Node 2 Node 3 Mapper Mapper Mapper Hadoop sorts keys Shuffle Node 1 •  Hadoop manages job execution Reducer generates aggregate result OutputFormat writes result Node 2 Node 3 Reducer Reducer Reducer
  • Map/Reduce with GemFire XD •  Users can execute Hadoop Map/Reduce jobs against GemFire XD data using –  EventInputFormat to read data from HDFS without impacting online availability or performance Hadoop EventInputFormat file split Mapper DDL –  SqlfOutputFormat to write data into SQL table for immediate use by online applications Hadoop Reducer SqlfOutputFormat GemFire XD jdbc:sqlfire://localhost:1527 PUT INTO foo (…) VALUES (?, ?, …) PUT INTO foo (…) VALUES (?, ?, …) … table foo
  • Demo: Map/Reduce
  • Using the InputFormat - Mapper //  count  each  airport  present  in  a  FLIGHT_HISTORY  row   public  class  SampleMapper  extends  MapReduceBase      implements  Mapper<Object,  Row,  Text,  IntWritable>  {        public  void  map(Object  key,  Row  row,              OutputCollector<Text,  IntWritable>  output,              Reporter  reporter)  throws  IOException  {          try  {              IntWritable  one  =  new  IntWritable(1);              ResultSet  rs  =  row.getRowAsResultSet();              String  origAirport  =  rs.getString("ORIG_AIRPORT");              String  destAirport  =  rs.getString("DEST_AIRPORT");              output.collect(new  Text(origAirport),  one);              output.collect(new  Text(destAirport),  one);          }  catch  (SQLException  e)  {              …          }      }   }   JobConf  conf  =  new  JobConf(getConf());   conf.setJobName("Busy  Airport  Count");     conf.set(EventInputFormat.HOME_DIR,  hdfsHomeDir);   conf.set(EventInputFormat.INPUT_TABLE,  tableName);     conf.setInputFormat(EventInputFormat.class);   conf.setMapperClass(SampleMapper.class);     ...    
  • Use Spring Hadoop for Job Configuration <beans:beans  …>    <job  id="busyAirportsJob"        libs="…"        input-­‐format="com.vmware.sqlfire.internal.engine.hadoop.mapreduce.EventInputFormat"        output-­‐path="${flights.intermediate.path}"        mapper="demo.sqlf.mr2.BusyAirports.SampleMapper"        combiner="org.apache.hadoop.mapreduce.lib.reduce.IntSumReducer"        reducer="org.apache.hadoop.mapreduce.lib.reduce.IntSumReducer"    />      <job  id="topBusyAirportJob"          libs="${LIB_DIR}/sqlfire-­‐mapreduce-­‐1.0-­‐SNAPSHOT.jar"          input-­‐path="${flights.intermediate.path}"          output-­‐path="${flights.output.path}"          mapper="demo.sqlf.mr2.TopBusyAirport.TopBusyAirportMapper"          reducer="demo.sqlf.mr2.TopBusyAirport.TopBusyAirportReducer"          number-­‐reducers="1”    />    …   </beans:beans>  
  • Using the OutputFormat - Reducer //  find  the  max,  aka  the  busiest  airport   public  class  TopBusyAirportReducer  extends  MapReduceBase      implements  Reducer<Text,  StringIntPair,  Key,  BusyAirportModel>  {        public  void  reduce(Text  token,  Iterator<StringIntPair>  values,                  OutputCollector<Key,  BusyAirportModel>  output,                    Reporter  reporter)  throws  IOException  {   JobConf  conf  =  new  JobConf(getConf());     conf.setJobName("Top  Busy  Airport");          String  topAirport  =  null;            int  max  =  0;   conf.set(SqlfOutputFormat.OUTPUT_URL,            "jdbc:sqlfire://localhost:1527");          while  (values.hasNext())  {   conf.set(SqlfOutputFormat.OUTPUT_SCHEMA,  "APP");              StringIntPair  v  =  values.next();   conf.set(SqlfOutputFormat.OUTPUT_TABLE,              if  (v.getSecond()  >  max)  {          "BUSY_AIRPORT");                  max  =  v.getSecond();                    topAirport  =  v.getFirst();   conf.setReducerClass(TopBusyAirportReducer.class);              }   conf.setOutputKeyClass(Key.class);          }   conf.setOutputValueClass(BusyAirportModel.class);          BusyAirportModel  busy  =     conf.setOutputFormat(SqlfOutputFormat.class);              new  BusyAirportModel(topAirport,  max);            output.collect(null,  busy);   ...      }     }  
  • Where do the results go? •  Automatically insert reduced values into output table by matching column names public  class  BusyAirportModel  {   PUT INTO BUSY_AIRPORT ( flights, airport) VALUES (?, ?) PUT INTO BUSY_AIRPORT (flights, airport) VALUES (?, ?) …    private  String  airport;      private  int  flights;        public  BusyAirportModel(String  airport,  int  flights)  {          this.airport  =  airport;          this.flights  =  flights;      }        public  void  setFlights(int  idx,  PreparedStatement  ps)              throws  SQLException  {          ps.setInt(idx,  flights);      }        public  void  setAirport(int  idx,  PreparedStatement  ps)              throws  SQLException  {          ps.setString(idx,  airport);      }   }  
  • Agenda •  •  •  •  •  •  In-memory data grid – concepts, strengths, weaknesses HDFS – strengths, weaknesses What is our proposal? How do you use this? SQL syntax and demo HDFS integration architecture and demo MapReduce integration and demo –  In-memory, parallel stored procedures •  Comparison to Hbase
  • Scaling Application logic with Parallel “Data Aware procedures”
  • Why not Map Reduce? Traditional Map reduce Source: UC Berkeley Spark project (just the image) parallel “data aware” procedures
  • Procedures – managed in spring containers as beans Java Stored Procedures may be created according to the SQL Standard CREATE PROCEDURE getOverBookedFlights () LANGUAGE JAVA PARAMETER STYLE JAVA READS SQL DATA DYNAMIC RESULT SETS 1 EXTERNAL NAME ‘examples.OverBookedStatus.getOverBookedStatus’; SQLFire also supports the JDBC type Types.JAVA_OBJECT. A parameter of type JAVA_OBJECT supports an arbitrary Serializable Java object.
  • Data Aware Procedures Parallelize procedure and prune to nodes with required data Extend the procedure call with the following syntax: Client CALL [PROCEDURE] procedure_name ( WITH RESULTexpression ]* ] ) processor_name ] [ [ expression [, PROCESSOR [ { ON TABLE table_name [ WHERE whereClause ] } { ON {ALL | SERVER GROUPS (server_group_name [, server_group_name ]*) }} | Fabric Server 1 Fabric Server 2 ] CALL getOverBookedFlights( ) ON TABLE FLIGHTAVAILABILITY WHERE FLIGHT_ID = ‘AA1116’; Hint the data the procedure depends on If table is partitioned by columns in the where clause the procedure execution is pruned to nodes with the data (node with “AA1116” in this case)
  • Parallelize procedure then aggregate (reduce) CALL [PROCEDURE] register a Java Result Processor (optional in some cases): procedure_name [ WITH RESULT PROCESSOR processor_name ] [ [ ON TABLE table_name WHERE ( { expression [, expression[ ]* ] ) whereClause ] } | { ON {ALL | SERVER GROUPS (server_group_name [, server_group_name ]*) }} Client ] Fabric Server 1 Fabric Server 2 Fabric Server 3
  • High density storage in Memory – Off Java Heap
  • Off-heap to minimize JVM copying, GC (MemScale) •  Off-heap memory manager for Java –  JVM memory manager not designed for volume –  Believe TB memory machines are now commodity class •  Key principles –  Avoid defrag and compaction of data blocks through reusable buffer pools –  Avoid all the nasty copying in Java heaps •  (YG – From – To – OldGen – UserToKernal copy – network copy) then repeat on the replicated node side •  Hadoop exacerbates the copying problem –  Multiple JVMs involved: TaskTracker(JVM) – Data Node (JVM) – FileSystem/Network –  Let alone all the copies and intermediate disk storage required in MR shuffling
  • Integration with SpringXD (future) •  Spring XD is a distributed, extensible framework for •  Ingestion, real time analytics, batch processing •  GemFire XD as a source and sink •  Pluggability in its Runtime (DIRT) •  GemFire XD could be an optional runtime
  • Comparison to HBase Reminder for speaker - don’t make this a product pitch J
  • Some HBase 0.9x challenges •  HBase inherently is not HA; HDFS is –  Failed segment servers can cause pauses? •  WAL writes have to synchronously go to HDFS (and its replicas) –  HDFS inherently detects failures slowly (thinks it is a overload) •  Probability for Hotspots –  Segments are sorted not stored on a random hash •  WAN replication needs a lot of work •  No Backup, recovery
  • Some HBase 0.9x challenges •  No real Querying – just key based range scans –  And, LSM on disk is suboptimal to B+Tree for querying •  You cannot execute transactions or integrate with RDBs •  Some like ColumnFamily data model; Really? –  Pros: Self describing, nested model is possible –  Cons: difficult, query engine optimization is difficult; mapping is your problem, bloat
  • Learn More. Stay Connected. Learn more: Jags – jramnarayan at gopivotal.com Anthony – abaker at gopivotal.com http://communities.vmware.com/community/vmtn/appplatform/vfabric_sqlfire Twitter: twitter.com/springsource YouTube: youtube.com/user/SpringSourceDev Google +: plus.google.com/+springframework
  • Extras
  • Consistency model
  • Consistency Model without Transactions •  Replication within cluster is always eager and synchronous •  Row updates are always atomic; No need to use transactions •  FIFO consistency: writes performed by a single thread are seen by all other processes in the order in which they were issued
  • Consistency Model without Transactions •  Consistency in Partitioned tables –  a partitioned table row owned by one member at a point in time –  all updates are serialized to replicas through owner –  "Total ordering" at a row level: atomic and isolated •  Membership changes and consistency – need another hour J •  Pessimistic concurrency support using ‘Select for update’ •  Support for referential integrity
  • Distributed Transactions •  Full support for distributed transactions •  Support READ_COMITTED and REPEATABLE_READ •  Highly scalable without any centralized coordinator or lock manager •  We make some important assumptions •  Most OLTP transactions are small in duration and size •  W-W conflicts are very rare in practice
  • Distributed Transactions •  How does it work? •  Each data node has a sub-coordinator to track TX state •  Eagerly acquire local “write” locks on each replica •  Object owned by a single primary at a point in time •  Fail fast if lock cannot be obtained •  Atomic and works with the cluster Failure detection system •  Isolated until commit for READ_COMMITTED •  Only support local isolation during commit
  • GFXD Performance benchmark In-memory
  • How does it perform? Scale? •  •  •  Scale from 2 to 10 servers (one per host) Scale from 200 to 1200 simulated clients (10 hosts) Single partitioned table: int PK, 40 fields (20 ints, 20 strings)
  • How does it perform? Scale? •  CPU% remained low per server – about 30% indicating many more clients could be handled
  • Is latency low with scale? •  •  •  Latency decreases with server capacity 50-70% take < 1 millisecond About 90% take less than 2 milliseconds