Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
© 2014 MapR Technologies 1© 2014 MapR Technologies
© 2014 MapR Technologies 2
Contact Information
Ted Dunning
Chief Applications Architect at MapR Technologies
Committer & P...
© 2014 MapR Technologies 5
What is Drill?
© 2014 MapR Technologies 6
A Query engine that has…
• Columnar/Vectorized
• Optimistic/pipelined
• Runtime compilation
• L...
© 2014 MapR Technologies 7
Table Can Be an Entire Directory Tree
// On a file
select errorLevel, count(*)
from dfs.logs.`/...
© 2014 MapR Technologies 8
Basic Process
Zookeepe
r
DFS/HBase DFS/HBase DFS/HBase
Drillbit
Distributed
Cache
Drillbit
Dist...
© 2014 MapR Technologies 9
Stages of Query Planning
Parser
Logical
Planner
Physical
Planner
Query
Foreman
Plan
fragments
s...
© 2014 MapR Technologies 10
Query Execution
SQL Parser
Optimizer
Scheduler
Pig Parser
PhysicalPlan
Mongo
Cassandra
HiveQL
...
© 2014 MapR Technologies 11
Batches of Values
• Value vectors
– List of values, with same schema
– With the 4-value semant...
© 2014 MapR Technologies 12
Fixed Value Vectors
© 2014 MapR Technologies 13
Vectorization
• Drill operates on more than one record at a time
– Word-sized manipulations
– ...
© 2014 MapR Technologies 14
Runtime Compilation is Faster
• JIT is smart, but more gains with runtime compilation
• Janino...
© 2014 MapR Technologies 15
Drill compiler
Loaded class
Merge byte-
code of the
two classes
Janino
compiles
runtime
byte-c...
© 2014 MapR Technologies 16
Optimistic
0
20
40
60
80
100
120
140
160
cmd pipeline small db med db large db dw compilation ...
© 2014 MapR Technologies 17
Optimistic Execution
• Recovery code trivial
– Running instances discard the failed query’s in...
© 2014 MapR Technologies 18
Pipelining
• Record batches are pipelined between
nodes
– ~256kB usually
• Unit of work for Dr...
© 2014 MapR Technologies 19
Pipelining
• Random access: sort without copy or restructuring
• Avoids serialization/deserial...
© 2014 MapR Technologies 20
Cost-based Optimization
• Using Optiq, an extensible framework
– Pluggable rules, and cost mod...
© 2014 MapR Technologies 21
What is SparkSQL?
© 2014 MapR Technologies 22
What is Spark SQL
• Essentially syntactic sugar over a limited subset of Spark
• Inherits all ...
© 2014 MapR Technologies 23
In More Detail
• A Spark program consists of a computation graph that consumes
and produces so...
© 2014 MapR Technologies 24
Many Similarities
SQL Parser
Optimizer
Java
PhysicalPlan
Scala
LogicalPlan
Python
group
filter
...
© 2014 MapR Technologies 25
Important Differences
• Spark execution assumes RDD’s are complete representation,
not a strea...
© 2014 MapR Technologies 26
scala> sqlContext.sql("select * from json.`foo.json`").show
+---+------+----+
| a| b| c|
+---+...
© 2014 MapR Technologies 27
scala> sqlContext.sql(
"select a, explode(b) b_v from json.`bug.json`"
).show
+---+---------+
...
© 2014 MapR Technologies 28
First Synthesis
• Drill has a more nuanced optimizer, better code generation
– This often lead...
© 2014 MapR Technologies 29
But …
• Spark can optimize across entire program
– This often leads to ~2x speed advantage
• S...
© 2014 MapR Technologies 30
The Really Big Differences
• Drill focuses heavily on secure, multi-tenant access to data
– St...
© 2014 MapR Technologies 31
Drill security
➢ End to end security from
BI tools to Hadoop
➢ Standard based PAM
Authenticati...
© 2014 MapR Technologies 32
Granular security permissions through Drill views
Name City State Credit Card #
Dave San Jose ...
© 2014 MapR Technologies 33
Ownership Chaining
• Combine Self Service Exploration with Data Governance
Name City State Cre...
© 2014 MapR Technologies 34
But was that the right
question?
© 2014 MapR Technologies 35
Unification is Feasible
• It is relatively easy to build a DrillContext in Spark
– compare to ...
© 2014 MapR Technologies 36
What does the Spark and Drill integration look like
Features at a glance:
• Use Drill as an in...
© 2014 MapR Technologies 37
Is unification
valuable?
© 2014 MapR Technologies 38
Example of Unification
Callers
Universe
Towers
cdr data
© 2014 MapR Technologies 39
Simple Session Protocol
• Calls started at random
intervals
• During calls, reconnection
is do...
© 2014 MapR Technologies 40
The Resulting Data
• Signal strength reports
– Tower, timestamp, rank, caller, caller location...
© 2014 MapR Technologies 41
What can we do with it?
© 2014 MapR Technologies 42
Baby Steps
• What does signal propagation look like?
select x, y, signal from cdr_stream where...
© 2014 MapR Technologies 43
Baby Steps
• What does tower coverage look like?
select x, y from cdr_stream
where tower = 3 a...
© 2014 MapR Technologies 44
What about anomaly detection?
© 2014 MapR Technologies 45
Detecting Tower Loss
It’s important to know if traffic is stopped or delayed
because of a prob...
© 2014 MapR Technologies 46
Event Stream (timing)
• Events of various types arrive at irregular intervals
– we can assume ...
© 2014 MapR Technologies 47
Converting Event Times to Anomaly
99.9%-ile
99.99%-ile
© 2014 MapR Technologies 48
But in the real world, event
rates often change
© 2014 MapR Technologies 49
Time Intervals Are Key to Modeling Sporadic Events
0 1 2 3 4
02468
t (days)
dt(min)
© 2014 MapR Technologies 50
Time Intervals Are Key to Modeling Sporadic Events
0 1 2 3 4
02468
t (days)
dt(min)
© 2014 MapR Technologies 51
After Rate Correction
0 1 2 3 4
0246810
t (days)
dt/rate
99.9%−ile
99.99%−ile
© 2014 MapR Technologies 52
Detecting Anomalies in Sporadic Events
Incoming
events
99.97%-ile
Alarm
Δn
Rate
predictor
Rate...
© 2014 MapR Technologies 53
Propagation Anomalies
• What happens when something shadows part of the coverage
field?
– Can ...
© 2014 MapR Technologies 54
© 2014 MapR Technologies 55
© 2014 MapR Technologies 56
Variable Signal/Noise Makes Heuristic Tricky
Far from the transmitter,
received signal is domi...
© 2014 MapR Technologies 57
Other Issues
• Finding anomalies in coverage area is similar tricky
• Coverage area is roughly...
© 2014 MapR Technologies 58
Simple Answer for Propagation Anomalies
• Cluster signal strength reports
• Cluster locations ...
© 2014 MapR Technologies 59
Coverage Areas
© 2014 MapR Technologies 60
Just One Tower
© 2014 MapR Technologies 61
Cluster Reports for That Tower
© 2014 MapR Technologies 62
Cluster Reports for That Tower
1
2 3
4
5
6
7
8
9
© 2014 MapR Technologies 63
General Dataflow
Group by tower,
filter data (SQL)
k-means cluster
(ML LIB)
Split data
(SQL)
Lo...
© 2014 MapR Technologies 64
Summary
• Drill and Spark provide healthy competition in Apache
• Over time, they have converg...
© 2014 MapR Technologies 65
e-book available courtesy of MapR
http://bit.ly/1jQ9QuL
A New Look at Anomaly Detection
by Ted...
© 2014 MapR Technologies 66
Read online mapr.com/6ebooks-read
Download pdfs mapr.com/6ebooks-pdf
6 Free ebooks
Streaming
A...
© 2014 MapR Technologies 67
Thank you for coming today!
© 2014 MapR Technologies 68
…helping you put data technology to work
● Find answers
● Ask technical questions
● Join on-de...
Upcoming SlideShare
Loading in …5
×

Spark SQL versus Apache Drill: Different Tools with Different Rules

4,285 views

Published on

Spark SQL versus Apache Drill: Different Tools with Different Rules

Published in: Technology
  • Be the first to comment

Spark SQL versus Apache Drill: Different Tools with Different Rules

  1. 1. © 2014 MapR Technologies 1© 2014 MapR Technologies
  2. 2. © 2014 MapR Technologies 2 Contact Information Ted Dunning Chief Applications Architect at MapR Technologies Committer & PMC for Apache’s Drill, Zookeeper & others VP of Incubator at Apache Foundation Email tdunning@apache.org tdunning@maprtech.com Twitter @ted_dunning
  3. 3. © 2014 MapR Technologies 5 What is Drill?
  4. 4. © 2014 MapR Technologies 6 A Query engine that has… • Columnar/Vectorized • Optimistic/pipelined • Runtime compilation • Late binding • Extensible
  5. 5. © 2014 MapR Technologies 7 Table Can Be an Entire Directory Tree // On a file select errorLevel, count(*) from dfs.logs.`/AppServerLogs/2014/Janpart0001.parquet` group by errorLevel; // On the entire data collection: all years, all months select errorLevel, count(*) from dfs.logs.`/AppServerLogs` group by errorLevel;
  6. 6. © 2014 MapR Technologies 8 Basic Process Zookeepe r DFS/HBase DFS/HBase DFS/HBase Drillbit Distributed Cache Drillbit Distributed Cache Drillbit Distributed Cache Query 1. Query comes to any Drillbit (JDBC, ODBC, CLI, protobuf) 2. Drillbit generates execution plan based on query optimization & locality 3. Fragments are farmed to individual nodes 4. Result is returned to driving node c c c
  7. 7. © 2014 MapR Technologies 9 Stages of Query Planning Parser Logical Planner Physical Planner Query Foreman Plan fragments sent to drill bits SQL Query Heuristic and cost based Cost based
  8. 8. © 2014 MapR Technologies 10 Query Execution SQL Parser Optimizer Scheduler Pig Parser PhysicalPlan Mongo Cassandra HiveQL Parser RPC Endpoint Distributed Cache StorageInterface OperatorsOperators Foreman LogicalPlan HDFS HBase JDBC Endpoint ODBC Endpoint
  9. 9. © 2014 MapR Technologies 11 Batches of Values • Value vectors – List of values, with same schema – With the 4-value semantics for each value • Shipped around in batches – max 256k bytes in a batch – max 64K rows in a batch • RPC designed for multiple replies to a request
  10. 10. © 2014 MapR Technologies 12 Fixed Value Vectors
  11. 11. © 2014 MapR Technologies 13 Vectorization • Drill operates on more than one record at a time – Word-sized manipulations – SIMD instructions • GCC, LLVM and JVM all do various optimizations automatically – Manually code algorithms • Logical Vectorization – Bitmaps allow lightning fast null-checks – Avoid branching to speed CPU pipeline
  12. 12. © 2014 MapR Technologies 14 Runtime Compilation is Faster • JIT is smart, but more gains with runtime compilation • Janino: Java-based Java compiler From http://bit.ly/16Xk32x
  13. 13. © 2014 MapR Technologies 15 Drill compiler Loaded class Merge byte- code of the two classes Janino compiles runtime byte-code CodeModel generates code Precompiled byte-code templates
  14. 14. © 2014 MapR Technologies 16 Optimistic 0 20 40 60 80 100 120 140 160 cmd pipeline small db med db large db dw compilation hadoop Speed vs. check-pointing No need to checkpoint Checkpoint frequentlyApache Drill
  15. 15. © 2014 MapR Technologies 17 Optimistic Execution • Recovery code trivial – Running instances discard the failed query’s intermediate state • Pipelining possible – Send results as soon as batch is large enough – Requires barrier-less decomposition of query
  16. 16. © 2014 MapR Technologies 18 Pipelining • Record batches are pipelined between nodes – ~256kB usually • Unit of work for Drill – Operators works on a batch • Operator reconfiguration happens at batch boundaries DrillBit DrillBit DrillBit
  17. 17. © 2014 MapR Technologies 19 Pipelining • Random access: sort without copy or restructuring • Avoids serialization/deserialization • Off-heap (no GC woes when lots of memory) • Read/write to disk – when data larger than memory Drill Bit Memory overflow uses disk Disk
  18. 18. © 2014 MapR Technologies 20 Cost-based Optimization • Using Optiq, an extensible framework – Pluggable rules, and cost model • Rules for distributed plan generation – Insert Exchange operator into physical plan – Optiq enhanced to explore parallel query plans • Pluggable cost model – CPU, IO, memory, network cost (data locality) – Storage engine features (HDFS vs HIVE vs HBase) Query Optimizer Pluggable rules Pluggable cost model
  19. 19. © 2014 MapR Technologies 21 What is SparkSQL?
  20. 20. © 2014 MapR Technologies 22 What is Spark SQL • Essentially syntactic sugar over a limited subset of Spark • Inherits all the virtues (and vices) of Spark – Lambdas can serve as UDFs (has subtle issues for performance) • Inputs have to be loaded – Perhaps lazily, not obvious when load actually happens • Not designed as a streaming engine, requires more memory • Some JSON support, but not so much for large or variable objects • Embedded in a real language!
  21. 21. © 2014 MapR Technologies 23 In More Detail • A Spark program consists of a computation graph that consumes and produces so-called resilient data datasets • SparkSQL allows these computations to be defined using SQL (but needs schema definitions on the RDD’s) • Conventional Spark programs and SparkSQL programs interoperate nearly seamlessly
  22. 22. © 2014 MapR Technologies 24 Many Similarities SQL Parser Optimizer Java PhysicalPlan Scala LogicalPlan Python group filter filter
  23. 23. © 2014 MapR Technologies 25 Important Differences • Spark execution assumes RDD’s are complete representation, not a stream of row batches • Input sources don’t inject optimization rules, nor expose detailed cost models • Most RDD’s don’t have a zero-copy capability • Spark inherits JVM memory model, very limited use of off-heap
  24. 24. © 2014 MapR Technologies 26 scala> sqlContext.sql("select * from json.`foo.json`").show +---+------+----+ | a| b| c| +---+------+----+ | 3|[3, 2]| xyz| | 7| null| wxy| | 7| []|null| +---+------+----+
  25. 25. © 2014 MapR Technologies 27 scala> sqlContext.sql( "select a, explode(b) b_v from json.`bug.json`" ).show +---+---------+ | a| b_v| +---+---------+ | 3| 3| | 3| 2| +---+---------+
  26. 26. © 2014 MapR Technologies 28 First Synthesis • Drill has a more nuanced optimizer, better code generation – This often leads to ~2x speed advantage • Drill has ValueVector and row batches – This leads to much less memory pressure • Drill has much stricter memory life-cycle – Query and done and gone, no need for big GC’s even on big memory • Drill is all about SQL execution
  27. 27. © 2014 MapR Technologies 29 But … • Spark can optimize across entire program – This often leads to ~2x speed advantage • Spark has much more flexible memory structures – This can lead to much less memory pressure • Spark has much more flexible RDD life-cycle – RDD’s can be cached, persisted or simply recomputed as necessary • Spark is not all about SQL execution
  28. 28. © 2014 MapR Technologies 30 The Really Big Differences • Drill focuses heavily on secure, multi-tenant access to data – Strong impersonation semantics – Cascading rights via views – Queries co-exist in a cluster and reserve only their momentary resource requirements • Spark focuses heavily on fully integrated execution models – Any spark function works with (almost) any RDD’s – Memory residency of RDD’s is the highest goal
  29. 29. © 2014 MapR Technologies 31 Drill security ➢ End to end security from BI tools to Hadoop ➢ Standard based PAM Authentication ➢ 2 level user Impersonation ➢ Fine-grained row and column level access control with Drill Views – no centralized security repository required
  30. 30. © 2014 MapR Technologies 32 Granular security permissions through Drill views Name City State Credit Card # Dave San Jose CA 1374-7914-3865-4817 John Boulder CO 1374-9735-1794-9711 Raw File (/raw/cards.csv) Owner Admins Permission Admins Business Analyst Data Scientist Name City State Credit Card # Dave San Jose CA 1374-1111-1111-1111 John Boulder CO 1374-1111-1111-1111 Data Scientist View (/views/maskedcards.view.drill) Not a physical data copy Name City State Dave San Jose CA John Boulder CO Business Analyst View Owner Admins Permission Business Analysts Owner Admins Permission Data Scientists
  31. 31. © 2014 MapR Technologies 33 Ownership Chaining • Combine Self Service Exploration with Data Governance Name City State Credit Card # Dave San Jose CA 1374-7914-3865-4817 John Boulder CO 1374-9735-1794-9711 Raw File (/raw/cards.csv) Name City State Credit Card # Dave San Jose CA 1374-1111-1111-1111 John Boulder CO 1374-1111-1111-1111 Data Scientist (/views/V_Scientist) Jane (Read) John (Owner) Name City State Dave San Jose CA John Boulder CO Analyst(/views/V_Analyst) Jack (Read) Jane(Owner) RAWFILEV_ScientistV_Analyst Does Jack have access to V_Analyst? ->YES Who is the owner of V_Analyst? ->Jane Drill accesses V_Analyst as Jane (Impersonation hop 1) Does Jane have access to V_Scientist ? -> YES Who is the owner of V_Scientist? ->John Drill accesses V_Scientist as John (Impersonation hop 2) John(Owner) Does John have permissions on raw file? -> YES Who is the owner of raw file? ->John Drill accesses source file as John (no impersonation here) Jack queries the view V_Analyst *Ownership chain length (# hops) is configurable Ownership chaining Access path
  32. 32. © 2014 MapR Technologies 34 But was that the right question?
  33. 33. © 2014 MapR Technologies 35 Unification is Feasible • It is relatively easy to build a DrillContext in Spark – compare to SqlContext • Define Datasets as Drill data sources and sinks – Drill runs at the same time as Spark • Orchestrate transport of Spark data to/from Drill • Cost of transport is remarkably small
  34. 34. © 2014 MapR Technologies 36 What does the Spark and Drill integration look like Features at a glance: • Use Drill as an input to Spark • Query Spark RDDs via Drill and create data pipelines Disk (DFS) Memory RDD Files Files
  35. 35. © 2014 MapR Technologies 37 Is unification valuable?
  36. 36. © 2014 MapR Technologies 38 Example of Unification Callers Universe Towers cdr data
  37. 37. © 2014 MapR Technologies 39 Simple Session Protocol • Calls started at random intervals • During calls, reconnection is done periodically idle connect HELLO FAIL TIME OUT active END CONNECT END HELLO start SETUP • Many log events are buffered and sent to current tower during active state
  38. 38. © 2014 MapR Technologies 40 The Resulting Data • Signal strength reports – Tower, timestamp, rank, caller, caller location*, signal strength • Tower log events: HELLO, FAIL, CONNECT, END • Call end • Note that data for one tower is often received by another due to caller buffering to diagnostic data *Location isn’t quite location … poetic license applied for
  39. 39. © 2014 MapR Technologies 41 What can we do with it?
  40. 40. © 2014 MapR Technologies 42 Baby Steps • What does signal propagation look like? select x, y, signal from cdr_stream where tower = 3 • Plot results to get a map of signal strength around a tower
  41. 41. © 2014 MapR Technologies 43 Baby Steps • What does tower coverage look like? select x, y from cdr_stream where tower = 3 and event_type = ‘CONNECT’. • Plot results to get a map of coverage area for a tower
  42. 42. © 2014 MapR Technologies 44 What about anomaly detection?
  43. 43. © 2014 MapR Technologies 45 Detecting Tower Loss It’s important to know if traffic is stopped or delayed because of a problem… But events from towers come at irregular intervals How long after the last event should you begin to worry?
  44. 44. © 2014 MapR Technologies 46 Event Stream (timing) • Events of various types arrive at irregular intervals – we can assume Poisson distribution • The key question is whether frequency has changed relative to expected values – This shows up as a change in interval • Want alert as soon as possible
  45. 45. © 2014 MapR Technologies 47 Converting Event Times to Anomaly 99.9%-ile 99.99%-ile
  46. 46. © 2014 MapR Technologies 48 But in the real world, event rates often change
  47. 47. © 2014 MapR Technologies 49 Time Intervals Are Key to Modeling Sporadic Events 0 1 2 3 4 02468 t (days) dt(min)
  48. 48. © 2014 MapR Technologies 50 Time Intervals Are Key to Modeling Sporadic Events 0 1 2 3 4 02468 t (days) dt(min)
  49. 49. © 2014 MapR Technologies 51 After Rate Correction 0 1 2 3 4 0246810 t (days) dt/rate 99.9%−ile 99.99%−ile
  50. 50. © 2014 MapR Technologies 52 Detecting Anomalies in Sporadic Events Incoming events 99.97%-ile Alarm Δn Rate predictor Rate history t-digest δ> t ti δ λ(ti- ti- n) λ t
  51. 51. © 2014 MapR Technologies 53 Propagation Anomalies • What happens when something shadows part of the coverage field? – Can happen in urban areas with a construction crane • Can solve heuristically – Subtract from reference image composed by long term averages – Doesn’t deal well with weak signal regions and low S/N • Can solve probabilistically – Compute anomaly for each measurement, use mean of log(p)
  52. 52. © 2014 MapR Technologies 54
  53. 53. © 2014 MapR Technologies 55
  54. 54. © 2014 MapR Technologies 56 Variable Signal/Noise Makes Heuristic Tricky Far from the transmitter, received signal is dominated by noise. This makes subtraction of average value a bad algorithm.
  55. 55. © 2014 MapR Technologies 57 Other Issues • Finding anomalies in coverage area is similar tricky • Coverage area is roughly where tower signal strength is higher than neighbors • Except for fuzziness due to hand-off delays • Except for bias due to large-scale caller motions – Rush hour – Event mobs
  56. 56. © 2014 MapR Technologies 58 Simple Answer for Propagation Anomalies • Cluster signal strength reports • Cluster locations using k-means, large k • Model report rate anomaly using discrete event models • Model signal strength anomaly using percentile model • Trade larger k against higher report rates, faster detection • Overall anomaly is sum of individual log(p) anomalies
  57. 57. © 2014 MapR Technologies 59 Coverage Areas
  58. 58. © 2014 MapR Technologies 60 Just One Tower
  59. 59. © 2014 MapR Technologies 61 Cluster Reports for That Tower
  60. 60. © 2014 MapR Technologies 62 Cluster Reports for That Tower 1 2 3 4 5 6 7 8 9
  61. 61. © 2014 MapR Technologies 63 General Dataflow Group by tower, filter data (SQL) k-means cluster (ML LIB) Split data (SQL) Location model (Java) Mark cluster (ML LIB) Rate detection per cluster
  62. 62. © 2014 MapR Technologies 64 Summary • Drill and Spark provide healthy competition in Apache • Over time, they have converged in many respects – But important distinctions remain • Projects can work together to share key technology – Apache Arrow … started as off-shoot of Drill, now has >12 major projects as participants, including Spark • Systems can work together even more deeply – DrillContext makes integration first class
  63. 63. © 2014 MapR Technologies 65 e-book available courtesy of MapR http://bit.ly/1jQ9QuL A New Look at Anomaly Detection by Ted Dunning and Ellen Friedman © June 2014 (published by O’Reilly)
  64. 64. © 2014 MapR Technologies 66 Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams
  65. 65. © 2014 MapR Technologies 67 Thank you for coming today!
  66. 66. © 2014 MapR Technologies 68 …helping you put data technology to work ● Find answers ● Ask technical questions ● Join on-demand training course discussions ● Follow release announcements ● Share and vote on product ideas ● Find Meetup and event listings Connect with fellow Apache Hadoop and Spark professionals community.mapr.com

×