SlideShare a Scribd company logo
© 2016 MapR Technologies 1© 2016 MapR Technologies 1© 2016 MapR Technologies
How Spark is Enabling
the New Wave of Converged Cloud Applications
Ankur Desai & Carol McDonald
December, 2016
© 2016 MapR Technologies 2© 2016 MapR Technologies 2
Today’s Presenters
Carol McDonald
Solutions Architect
Ankur Desai
Sr Mgr, Platform & Products
© 2016 MapR Technologies 3© 2016 MapR Technologies 3
Agenda
• Market Trends
• What’s Needed for Converged Streaming Applications
• Use Cases
• Demo of MapR Streams with Spark Streaming
© 2016 MapR Technologies 4© 2016 MapR Technologies 4
Flexible processing where
change is the norm
Distributed processing across clusters, data
centers, public & private cloud environments
Supports global apps that
can scale arbitrarily
A Single Platform: On-Prem, In the Cloud, or InterCloud
© 2016 MapR Technologies 5© 2016 MapR Technologies 5
MapR on Microsoft Azure Marketplace
MapR and Microsoft enable enterprise grade big data applications in the Azure cloud
Simplified Deployment
Azure Marketplace’s automated deployment
capabilities make big data easy
Azure’s infrastructure can scale up to match any
requirement and scale down for value
MapR integrates with other Azure services to
enable customers to analyze any type of data to
unlock the biggest insights
Unlimited Scale Seamless Interoperability
Product Alignment
© 2016 MapR Technologies 6© 2016 MapR Technologies 6
Digital transformation for better customer experience
Deliver self-service insights across the business
• MapR platform on the Azure cloud to modernize their infrastructure and
sunset legacy systems.
• Faster exploration of data with Apache Drill mitigating need for
schema development.
• Support for use cases such as customer 360, supply chain & image
analysis
OBJECTIVES
CHALLENGES
SOLUTION
• Modernize analytics & improve speed of marketing campaigns
• Reduce cost of existing systems
•
• Existing technologies prohibiting effective & timely reporting and
analysis
• Very long time to extract value from the data leading to lots of Excel
Leading optical retail chain
© 2016 MapR Technologies 7© 2016 MapR Technologies 7© 2016 MapR Technologies© 2016 MapR Technologies© 2016 MapR Technologies
The Need For Streaming
© 2016 MapR Technologies 8© 2016 MapR Technologies 8
Decreasing Job Latencies
Hours Mins Secs Milli Secs
Data persistence
on-disk
Data persistence
in-memory
© 2016 MapR Technologies 9© 2016 MapR Technologies 9
Big Data is Continuously Generated One Event at a Time
“time” : “6:01.103”,
“event” : “RETWEET”,
“location” :
“lat” : 40.712784,
“lon” : -74.005941
“time: “5:04.120”,
“severity” : “CRITICAL”,
“msg” : “Service down”
“card_num” : 1234,
“merchant” : ”MERCH1”,
“amount” : 50
© 2016 MapR Technologies 10© 2016 MapR Technologies 10
It was hot at
6:05 yesterday!
Why Stream Processing?
A n a l y z e
6:01 P.M.:
72°
6:02 P.M.:
75°
6:03 P.M.: 77°
6:04 P.M.: 85°
6:05 P.M.: 90°
6:06 P.M.: 85°
6:07 P.M.: 77°
6:08 P.M.: 75°
90°90°6:01 P.M.: 72°
6:02 P.M.: 75°
6:03 P.M.: 77°
6:04 P.M.: 85°
6:05 P.M.: 90°
6:06 P.M.: 85°
6:07 P.M.: 77°
6:08 P.M.: 75°
Batch processing may be too late for some events
© 2016 MapR Technologies 11© 2016 MapR Technologies 11
Why Stream Processing?
6:05 P.M.: 90°
To
pi
c
Temperature
Turn on the air
conditioning!
It’s becoming important to process events as they arrive
S t r e a m
© 2016 MapR Technologies 12© 2016 MapR Technologies 12© 2016 MapR Technologies© 2016 MapR Technologies
Anatomy of Converged Streaming Applications
© 2016 MapR Technologies 13© 2016 MapR Technologies 13
The Trinity of Real-time
Topic 1
Real Time
Producers
Topic 2
Global Messaging System
Persistence
(Databases and Files)
Real Time
Operational
Analytics
Stream Processing
© 2016 MapR Technologies 14© 2016 MapR Technologies 14
Serve DataStore DataStream Data
Creating the Streaming Pipeline
Process DataData Sources
Topic
© 2016 MapR Technologies 15© 2016 MapR Technologies 15
Open Source Engines & Tools Commercial Engines & Applications
Enterprise-Grade Platform Services
DataProcessing
Web-Scale Storage
MapR-FS MapR-DB
Search
and
Others
Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability
MapR Streams
Cloud
and
Managed
Services
Search and
Others
UnifiedManagementandMonitoring
Search
and
Others
Event StreamingDatabase
Custom
Apps
HDFS API POSIX, NFS HBase API JSON API Kafka API
MapR Converged Data Platform
© 2016 MapR Technologies 16© 2016 MapR Technologies 16
MapR Streams:
Global Pub-sub Event Streaming System for Big Data
Producers publish billions of
messages/sec to a topic in a stream.
Guaranteed, immediate delivery
to all consumers.
Tie together geo-dispersed clusters.
Worldwide.
Standard real-time API (Kafka).
Integrates with Spark Streaming,
Storm, Apex, and Flink
Direct data access (OJAI API) from
analytics frameworks.
To
pi
c
Stream
Producers
Remote sites and consumers
Batch analytics
Topic
Replication
Consumers
Consumers
© 2016 MapR Technologies 17© 2016 MapR Technologies 17
Scalable Event Streaming with MapR Streams
Topics are partitioned for throughput and scalability
Partition 1: Topic - Pressure
Partition 1: Topic - Temperature
Partition 1: Topic - Warning
Partition 2: Topic - Pressure
Partition 2: Topic - Temperature
Partition 2: Topic - Warning
Partition 3: Topic - Pressure
Partition 3: Topic - Temperature
Partition 3: Topic - Warning
Consumers
Consumers
Consumers
!
© 2016 MapR Technologies 18© 2016 MapR Technologies 18
MapR-DB is Designed to Scale
Key Range
xxxx
xxxx
Key Col B Col C
val val val
xxx val val
Fast Reads and Writes by Key
Data is automatically partitioned
by Key Range
Key Range
xxxx
xxxx
Key Col B Col C
val val val
xxx val val
Key Range
xxxx
xxxx
Key Col B Col C
val val val
xxx val val
© 2016 MapR Technologies 19© 2016 MapR Technologies 19© 2016 MapR Technologies© 2016 MapR Technologies
Use Cases
© 2016 MapR Technologies 20© 2016 MapR Technologies 20
Customer 360 & Behavior Prediction
Website
Click-Stream
Real Time/Offline
ClickStream Analysis
Internal Data Sources
External Data Sources
• Prediction Modelling
• Attribution Modelling
• Cohort Analysis
• Customer Lifetime Value
Analysis
• Attrition Modelling
• Response Modelling
• Churn Modelling
Eliminate latency due to data
movement between clusters
Eliminate Redundant storage with
MapR streams and lower the TCO
360 Degree
Customer View
Customer Behavior Prediction
Better Conversion Rate and Lower attrition $$$
Offline
Real Time
HA, DR, NFS, Snapshots,
Data Protection
EDH/EDL
Topic
Topic
Topic
Topic
Support
Tickets
DBMSEmail
CRM
© 2016 MapR Technologies 21© 2016 MapR Technologies 21
Prescriptive Analytics: IoT & Auto Manufacturing
GPS
Telemati
c Data
Telephone Truck Fleet
Data generated from cars are
stored locally
Data Modelling/Secondary ETL: Data is
converted from proprietary to parquet format
• Identify emission patterns
• Route optimization
• Customer service requests
• How does throttling affect other factors such as fuel consumption, emissions, etc.
• Image and video analysis
• Time series analysis for threshold breach
Topic
Topic
Topic
Topic
© 2016 MapR Technologies 22© 2016 MapR Technologies 22© 2016 MapR Technologies© 2016 MapR Technologies
Demo
© 2016 MapR Technologies 23© 2016 MapR Technologies 23
What if BP had detected problems before
the oil hit the water ?
1M samples/sec
High performance at
scale is necessary!
© 2016 MapR Technologies 24© 2016 MapR Technologies 24
Use Case: Time Series Data
Data for
real-time monitoring
Sensor
time-stamped data
Spark
processing
readSpark
Streaming
Stream
Topic
© 2016 MapR Technologies 25© 2016 MapR Technologies 25
Use Case: Time Series Data
Sensor
time-stamped data
Stream
Topic
COHUTTA,3/10/14,1:01,10.27,1.73,881,1.56,85,1.94
COHUTTA,3/10/14,1:03,10.47,1.732,882,1.7,92,0.66
COHUTTA,3/10/14,1:02,9.67,1.731,882,0.52,87,1.79
Data: PumpId, Date,Time , pressure and flow measurements
© 2016 MapR Technologies 26© 2016 MapR Technologies 26
Schema
• All events stored, CF data could be set to expire data
• Filtered alerts put in CF alerts
• Daily summaries put in CF stats
Row key
CF data CF alerts CF stats
hz … psi psi … hz_avg … psi_min
COHUTTA_3/10/14_1:01 10.37 84 0
COHUTTA_3/10/14 10 0
Row Key contains oil
pump name, date, and
a time stamp
© 2016 MapR Technologies 27© 2016 MapR Technologies 27
Schema
• All events stored, CF data could be set to expire data
• Filtered alerts put in CF alerts
• Daily summaries put in CF stats
Row key
CF data CF alerts CF stats
hz … psi psi … hz_avg … psi_min
COHUTTA_3/10/14_1:01 10.37 84 0
COHUTTA_3/10/14 10 0
© 2016 MapR Technologies 28© 2016 MapR Technologies 28
Schema
• All events stored, CF data could be set to expire data
• Filtered alerts put in CF alerts
• Daily summaries put in CF stats
Row key
CF data CF alerts CF stats
hz … psi psi … hz_avg … psi_min
COHUTTA_3/10/14_1:01 10.37 84 0
COHUTTA_3/10/14 10 0
© 2016 MapR Technologies 29© 2016 MapR Technologies 29
Serve Data
What Do We Need to Do ?
Data Sources Store DataCollect Data Process Data
Stream
Topic
© 2016 MapR Technologies 30© 2016 MapR Technologies 30
readSpark
Streaming
Stream
Topic
Use Case Example Code
Data for
real-time monitoring
Sensor
time-stamped data Spark processing
© 2016 MapR Technologies 31© 2016 MapR Technologies 31
KafkaProducer
String topic=“/streams/pump:warning”;
public static KafkaProducer producer;
//1 configure KafkaProducer properties
Properties properties = new Properties();
properties.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
//2 Create KafkaProducer with properties
kafkaProducer = new KafkaProducer<String, String>(properties);
String txt = “msg text”;
//3 Create producer records with topic and message
ProducerRecord<String, String> record = new
ProducerRecord<String, String>(topic, txt);
//4 use kafka producer to send records
kafkaProducer.send(record);
© 2016 MapR Technologies 32© 2016 MapR Technologies 32
readSpark
Streaming
Stream
Topic
Use Case Example Code
Data for
real-time monitoring
Sensor
time-stamped data Spark processing
© 2016 MapR Technologies 33© 2016 MapR Technologies 33
Create a DStream
DStream: a sequence of RDDs
representing a stream of data
val ssc = new StreamingContext(sparkConf, Seconds(5))
// create an input Stream for set of topics
val dStream = KafkaUtils.createDirectStream[String,
String](ssc, kafkaParams, topicsSet)
batch
time 0 to 1
batch
time 1 to 2
batch
time 2 to 3
dStream
Stored in memory
as an RDD
© 2016 MapR Technologies 34© 2016 MapR Technologies 34
Message Data to Sensor Object
case class Sensor(resid: String, date: String, time: String,
hz: Double, disp: Double, flo: Double, sedPPM: Double,
psi: Double, chlPPM: Double)
// Parse CSV Strings into Sensor objects
def parseSensor(str: String): Sensor = {
val p = str.split(",")
Sensor(p(0), p(1), p(2), p(3).toDouble, p(4).toDouble, p(5).toDouble,
p(6).toDouble, p(7).toDouble, p(8).toDouble)
}
© 2016 MapR Technologies 35© 2016 MapR Technologies 35
Process DStream
// Parse message values into Sensor objects
val sensorDStream = dStream.map(_._2).map(parseSensor)
dStream RDDs
batch
time 2 to 3
batch
time 1 to 2
batch
time 0 to 1
sensorDStream RDDs
New RDDs created
for every batch
map map map
© 2016 MapR Technologies 36© 2016 MapR Technologies 36
DataFrame and SQL Operations
// for Each RDD
sensorDStream.foreachRDD { rdd =>
val sqlContext = SQLContext.getOrCreate(rdd.sparkContext)
// convert RDD to DataFrame
rdd.toDF().registerTempTable("sensor")
// get the avg max min for pump values
val res = sqlContext.sql( "SELECT resid, date,
max(hz) as maxhz, min(hz) as minhz, avg(hz) as avghz,
max(disp) as maxdisp, min(disp) as mindisp, avg(disp) as avgdisp,
max(flo) as maxflo, min(flo) as minflo, avg(flo) as avgflo,
max(psi) as maxpsi, min(psi) as minpsi, avg(psi) as avgpsi
FROM sensor GROUP BY resid,date”)
res.show()
}
© 2016 MapR Technologies 37© 2016 MapR Technologies 37
Streaming Application Output
© 2016 MapR Technologies 38© 2016 MapR Technologies 38
Save to HBase
rdd.map(Sensor.convertToPut).saveAsHadoopDataset(jobConfig)
linesRDD DStream
sensorRDD DStream
output operation: persist
data to external storage
Put objects written
to HBase
batch
time 2-3
batch
time 1 to 2
batch
time 0 to 1
mapmap map
savesave save
© 2016 MapR Technologies 39© 2016 MapR Technologies 39
Start Receiving Data
sensorDStream.foreachRDD { rdd =>
. . .
}
// Start the computation
ssc.start()
// Wait for the computation to terminate
ssc.awaitTermination()
© 2016 MapR Technologies 40© 2016 MapR Technologies 40
Stream Processing
Building a Complete Data Architecture
MapR File System
(MapR-FS)
MapR Converged Data Platform
MapR Database
(MapR-DB)
MapR Streams
Sources/Apps Bulk Processing
© 2016 MapR Technologies 41© 2016 MapR Technologies 41
© 2016 MapR Technologies 42© 2016 MapR Technologies 42
Azure and MapR Resources – 3 steps to get started
• Azure Overview
https://www.mapr.com/partners/partner/microsoft-azure-microsofts-cloud-
computing-platform-moving-faster-achieving-more
• 7 Steps to Deploy the MapR Sandbox on Azure
https://www.mapr.com/blog/7-steps-deploy-mapr-sandbox-microsoft-azure
• Azure Test Drive
http://mapr.testdrivelabs.com/ (subject to change)
© 2016 MapR Technologies 43© 2016 MapR Technologies 43
Q&AEngage with us!
1. Read explanation of and Download code
– https://www.mapr.com/blog/fast-scalable-streaming-applications-mapr-streams-spark-streaming-and-mapr-db
– https://www.mapr.com/blog/spark-streaming-hbase
2. Get Started: MapR Converged Data Platform
https://www.mapr.com/get-started-with-mapr
3. Get Answers: MapR Converge Community
https://community.mapr.com/community/answers
4. Get Trained: MapR On-Demand Training
https://learn.mapr.com

More Related Content

What's hot

NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
MapR Technologies
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes Strategic
MapR Technologies
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
MapR Technologies
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR Technologies
 
IoT Use Cases with MapR
IoT Use Cases with MapRIoT Use Cases with MapR
IoT Use Cases with MapR
MapR Technologies
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
MapR Technologies
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
 
Best Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data PlatformBest Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data Platform
MapR Technologies
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
Mateusz Dymczyk
 
Spark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleSpark & Hadoop at Production at Scale
Spark & Hadoop at Production at Scale
MapR Technologies
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
Carol McDonald
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
Carol McDonald
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
DataWorks Summit/Hadoop Summit
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Codemotion
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
Carol McDonald
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
Carol McDonald
 
Dchug m7-30 apr2013
Dchug m7-30 apr2013Dchug m7-30 apr2013
Dchug m7-30 apr2013
jdfiori
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution Roadshow
MapR Technologies
 

What's hot (20)

NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes Strategic
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
 
IoT Use Cases with MapR
IoT Use Cases with MapRIoT Use Cases with MapR
IoT Use Cases with MapR
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Best Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data PlatformBest Practices for Protecting Sensitive Data Across the Big Data Platform
Best Practices for Protecting Sensitive Data Across the Big Data Platform
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
Spark & Hadoop at Production at Scale
Spark & Hadoop at Production at ScaleSpark & Hadoop at Production at Scale
Spark & Hadoop at Production at Scale
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in ProductionTugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
Tugdual Grall - Real World Use Cases: Hadoop and NoSQL in Production
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache  HBaseBuild a Time Series Application with Apache Spark and Apache  HBase
Build a Time Series Application with Apache Spark and Apache HBase
 
Dchug m7-30 apr2013
Dchug m7-30 apr2013Dchug m7-30 apr2013
Dchug m7-30 apr2013
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution Roadshow
 

Viewers also liked

Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
MapR Technologies
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
MapR Technologies
 
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectHUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
SpagoWorld
 
Big data analysing genomics and the bdg project
Big data   analysing genomics and the bdg projectBig data   analysing genomics and the bdg project
Big data analysing genomics and the bdg project
sree navya
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
MapR Technologies
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital Transformation
MapR Technologies
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation TechnTed Dunning
 
Big Data Paris
Big Data ParisBig Data Paris
Big Data Paris
MapR Technologies
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
MapR Technologies
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation Workshop
MapR Technologies
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Ted Dunning
 
Spark Application for Time Series Analysis
Spark Application for Time Series AnalysisSpark Application for Time Series Analysis
Spark Application for Time Series Analysis
MapR Technologies
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark Summit
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
MapR Technologies
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
MapR Technologies
 
Hadoop によるゲノム解読
Hadoop によるゲノム解読Hadoop によるゲノム解読
Hadoop によるゲノム解読
MapR Technologies Japan
 

Viewers also liked (16)

Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectHUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
 
Big data analysing genomics and the bdg project
Big data   analysing genomics and the bdg projectBig data   analysing genomics and the bdg project
Big data analysing genomics and the bdg project
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital Transformation
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation Techn
 
Big Data Paris
Big Data ParisBig Data Paris
Big Data Paris
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation Workshop
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
 
Spark Application for Time Series Analysis
Spark Application for Time Series AnalysisSpark Application for Time Series Analysis
Spark Application for Time Series Analysis
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
 
Hadoop によるゲノム解読
Hadoop によるゲノム解読Hadoop によるゲノム解読
Hadoop によるゲノム解読
 

Similar to How Spark is Enabling the New Wave of Converged Cloud Applications

Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
Nitin Kumar
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
Tugdual Grall
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
Ted Dunning
 
Spark Streaming Data Pipelines
Spark Streaming Data PipelinesSpark Streaming Data Pipelines
Spark Streaming Data Pipelines
MapR Technologies
 
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Carol McDonald
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
Carol McDonald
 
Streaming in the Extreme
Streaming in the ExtremeStreaming in the Extreme
Streaming in the Extreme
Julius Remigio, CBIP
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
MapR Technologies
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
Codemotion
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globally
ridhav
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Mathieu Dumoulin
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Matt Stubbs
 
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
OW2
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating Example
Ian Downard
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Codemotion
 
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
Matt Stubbs
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Mathieu Dumoulin
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
MapR Technologies
 

Similar to How Spark is Enabling the New Wave of Converged Cloud Applications (20)

Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
 
Spark Streaming Data Pipelines
Spark Streaming Data PipelinesSpark Streaming Data Pipelines
Spark Streaming Data Pipelines
 
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
 
Streaming in the Extreme
Streaming in the ExtremeStreaming in the Extreme
Streaming in the Extreme
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globally
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business Solutions
 
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating Example
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
 
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
 

More from MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
MapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
MapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
MapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
MapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
MapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
MapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
MapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR Technologies
 

More from MapR Technologies (17)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 

Recently uploaded

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 

Recently uploaded (20)

一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 

How Spark is Enabling the New Wave of Converged Cloud Applications

  • 1. © 2016 MapR Technologies 1© 2016 MapR Technologies 1© 2016 MapR Technologies How Spark is Enabling the New Wave of Converged Cloud Applications Ankur Desai & Carol McDonald December, 2016
  • 2. © 2016 MapR Technologies 2© 2016 MapR Technologies 2 Today’s Presenters Carol McDonald Solutions Architect Ankur Desai Sr Mgr, Platform & Products
  • 3. © 2016 MapR Technologies 3© 2016 MapR Technologies 3 Agenda • Market Trends • What’s Needed for Converged Streaming Applications • Use Cases • Demo of MapR Streams with Spark Streaming
  • 4. © 2016 MapR Technologies 4© 2016 MapR Technologies 4 Flexible processing where change is the norm Distributed processing across clusters, data centers, public & private cloud environments Supports global apps that can scale arbitrarily A Single Platform: On-Prem, In the Cloud, or InterCloud
  • 5. © 2016 MapR Technologies 5© 2016 MapR Technologies 5 MapR on Microsoft Azure Marketplace MapR and Microsoft enable enterprise grade big data applications in the Azure cloud Simplified Deployment Azure Marketplace’s automated deployment capabilities make big data easy Azure’s infrastructure can scale up to match any requirement and scale down for value MapR integrates with other Azure services to enable customers to analyze any type of data to unlock the biggest insights Unlimited Scale Seamless Interoperability Product Alignment
  • 6. © 2016 MapR Technologies 6© 2016 MapR Technologies 6 Digital transformation for better customer experience Deliver self-service insights across the business • MapR platform on the Azure cloud to modernize their infrastructure and sunset legacy systems. • Faster exploration of data with Apache Drill mitigating need for schema development. • Support for use cases such as customer 360, supply chain & image analysis OBJECTIVES CHALLENGES SOLUTION • Modernize analytics & improve speed of marketing campaigns • Reduce cost of existing systems • • Existing technologies prohibiting effective & timely reporting and analysis • Very long time to extract value from the data leading to lots of Excel Leading optical retail chain
  • 7. © 2016 MapR Technologies 7© 2016 MapR Technologies 7© 2016 MapR Technologies© 2016 MapR Technologies© 2016 MapR Technologies The Need For Streaming
  • 8. © 2016 MapR Technologies 8© 2016 MapR Technologies 8 Decreasing Job Latencies Hours Mins Secs Milli Secs Data persistence on-disk Data persistence in-memory
  • 9. © 2016 MapR Technologies 9© 2016 MapR Technologies 9 Big Data is Continuously Generated One Event at a Time “time” : “6:01.103”, “event” : “RETWEET”, “location” : “lat” : 40.712784, “lon” : -74.005941 “time: “5:04.120”, “severity” : “CRITICAL”, “msg” : “Service down” “card_num” : 1234, “merchant” : ”MERCH1”, “amount” : 50
  • 10. © 2016 MapR Technologies 10© 2016 MapR Technologies 10 It was hot at 6:05 yesterday! Why Stream Processing? A n a l y z e 6:01 P.M.: 72° 6:02 P.M.: 75° 6:03 P.M.: 77° 6:04 P.M.: 85° 6:05 P.M.: 90° 6:06 P.M.: 85° 6:07 P.M.: 77° 6:08 P.M.: 75° 90°90°6:01 P.M.: 72° 6:02 P.M.: 75° 6:03 P.M.: 77° 6:04 P.M.: 85° 6:05 P.M.: 90° 6:06 P.M.: 85° 6:07 P.M.: 77° 6:08 P.M.: 75° Batch processing may be too late for some events
  • 11. © 2016 MapR Technologies 11© 2016 MapR Technologies 11 Why Stream Processing? 6:05 P.M.: 90° To pi c Temperature Turn on the air conditioning! It’s becoming important to process events as they arrive S t r e a m
  • 12. © 2016 MapR Technologies 12© 2016 MapR Technologies 12© 2016 MapR Technologies© 2016 MapR Technologies Anatomy of Converged Streaming Applications
  • 13. © 2016 MapR Technologies 13© 2016 MapR Technologies 13 The Trinity of Real-time Topic 1 Real Time Producers Topic 2 Global Messaging System Persistence (Databases and Files) Real Time Operational Analytics Stream Processing
  • 14. © 2016 MapR Technologies 14© 2016 MapR Technologies 14 Serve DataStore DataStream Data Creating the Streaming Pipeline Process DataData Sources Topic
  • 15. © 2016 MapR Technologies 15© 2016 MapR Technologies 15 Open Source Engines & Tools Commercial Engines & Applications Enterprise-Grade Platform Services DataProcessing Web-Scale Storage MapR-FS MapR-DB Search and Others Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability MapR Streams Cloud and Managed Services Search and Others UnifiedManagementandMonitoring Search and Others Event StreamingDatabase Custom Apps HDFS API POSIX, NFS HBase API JSON API Kafka API MapR Converged Data Platform
  • 16. © 2016 MapR Technologies 16© 2016 MapR Technologies 16 MapR Streams: Global Pub-sub Event Streaming System for Big Data Producers publish billions of messages/sec to a topic in a stream. Guaranteed, immediate delivery to all consumers. Tie together geo-dispersed clusters. Worldwide. Standard real-time API (Kafka). Integrates with Spark Streaming, Storm, Apex, and Flink Direct data access (OJAI API) from analytics frameworks. To pi c Stream Producers Remote sites and consumers Batch analytics Topic Replication Consumers Consumers
  • 17. © 2016 MapR Technologies 17© 2016 MapR Technologies 17 Scalable Event Streaming with MapR Streams Topics are partitioned for throughput and scalability Partition 1: Topic - Pressure Partition 1: Topic - Temperature Partition 1: Topic - Warning Partition 2: Topic - Pressure Partition 2: Topic - Temperature Partition 2: Topic - Warning Partition 3: Topic - Pressure Partition 3: Topic - Temperature Partition 3: Topic - Warning Consumers Consumers Consumers !
  • 18. © 2016 MapR Technologies 18© 2016 MapR Technologies 18 MapR-DB is Designed to Scale Key Range xxxx xxxx Key Col B Col C val val val xxx val val Fast Reads and Writes by Key Data is automatically partitioned by Key Range Key Range xxxx xxxx Key Col B Col C val val val xxx val val Key Range xxxx xxxx Key Col B Col C val val val xxx val val
  • 19. © 2016 MapR Technologies 19© 2016 MapR Technologies 19© 2016 MapR Technologies© 2016 MapR Technologies Use Cases
  • 20. © 2016 MapR Technologies 20© 2016 MapR Technologies 20 Customer 360 & Behavior Prediction Website Click-Stream Real Time/Offline ClickStream Analysis Internal Data Sources External Data Sources • Prediction Modelling • Attribution Modelling • Cohort Analysis • Customer Lifetime Value Analysis • Attrition Modelling • Response Modelling • Churn Modelling Eliminate latency due to data movement between clusters Eliminate Redundant storage with MapR streams and lower the TCO 360 Degree Customer View Customer Behavior Prediction Better Conversion Rate and Lower attrition $$$ Offline Real Time HA, DR, NFS, Snapshots, Data Protection EDH/EDL Topic Topic Topic Topic Support Tickets DBMSEmail CRM
  • 21. © 2016 MapR Technologies 21© 2016 MapR Technologies 21 Prescriptive Analytics: IoT & Auto Manufacturing GPS Telemati c Data Telephone Truck Fleet Data generated from cars are stored locally Data Modelling/Secondary ETL: Data is converted from proprietary to parquet format • Identify emission patterns • Route optimization • Customer service requests • How does throttling affect other factors such as fuel consumption, emissions, etc. • Image and video analysis • Time series analysis for threshold breach Topic Topic Topic Topic
  • 22. © 2016 MapR Technologies 22© 2016 MapR Technologies 22© 2016 MapR Technologies© 2016 MapR Technologies Demo
  • 23. © 2016 MapR Technologies 23© 2016 MapR Technologies 23 What if BP had detected problems before the oil hit the water ? 1M samples/sec High performance at scale is necessary!
  • 24. © 2016 MapR Technologies 24© 2016 MapR Technologies 24 Use Case: Time Series Data Data for real-time monitoring Sensor time-stamped data Spark processing readSpark Streaming Stream Topic
  • 25. © 2016 MapR Technologies 25© 2016 MapR Technologies 25 Use Case: Time Series Data Sensor time-stamped data Stream Topic COHUTTA,3/10/14,1:01,10.27,1.73,881,1.56,85,1.94 COHUTTA,3/10/14,1:03,10.47,1.732,882,1.7,92,0.66 COHUTTA,3/10/14,1:02,9.67,1.731,882,0.52,87,1.79 Data: PumpId, Date,Time , pressure and flow measurements
  • 26. © 2016 MapR Technologies 26© 2016 MapR Technologies 26 Schema • All events stored, CF data could be set to expire data • Filtered alerts put in CF alerts • Daily summaries put in CF stats Row key CF data CF alerts CF stats hz … psi psi … hz_avg … psi_min COHUTTA_3/10/14_1:01 10.37 84 0 COHUTTA_3/10/14 10 0 Row Key contains oil pump name, date, and a time stamp
  • 27. © 2016 MapR Technologies 27© 2016 MapR Technologies 27 Schema • All events stored, CF data could be set to expire data • Filtered alerts put in CF alerts • Daily summaries put in CF stats Row key CF data CF alerts CF stats hz … psi psi … hz_avg … psi_min COHUTTA_3/10/14_1:01 10.37 84 0 COHUTTA_3/10/14 10 0
  • 28. © 2016 MapR Technologies 28© 2016 MapR Technologies 28 Schema • All events stored, CF data could be set to expire data • Filtered alerts put in CF alerts • Daily summaries put in CF stats Row key CF data CF alerts CF stats hz … psi psi … hz_avg … psi_min COHUTTA_3/10/14_1:01 10.37 84 0 COHUTTA_3/10/14 10 0
  • 29. © 2016 MapR Technologies 29© 2016 MapR Technologies 29 Serve Data What Do We Need to Do ? Data Sources Store DataCollect Data Process Data Stream Topic
  • 30. © 2016 MapR Technologies 30© 2016 MapR Technologies 30 readSpark Streaming Stream Topic Use Case Example Code Data for real-time monitoring Sensor time-stamped data Spark processing
  • 31. © 2016 MapR Technologies 31© 2016 MapR Technologies 31 KafkaProducer String topic=“/streams/pump:warning”; public static KafkaProducer producer; //1 configure KafkaProducer properties Properties properties = new Properties(); properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); //2 Create KafkaProducer with properties kafkaProducer = new KafkaProducer<String, String>(properties); String txt = “msg text”; //3 Create producer records with topic and message ProducerRecord<String, String> record = new ProducerRecord<String, String>(topic, txt); //4 use kafka producer to send records kafkaProducer.send(record);
  • 32. © 2016 MapR Technologies 32© 2016 MapR Technologies 32 readSpark Streaming Stream Topic Use Case Example Code Data for real-time monitoring Sensor time-stamped data Spark processing
  • 33. © 2016 MapR Technologies 33© 2016 MapR Technologies 33 Create a DStream DStream: a sequence of RDDs representing a stream of data val ssc = new StreamingContext(sparkConf, Seconds(5)) // create an input Stream for set of topics val dStream = KafkaUtils.createDirectStream[String, String](ssc, kafkaParams, topicsSet) batch time 0 to 1 batch time 1 to 2 batch time 2 to 3 dStream Stored in memory as an RDD
  • 34. © 2016 MapR Technologies 34© 2016 MapR Technologies 34 Message Data to Sensor Object case class Sensor(resid: String, date: String, time: String, hz: Double, disp: Double, flo: Double, sedPPM: Double, psi: Double, chlPPM: Double) // Parse CSV Strings into Sensor objects def parseSensor(str: String): Sensor = { val p = str.split(",") Sensor(p(0), p(1), p(2), p(3).toDouble, p(4).toDouble, p(5).toDouble, p(6).toDouble, p(7).toDouble, p(8).toDouble) }
  • 35. © 2016 MapR Technologies 35© 2016 MapR Technologies 35 Process DStream // Parse message values into Sensor objects val sensorDStream = dStream.map(_._2).map(parseSensor) dStream RDDs batch time 2 to 3 batch time 1 to 2 batch time 0 to 1 sensorDStream RDDs New RDDs created for every batch map map map
  • 36. © 2016 MapR Technologies 36© 2016 MapR Technologies 36 DataFrame and SQL Operations // for Each RDD sensorDStream.foreachRDD { rdd => val sqlContext = SQLContext.getOrCreate(rdd.sparkContext) // convert RDD to DataFrame rdd.toDF().registerTempTable("sensor") // get the avg max min for pump values val res = sqlContext.sql( "SELECT resid, date, max(hz) as maxhz, min(hz) as minhz, avg(hz) as avghz, max(disp) as maxdisp, min(disp) as mindisp, avg(disp) as avgdisp, max(flo) as maxflo, min(flo) as minflo, avg(flo) as avgflo, max(psi) as maxpsi, min(psi) as minpsi, avg(psi) as avgpsi FROM sensor GROUP BY resid,date”) res.show() }
  • 37. © 2016 MapR Technologies 37© 2016 MapR Technologies 37 Streaming Application Output
  • 38. © 2016 MapR Technologies 38© 2016 MapR Technologies 38 Save to HBase rdd.map(Sensor.convertToPut).saveAsHadoopDataset(jobConfig) linesRDD DStream sensorRDD DStream output operation: persist data to external storage Put objects written to HBase batch time 2-3 batch time 1 to 2 batch time 0 to 1 mapmap map savesave save
  • 39. © 2016 MapR Technologies 39© 2016 MapR Technologies 39 Start Receiving Data sensorDStream.foreachRDD { rdd => . . . } // Start the computation ssc.start() // Wait for the computation to terminate ssc.awaitTermination()
  • 40. © 2016 MapR Technologies 40© 2016 MapR Technologies 40 Stream Processing Building a Complete Data Architecture MapR File System (MapR-FS) MapR Converged Data Platform MapR Database (MapR-DB) MapR Streams Sources/Apps Bulk Processing
  • 41. © 2016 MapR Technologies 41© 2016 MapR Technologies 41
  • 42. © 2016 MapR Technologies 42© 2016 MapR Technologies 42 Azure and MapR Resources – 3 steps to get started • Azure Overview https://www.mapr.com/partners/partner/microsoft-azure-microsofts-cloud- computing-platform-moving-faster-achieving-more • 7 Steps to Deploy the MapR Sandbox on Azure https://www.mapr.com/blog/7-steps-deploy-mapr-sandbox-microsoft-azure • Azure Test Drive http://mapr.testdrivelabs.com/ (subject to change)
  • 43. © 2016 MapR Technologies 43© 2016 MapR Technologies 43 Q&AEngage with us! 1. Read explanation of and Download code – https://www.mapr.com/blog/fast-scalable-streaming-applications-mapr-streams-spark-streaming-and-mapr-db – https://www.mapr.com/blog/spark-streaming-hbase 2. Get Started: MapR Converged Data Platform https://www.mapr.com/get-started-with-mapr 3. Get Answers: MapR Converge Community https://community.mapr.com/community/answers 4. Get Trained: MapR On-Demand Training https://learn.mapr.com