SlideShare a Scribd company logo
1 of 11
Download to read offline
APACHE SPARK
PREPARING FOR THE NEXT WAVE OF REACTIVE BIG DATA
74% Developers
8% Data Scientists
7% C-level execs
TOP 3 LANGUAGES
USED WITH SPARK
88% Scala
44% Java
22% Python
31%
are evaluating
Spark now
are running Spark
in production
13%
82%
of users chose
Spark to replace
MapReduce
78%
of users
need faster
processing
of larger
data sets
62%
of users load data into
Spark with Hadoop DFS
54%
of users
run Spark
standalone
67%
of users need
Spark for event
stream processing
20%
are planning to use
Spark in 2015
TOP 3 INDUSTRIES
RESPONDENTS
Telecoms, Banks, Retail
APACHE SPARK SURVEY 2015 - QUICK SNAPSHOT
3
JOB TYPE/ROLE
7.5%Data Scientist
6.5%C-Level Executive
3.5%Software Architect
3.5%Dev Ops
1% Business Analyst
74%Developer
6.5%Other
INDUSTRY FOCUS
33%Other
5%Consulting
4%Healthcare / Insurance
9%Advertising
10% Software / Technology
11%Retail
12%Banking / Finance
16% Telecommunications / Networks
Including Biotechnology/Chemistry,
Machinery, Education, Government
and Utilities and other sectors
4
INFRASTRUCTURE TECHNOLOGIES IN USE
53% Amazon EC2
34% Docker
22% Cloudera CDH
16% Ansible
14% Mesos
13% OpenStack
12% Apache.org Builds of Hadoop
10% HortonWorks HDP
10% Heroku
8% Google Compute Engine
7% Core OS
7% MapR Hadoop Distribution
6% Microsoft Azure
5% Marathon
4% Kubernetes
2% Aurora
11% Other XaaS
5
Evaluating
Spark now
Currently using
in production
Evaluated,
not planning to use
Evaluated,
will use in 2016 or later
Um, what’s Spark?
Planning to
use in 2015
31%
28%
20%
13%
6%
2%
CURRENT RELATIONSHIP WITH SPARK
6
Fast Batch
Processing of
Large Data Sets
78%
Support for
Event Stream
Processing
60%
Fast Data
Queries in
Real Time
56%
Improved
Programmer
Productivity
55%
BUSINESS GOALS IN MIND
7
SPARK FEATURES/MODULES IN DEMAND
25%
59%
65%
82%
51%
Core API as a
Replacement for
MapReduce
Streaming Library
(Spark Streaming)
Machine
Learning Library
(MLlib) Integrated SQL
(SparkSQL)
Graph
Algorithms Library
(GraphX)
8
DATA PROCESSING WITH SPARK
39%
41%
46%
46%
59%
61%
Read or Write Data to One or More Databases
Static Reports
SQL Queries and Business Intelligence
Write Data to Hadoop Distributed File System (HDFS)
Ad-hoc Queries and Reporting
ETL Data from External Sources
67% Event Stream Processing
71%
65%
40%
Use Spark as Part of a Larger Data Pipeline
Extract Information from Data Sooner Rather than Later
Automate Decision Making at Runtime
9
2nd
Java 44%
1st
Scala 88%
3rd
Python 22%
WHICH LANGUAGES ARE IMPORTANT TO YOUR SPARK INSTALLATION?
Honorable mentions: R, Clojure, Groovy, Ruby & Go
10
HOW DO YOU LOAD DATA INTO SPARK?
62% Hadoop Distributed
File System (HDFS)
18% Other Services
(e.g. over socket connection)
41% Apache Kafka
46% Databases
29% Amazon S3
12% Other*
*Including:
Apache Cassandra, Amazon
Kinesis and Apache HBase
11
Typesafe (Twitter: @Typesafe) is dedicated to helping developers build Reactive applications on the JVM. Backed by Greylock
Partners, Shasta Ventures, Bain Capital Ventures and Juniper Networks, Typesafe is headquartered in San Francisco with
offices in Switzerland and Sweden. To start building Reactive applications today, download Typesafe Activator.
© 2015 Typesafe
Hello, Apache Spark!
Typesafe Activator template for devs
DOWNLOAD
Get the FULL report
(PDF)
DOWNLOAD

More Related Content

What's hot

Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksSlim Baltagi
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceDataWorks Summit/Hadoop Summit
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningKai Wähner
 
H2O Advancements - Arno Candel
H2O Advancements - Arno CandelH2O Advancements - Arno Candel
H2O Advancements - Arno CandelSri Ambati
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] SparktaStratio
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaSlim Baltagi
 
Spark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Spark Summit East 2015 Keynote -- Databricks CEO Ion StoicaSpark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Spark Summit East 2015 Keynote -- Databricks CEO Ion StoicaDatabricks
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoTJim Haughwout
 
Shifting Data Science into High Gear
Shifting Data Science into High GearShifting Data Science into High Gear
Shifting Data Science into High GearSpark Summit
 
Migrating from Closed to Open Source - Fonda Ingram & Ken Sanford
Migrating from Closed to Open Source - Fonda Ingram & Ken SanfordMigrating from Closed to Open Source - Fonda Ingram & Ken Sanford
Migrating from Closed to Open Source - Fonda Ingram & Ken SanfordSri Ambati
 
What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3DataWorks Summit
 
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTestistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTurkish Testing Board
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingDatabricks
 
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSmack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSpark Summit
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiSlim Baltagi
 

What's hot (20)

Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
 
The Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open SourceThe Next Generation of Data Processing and Open Source
The Next Generation of Data Processing and Open Source
 
Log I am your father
Log I am your fatherLog I am your father
Log I am your father
 
Apache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep LearningApache Kafka Streams + Machine Learning / Deep Learning
Apache Kafka Streams + Machine Learning / Deep Learning
 
H2O Advancements - Arno Candel
H2O Advancements - Arno CandelH2O Advancements - Arno Candel
H2O Advancements - Arno Candel
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] Sparkta
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Spark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Spark Summit East 2015 Keynote -- Databricks CEO Ion StoicaSpark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Spark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
Real Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with SparkReal Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with Spark
 
Shifting Data Science into High Gear
Shifting Data Science into High GearShifting Data Science into High Gear
Shifting Data Science into High Gear
 
Migrating from Closed to Open Source - Fonda Ingram & Ken Sanford
Migrating from Closed to Open Source - Fonda Ingram & Ken SanfordMigrating from Closed to Open Source - Fonda Ingram & Ken Sanford
Migrating from Closed to Open Source - Fonda Ingram & Ken Sanford
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3
 
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTestistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured Streaming
 
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSmack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
 
Apache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim BaltagiApache Flink community Update for March 2016 - Slim Baltagi
Apache Flink community Update for March 2016 - Slim Baltagi
 

Viewers also liked

Apache® Spark™ 1.6 presented by Databricks co-founder Patrick Wendell
Apache® Spark™ 1.6 presented by Databricks co-founder Patrick WendellApache® Spark™ 1.6 presented by Databricks co-founder Patrick Wendell
Apache® Spark™ 1.6 presented by Databricks co-founder Patrick WendellDatabricks
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Spark Summit
 
Parallelizing Existing R Packages with SparkR
Parallelizing Existing R Packages with SparkRParallelizing Existing R Packages with SparkR
Parallelizing Existing R Packages with SparkRDatabricks
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Databricks
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 

Viewers also liked (7)

Apache® Spark™ 1.6 presented by Databricks co-founder Patrick Wendell
Apache® Spark™ 1.6 presented by Databricks co-founder Patrick WendellApache® Spark™ 1.6 presented by Databricks co-founder Patrick Wendell
Apache® Spark™ 1.6 presented by Databricks co-founder Patrick Wendell
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
 
Parallelizing Existing R Packages with SparkR
Parallelizing Existing R Packages with SparkRParallelizing Existing R Packages with SparkR
Parallelizing Existing R Packages with SparkR
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Map reduce vs spark
Map reduce vs sparkMap reduce vs spark
Map reduce vs spark
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 

Similar to [Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data

Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingPaco Nathan
 
BDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of sparkBDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of sparkJerry Wen
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemWhy Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemCloudera, Inc.
 
Why spark by Stratio - v.1.0
Why spark by Stratio - v.1.0Why spark by Stratio - v.1.0
Why spark by Stratio - v.1.0Stratio
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to ProductionMostafa Majidpour
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MorePaco Nathan
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceeRic Choo
 
Apache spark - History and market overview
Apache spark - History and market overviewApache spark - History and market overview
Apache spark - History and market overviewMartin Zapletal
 
Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Anirudh Gangwar
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...Debraj GuhaThakurta
 
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...Debraj GuhaThakurta
 
Strata singapore survey
Strata singapore surveyStrata singapore survey
Strata singapore surveyCheng Feng
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafkaconfluent
 
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Cloudera, Inc.
 

Similar to [Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data (20)

Dev Ops Training
Dev Ops TrainingDev Ops Training
Dev Ops Training
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
BDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of sparkBDTC2015 databricks-辛湜-state of spark
BDTC2015 databricks-辛湜-state of spark
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemWhy Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
 
2016 spark survey
2016 spark survey2016 spark survey
2016 spark survey
 
Why spark by Stratio - v.1.0
Why spark by Stratio - v.1.0Why spark by Stratio - v.1.0
Why spark by Stratio - v.1.0
 
Big data with java
Big data with javaBig data with java
Big data with java
 
Deploying Data Science Engines to Production
Deploying Data Science Engines to ProductionDeploying Data Science Engines to Production
Deploying Data Science Engines to Production
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
 
Apache spark - History and market overview
Apache spark - History and market overviewApache spark - History and market overview
Apache spark - History and market overview
 
Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.Evolution of spark framework for simplifying data analysis.
Evolution of spark framework for simplifying data analysis.
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...
 
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...
 
Strata singapore survey
Strata singapore surveyStrata singapore survey
Strata singapore survey
 
Streaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache KafkaStreaming Data and Stream Processing with Apache Kafka
Streaming Data and Stream Processing with Apache Kafka
 
INFO491FinalPaper
INFO491FinalPaperINFO491FinalPaper
INFO491FinalPaper
 
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
 

More from Legacy Typesafe (now Lightbend)

The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkLegacy Typesafe (now Lightbend)
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreLegacy Typesafe (now Lightbend)
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformLegacy Typesafe (now Lightbend)
 
Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...
Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...
Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...Legacy Typesafe (now Lightbend)
 
Akka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive PlatformAkka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive PlatformLegacy Typesafe (now Lightbend)
 
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...Legacy Typesafe (now Lightbend)
 
Microservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyMicroservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyLegacy Typesafe (now Lightbend)
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksLegacy Typesafe (now Lightbend)
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0Legacy Typesafe (now Lightbend)
 
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...Legacy Typesafe (now Lightbend)
 
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Legacy Typesafe (now Lightbend)
 

More from Legacy Typesafe (now Lightbend) (16)

The How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache SparkThe How and Why of Fast Data Analytics with Apache Spark
The How and Why of Fast Data Analytics with Apache Spark
 
Reactive Design Patterns
Reactive Design PatternsReactive Design Patterns
Reactive Design Patterns
 
Revitalizing Aging Architectures with Microservices
Revitalizing Aging Architectures with MicroservicesRevitalizing Aging Architectures with Microservices
Revitalizing Aging Architectures with Microservices
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
 
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive PlatformAkka 2.4 plus new commercial features in Typesafe Reactive Platform
Akka 2.4 plus new commercial features in Typesafe Reactive Platform
 
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
 
Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...
Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...
Reactive Revealed Part 3 of 3: Resiliency, Failures vs Errors, Isolation, Del...
 
Akka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive PlatformAkka 2.4 plus commercial features in Typesafe Reactive Platform
Akka 2.4 plus commercial features in Typesafe Reactive Platform
 
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
Reactive Revealed Part 2: Scalability, Elasticity and Location Transparency i...
 
Microservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with TechnologyMicroservices 101: Exploiting Reality's Constraints with Technology
Microservices 101: Exploiting Reality's Constraints with Technology
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
 
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
Modernizing Your Aging Architecture: What Enterprise Architects Need To Know ...
 
Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)Reactive Streams 1.0.0 and Why You Should Care (webinar)
Reactive Streams 1.0.0 and Why You Should Care (webinar)
 
Going Reactive in Java with Typesafe Reactive Platform
Going Reactive in Java with Typesafe Reactive PlatformGoing Reactive in Java with Typesafe Reactive Platform
Going Reactive in Java with Typesafe Reactive Platform
 
Why Play Framework is fast
Why Play Framework is fastWhy Play Framework is fast
Why Play Framework is fast
 

Recently uploaded

Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 

Recently uploaded (20)

Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 

[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data

  • 1. APACHE SPARK PREPARING FOR THE NEXT WAVE OF REACTIVE BIG DATA
  • 2. 74% Developers 8% Data Scientists 7% C-level execs TOP 3 LANGUAGES USED WITH SPARK 88% Scala 44% Java 22% Python 31% are evaluating Spark now are running Spark in production 13% 82% of users chose Spark to replace MapReduce 78% of users need faster processing of larger data sets 62% of users load data into Spark with Hadoop DFS 54% of users run Spark standalone 67% of users need Spark for event stream processing 20% are planning to use Spark in 2015 TOP 3 INDUSTRIES RESPONDENTS Telecoms, Banks, Retail APACHE SPARK SURVEY 2015 - QUICK SNAPSHOT
  • 3. 3 JOB TYPE/ROLE 7.5%Data Scientist 6.5%C-Level Executive 3.5%Software Architect 3.5%Dev Ops 1% Business Analyst 74%Developer 6.5%Other INDUSTRY FOCUS 33%Other 5%Consulting 4%Healthcare / Insurance 9%Advertising 10% Software / Technology 11%Retail 12%Banking / Finance 16% Telecommunications / Networks Including Biotechnology/Chemistry, Machinery, Education, Government and Utilities and other sectors
  • 4. 4 INFRASTRUCTURE TECHNOLOGIES IN USE 53% Amazon EC2 34% Docker 22% Cloudera CDH 16% Ansible 14% Mesos 13% OpenStack 12% Apache.org Builds of Hadoop 10% HortonWorks HDP 10% Heroku 8% Google Compute Engine 7% Core OS 7% MapR Hadoop Distribution 6% Microsoft Azure 5% Marathon 4% Kubernetes 2% Aurora 11% Other XaaS
  • 5. 5 Evaluating Spark now Currently using in production Evaluated, not planning to use Evaluated, will use in 2016 or later Um, what’s Spark? Planning to use in 2015 31% 28% 20% 13% 6% 2% CURRENT RELATIONSHIP WITH SPARK
  • 6. 6 Fast Batch Processing of Large Data Sets 78% Support for Event Stream Processing 60% Fast Data Queries in Real Time 56% Improved Programmer Productivity 55% BUSINESS GOALS IN MIND
  • 7. 7 SPARK FEATURES/MODULES IN DEMAND 25% 59% 65% 82% 51% Core API as a Replacement for MapReduce Streaming Library (Spark Streaming) Machine Learning Library (MLlib) Integrated SQL (SparkSQL) Graph Algorithms Library (GraphX)
  • 8. 8 DATA PROCESSING WITH SPARK 39% 41% 46% 46% 59% 61% Read or Write Data to One or More Databases Static Reports SQL Queries and Business Intelligence Write Data to Hadoop Distributed File System (HDFS) Ad-hoc Queries and Reporting ETL Data from External Sources 67% Event Stream Processing 71% 65% 40% Use Spark as Part of a Larger Data Pipeline Extract Information from Data Sooner Rather than Later Automate Decision Making at Runtime
  • 9. 9 2nd Java 44% 1st Scala 88% 3rd Python 22% WHICH LANGUAGES ARE IMPORTANT TO YOUR SPARK INSTALLATION? Honorable mentions: R, Clojure, Groovy, Ruby & Go
  • 10. 10 HOW DO YOU LOAD DATA INTO SPARK? 62% Hadoop Distributed File System (HDFS) 18% Other Services (e.g. over socket connection) 41% Apache Kafka 46% Databases 29% Amazon S3 12% Other* *Including: Apache Cassandra, Amazon Kinesis and Apache HBase
  • 11. 11 Typesafe (Twitter: @Typesafe) is dedicated to helping developers build Reactive applications on the JVM. Backed by Greylock Partners, Shasta Ventures, Bain Capital Ventures and Juniper Networks, Typesafe is headquartered in San Francisco with offices in Switzerland and Sweden. To start building Reactive applications today, download Typesafe Activator. © 2015 Typesafe Hello, Apache Spark! Typesafe Activator template for devs DOWNLOAD Get the FULL report (PDF) DOWNLOAD