Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
APACHE SPARK
PREPARING FOR THE NEXT WAVE OF REACTIVE BIG DATA
74% Developers
8% Data Scientists
7% C-level execs
TOP 3 LANGUAGES
USED WITH SPARK
88% Scala
44% Java
22% Python
31%
are e...
3
JOB TYPE/ROLE
7.5%Data Scientist
6.5%C-Level Executive
3.5%Software Architect
3.5%Dev Ops
1% Business Analyst
74%Develop...
4
INFRASTRUCTURE TECHNOLOGIES IN USE
53% Amazon EC2
34% Docker
22% Cloudera CDH
16% Ansible
14% Mesos
13% OpenStack
12% Ap...
5
Evaluating
Spark now
Currently using
in production
Evaluated,
not planning to use
Evaluated,
will use in 2016 or later
U...
6
Fast Batch
Processing of
Large Data Sets
78%
Support for
Event Stream
Processing
60%
Fast Data
Queries in
Real Time
56%
...
7
SPARK FEATURES/MODULES IN DEMAND
25%
59%
65%
82%
51%
Core API as a
Replacement for
MapReduce
Streaming Library
(Spark St...
8
DATA PROCESSING WITH SPARK
39%
41%
46%
46%
59%
61%
Read or Write Data to One or More Databases
Static Reports
SQL Querie...
9
2nd
Java 44%
1st
Scala 88%
3rd
Python 22%
WHICH LANGUAGES ARE IMPORTANT TO YOUR SPARK INSTALLATION?
Honorable mentions: ...
10
HOW DO YOU LOAD DATA INTO SPARK?
62% Hadoop Distributed
File System (HDFS)
18% Other Services
(e.g. over socket connect...
11
Typesafe (Twitter: @Typesafe) is dedicated to helping developers build Reactive applications on the JVM. Backed by Grey...
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
Apache® Spark™ 1.6 presented by Databricks co-founder Patrick Wendell
Next
Upcoming SlideShare
Apache® Spark™ 1.6 presented by Databricks co-founder Patrick Wendell
Next
Download to read offline and view in fullscreen.

Share

[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data

Download to read offline

Back in summer of 2014, we launched the results of a survey on Java 8, which shared a lot of information we were looking for, but also contained a small golden nugget of data that we didn’t expect: that out of more than 3000 developers surveyed, a shocking 17% of them reported using Apache Spark in production.

So we did another survey with 2100+ respondents drilling down into what developers, data scientists, executives and organizations are looking forward to with Apache Spark. You can download the full version of the report for the whole story, but here is a sneak peak into the findings that we discovered.

The full version is at: http://typesafe.com/blog/apache-spark-preparing-for-the-next-wave-of-reactive-big-data

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data

  1. 1. APACHE SPARK PREPARING FOR THE NEXT WAVE OF REACTIVE BIG DATA
  2. 2. 74% Developers 8% Data Scientists 7% C-level execs TOP 3 LANGUAGES USED WITH SPARK 88% Scala 44% Java 22% Python 31% are evaluating Spark now are running Spark in production 13% 82% of users chose Spark to replace MapReduce 78% of users need faster processing of larger data sets 62% of users load data into Spark with Hadoop DFS 54% of users run Spark standalone 67% of users need Spark for event stream processing 20% are planning to use Spark in 2015 TOP 3 INDUSTRIES RESPONDENTS Telecoms, Banks, Retail APACHE SPARK SURVEY 2015 - QUICK SNAPSHOT
  3. 3. 3 JOB TYPE/ROLE 7.5%Data Scientist 6.5%C-Level Executive 3.5%Software Architect 3.5%Dev Ops 1% Business Analyst 74%Developer 6.5%Other INDUSTRY FOCUS 33%Other 5%Consulting 4%Healthcare / Insurance 9%Advertising 10% Software / Technology 11%Retail 12%Banking / Finance 16% Telecommunications / Networks Including Biotechnology/Chemistry, Machinery, Education, Government and Utilities and other sectors
  4. 4. 4 INFRASTRUCTURE TECHNOLOGIES IN USE 53% Amazon EC2 34% Docker 22% Cloudera CDH 16% Ansible 14% Mesos 13% OpenStack 12% Apache.org Builds of Hadoop 10% HortonWorks HDP 10% Heroku 8% Google Compute Engine 7% Core OS 7% MapR Hadoop Distribution 6% Microsoft Azure 5% Marathon 4% Kubernetes 2% Aurora 11% Other XaaS
  5. 5. 5 Evaluating Spark now Currently using in production Evaluated, not planning to use Evaluated, will use in 2016 or later Um, what’s Spark? Planning to use in 2015 31% 28% 20% 13% 6% 2% CURRENT RELATIONSHIP WITH SPARK
  6. 6. 6 Fast Batch Processing of Large Data Sets 78% Support for Event Stream Processing 60% Fast Data Queries in Real Time 56% Improved Programmer Productivity 55% BUSINESS GOALS IN MIND
  7. 7. 7 SPARK FEATURES/MODULES IN DEMAND 25% 59% 65% 82% 51% Core API as a Replacement for MapReduce Streaming Library (Spark Streaming) Machine Learning Library (MLlib) Integrated SQL (SparkSQL) Graph Algorithms Library (GraphX)
  8. 8. 8 DATA PROCESSING WITH SPARK 39% 41% 46% 46% 59% 61% Read or Write Data to One or More Databases Static Reports SQL Queries and Business Intelligence Write Data to Hadoop Distributed File System (HDFS) Ad-hoc Queries and Reporting ETL Data from External Sources 67% Event Stream Processing 71% 65% 40% Use Spark as Part of a Larger Data Pipeline Extract Information from Data Sooner Rather than Later Automate Decision Making at Runtime
  9. 9. 9 2nd Java 44% 1st Scala 88% 3rd Python 22% WHICH LANGUAGES ARE IMPORTANT TO YOUR SPARK INSTALLATION? Honorable mentions: R, Clojure, Groovy, Ruby & Go
  10. 10. 10 HOW DO YOU LOAD DATA INTO SPARK? 62% Hadoop Distributed File System (HDFS) 18% Other Services (e.g. over socket connection) 41% Apache Kafka 46% Databases 29% Amazon S3 12% Other* *Including: Apache Cassandra, Amazon Kinesis and Apache HBase
  11. 11. 11 Typesafe (Twitter: @Typesafe) is dedicated to helping developers build Reactive applications on the JVM. Backed by Greylock Partners, Shasta Ventures, Bain Capital Ventures and Juniper Networks, Typesafe is headquartered in San Francisco with offices in Switzerland and Sweden. To start building Reactive applications today, download Typesafe Activator. © 2015 Typesafe Hello, Apache Spark! Typesafe Activator template for devs DOWNLOAD Get the FULL report (PDF) DOWNLOAD
  • allenjoe1986

    Jan. 7, 2016
  • SaravanakumarDhandap

    Aug. 5, 2015
  • fasoulas

    Aug. 4, 2015
  • cheolkang37

    Jun. 6, 2015
  • ramkumarkb

    Jun. 2, 2015
  • RajeshReddyKunduru

    Apr. 19, 2015
  • hampsterx

    Mar. 20, 2015
  • NoamShaish

    Feb. 16, 2015
  • paoloarvati

    Feb. 5, 2015
  • colinkuo

    Jan. 31, 2015
  • seohoseok14

    Jan. 28, 2015
  • rasummer

    Jan. 28, 2015
  • TomaszLelek

    Jan. 27, 2015
  • bunkertor

    Jan. 27, 2015
  • gabrieloliveiraf

    Jan. 27, 2015

Back in summer of 2014, we launched the results of a survey on Java 8, which shared a lot of information we were looking for, but also contained a small golden nugget of data that we didn’t expect: that out of more than 3000 developers surveyed, a shocking 17% of them reported using Apache Spark in production. So we did another survey with 2100+ respondents drilling down into what developers, data scientists, executives and organizations are looking forward to with Apache Spark. You can download the full version of the report for the whole story, but here is a sneak peak into the findings that we discovered. The full version is at: http://typesafe.com/blog/apache-spark-preparing-for-the-next-wave-of-reactive-big-data

Views

Total views

28,768

On Slideshare

0

From embeds

0

Number of embeds

21,290

Actions

Downloads

75

Shares

0

Comments

0

Likes

15

×