Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data

27,096 views

Published on

Back in summer of 2014, we launched the results of a survey on Java 8, which shared a lot of information we were looking for, but also contained a small golden nugget of data that we didn’t expect: that out of more than 3000 developers surveyed, a shocking 17% of them reported using Apache Spark in production.

So we did another survey with 2100+ respondents drilling down into what developers, data scientists, executives and organizations are looking forward to with Apache Spark. You can download the full version of the report for the whole story, but here is a sneak peak into the findings that we discovered.

The full version is at: http://typesafe.com/blog/apache-spark-preparing-for-the-next-wave-of-reactive-big-data

Published in: Software
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

[Sneak Preview] Apache Spark: Preparing for the next wave of Reactive Big Data

  1. 1. APACHE SPARK PREPARING FOR THE NEXT WAVE OF REACTIVE BIG DATA
  2. 2. 74% Developers 8% Data Scientists 7% C-level execs TOP 3 LANGUAGES USED WITH SPARK 88% Scala 44% Java 22% Python 31% are evaluating Spark now are running Spark in production 13% 82% of users chose Spark to replace MapReduce 78% of users need faster processing of larger data sets 62% of users load data into Spark with Hadoop DFS 54% of users run Spark standalone 67% of users need Spark for event stream processing 20% are planning to use Spark in 2015 TOP 3 INDUSTRIES RESPONDENTS Telecoms, Banks, Retail APACHE SPARK SURVEY 2015 - QUICK SNAPSHOT
  3. 3. 3 JOB TYPE/ROLE 7.5%Data Scientist 6.5%C-Level Executive 3.5%Software Architect 3.5%Dev Ops 1% Business Analyst 74%Developer 6.5%Other INDUSTRY FOCUS 33%Other 5%Consulting 4%Healthcare / Insurance 9%Advertising 10% Software / Technology 11%Retail 12%Banking / Finance 16% Telecommunications / Networks Including Biotechnology/Chemistry, Machinery, Education, Government and Utilities and other sectors
  4. 4. 4 INFRASTRUCTURE TECHNOLOGIES IN USE 53% Amazon EC2 34% Docker 22% Cloudera CDH 16% Ansible 14% Mesos 13% OpenStack 12% Apache.org Builds of Hadoop 10% HortonWorks HDP 10% Heroku 8% Google Compute Engine 7% Core OS 7% MapR Hadoop Distribution 6% Microsoft Azure 5% Marathon 4% Kubernetes 2% Aurora 11% Other XaaS
  5. 5. 5 Evaluating Spark now Currently using in production Evaluated, not planning to use Evaluated, will use in 2016 or later Um, what’s Spark? Planning to use in 2015 31% 28% 20% 13% 6% 2% CURRENT RELATIONSHIP WITH SPARK
  6. 6. 6 Fast Batch Processing of Large Data Sets 78% Support for Event Stream Processing 60% Fast Data Queries in Real Time 56% Improved Programmer Productivity 55% BUSINESS GOALS IN MIND
  7. 7. 7 SPARK FEATURES/MODULES IN DEMAND 25% 59% 65% 82% 51% Core API as a Replacement for MapReduce Streaming Library (Spark Streaming) Machine Learning Library (MLlib) Integrated SQL (SparkSQL) Graph Algorithms Library (GraphX)
  8. 8. 8 DATA PROCESSING WITH SPARK 39% 41% 46% 46% 59% 61% Read or Write Data to One or More Databases Static Reports SQL Queries and Business Intelligence Write Data to Hadoop Distributed File System (HDFS) Ad-hoc Queries and Reporting ETL Data from External Sources 67% Event Stream Processing 71% 65% 40% Use Spark as Part of a Larger Data Pipeline Extract Information from Data Sooner Rather than Later Automate Decision Making at Runtime
  9. 9. 9 2nd Java 44% 1st Scala 88% 3rd Python 22% WHICH LANGUAGES ARE IMPORTANT TO YOUR SPARK INSTALLATION? Honorable mentions: R, Clojure, Groovy, Ruby & Go
  10. 10. 10 HOW DO YOU LOAD DATA INTO SPARK? 62% Hadoop Distributed File System (HDFS) 18% Other Services (e.g. over socket connection) 41% Apache Kafka 46% Databases 29% Amazon S3 12% Other* *Including: Apache Cassandra, Amazon Kinesis and Apache HBase
  11. 11. 11 Typesafe (Twitter: @Typesafe) is dedicated to helping developers build Reactive applications on the JVM. Backed by Greylock Partners, Shasta Ventures, Bain Capital Ventures and Juniper Networks, Typesafe is headquartered in San Francisco with offices in Switzerland and Sweden. To start building Reactive applications today, download Typesafe Activator. © 2015 Typesafe Hello, Apache Spark! Typesafe Activator template for devs DOWNLOAD Get the FULL report (PDF) DOWNLOAD

×