Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Spark Summit 2015 keynote: Making Big Data Simple with Spark

7,426 views

Published on

Spark Summit 2015 keynote in San Francisco. Presented by Databricks CEO Ion Stoica and VP of Engineering Ali Ghodsi.

Published in: Software

Spark Summit 2015 keynote: Making Big Data Simple with Spark

  1. 1. Making big Data Simple with Spark Ion Stoica and Ali Ghodsi June 15, 2015
  2. 2. More than 5,000 people trained over past year 2 Alleviating Data Scientist Scarcity Challenge “Intro to Big Data with Apache Spark” •  Anthony Joseph, UC Berkeley •  Started June 1st “Scalable Machine Learning” •  Ameet Talwalkar, UCLA •  To start July 5th
  3. 3. More than 5,000 people trained over past year 3 Alleviating Data Scientist Scarcity Challenge “Intro to Big Data with Apache Spark” •  Anthony Joseph, UC Berkeley •  Started June 1st, over 64K registered students “Scalable Machine Learning” •  Ameet Talwalkar, UCLA •  To start July 5th, over 26K registered students
  4. 4. 4 …   Spark Core Python, Java, Scala, R Spark Streaming real-time Spark SQL interactive MLlib machine learning GraphX graph a   Fast • Expressive • General Spark Significantly Simplifies Big Data Processing
  5. 5. 5 Still need to set up and manage your own Spark cluster Still more complex to operate than existing single node tools (R, Python) But Big Data Processing Remains Complex...
  6. 6. Databricks Truly Makes Big Data Simple A hosted end-to-end platform from ingest to production 6 Cluster Manager JobsNotebooks Third-Party AppsDashboards
  7. 7. June 2014: Unveiling •  Over 3,500 sign ups November 2014: Limited Availability Today •  Over 150 organizations using Databricks Databricks: The Journey Thus Far 7
  8. 8. Better products Update customers’ databases weekly instead of monthly What can Databricks and Spark do for organizations? 8 Faster time to market Create new products in 3 weeks rather than 2 months Democratize data access within enterprises Increase number of data analysts by 4x and number of data projects by 6x
  9. 9. 9 General Availability starting today! www.databricks.com
  10. 10. Ease of use Increase user productivity 10 Key Areas of Focus 1 2 Integration with existing (small and big) data tools Make non-Spark experts instantly productive 3 Security Enable mission-critical applications
  11. 11. 11 Cluster manager with multiple Spark versions From notebooks to dashboards and jobs with just a few clicks Lunch and monitor jobs, including streaming Ease of Use Notebooks Dashboards Jobs
  12. 12. 12 Best-of-breed appsVersioningR Notebooks Integration … +
  13. 13. 13 Run in your own Amazon account Access Control Lists Security Encryption at rest
  14. 14. 14 Demo

×