Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Day LA 2016 Keynote - Reynold Xin/ Databricks


Published on

Big Data Day LA 2016 Keynote - Reynold Xin, Co Founder of Databricks

Published in: Technology
  • Be the first to comment

Big Data Day LA 2016 Keynote - Reynold Xin/ Databricks

  1. 1. Scaling Big Data, a Spark perspective Reynold Xin @rxin 2016-07-09 Big Data LA
  2. 2. Scaling Big Data Early adopters Data Scientists Statisticians Physicists R users PyData … Citizen data scientists Sophisticated engineering teams
  3. 3. Spark Philosophy Unified engine Support end-to-end applications High-level APIs Easy to use, rich optimizations Integrate broadly Storage systems, libraries, etc SQLStreaming ML Graph … 1 2 3
  4. 4. Apache Spark 2.0 Next major release,coming out in the next few weeks • Unstable preview release at • 2.0.0-rc2 available on dev@sparkmailing list Remains highly compatible with ApacheSpark 1.X 17k patches (2500 for 2.0) from 1200+ contributors
  5. 5. New in 2.0 Structured API improvements (DataFrame, Dataset, SparkSession) Structured Streaming MLlib model export R bindings SQL 2003 Performance improvements Deep learning libraries (Baidu, Yahoo!, Berkeley, Databricks) GraphFrames PyData integration Reactive streams C# bindings:Mobius JS bindings:EclairJS Broader Community
  6. 6. Growing the Community New initiatives from Databricks
  7. 7. The largest challenge in applying big data is the skills gap. StackOverflow Developer Survey 2016
  8. 8. Massive Open Online Courses Free 5-course series on big data with Apache Spark Introduction to Apache Spark TM Distributed Machine Learning with Apache Spark TM Big Data Analysis with Apache Spark TM Advanced Apache Spark for Data Science and Data Engineering TM Advanced Machine Learning with Apache Spark TM
  9. 9. Databricks Community Edition Free version of Databricks with: • Interactive tutorials • Apache Spark and populardata science libraries • Visualization & debug tools
  10. 10. Demo Link to demo:
  11. 11. 2016 Apache Spark Survey
  12. 12. Thank you.