Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Spark Summit 2017 - A feedback for TASM

205 views

Published on

As I went to Spark Summit in San Francisco, early June, I wanted to share key takeaways from the conference with my local friends of the Triangle Apache Spark Meetup.

Published in: Software
  • Be the first to comment

Spark Summit 2017 - A feedback for TASM

  1. 1. Zaloni Confidential and Proprietary - Provided under NDA Spark Summit 2017 TASM Feedback Jean Georges Perrin / jgperrin@zaloni.com 2017-06-22
  2. 2. Zaloni Confidential and Proprietary - Provided under NDA is hiring! Check out https://www.zaloni.com/about/careers/ Forbes: Best Big Data Companies And CEOs To Work For In 2017
  3. 3. Zaloni Confidential and Proprietary - Provided under NDA • June 5-7 2017 • San Francisco's Moscone Center • Just under 3000 attendees • 11 tracks: Data Science , Data Science 2, Developer, Enterprise, Machine Learning, Research, Spark Ecosystem, Use Cases, Sponsored Sessions, Streaming, Technical Deep Dives • About 30 exhibitors • About 50 sponsors • At least four French speakers • One Zaloni Speaker Logistics
  4. 4. Zaloni Confidential and Proprietary - Provided under NDA
  5. 5. Zaloni Confidential and Proprietary - Provided under NDA Significant Growth in the Community
  6. 6. Zaloni Confidential and Proprietary - Provided under NDA • Spark 2.2 is coming: ▪ Cost-based optimizer (IBM contribution). ▪ Structured streaming. ▪ Easier Python Experience (pip support). • New Databricks Open Source contribution: ▪ Deep Learning. ▪ Streaming Performance. Announces
  7. 7. Zaloni Confidential and Proprietary - Provided under NDA Deep & Machine Learning
  8. 8. Zaloni Confidential and Proprietary - Provided under NDA • Initiative from Databricks ▪ https://databricks.com/blog/2017/06/06/databricks-vision-simplify-large-sc ale-deep-learning.html ▪ https://github.com/databricks/spark-deep-learning • Easier integration of TensorFlow and other frameworks • Partnership with Stanford U Deep Learning - Making it Easier
  9. 9. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  10. 10. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  11. 11. Zaloni Confidential and Proprietary - Provided under NDA Yes, it is! Christopher Ré, Stanford U Zaloni Confidential and Proprietary - Provided under NDA
  12. 12. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  13. 13. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  14. 14. Zaloni Confidential and Proprietary - Provided under NDA Streaming
  15. 15. Zaloni Confidential and Proprietary - Provided under NDA Clearly after Kafka Streams Zaloni Confidential and Proprietary - Provided under NDA
  16. 16. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  17. 17. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  18. 18. Zaloni Confidential and Proprietary - Provided under NDA Some Sessions
  19. 19. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  20. 20. Zaloni Confidential and Proprietary - Provided under NDA • Building a Mica-like tool internally. • Looking at Open-Sourcing it. • Video: https://www.youtube.com/watch?v=-hDIkTUPhZY&feature=youtu.be • Slides: https://www.slideshare.net/databricks/using-sparkml-to-power-a-dsaas-data-sc ience-as-a-service-with-kiran-muglurmath-and-sridhar-alla Comcast
  21. 21. Zaloni Confidential and Proprietary - Provided under NDA Sunning Too... Zaloni Confidential and Proprietary - Provided under NDA
  22. 22. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  23. 23. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  24. 24. Zaloni Confidential and Proprietary - Provided under NDA Sunning's Extensions to Spark ML Zaloni Confidential and Proprietary - Provided under NDA
  25. 25. Zaloni Confidential and Proprietary - Provided under NDA • Giving ML capabilities to Business Users, mainly in fraud detection. • Slides: https://www.slideshare.net/databricks/machine-learning-as-a-service-apache-s park-mllib-enrichment-and-webbased-codeless-modeling-with-zhengyi-le • Video: https://www.youtube.com/watch?v=R4VEHoCvHy4&feature=youtu.be Sunning - ML as a Service
  26. 26. Zaloni Confidential and Proprietary - Provided under NDA A Religion War about to Start? Zaloni Confidential and Proprietary - Provided under NDA
  27. 27. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA The Cloud is Too (Damn) Hard!
  28. 28. Zaloni Confidential and Proprietary - Provided under NDA More and more of NLP and Spark Zaloni Confidential and Proprietary - Provided under NDA
  29. 29. Zaloni Confidential and Proprietary - Provided under NDA More Keynotes
  30. 30. Zaloni Confidential and Proprietary - Provided under NDA Serverless is the Future of Cloud Zaloni Confidential and Proprietary - Provided under NDA
  31. 31. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  32. 32. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  33. 33. Zaloni Confidential and Proprietary - Provided under NDA • Dynamic allocation of resources. • More flexibility for the customers. • Lower TCO. • Non-blocking jobs. • Faster. • Matching Amazon offers? Serverless
  34. 34. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA Up to 12x Faster
  35. 35. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA Intermission with Ben, Ion, and Matei Ben Lorica (O’Reilly Media) Ion Stoica (UC Berkeley AMP/RISELab & Databricks) Matei Zaharia (Databricks)
  36. 36. Zaloni Confidential and Proprietary - Provided under NDA Various Takeaways
  37. 37. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA DRY & DRO
  38. 38. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA Smarter Notebooks
  39. 39. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA Microsoft Fully Embracing the Apache Stack
  40. 40. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  41. 41. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  42. 42. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  43. 43. Zaloni Confidential and Proprietary - Provided under NDA Finally Some Common Sense! Zaloni Confidential and Proprietary - Provided under NDA
  44. 44. Zaloni Confidential and Proprietary - Provided under NDA 2.2 rocks! • Simply Faster. ▪ Autoboxing kills performance! ▪ Scala sucks (yeah!) ▪ Better Catalyst, including cost-based optimizer (donated by IBM).
  45. 45. Zaloni Confidential and Proprietary - Provided under NDA GPU Analytics is a Trend Zaloni Confidential and Proprietary - Provided under NDA
  46. 46. Zaloni Confidential and Proprietary - Provided under NDA • IBM mentioned it. • 4 sessions on the subject. ▪ 3 sessions on GPU ▪ 2 sessions on FPGA • Vendors: MapD, Intel, Nvidia. Analytics on GPU? FPGA?
  47. 47. Zaloni Confidential and Proprietary - Provided under NDA Vendors
  48. 48. Zaloni Confidential and Proprietary - Provided under NDA Classics • Databricks • Intel • IBM • Cloudera • Pepperdata • Cask • Mesosphere • Google Cloud • Amazon • Mapr • Netapp • BlueTalon • DataIku • Talend • MemSQL • Redis • Microsoft • Confluent • VMware • ... • Not Hortonworks
  49. 49. Zaloni Confidential and Proprietary - Provided under NDA • Gridgain - in memory DB • SnappyData - in memory DB • Target - looking to hire people • Yelp! - looking to hire people Others
  50. 50. Zaloni Confidential and Proprietary - Provided under NDA Freebie
  51. 51. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  52. 52. Zaloni Confidential and Proprietary - Provided under NDA And the best session of all times...
  53. 53. Zaloni Confidential and Proprietary - Provided under NDAZaloni Confidential and Proprietary - Provided under NDA
  54. 54. Zaloni Confidential and Proprietary - Provided under NDA • Video: https://www.youtube.com/watch?v=ka8xhQAoj-E&feature=youtu.be (go like it!) • Slides: ▪ On Databricks' channel: https://www.slideshare.net/databricks/the-key-to-machine-learning-is-prep ping-the-right-data-with-jean-georges-perrin (go like it!) ▪ On my channel: https://www.slideshare.net/jgperrin/the-key-to-machine-learning-is-preppin g-the-right-data (go like it!) The Key to ML is Prepping the Right Data
  55. 55. Zaloni Confidential and Proprietary - Provided under NDA Thank you

×