Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

371 views

Published on

Spark is rapidly becoming a go-to solution for enterprises that want to leverage their data with more advanced analytics. But did you know that it’s easier than ever to leverage Couchbase data with Spark? Databricks, founded by the creators of Spark, will present how they see Spark evolving to address new use cases, and how simple it can be to immediately start using Spark with Couchbase data.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Databricks: Exploring all the ways to analyze data with Spark – Couchbase Connect 2016

  1. 1. Distributed Analytics with Apache Spark and Couchbase Jason Pohl (Databricks) Michael Nitschinger (Couchbase)
  2. 2. OUR PRODUCT • Creators of Apache Spark. Contribute 75% of the code - 10x more than others • Trained 20K Spark users • Largest number of customers deploying Spark (300+) • Just-in-Time Data Platform • Empower your organization to swiftly build and deploy advanced analytics WHY US Who is Databricks?
  3. 3. open source data processing engine built around speed, ease of use, and sophisticated analytics largest open source data project with 1000+ contributors
  4. 4. UNIFIED ENGINE ACROSS DIVERSE WORKLOADS & ENVIRONMENTS Scale out, fault tolerant Python, Java, Scala, and R APIs Standard libraries APACHE SPARK ENGINE
  5. 5. First Cellular Phones Unified DeviceSpecialized Devices ANALOGY: EVOLUTION OF CONSUMER ELECTRONICS
  6. 6. HISTORY REPEATS: FASTER, EASIER TO USE, UNIFIED First Distributed Processing Engine Specialized Data Processing Engines Unified Data Processing Engine
  7. 7. Google Trends: Hadoop vs. Spark
  8. 8. MAJOR FEATURES IN SPARK 2.0 Performance Tungsten Phase 2 speedups of 5-20x Structured Streaming Engine SQL 2003 & Machine Learning
  9. 9. Couchbase + Apache Spark Storage Processing  Recommendations  Next gen data warehousing  Predictive analytics  Fraud detection  Catalog  Customer 360 + IOT  Personalization  Mobile applications
  10. 10. Couchbase + Apache Spark Operations Analysis  Recommendations  Next gen data warehousing  Predictive analytics  Fraud detection  Catalog  Customer 360 + IOT  Personalization  Mobile applications
  11. 11. COUCHBASE SPARK CONNECTOR 2.0 Spark 2.0 Support Structured Streaming Efficiency Improved DCP handling memory allocation creates less garbage Easier Management Tolerates Couchbase cluster topology changes (eg. add nodes & rebalance) … except rollbacks
  12. 12. Demo
  13. 13. HADOOP / DATA LAKES DATABRICKS JUST-IN-TIME DATA PLATFORM
  14. 14. Build a PoC on Databricks today. Professional services and training also available. Contact sales@databricks.com or Sign up for a trial at https://databricks.com/try-databricks

×