Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Powerful Big Data Analytics as a Service, with Apache Spark and Azure Databricks
1. Powerful Big Data Analytics as
a Service, with Apache Spark
and Azure Databricks
Sorin Pește
Technology Solutions Professional
Microsoft
source: xkcd.com
2.
3.
4. A P A C H E S P A R K
An unified, open source, distributed engine for large-scale data processing
Spark Structured
Streaming
Stream processing
Spark MLlib
Machine
Learning
Spark Core Engine
Spark SQL
Interactive
Queries
Yarn Mesos
Standalone
Scheduler
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
6. S P A R K I N T H E E N T E R P R I S E
still comes with challenges…
Spark MLlib
Machine
Learning
7.
8. D A T A B R I C K S - C O M P A N Y O V E R V I E W
9. D A T A B R I C K S : T H E U N I F I E D A N A L Y T I C S P L A T F O R M
10. D A T A B R I C K S S P A R K I S F A S T
Benchmarks have shown Databricks to often have better performance than alternatives
SOURCE: Benchmarking Big Data SQL Platforms in the Cloud
15. CONTROL EASE OF USE
Azure Data Lake
Analytics
Any Hadoop technology,
any distribution
Workload optimized,
managed clusters
Data Engineering in a
Job-as-a-service model
Azure Marketplace
HDP | CDH | MapR
Azure Data Lake
Analytics
Virtual Machines Managed Clusters Big Data as-a-service
Azure HDInsight
Frictionless & Optimized
Spark clusters
Azure Databricks
BIGDATA
ANALYTICS
ReducedAdministration
B I G D A T A I N A Z U R E
Azure Data Lake Store
Azure Storage
BIGDATA
STORAGE