SlideShare a Scribd company logo
Adrian'Popescu,'Shivnath Babu
Using&Spark&to&Tune&Spark
#AI7SAIS
Meet$the$speakers
• Data%engineer%at%Unravel
• PhD%from%EPFL,%Switzerland
• 8+%years%of%experience%in%performance%
monitoring%&%modeling%of%data%management%
systems
• Focusing%on%tuning%and%optimization%of%Big%
Data%apps
2#Exp8SAIS
Adrian$Popescu
• Cofounder%and%CTO%at%Unravel,%Adjunct%
Professor%at%Duke%University
• Focusing%on%easeKofKuse%and%manageability%
of%dataKintensive%systems
• Recipient%of%US%National%Science%
Foundation%CAREER%Award,%three%IBM%
Faculty%Awards,%HP%Labs%Innovation%
Research%Award
Shivnath Babu
Many%apps%are%being%built%in%Spark
3#Exp8SAIS
But,%let%us%face%it:%
Running%Spark%apps%in%
production%is%hard
4#Exp8SAIS
My#app#often#fails#with#Out#of#
Memory…
5
DATA#SCIENTIST
#Exp8SAIS
My#app#is#too#slow…
6
DATA#ENGINEER
#Exp8SAIS
My#app#is#missing#SLA…
7
DATA#PIPELINE#OWNER
#Exp8SAIS
This%rogue%app%is%wasting%resources%
and%reducing%cluster%throughput
8
OPERATIONS%TEAMS
#Exp8SAIS
Many%factors%affect%app%performance
9#Exp8SAIS
To#add#to#Spark’s#complexity
• Many#types#of#Spark#apps###########################
• SQL
• Streaming
• AI/ML
• Graph
• Scala/Python/R
• Many#app#submission#methods#in#Spark####
• CLI
• Thrift=Server
• Notebooks=like=Zeppelin,=Jupyter,=Hue
• ETL=tools=like=Informatica,=Pentaho,=and=Talend
• Schedulers=like=Airflow,=Autosys,=Control=M,=Oozie,=Tidal,=TWS
• Many#infrastructure#choices#for#Spark#########
• OnNpremises=multiNtenant=clusters
• Transient=cloud=clusters
• AutoNscaling=clusters
• Containerized=deployments=
Simple#SQL#and#
Programming#Interface#
10#Exp8SAIS
11#Exp8SAIS
Can-we-convert-this-problem
into-a-data-problem?
First:'Bring'all'monitoring'data'to'a'
single'platform
One$complete$correlated$view.
History'Server'
API
Resource'
Manager'API
Container'
Metrics
Logs
Metadata
Data'
Statistics
SQL'Query'
Plans
Configuration
12#Exp8SAIS
Then:&Apply&intelligent&algorithms&to&
analyze&the&data&automatically
One$complete$correlated$view.
History&Server&
API
Resource&
Manager&API
Container&
Metrics
Logs
Metadata
Data&
Statistics
SQL&Query&
Plans
Configuration
Built4in$intelligence.
13#Exp8SAIS
14#Exp8SAIS
Why$not$use$Spark$itself?
15#Exp8SAIS
History$Server$
API
Resource$
Manager$API
Container$
Metrics
Logs
Metadata
Data$
Statistics
SQL$Query$
Plans
Configuration
What$
application$&$
cluster$
management$
tasks$can$we$
automate$with$
intelligent$
algorithms$in$
Spark?
Let$us$take$three$(hard)$tasks
16#Exp8SAIS
• Failures$in$Spark
• SLA%management%for%real0time%data%pipelines
• Application%autotuning
Manual&Root&Cause&Analysis&of&Spark&Failures
Typical(Failure(in(Spark
• Many(levels(of(correlated(stack(traces
• Identifying(the(root(cause(is(hard(and(time(consuming
17#Exp8SAIS
• Reduce&troubleshooting&time&from&days&to&seconds
• Improve)productivity)of)data)scientists)and)analysts
Automated&Root&Cause&Analysis&of&Spark&Failures
18#Exp8SAIS
Automatic)Root)Cause)Analysis
Error$
Template$
Extraction
Feature$
vectors
Learning$
Algorithm
for$
Predictive$
Model
Container$
Logs
Predictive$
Model
Root$
causes
19#Exp8SAIS
We#have#created#a#Failure#Taxonomy
Configuration
Errors
Data
Errors
Resource1
Errors
Deployment1
Errors
Root1Node
Category1of1failure
Input1Path1
Not1
Available
Number1
Format1
Exception
SparkSQL
JsonProcessing
Exception
…
Root1cause1labels
20#Exp8SAIS
Two$Ways$to$get$Root-Cause$Labels
• Manual'diagnosis'by'a'domain'expert
• Automatic'injection'of'the'root'cause
21#Exp8SAIS
Unravel’s Large,scale.Lab.Framework.for.
Automatic.Root.Cause.Analysis
Spark.and.multi,tenant.Workloads:
, Variety(of(workloads:(Batch,(ML,(SQL,(Streaming,(etc.
Failures:
= Large(set(of(root(causes(learned(from(customers(&(
partners.(Constantly(updated
= Continuously(inject(these(root(causes(to(train(&(test(
models(for(root=cause(prediction(
Environment:
= Lab(created(on(demand(on(cloud(or(on=premises
= Workloads(are(run(and(failures(are(injected
22#Exp8SAIS
Injecting)Failures
Application
Execution
Application
Monitor
FAILED
Injected6
Failure
Label
Labeled6
Failures
• Invalid6input
• Invalid6memory6
configuration
• OOME:6Java6heap6space
• OOME:6GC6overhead6limit6
• Container6killed6by6YARN
• Runtime6incompatibility
Injected)failure)examples:
• No6space6left6on6device
• Transformations6inside6
other6transformations
• Runtime6error
• Arithmetic6error
• Invalid6configuration6
settings
Input6Feature6
Extraction
23#Exp8SAIS
Extracting*Input*Features*from*Logs
java.lang.OutOfMemoryError: Java heap space
at
scala.reflect.ManifestFactory$$anon$9.newArray(Manifest.scala:114)
at
scala.reflect.ManifestFactory$$anon$9.newArray(Manifest.scala:112)
at …
• Extracting+stack+traces+and+error+messages
• Tokenize+by+class+names+and+words
Tokens*example:
java.lang.OutOfmemoryError Java heap space at
scala.reflect.ManifestFactory$$anon$9.newArray(Manife
st.scala:114)
• Create+a+vocabulary+of+words+from+all+words+collected
24#Exp8SAIS
Input&Feature&extraction
• Bag&of&Words&with&TF8IDF
– Computes+a+vocabulary+of+words
– Uses+TF9IDF+to+reflect+importance+of+words+in+a+document
• Doc2Vec
– Maps+words,+paragraphs,+or+documents+to+multi9dimensional+vectors
– Evaluates+the+placement+of+words+wrt neighboring+words
– Uses+a+39layer+neural+network
25#Exp8SAIS
System'Architecture
Error$
Template$
Extraction
Feature$
vectors
Learning$
Algorithm
for$
Predictive$
Model
Container$
Logs
Predictive$
Model
Root$
causes
Root$cause$
of$the$
failure
New$failure
Feature$
vector
Container$
Logs
Error$
Template$
Extraction
26#Exp8SAIS
Predictive)Models
• Shallow)Learning
– Logistic*Regression
– Random*forests
• Deep)Learning
– Neural*networks
Very)easy)to)
implement)these)
in)Spark
27#Exp8SAIS
Predicting*the*Root*Cause*of*Failures
• Training and%testing*with%injected%failures
• Test%to%train%data%set%ratio%75%*to*25%
• Models:%logistic%regression,%random%forests%
80
85
90
95
100
TF>IDF Doc2Vec
Accuracy*Score*
[%]
Logistic%Regression Random%Forests
Work%with%
deep%learning%
is%in%progress
28#Exp8SAIS
See*our*talk*at*
Strata,*NY*2017*
for*more*details
Let$us$take$three$(hard)$tasks
29#Exp8SAIS
• Failures*in*Spark
• SLA$management$for$real>time$data$pipelines
• Application*autotuning
Many%problems%can%cause%unreliable%
performance%in%streaming%pipelines
0
STREAM%STORE
Kafka HBase Spark
REAL?TIME%PROCESSORIoT Sensors
Database
Other%Data
Dashboard
Result%Store
Other%Output
Untimely%
results
Poor%
partitioning
Inefficient%
Configuration
Resource%
contention
+ + =
30#Exp8SAIS
Forecasting,&,Anomaly,
Detection,are,important,for,
streaming,pipelines
1
31#Exp8SAIS
Predicting*when*SLAs*are*in*danger*of*being*missed
Latency*SLA*is*
3*minutes
Latency*SLA*can*be
missed*by*this*time
Current*time*is*here
32#Exp8SAIS
3
Detecting(anomalies(is(tricky
Is(this(an(unexpected(lag(
worth(alerting/acting(on?
33#Exp8SAIS
What%is%an%anomaly?
1. An%unexpected%change%that%needs%your%attention
2. Smart%alerts/actions:%
1. False%positives%should%be%minimal
2. False%negatives%should%be%minimal
4
34#Exp8SAIS
Time%Series%Analysis%in%Spark
5
Time%Series%Analysis%in%Spark
6
See%our%talk%at%Strata,%San%Jose%2018
for%more%insights
Let$us$take$three$(hard)$tasks
37#Exp8SAIS
• Failures*in*Spark
• SLA*management*for*real6time*data*pipelines
• Application$autotuning
spark.driver.cores 2
spark.executor.cores
…
10
spark.sql.shuffle.partitions 300
spark.sql.autoBroadcastJoinThres
hold
20MB
…
SKEW('orders',@'o_custId') true
spark.catalog.cacheTable(“orders") true
…
PERFORMANCE
#AI7SAIS 38
Today,@tuning@is@often@by@trialNandNerror
Imagine(a(new(world
INPUTS
1. App(=(Spark(Query
2. Goal(=(Speedup
“I#need#to#make#this#app#faster”
39#Exp8SAIS
Imagine(a(new(world
TIME
APP(DURATION
In#blink#of#an#eye,#user#
gets#recommendations#to#
make#the#app#30%(faster
As#user#finishes#
checking#email,#she#
has#a#verified#run#
that#is#60%(faster
User#comes#back#from#
lunch.#A#verified#run#that#
is#90%(faster
90%(
faster!
40#Exp8SAIS
Response'Surface'Methodology'Reinforcement'Learning
41#Exp8SAIS
See#our#talk#on#Unravel#Sessions#at#Spark+AI
Summit#2018#on#this#problem#
Frameworks#like#BigDL,#Deeplearning4j,#and#
Keras are#simplifying#the#use#of#deep#
reinforcement#learning#in#Spark
42#Exp8SAIS
In#Summary
• Spark'Performance'Management'can'be'
addressed'using'Spark'itself
43#Exp8SAIS
In#Summary
• Spark'Performance'Management'can'be'
addressed'using'Spark'itself
44#Exp8SAIS
ALGORITHMS
TASKS
YOU#MUST
Sign%up%for%a%free%trial.%We%value%your%feedback!%
https://unraveldata.com/free4trial
Join%the%Unravel%team!
adrian &%shivnath [@unraveldata.com]
45#Exp8SAIS

More Related Content

Similar to Using Apache Spark to Tune Spark with Shivnath Babu and Adrian Popescu

Root Causing Cloud Adoption
Root Causing Cloud AdoptionRoot Causing Cloud Adoption
Root Causing Cloud Adoption
Scott Lowe
 
Infrastructure Automation How to Use Chef For DevOps Success
Infrastructure Automation How to Use Chef For DevOps SuccessInfrastructure Automation How to Use Chef For DevOps Success
Infrastructure Automation How to Use Chef For DevOps Success
Dynatrace
 
Stephanie Sansoucie, "Future of Omnichannel Immersion"
Stephanie Sansoucie, "Future of Omnichannel Immersion"Stephanie Sansoucie, "Future of Omnichannel Immersion"
Stephanie Sansoucie, "Future of Omnichannel Immersion"
WebVisions
 
Putting AI to Work on Apache Spark
Putting AI to Work on Apache SparkPutting AI to Work on Apache Spark
Putting AI to Work on Apache Spark
Anyscale
 
SEASR Overview
SEASR OverviewSEASR Overview
SEASR Overview
Loretta Auvil
 
Pitch Slides
Pitch SlidesPitch Slides
Pitch Slides
Jorge Rodriguez
 
Bradley Evans SPEDDEXES 2014
Bradley Evans SPEDDEXES 2014Bradley Evans SPEDDEXES 2014
Bradley Evans SPEDDEXES 2014
aceas13tern
 
Seasr Overview Ws April 2009
Seasr Overview Ws April 2009Seasr Overview Ws April 2009
Seasr Overview Ws April 2009Loretta Auvil
 
Charlie Talk - SUNY at Delhi (confluence)
Charlie Talk - SUNY at Delhi (confluence)Charlie Talk - SUNY at Delhi (confluence)
Charlie Talk - SUNY at Delhi (confluence)
Atlassian
 
Momentum of Open Research Data: now in 5-d!
Momentum of Open Research Data: now in 5-d!Momentum of Open Research Data: now in 5-d!
Momentum of Open Research Data: now in 5-d!
Heather Piwowar
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for Success
OpenSource Connections
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
MLconf
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
Hang Li
 
Rishub Kumar Resume
Rishub Kumar ResumeRishub Kumar Resume
Rishub Kumar Resume
Rishub Kumar
 
Introduction to web design
Introduction to web designIntroduction to web design
Introduction to web design
Abeer Hasanin
 
Redefining content with Infographics
Redefining content with InfographicsRedefining content with Infographics
Redefining content with Infographics
Jennifer Riehle McFarland
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
StampedeCon
 
Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?
C4Media
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the Gap
Josh Evans
 

Similar to Using Apache Spark to Tune Spark with Shivnath Babu and Adrian Popescu (20)

SEASR Overview
SEASR OverviewSEASR Overview
SEASR Overview
 
Root Causing Cloud Adoption
Root Causing Cloud AdoptionRoot Causing Cloud Adoption
Root Causing Cloud Adoption
 
Infrastructure Automation How to Use Chef For DevOps Success
Infrastructure Automation How to Use Chef For DevOps SuccessInfrastructure Automation How to Use Chef For DevOps Success
Infrastructure Automation How to Use Chef For DevOps Success
 
Stephanie Sansoucie, "Future of Omnichannel Immersion"
Stephanie Sansoucie, "Future of Omnichannel Immersion"Stephanie Sansoucie, "Future of Omnichannel Immersion"
Stephanie Sansoucie, "Future of Omnichannel Immersion"
 
Putting AI to Work on Apache Spark
Putting AI to Work on Apache SparkPutting AI to Work on Apache Spark
Putting AI to Work on Apache Spark
 
SEASR Overview
SEASR OverviewSEASR Overview
SEASR Overview
 
Pitch Slides
Pitch SlidesPitch Slides
Pitch Slides
 
Bradley Evans SPEDDEXES 2014
Bradley Evans SPEDDEXES 2014Bradley Evans SPEDDEXES 2014
Bradley Evans SPEDDEXES 2014
 
Seasr Overview Ws April 2009
Seasr Overview Ws April 2009Seasr Overview Ws April 2009
Seasr Overview Ws April 2009
 
Charlie Talk - SUNY at Delhi (confluence)
Charlie Talk - SUNY at Delhi (confluence)Charlie Talk - SUNY at Delhi (confluence)
Charlie Talk - SUNY at Delhi (confluence)
 
Momentum of Open Research Data: now in 5-d!
Momentum of Open Research Data: now in 5-d!Momentum of Open Research Data: now in 5-d!
Momentum of Open Research Data: now in 5-d!
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for Success
 
Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013Ted Willke, Intel Labs MLconf 2013
Ted Willke, Intel Labs MLconf 2013
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
 
Rishub Kumar Resume
Rishub Kumar ResumeRishub Kumar Resume
Rishub Kumar Resume
 
Introduction to web design
Introduction to web designIntroduction to web design
Introduction to web design
 
Redefining content with Infographics
Redefining content with InfographicsRedefining content with Infographics
Redefining content with Infographics
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
 
Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?Beyond DevOps: How Netflix Bridges the Gap?
Beyond DevOps: How Netflix Bridges the Gap?
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the Gap
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 

Recently uploaded (20)

Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 

Using Apache Spark to Tune Spark with Shivnath Babu and Adrian Popescu