SlideShare a Scribd company logo
1 of 13
for analytics
Nico Liberato Candio
WHY DASK
DASK IN ACTION
RESULTS TABLE OF
CONTENTS
01
02
03
04 REFERENCES
03
WHY DASK
01
Parallel computing
library that scales
the existing Python
ecosystem : NumPy,
Pandas , Scikit
PARALLELIZATION FLEXIBLE RESPONSIVE
Task scheduling
interface for more
custom workloads
and integration.
Single machine or
distributed
workloads across
GCP clusters
Runs resiliently up
to 1000s cores.
Horizontal and
vertical scaling.
03
DASK IN ACTION
02
6
Example:
1. Dask in GKE cluster
2. Models deployment (Helm)
3. Faster training and
workload parallelization
4. Scale via Kubernetes or
Dask (threads/workers)
Demo: simple tree
summation computation
with Dask installed in our
cluster.
Example: parallelization in GKE.
03
CONCLUSIONS
03
WHY DASK IS
GOOD FOR US
OPEN SOURCE
Our setup is on GKE
with a single Dask
cluster deployed
with Helm. Open
Source solution
configurable for all
the ML projects.
SCALING
We scale the number
of workers (threads)
dividing the
workload in multiple
units. Horizontal or
vertical scaling via
dashboard.
SPEED
We speed up the
computation
achieving faster
times in training
models
Next?
Creation of Dask clusters
by computation context,
saving costs .
Storage and dataframes
parallelization refactoring
the model
implementation.
Questions?
nicoliberatoc@gmail.com
+353 089 44 207 64
THANKS!
CREDITS
This is where you give credit to the ones who are part of this
project.
Did you like the resources on this template? Get them for
free at our other websites.
◂ Presentation template by Slidesgo
◂ Icons by Flaticon
◂ Infographics by Freepik
◂ Images created by Freepik and rawpixel -Freepik
◂ Author introduction slide photo created by Freepik
◂ Text & Image slide photo created by Freepik.com
Dask for Analytics

More Related Content

What's hot

AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAmazon Web Services
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudAmazon Web Services
 
HybridAzureCloud
HybridAzureCloudHybridAzureCloud
HybridAzureCloudChris Condo
 
#lspe Q1 2013 dynamically scaling netflix in the cloud
#lspe Q1 2013   dynamically scaling netflix in the cloud#lspe Q1 2013   dynamically scaling netflix in the cloud
#lspe Q1 2013 dynamically scaling netflix in the cloudCoburn Watson
 
OpenStack Heat slides
OpenStack Heat slidesOpenStack Heat slides
OpenStack Heat slidesdbelova
 
Running Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformRunning Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformEva Tse
 
(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the CloudAmazon Web Services
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftLee Stott
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosmictc
 
Rancher and Kubernetes Best Practices
Rancher and  Kubernetes Best PracticesRancher and  Kubernetes Best Practices
Rancher and Kubernetes Best PracticesAvinash Patil
 
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16MLconf
 
Ford's AWS Service Update - March 2020 (Richmond AWS User Group)
Ford's AWS Service Update - March 2020 (Richmond AWS User Group)Ford's AWS Service Update - March 2020 (Richmond AWS User Group)
Ford's AWS Service Update - March 2020 (Richmond AWS User Group)Ford Prior
 
High Performance Computing (HPC) in cloud
High Performance Computing (HPC) in cloudHigh Performance Computing (HPC) in cloud
High Performance Computing (HPC) in cloudAccubits Technologies
 
Orchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using HeatOrchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using HeatCoreStack
 
OpenStack Orchestration (Heat)
OpenStack Orchestration (Heat)OpenStack Orchestration (Heat)
OpenStack Orchestration (Heat)Jimi Chen
 
Ejecución del Elastic Stack en Kubernetes
Ejecución del Elastic Stack en KubernetesEjecución del Elastic Stack en Kubernetes
Ejecución del Elastic Stack en KubernetesElasticsearch
 
Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and Saptak Sen
 
Kubernetes: Reducing Infrastructure Cost & Complexity
Kubernetes: Reducing Infrastructure Cost & ComplexityKubernetes: Reducing Infrastructure Cost & Complexity
Kubernetes: Reducing Infrastructure Cost & ComplexityDevOps.com
 
AWS Customer Presentation - JovianDATA
AWS Customer Presentation - JovianDATAAWS Customer Presentation - JovianDATA
AWS Customer Presentation - JovianDATAAmazon Web Services
 
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014Amazon Web Services
 

What's hot (20)

AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
HybridAzureCloud
HybridAzureCloudHybridAzureCloud
HybridAzureCloud
 
#lspe Q1 2013 dynamically scaling netflix in the cloud
#lspe Q1 2013   dynamically scaling netflix in the cloud#lspe Q1 2013   dynamically scaling netflix in the cloud
#lspe Q1 2013 dynamically scaling netflix in the cloud
 
OpenStack Heat slides
OpenStack Heat slidesOpenStack Heat slides
OpenStack Heat slides
 
Running Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformRunning Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data Platform
 
(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud(CMP202) Engineering Simulation and Analysis in the Cloud
(CMP202) Engineering Simulation and Analysis in the Cloud
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop Microsoft
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenarios
 
Rancher and Kubernetes Best Practices
Rancher and  Kubernetes Best PracticesRancher and  Kubernetes Best Practices
Rancher and Kubernetes Best Practices
 
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
Dr. Ike Nassi, Founder, TidalScale at MLconf NYC - 4/15/16
 
Ford's AWS Service Update - March 2020 (Richmond AWS User Group)
Ford's AWS Service Update - March 2020 (Richmond AWS User Group)Ford's AWS Service Update - March 2020 (Richmond AWS User Group)
Ford's AWS Service Update - March 2020 (Richmond AWS User Group)
 
High Performance Computing (HPC) in cloud
High Performance Computing (HPC) in cloudHigh Performance Computing (HPC) in cloud
High Performance Computing (HPC) in cloud
 
Orchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using HeatOrchestration across multiple cloud platforms using Heat
Orchestration across multiple cloud platforms using Heat
 
OpenStack Orchestration (Heat)
OpenStack Orchestration (Heat)OpenStack Orchestration (Heat)
OpenStack Orchestration (Heat)
 
Ejecución del Elastic Stack en Kubernetes
Ejecución del Elastic Stack en KubernetesEjecución del Elastic Stack en Kubernetes
Ejecución del Elastic Stack en Kubernetes
 
Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and
 
Kubernetes: Reducing Infrastructure Cost & Complexity
Kubernetes: Reducing Infrastructure Cost & ComplexityKubernetes: Reducing Infrastructure Cost & Complexity
Kubernetes: Reducing Infrastructure Cost & Complexity
 
AWS Customer Presentation - JovianDATA
AWS Customer Presentation - JovianDATAAWS Customer Presentation - JovianDATA
AWS Customer Presentation - JovianDATA
 
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
(BDT202) HPC Now Means 'High Personal Computing' | AWS re:Invent 2014
 

Similar to Dask for Analytics

Sybase IQ ile Muhteşem Performans
Sybase IQ ile Muhteşem PerformansSybase IQ ile Muhteşem Performans
Sybase IQ ile Muhteşem PerformansSybase Türkiye
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irdatastack
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudQubole
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlDavid Daeschler
 
In Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkIn Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkVenkata Naga Ravi
 
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataFrom Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataDatabricks
 
Run more applications without expanding your datacenter
Run more applications without expanding your datacenterRun more applications without expanding your datacenter
Run more applications without expanding your datacenterPrincipled Technologies
 
Azure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkAzure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkIke Ellis
 
TupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and SparkTupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and SparkDataStax Academy
 
FiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and SparkFiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and SparkEvan Chan
 
Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...
Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...
Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...Principled Technologies
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_systemKai Sasaki
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLDESMOND YUEN
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep diveAhmed Shaaban
 
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data PlatformAmazon Web Services
 
Apache spark sneha challa- google pittsburgh-aug 25th
Apache spark  sneha challa- google pittsburgh-aug 25thApache spark  sneha challa- google pittsburgh-aug 25th
Apache spark sneha challa- google pittsburgh-aug 25thSneha Challa
 

Similar to Dask for Analytics (20)

IEEE CLOUD \'11
IEEE CLOUD \'11IEEE CLOUD \'11
IEEE CLOUD \'11
 
Sybase IQ ile Muhteşem Performans
Sybase IQ ile Muhteşem PerformansSybase IQ ile Muhteşem Performans
Sybase IQ ile Muhteşem Performans
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.ir
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public Cloud
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosql
 
In Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkIn Memory Analytics with Apache Spark
In Memory Analytics with Apache Spark
 
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's DataFrom Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
From Pandas to Koalas: Reducing Time-To-Insight for Virgin Hyperloop's Data
 
AWS ECS workshop
AWS ECS workshopAWS ECS workshop
AWS ECS workshop
 
Run more applications without expanding your datacenter
Run more applications without expanding your datacenterRun more applications without expanding your datacenter
Run more applications without expanding your datacenter
 
Azure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkAzure Databricks is Easier Than You Think
Azure Databricks is Easier Than You Think
 
TupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and SparkTupleJump: Breakthrough OLAP performance on Cassandra and Spark
TupleJump: Breakthrough OLAP performance on Cassandra and Spark
 
FiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and SparkFiloDB - Breakthrough OLAP Performance with Cassandra and Spark
FiloDB - Breakthrough OLAP Performance with Cassandra and Spark
 
Scala a case4
Scala a case4Scala a case4
Scala a case4
 
Spark vstez
Spark vstezSpark vstez
Spark vstez
 
Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...
Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...
Move your private cloud to Dell EMC PowerEdge C6420 server nodes and boost Ap...
 
20180522 infra autoscaling_system
20180522 infra autoscaling_system20180522 infra autoscaling_system
20180522 infra autoscaling_system
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDL
 
SQL vs NoSQL deep dive
SQL vs NoSQL deep diveSQL vs NoSQL deep dive
SQL vs NoSQL deep dive
 
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
 
Apache spark sneha challa- google pittsburgh-aug 25th
Apache spark  sneha challa- google pittsburgh-aug 25thApache spark  sneha challa- google pittsburgh-aug 25th
Apache spark sneha challa- google pittsburgh-aug 25th
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Dask for Analytics

  • 2. WHY DASK DASK IN ACTION RESULTS TABLE OF CONTENTS 01 02 03 04 REFERENCES
  • 4. Parallel computing library that scales the existing Python ecosystem : NumPy, Pandas , Scikit PARALLELIZATION FLEXIBLE RESPONSIVE Task scheduling interface for more custom workloads and integration. Single machine or distributed workloads across GCP clusters Runs resiliently up to 1000s cores. Horizontal and vertical scaling.
  • 6. 6 Example: 1. Dask in GKE cluster 2. Models deployment (Helm) 3. Faster training and workload parallelization 4. Scale via Kubernetes or Dask (threads/workers) Demo: simple tree summation computation with Dask installed in our cluster.
  • 9. WHY DASK IS GOOD FOR US OPEN SOURCE Our setup is on GKE with a single Dask cluster deployed with Helm. Open Source solution configurable for all the ML projects. SCALING We scale the number of workers (threads) dividing the workload in multiple units. Horizontal or vertical scaling via dashboard. SPEED We speed up the computation achieving faster times in training models
  • 10. Next? Creation of Dask clusters by computation context, saving costs . Storage and dataframes parallelization refactoring the model implementation.
  • 12. CREDITS This is where you give credit to the ones who are part of this project. Did you like the resources on this template? Get them for free at our other websites. ◂ Presentation template by Slidesgo ◂ Icons by Flaticon ◂ Infographics by Freepik ◂ Images created by Freepik and rawpixel -Freepik ◂ Author introduction slide photo created by Freepik ◂ Text & Image slide photo created by Freepik.com