SlideShare a Scribd company logo
1 of 33
Say What You Mean
Braxton McKee, CEO & Founder
Scaling up machine learning algorithms
directly from source code
Q: Why should I have to rewrite my
program as my dataset gets larger?
def sq_distance(p1,p2):
return sum((p1[i]-p2[i])**2 for i in range(len(p1)))
def index_of_nearest(p, points):
return min((sq_distance(p, points[i]),i)
for i in range(len(points)))[1]
def nearest_center(points, centers):
return [index_of_nearest(p, centers) for p in points]
Example: Nearest Neighbors
Unfortunately, this is not fast.
A: You shouldn’t have to!
Q: Why should I have to rewrite my
program as my dataset gets larger?
Pyfora
Automatically scalable Python
for large-scale machine learning and data science
100% Open Source
http://github.com/ufora/ufora
http://docs.pyfora.com/
Goals of Pyfora
• Provide identical semantics to regular Python
• Easily use hundreds of CPUs / GPUs and TBs of
RAM
• Scale by analyzing source code, not by calling
libraries
No more complex frameworks or
Approaches to Scaling
APIs and Frameworks
• Library of functions for
specific patterns of
parallelism
• Programmer (re)writes
program to fit the pattern.
Approaches to Scaling
APIs and Frameworks
• Library of functions for
specific patterns of
parallelism
• Programmer (re)writes
program to fit the pattern.
Programming Language
• Semantics of calculation
entirely defined by source-
code
• Compiler and Runtime are
responsible for efficient
execution.
Approaches to Scaling
APIs and Frameworks
• MPI
• Hadoop
• Spark
Programming
Languages
•CUDA
•CILK
•SQL
•Python with Pyfora
API Language
Pros
• More control over performance
• Easy to integrate lots of different
systems.
• Simpler code
• Much more expressive
• Programs are easier to understand.
• Cleaner failure modes
• Much deeper optimizations are possible.
Cons
• More code
• Program meaning obscured by
implementation details
• Hard to debug when something goes
wrong
• Very hard to implement
With a strong implementation,
“language approach” should win
• Any pattern that can be implemented in an API can be
recognized in a language.
• Language-based systems have the entire source code, so they
have more to work with than API based systems.
• Can measure behavior at runtime and use this to optimize.
Example: Nearest Neighbors
def sq_distance(p1,p2):
return sum((p1[i]-p2[i])**2 for i in range(len(p1)))
def index_of_nearest(p, points):
return min((sq_distance(p, points[i]),i)
for i in xrange(len(points)))[1]
def nearest_center(points, centers):
return [index_of_nearest(p, centers) for p in points]
How can we make this fast?
• JIT compile to make single-threaded code fast
• Parallelize to use multiple CPUs
• Distribute data to use multiple machines
Why is this tricky?
Optimal behavior depends on the sizes and shapes of data.
Centers Points
If both sets are small, don’t bother to distribute.
Why is this tricky?
Centers
Points
If “points” is tall and thin, it’s
natural to split it across many
machines and replicate
“centers”
Why is this tricky?
Centers
Points
If “points” and “centers” are really wide (say, they’re
images), it would be better to split them horizontally,
compute distances between all pairs in slices, and merge
them.
Why is this tricky?
You will end up writing totally different code for
each of these different situations.
The source code contains the necessary
structure.
The key is to defer decisions to runtime, when the
system can actually see how big the datasets are.
Getting it right is valuable
• Much less work for the programmer
• Code is actually readable
• Code becomes more reusable.
• Use the language the way it was intended:
For instance, in Python, the “row” objects can be anything that looks like
a list.
What are some other common
implementation problems we can
solve this way?
Problem: Wrong-sized chunking
• API-based frameworks require you to explicitly partition your
data into chunks.
• If you are running a complex task, the runtime may be really
long for a small subset of chunks. You’ll end up waiting a long
time for that last mapper.
• If your tasks allocate memory, you can run out of RAM and
crash.
Solution: Dynamically rebalance
CORE
#1
CORE #2 CORE #3 CORE #4
Splitting
Adaptive
Parallelism
Solution: Dynamically rebalance
• This requires you to be able to interrupt running tasks as
they’re executing.
• Adding support for this to an API makes it much more
complicated to use.
• This is much easier to do with compiler support.
Problem: Nested parallelism
Example:
• You have an iterative model
• There is lots of parallelism in each iteration
• But you also want to search over many hyperparameters
With API-based approaches, you have to manage this yourself,
either by constructing a graph of subtasks, or figuring out how to
flatten your workload into something that can be map-reduced.
sources of parallelism
def fit_model(learning_rate, model, params):
while not model.finished(params):
params = model.update_params(learning_rate, params)
return params
fits = [[fit_model(rate, model, params) for rate in learning_rates]
for model in models]
Solution: infer parallelism from
source
Problem: Common data is too big
Example:
• You have a bunch of datasets (say, for a bunch of products, the
customers who bought that product)
• You want to compute something on all pairs of sets (say, some
average on common customers for both)
• The whole set-of-sets is too big for memory
[[some_function(s1,s2) for s1 in sets] for s2 in sets]
Problem: Common data is too big
This creates problems because:
• If you just do map-reduce on the outer loop, you still need to get to the
data for all the other sets.
• If you try to actually produce all pairs of sets, you’ll end up with
something many many times larger than the original dataset.
[[some_function(s1,s2) for s1 in sets] for s2 in sets]
Solution: infer cache locality
• Think of each call to “f” as a separate task.
• Break tasks into smaller tasks until each one’s active working
set is a reasonable size.
• Schedule tasks that use the same data on the same machine to
minimize data movement.
[[some_function(s1,s2) for s1 in sets] for s2 in sets]
Solution: infer cache locality
f(s0,s0)
f(s0,s1)
f(s0,s2)
f(s0,s3)
f(s0,s4)
f(s0,s5)
f(s1,s0)
f(s1,s1)
f(s1,s2)
f(s1,s3)
f(s1,s4)
f(s1,s5)
f(s2,s0)
f(s2,s1)
f(s2,s2)
f(s2,s3)
f(s2,s4)
f(s2,s5)
f(s3,s0)
f(s3,s1)
f(s3,s2)
f(s3,s3)
f(s3,s4)
f(s3,s5)
f(s4,s0)
f(s4,s1)
f(s4,s2)
f(s4,s3)
f(s4,s4)
f(s4,s5)
f(s5,s0)
f(s5,s1)
f(s5,s2)
f(s5,s3)
f(s5,s4)
f(s5,s5)
f(s6,s0)
f(s6,s1)
f(s6,s2)
f(s6,s3)
f(s6,s4)
f(s6,s5)
f(s7,s0)
f(s7,s1)
f(s7,s2)
f(s7,s3)
f(s7,s4)
f(s7,s5)
f(s8,
f(s8,
f(s8,
f(s8,
f(s8,
f(s8,
So how does Pyfora work?
• Operate on a subset of Python that restricts mutability.
• Built a JIT compiler that can “pop” code back into the interpreter
• Can move sets of stackframes from one machine to another
• Can rewrite selected stackframes to use futures if there is parallelism to
exploit.
• Carefully track what data a thread is using.
• Dynamically schedule threads and data on machines to
optimize for cache locality.
import pyfora
executor = pyfora.connect(“http://...”)
data = executor.importS3Dataset(“myBucket”,”myData.csv”)
def calibrate(dataframe, params):
#some complex model with loops and parallelism
with executor.remotely:
dataframe = parse_csv(data)
models = [calibrate(dataframe, p) for p in params]
print(models.toLocal().result())
What are we working on?
• More libraries!
• Better predictions on how long functions will take and what data
they consume. This helps to make better scheduling decisions.
• Compiler optimizations (immutable Python is a rich source of
these)
• Automatic compilation and scheduling of data and compute on
GPU
Thanks!
• Check out the repo: github.com/ufora/ufora
• Follow me on Twitter and Medium: @braxtonmckee
• Subscribe to “This Week in Data” (see top of ufora.com)
• Email me: braxton@ufora.com

More Related Content

What's hot

Parquet Vectorization in Hive
Parquet Vectorization in HiveParquet Vectorization in Hive
Parquet Vectorization in HiveSahil Takiar
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016MLconf
 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15MLconf
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureMani Goswami
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15MLconf
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...Preferred Networks
 
TensorFlow Dev Summit 2017 요약
TensorFlow Dev Summit 2017 요약TensorFlow Dev Summit 2017 요약
TensorFlow Dev Summit 2017 요약Jin Joong Kim
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlowMatthias Feys
 
Intro to Scalable Deep Learning on AWS with Apache MXNet
Intro to Scalable Deep Learning on AWS with Apache MXNetIntro to Scalable Deep Learning on AWS with Apache MXNet
Intro to Scalable Deep Learning on AWS with Apache MXNetAmazon Web Services
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...Johan Andersson
 
Intro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowIntro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowAltoros
 
Using Derivation-Free Optimization in the Hadoop Cluster with Terasort
Using Derivation-Free Optimization in the Hadoop Cluster  with TerasortUsing Derivation-Free Optimization in the Hadoop Cluster  with Terasort
Using Derivation-Free Optimization in the Hadoop Cluster with TerasortAnhanguera Educacional S/A
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in ProductionMatthias Feys
 
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...Big Data Spain
 
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...MLconf
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」Preferred Networks
 
Data Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnData Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnAsim Jalis
 

What's hot (20)

Parquet Vectorization in Hive
Parquet Vectorization in HiveParquet Vectorization in Hive
Parquet Vectorization in Hive
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016
 
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
 
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
PFN Summer Internship 2019 / Kenshin Abe: Extension of Chainer-Chemistry for ...
 
TensorFlow Dev Summit 2017 요약
TensorFlow Dev Summit 2017 요약TensorFlow Dev Summit 2017 요약
TensorFlow Dev Summit 2017 요약
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
Scope Stack Allocation
Scope Stack AllocationScope Stack Allocation
Scope Stack Allocation
 
Intro to Scalable Deep Learning on AWS with Apache MXNet
Intro to Scalable Deep Learning on AWS with Apache MXNetIntro to Scalable Deep Learning on AWS with Apache MXNet
Intro to Scalable Deep Learning on AWS with Apache MXNet
 
Chainer v3
Chainer v3Chainer v3
Chainer v3
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
 
Intro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowIntro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlow
 
Using Derivation-Free Optimization in the Hadoop Cluster with Terasort
Using Derivation-Free Optimization in the Hadoop Cluster  with TerasortUsing Derivation-Free Optimization in the Hadoop Cluster  with Terasort
Using Derivation-Free Optimization in the Hadoop Cluster with Terasort
 
running Tensorflow in Production
running Tensorflow in Productionrunning Tensorflow in Production
running Tensorflow in Production
 
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
TENSORFLOW: ARCHITECTURE AND USE CASE - NASA SPACE APPS CHALLENGE by Gema Par...
 
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
Mathias Brandewinder, Software Engineer & Data Scientist, Clear Lines Consult...
 
IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」IIBMP2019 講演資料「オープンソースで始める深層学習」
IIBMP2019 講演資料「オープンソースで始める深層学習」
 
Data Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnData Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learn
 

Viewers also liked

Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Matthew Russell
 
Building Tooling And Culture Together
Building Tooling And Culture TogetherBuilding Tooling And Culture Together
Building Tooling And Culture TogetherNishan Subedi
 
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...Rizwan Habib
 
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...Rizwan Habib
 
NYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason YosinskiNYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason YosinskiRizwan Habib
 
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...Rizwan Habib
 
NYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeNYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeRizwan Habib
 
NYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFeeNYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFeeRizwan Habib
 
Virtual Madness @ Etsy
Virtual Madness @ EtsyVirtual Madness @ Etsy
Virtual Madness @ EtsyNishan Subedi
 
Speed up your Tests - Devi Sridharan, ThoughtWorks
Speed up your Tests - Devi Sridharan, ThoughtWorksSpeed up your Tests - Devi Sridharan, ThoughtWorks
Speed up your Tests - Devi Sridharan, ThoughtWorksThoughtworks
 
Storia degli scorpions
Storia degli scorpionsStoria degli scorpions
Storia degli scorpionsrobertlekaj
 
Klikkrant GO! - 20100309
Klikkrant GO! - 20100309Klikkrant GO! - 20100309
Klikkrant GO! - 20100309VROBA
 
Presentation for CF at SCHOOL Webinar hosted by CFQ
Presentation for CF at SCHOOL Webinar hosted by CFQPresentation for CF at SCHOOL Webinar hosted by CFQ
Presentation for CF at SCHOOL Webinar hosted by CFQChannon Goodwin
 
Market research case indian paints limited
Market research case  indian paints limitedMarket research case  indian paints limited
Market research case indian paints limitedPrafulla Tekriwal
 
Forrester & Perficient on SharePoint as a Social Business Platform
Forrester & Perficient on SharePoint as a Social Business PlatformForrester & Perficient on SharePoint as a Social Business Platform
Forrester & Perficient on SharePoint as a Social Business PlatformPerficient, Inc.
 
How to deal with deadlines
How to deal with deadlinesHow to deal with deadlines
How to deal with deadlinesMark William
 
Xub magis republic day edition vol1
Xub magis republic day edition vol1Xub magis republic day edition vol1
Xub magis republic day edition vol1MBA(RM) XIMB
 
Adverteren op Facebook: Geavanceerde campagne-optimalisatie en analyse
Adverteren op Facebook: Geavanceerde campagne-optimalisatie en analyseAdverteren op Facebook: Geavanceerde campagne-optimalisatie en analyse
Adverteren op Facebook: Geavanceerde campagne-optimalisatie en analyseKomfo
 

Viewers also liked (20)

Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
Why Twitter Is All The Rage: A Data Miner's Perspective (PyTN 2014)
 
Building Tooling And Culture Together
Building Tooling And Culture TogetherBuilding Tooling And Culture Together
Building Tooling And Culture Together
 
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
NYAI #7 - Top-down vs. Bottom-up Computational Creativity by Dr. Cole D. Ingr...
 
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
NYAI #7 - Using Data Science to Operationalize Machine Learning by Matthew Ru...
 
NYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason YosinskiNYAI #5 - Fun With Neural Nets by Jason Yosinski
NYAI #5 - Fun With Neural Nets by Jason Yosinski
 
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
NYAI #8 - HOLIDAY PARTY + NYC AI OVERVIEW with NYC's Chief Digital Officer Sr...
 
NYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeNYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden Lake
 
NYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFeeNYAI - Understanding Music Through Machine Learning by Brian McFee
NYAI - Understanding Music Through Machine Learning by Brian McFee
 
Virtual Madness @ Etsy
Virtual Madness @ EtsyVirtual Madness @ Etsy
Virtual Madness @ Etsy
 
Speed up your Tests - Devi Sridharan, ThoughtWorks
Speed up your Tests - Devi Sridharan, ThoughtWorksSpeed up your Tests - Devi Sridharan, ThoughtWorks
Speed up your Tests - Devi Sridharan, ThoughtWorks
 
Storia degli scorpions
Storia degli scorpionsStoria degli scorpions
Storia degli scorpions
 
Klikkrant GO! - 20100309
Klikkrant GO! - 20100309Klikkrant GO! - 20100309
Klikkrant GO! - 20100309
 
Presentation for CF at SCHOOL Webinar hosted by CFQ
Presentation for CF at SCHOOL Webinar hosted by CFQPresentation for CF at SCHOOL Webinar hosted by CFQ
Presentation for CF at SCHOOL Webinar hosted by CFQ
 
Demand Gen Case Study on Social Media
Demand Gen Case Study on Social MediaDemand Gen Case Study on Social Media
Demand Gen Case Study on Social Media
 
Market research case indian paints limited
Market research case  indian paints limitedMarket research case  indian paints limited
Market research case indian paints limited
 
Forrester & Perficient on SharePoint as a Social Business Platform
Forrester & Perficient on SharePoint as a Social Business PlatformForrester & Perficient on SharePoint as a Social Business Platform
Forrester & Perficient on SharePoint as a Social Business Platform
 
Market Research Efx
Market Research   EfxMarket Research   Efx
Market Research Efx
 
How to deal with deadlines
How to deal with deadlinesHow to deal with deadlines
How to deal with deadlines
 
Xub magis republic day edition vol1
Xub magis republic day edition vol1Xub magis republic day edition vol1
Xub magis republic day edition vol1
 
Adverteren op Facebook: Geavanceerde campagne-optimalisatie en analyse
Adverteren op Facebook: Geavanceerde campagne-optimalisatie en analyseAdverteren op Facebook: Geavanceerde campagne-optimalisatie en analyse
Adverteren op Facebook: Geavanceerde campagne-optimalisatie en analyse
 

Similar to NYAI - Scaling Machine Learning Applications by Braxton McKee

Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Intel® Software
 
MapReduce: teoria e prática
MapReduce: teoria e práticaMapReduce: teoria e prática
MapReduce: teoria e práticaPET Computação
 
Optimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola PericOptimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola PericNik Peric
 
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013Intel Software Brasil
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917Bill Liu
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine ParallelismSri Prasanna
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
Anirudh Koul. 30 Golden Rules of Deep Learning Performance
Anirudh Koul. 30 Golden Rules of Deep Learning PerformanceAnirudh Koul. 30 Golden Rules of Deep Learning Performance
Anirudh Koul. 30 Golden Rules of Deep Learning PerformanceLviv Startup Club
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerankgothicane
 
Workshop "Can my .NET application use less CPU / RAM?", Yevhen Tatarynov
Workshop "Can my .NET application use less CPU / RAM?", Yevhen TatarynovWorkshop "Can my .NET application use less CPU / RAM?", Yevhen Tatarynov
Workshop "Can my .NET application use less CPU / RAM?", Yevhen TatarynovFwdays
 
Migration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsZvi Avraham
 
HFSP: the Hadoop Fair Sojourn Protocol
HFSP: the Hadoop Fair Sojourn ProtocolHFSP: the Hadoop Fair Sojourn Protocol
HFSP: the Hadoop Fair Sojourn ProtocolMatteo Dell'Amico
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLHyderabad Scalability Meetup
 
k-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoopk-means algorithm implementation on Hadoop
k-means algorithm implementation on HadoopStratos Gounidellis
 
Python VS GO
Python VS GOPython VS GO
Python VS GOOfir Nir
 
Using R on High Performance Computers
Using R on High Performance ComputersUsing R on High Performance Computers
Using R on High Performance ComputersDave Hiltbrand
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable PythonTravis Oliphant
 
Apache Spark Performance tuning and Best Practise
Apache Spark Performance tuning and Best PractiseApache Spark Performance tuning and Best Practise
Apache Spark Performance tuning and Best PractiseKnoldus Inc.
 

Similar to NYAI - Scaling Machine Learning Applications by Braxton McKee (20)

Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*Data Analytics and Simulation in Parallel with MATLAB*
Data Analytics and Simulation in Parallel with MATLAB*
 
MapReduce: teoria e prática
MapReduce: teoria e práticaMapReduce: teoria e prática
MapReduce: teoria e prática
 
Optimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola PericOptimizing Performance - Clojure Remote - Nikola Peric
Optimizing Performance - Clojure Remote - Nikola Peric
 
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
Computação Paralela: Benefícios e Desafios - Intel Software Conference 2013
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Toronto meetup 20190917
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine Parallelism
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
Anirudh Koul. 30 Golden Rules of Deep Learning Performance
Anirudh Koul. 30 Golden Rules of Deep Learning PerformanceAnirudh Koul. 30 Golden Rules of Deep Learning Performance
Anirudh Koul. 30 Golden Rules of Deep Learning Performance
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
 
Workshop "Can my .NET application use less CPU / RAM?", Yevhen Tatarynov
Workshop "Can my .NET application use less CPU / RAM?", Yevhen TatarynovWorkshop "Can my .NET application use less CPU / RAM?", Yevhen Tatarynov
Workshop "Can my .NET application use less CPU / RAM?", Yevhen Tatarynov
 
Data Science
Data ScienceData Science
Data Science
 
Migration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming Models
 
HFSP: the Hadoop Fair Sojourn Protocol
HFSP: the Hadoop Fair Sojourn ProtocolHFSP: the Hadoop Fair Sojourn Protocol
HFSP: the Hadoop Fair Sojourn Protocol
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
 
k-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoopk-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoop
 
Python VS GO
Python VS GOPython VS GO
Python VS GO
 
Using R on High Performance Computers
Using R on High Performance ComputersUsing R on High Performance Computers
Using R on High Performance Computers
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
 
Apache Spark Performance tuning and Best Practise
Apache Spark Performance tuning and Best PractiseApache Spark Performance tuning and Best Practise
Apache Spark Performance tuning and Best Practise
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

NYAI - Scaling Machine Learning Applications by Braxton McKee

  • 1. Say What You Mean Braxton McKee, CEO & Founder Scaling up machine learning algorithms directly from source code
  • 2. Q: Why should I have to rewrite my program as my dataset gets larger?
  • 3. def sq_distance(p1,p2): return sum((p1[i]-p2[i])**2 for i in range(len(p1))) def index_of_nearest(p, points): return min((sq_distance(p, points[i]),i) for i in range(len(points)))[1] def nearest_center(points, centers): return [index_of_nearest(p, centers) for p in points] Example: Nearest Neighbors
  • 5. A: You shouldn’t have to! Q: Why should I have to rewrite my program as my dataset gets larger?
  • 6. Pyfora Automatically scalable Python for large-scale machine learning and data science 100% Open Source http://github.com/ufora/ufora http://docs.pyfora.com/
  • 7. Goals of Pyfora • Provide identical semantics to regular Python • Easily use hundreds of CPUs / GPUs and TBs of RAM • Scale by analyzing source code, not by calling libraries No more complex frameworks or
  • 8. Approaches to Scaling APIs and Frameworks • Library of functions for specific patterns of parallelism • Programmer (re)writes program to fit the pattern.
  • 9. Approaches to Scaling APIs and Frameworks • Library of functions for specific patterns of parallelism • Programmer (re)writes program to fit the pattern. Programming Language • Semantics of calculation entirely defined by source- code • Compiler and Runtime are responsible for efficient execution.
  • 10. Approaches to Scaling APIs and Frameworks • MPI • Hadoop • Spark Programming Languages •CUDA •CILK •SQL •Python with Pyfora
  • 11. API Language Pros • More control over performance • Easy to integrate lots of different systems. • Simpler code • Much more expressive • Programs are easier to understand. • Cleaner failure modes • Much deeper optimizations are possible. Cons • More code • Program meaning obscured by implementation details • Hard to debug when something goes wrong • Very hard to implement
  • 12. With a strong implementation, “language approach” should win • Any pattern that can be implemented in an API can be recognized in a language. • Language-based systems have the entire source code, so they have more to work with than API based systems. • Can measure behavior at runtime and use this to optimize.
  • 13. Example: Nearest Neighbors def sq_distance(p1,p2): return sum((p1[i]-p2[i])**2 for i in range(len(p1))) def index_of_nearest(p, points): return min((sq_distance(p, points[i]),i) for i in xrange(len(points)))[1] def nearest_center(points, centers): return [index_of_nearest(p, centers) for p in points]
  • 14. How can we make this fast? • JIT compile to make single-threaded code fast • Parallelize to use multiple CPUs • Distribute data to use multiple machines
  • 15. Why is this tricky? Optimal behavior depends on the sizes and shapes of data. Centers Points If both sets are small, don’t bother to distribute.
  • 16. Why is this tricky? Centers Points If “points” is tall and thin, it’s natural to split it across many machines and replicate “centers”
  • 17. Why is this tricky? Centers Points If “points” and “centers” are really wide (say, they’re images), it would be better to split them horizontally, compute distances between all pairs in slices, and merge them.
  • 18. Why is this tricky? You will end up writing totally different code for each of these different situations. The source code contains the necessary structure. The key is to defer decisions to runtime, when the system can actually see how big the datasets are.
  • 19. Getting it right is valuable • Much less work for the programmer • Code is actually readable • Code becomes more reusable. • Use the language the way it was intended: For instance, in Python, the “row” objects can be anything that looks like a list.
  • 20. What are some other common implementation problems we can solve this way?
  • 21. Problem: Wrong-sized chunking • API-based frameworks require you to explicitly partition your data into chunks. • If you are running a complex task, the runtime may be really long for a small subset of chunks. You’ll end up waiting a long time for that last mapper. • If your tasks allocate memory, you can run out of RAM and crash.
  • 22. Solution: Dynamically rebalance CORE #1 CORE #2 CORE #3 CORE #4 Splitting Adaptive Parallelism
  • 23. Solution: Dynamically rebalance • This requires you to be able to interrupt running tasks as they’re executing. • Adding support for this to an API makes it much more complicated to use. • This is much easier to do with compiler support.
  • 24. Problem: Nested parallelism Example: • You have an iterative model • There is lots of parallelism in each iteration • But you also want to search over many hyperparameters With API-based approaches, you have to manage this yourself, either by constructing a graph of subtasks, or figuring out how to flatten your workload into something that can be map-reduced.
  • 25. sources of parallelism def fit_model(learning_rate, model, params): while not model.finished(params): params = model.update_params(learning_rate, params) return params fits = [[fit_model(rate, model, params) for rate in learning_rates] for model in models] Solution: infer parallelism from source
  • 26. Problem: Common data is too big Example: • You have a bunch of datasets (say, for a bunch of products, the customers who bought that product) • You want to compute something on all pairs of sets (say, some average on common customers for both) • The whole set-of-sets is too big for memory [[some_function(s1,s2) for s1 in sets] for s2 in sets]
  • 27. Problem: Common data is too big This creates problems because: • If you just do map-reduce on the outer loop, you still need to get to the data for all the other sets. • If you try to actually produce all pairs of sets, you’ll end up with something many many times larger than the original dataset. [[some_function(s1,s2) for s1 in sets] for s2 in sets]
  • 28. Solution: infer cache locality • Think of each call to “f” as a separate task. • Break tasks into smaller tasks until each one’s active working set is a reasonable size. • Schedule tasks that use the same data on the same machine to minimize data movement. [[some_function(s1,s2) for s1 in sets] for s2 in sets]
  • 29. Solution: infer cache locality f(s0,s0) f(s0,s1) f(s0,s2) f(s0,s3) f(s0,s4) f(s0,s5) f(s1,s0) f(s1,s1) f(s1,s2) f(s1,s3) f(s1,s4) f(s1,s5) f(s2,s0) f(s2,s1) f(s2,s2) f(s2,s3) f(s2,s4) f(s2,s5) f(s3,s0) f(s3,s1) f(s3,s2) f(s3,s3) f(s3,s4) f(s3,s5) f(s4,s0) f(s4,s1) f(s4,s2) f(s4,s3) f(s4,s4) f(s4,s5) f(s5,s0) f(s5,s1) f(s5,s2) f(s5,s3) f(s5,s4) f(s5,s5) f(s6,s0) f(s6,s1) f(s6,s2) f(s6,s3) f(s6,s4) f(s6,s5) f(s7,s0) f(s7,s1) f(s7,s2) f(s7,s3) f(s7,s4) f(s7,s5) f(s8, f(s8, f(s8, f(s8, f(s8, f(s8,
  • 30. So how does Pyfora work? • Operate on a subset of Python that restricts mutability. • Built a JIT compiler that can “pop” code back into the interpreter • Can move sets of stackframes from one machine to another • Can rewrite selected stackframes to use futures if there is parallelism to exploit. • Carefully track what data a thread is using. • Dynamically schedule threads and data on machines to optimize for cache locality.
  • 31. import pyfora executor = pyfora.connect(“http://...”) data = executor.importS3Dataset(“myBucket”,”myData.csv”) def calibrate(dataframe, params): #some complex model with loops and parallelism with executor.remotely: dataframe = parse_csv(data) models = [calibrate(dataframe, p) for p in params] print(models.toLocal().result())
  • 32. What are we working on? • More libraries! • Better predictions on how long functions will take and what data they consume. This helps to make better scheduling decisions. • Compiler optimizations (immutable Python is a rich source of these) • Automatic compilation and scheduling of data and compute on GPU
  • 33. Thanks! • Check out the repo: github.com/ufora/ufora • Follow me on Twitter and Medium: @braxtonmckee • Subscribe to “This Week in Data” (see top of ufora.com) • Email me: braxton@ufora.com