SlideShare a Scribd company logo
1 of 73
Map-Reduce for Machine Learning on Multi-Core Gary Bradski, work with Stanford: Prof. Andrew Ng Cheng-Tao, Chu, Yi-An Lin, Yuan Yuan Yu. Sang Kyun Kim with Prof. Kunle Olukotun. Ben Sapp, Intern
Acknowledgments ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],*Some slides taken from Intel presentation on Google’s Map-Reduce and Paint slide from www.cs.virginia.edu/~evans/bio/slides/paintable.ppt
And Ad-Sense Adverts ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
OpenCV: Open Source Computer Vision Lib. http://www.intel.com/research/mrl/research/opencv   ,[object Object]
Machine Learning Libraries (MLL) Subdirectory under OpenCV  http:// www.intel.com/research/mrl/research/opencv   ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Modeless Model based Unsupervised Supervised ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Bayesian Networks Library: PNL Statistical Learning Library: MLL
Talk
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[email_address]
Google’s Map-Reduce Programming Model Map Phase Process each record independently Example : Determine if  business is a restaurant  AND   address is within 50 miles Reduce Phase Aggregate the filter output Example : Top(10) 10s of lines code 1 line code Find all restaurants within 25 miles of RNB Input Data A large collection of records Stored across multiple disks Example : Business Addresses   Filter Filter Map Reduce Results
More Detail: Google Search Index Generation In Reality, Uses 5-10 Map-Reduce Stages Efficiency impacts how often this is run. Impacts quality of the search. Web Page 3 Web Page 4 Web Page 5 Web Page 6 Web Page 1 Web Page 2 Web Page N Lexicon Table Filter Filter Filter A Aggregator A Web Graph Filter Filter Filter B Aggregator B Page Rank Table Filter Filter Filter C Aggregator C
More Detail: In Use: Google Search Index Lookup Google Web Server Spell Check Server Ad Server Map Reduce Map Reduce Processing a Google Search Query Index Server Index Server Index Server Index Server Index Server Index Server Index Server Index Server Index Server Document Server
Motivation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Google Hardware Infrastructure ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Two Map-Reduce Programming Models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Two Map-Reduce Programming Models ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Map-Reduce Infrastructure Overview Protocol Buffers Data description language like XML Google File System Customized Files system optimized for access pattern and disk failure Work Queue Manages processes creation on a cluster for different jobs Map-Reduce (MR-C++) Transparent distribution on a cluster: data distribution and scheduling  Provides fault tolerance: handles machine failure  Building Blocks Programming Model  and  Implementation C/C++ Library Sawzall Domain Specific Language  Language
Map-Reduce on a Cluster Find all restaurants within 25 miles of SAP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],These programs are complicated but similar. Map-Reduce hides  parallelism  and  distribution  from programmer (steps 1, 3, 5, & 6). X Request “ Map ” Server Rack “ Reduce ” Server Rack Heavily Loaded Map Reduce
Examples of Google Applications Google Search Offline : Minutes to Weeks Moderately Parallel Online : Seconds per Query “ Embarrassingly” Parallel Google News Froogle Search 100s of small programs (Scripts) Monitor, Analyze, Prototype, etc. IO-Bound Search Index News Index Product Index Search Index Generation Search Index Lookup News Index Generation News Index Lookup Froogle Index Generation Froogle Index Lookup
Google on Many-Core Architecture ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Map-Reduce on Multi-Core Use machine learning to study.
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Summation form for Machine Learning ,[object Object],[object Object],[object Object],[object Object]
Machine Learning Summation Form: Map Reduce Example   Simple example: locally weighted linear regression (LWLR): [1] To parallelize this algorithm,  re-write it in “summation form” as follows: MAP : By  writing  LWLR as a summation over data points, we now  easily parallelize the computation of  A  and  b  by dividing the data over the number of threads.  For example, with two cores:  REDUCE : A is obtained by adding A 1  and A 2  (at negligible additional cost).  The computation for b is similarly parallelized.  [1]  Here, the w i  are “weights.”  If all weights = 1, this reduces to the ordinary least squares.
ML fit with Many-Core Each  implies an NxN communication To be more clear:  We have a set of  M  data points each of length  N N N Decomposition for Many-Core: Threads T 1 T 2 . . . T P N M Each “ T ” incurres a NxN communication cost N<<M or we haven't formulated a ML problem This is a good fit for Map-Reduce
Map Reduce ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Map-Reduce Machine Learning Framework ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
List of Implemented and Analyzed Algorithms Analyzed : Yes Histogram Intersection 18. Yes Multiple Discriminant Analysis 17. Unknown Gaussian process regression 15. Yes Fisher Linear Discriminant 16. Unknown SVM (with kernel) 14.  Unknown * Boosting  13. Yes Policy search (PEGASUS)  12. Yes ICA (Independent components analysis) 11. Yes PCA (Principal components analysis) 10. Yes Neural networks  9. Yes EM for mixture distributions 8. Yes K-means clustering 7. Yes SVM (without kernel) 6. Yes Naïve Bayes 5.  Yes Gaussian discriminant analysis 4.  Yes Logistic regression 3.  Yes Locally weighted linear regression  2.  Yes Linear regression 1. Have summation form? Algorithm
Complexity Analysis :  m : number of data points;  n : number of features;  P : number of mappers (processors/cores);  c : iterations or nodes. Assume matrix inverse is O(n^3/P’) where P’ are iterations that may be parallelized.  For  L ocally  W eighted  L inear  R egression ( LWLR ) training: ( ‘  => transpose below): ============================================================== X ’ WX:  mn 2 Reducer: log(P)n 2 X ’ WY: mn Reducer: log(P)n (X ’ WX) -1 :   n 3 /P’ (X ’ WX) -1 X”WY: n 2 Serial: O(mn 2   +  n 3 /P’) Parallel:  O(mn 2 /P + n 3 /P’  + log(P)n 2                    map    invert     reduce Basically: Linear speedup with increasing numbers of cores.
Some Results, 2 Cores ,[object Object],Dataset characteristics: Basically:  Linear speed up with the number of cores
Results on 1-16 Way System ,[object Object],[object Object],[object Object]
Average Speedup over All Data ,[object Object],[object Object],[object Object],Over 10 datasets
Map-Reduce for Machine Learning: Summary Speedups over 10 datasets Bold line is avg. Dashed is std. Bars are max/min RESULT: Basically linear or slightly better speedup.
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Overall Results
Dynamic Run Time Optimization Opportunity ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Discussion
Really want a Macro/micro Map-Reduce ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extension to Other Areas of AI ,[object Object],[object Object],[object Object],[object Object],[object Object]
Multi-Core here => Shared Memory, but… Map-Reduce spans bridge to non-shared memory ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Paintable Computing
Conclusions ,[object Object],[object Object],[object Object]
References and Related Work
References ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Related Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Related Work Cont’d ,[object Object],[object Object],[object Object],[object Object]
Some Current Work, just for fun… Statistical models of vision using Fast multi-grid techniques from physics
Questions
BACK UP
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Directory Structure of Code
Basic Flow For All Algorithms main () Create the training set and the algorithm instance Register map-reduce functions to Engine class Run Algorithm Algorithm::run(matrix x, matrix y) Create Engine instance & initialize Sequential computation1 (initialize variables, etc) Engine->run(“Map-Reduce1”) Sequential computation2 Engine->run(“Map-Reduce2”) etc main () Test Result?
[object Object],Basic Flow For All Algorithms Engine::run(algorithm) Create master instance Create n threads (each execute Master::run()) Join n threads Return master’s result ,[object Object]
[object Object],Basic Flow For All Algorithms Master::run(map_id, matrix x, matrix y) Find the Mapper instance map->run() Mapper::run(algorithm, master, matrix x, matrix y) map(x, y) master->mapper_is_done() Master::mapper_is_done(Mapper *map) Get intermediate and accumulate in a vector Find Reducer instance reduce->run()
[object Object],Basic Flow For All Algorithms Reducer::run(algorithm, master, matrix x, matrix y) reduce(intermediate_data) master->reducer_is_done() Mapper::reducer_is_done() Master::mapper_is_done(Mapper *map) Get intermediate and accumulate in a vector Find Reducer instance reduce->run() Master::get_result() reduce->get_result() ,[object Object]
Where does parallelism takes place Sequential Code MAP REDUCE Sequential Code
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Input Data
ALGORITHM  DESCRIPTION
[object Object],[object Object],[object Object],[object Object],[object Object],EM Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],EM Algorithm Map-Reduce
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],EM Algorithm Performance Analysis
[object Object],[object Object],[object Object],[object Object],[object Object],EM Algorithm Performance Analysis
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Logistic Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Logistic Algorithm code
[object Object],[object Object],[object Object],[object Object],Logistic Algorithm code
[object Object],[object Object],[object Object],GDA Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],GDA Algorithm code
[object Object],[object Object],[object Object],[object Object],[object Object],K-means Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],K-means Algorithm code
[object Object],[object Object],[object Object],[object Object],NB Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],NB Algorithm source code
[object Object],[object Object],[object Object],[object Object],PCA Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],PCA Algorithm code
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ICA Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],ICA Algorithm code
[object Object],[object Object],[object Object],[object Object],LWLR Algorithm
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],LWLR Algorithm code

More Related Content

What's hot

Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit
 
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Databricks
 
Machine Learning by Example - Apache Spark
Machine Learning by Example - Apache SparkMachine Learning by Example - Apache Spark
Machine Learning by Example - Apache SparkMeeraj Kunnumpurath
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkSpark Summit
 
Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Wee Hyong Tok
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkDatabricks
 
Spark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesSpark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesDatabricks
 
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...Spark Summit
 
Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Databricks
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesAhsan Javed Awan
 
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...Databricks
 
End-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkEnd-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkBurak Yavuz
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkSpark Summit
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowDatabricks
 
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovSpark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovDatabricks
 
Scalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU ClustersScalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU ClustersDatabricks
 
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...Spark Summit
 
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...Spark Summit
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Databricks
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkDB Tsai
 

What's hot (20)

Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
 
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
 
Machine Learning by Example - Apache Spark
Machine Learning by Example - Apache SparkMachine Learning by Example - Apache Spark
Machine Learning by Example - Apache Spark
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On Spark
 
Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
Spark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesSpark DataFrames and ML Pipelines
Spark DataFrames and ML Pipelines
 
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...
 
Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
 
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
 
End-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkEnd-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache Spark
 
Distributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On SparkDistributed Heterogeneous Mixture Learning On Spark
Distributed Heterogeneous Mixture Learning On Spark
 
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflowImproving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
 
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan PanayotovSpark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
Spark ML with High Dimensional Labels Michael Zargham and Stefan Panayotov
 
Scalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU ClustersScalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
Scalable Acceleration of XGBoost Training on Apache Spark GPU Clusters
 
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...Fully Automated QA System For Large Scale Search And Recommendation Engines U...
Fully Automated QA System For Large Scale Search And Recommendation Engines U...
 
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache Spark
 

Similar to Download It

Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsDilum Bandara
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poliivascucristian
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacmlmphuong06
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsRobert Grossman
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...Yahoo Developer Network
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersXiao Qin
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentationNoha Elprince
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabadsreehari orienit
 
Presentation
PresentationPresentation
Presentationbutest
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceSpiros Economakis
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceSpiros Oikonomakis
 
Introduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationIntroduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationKnoldus Inc.
 
Introduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationIntroduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationKnoldus Inc.
 
Hadoop live online training
Hadoop live online trainingHadoop live online training
Hadoop live online trainingHarika583
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentationateeq ateeq
 
Hadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-DelhiHadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-DelhiJoydeep Sen Sarma
 

Similar to Download It (20)

Embarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel ProblemsEmbarrassingly/Delightfully Parallel Problems
Embarrassingly/Delightfully Parallel Problems
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poli
 
Hadoop
HadoopHadoop
Hadoop
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...Apache Hadoop India Summit 2011 talk  "Making Hadoop Enterprise Ready with Am...
Apache Hadoop India Summit 2011 talk "Making Hadoop Enterprise Ready with Am...
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
 
My mapreduce1 presentation
My mapreduce1 presentationMy mapreduce1 presentation
My mapreduce1 presentation
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Presentation
PresentationPresentation
Presentation
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/Reduce
 
Efficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/ReduceEfficient processing of Rank-aware queries in Map/Reduce
Efficient processing of Rank-aware queries in Map/Reduce
 
Introduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationIntroduction to GCP Data Flow Presentation
Introduction to GCP Data Flow Presentation
 
Introduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationIntroduction to GCP DataFlow Presentation
Introduction to GCP DataFlow Presentation
 
Hadoop live online training
Hadoop live online trainingHadoop live online training
Hadoop live online training
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Hadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-DelhiHadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-Delhi
 
Map reduce
Map reduceMap reduce
Map reduce
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Download It

  • 1. Map-Reduce for Machine Learning on Multi-Core Gary Bradski, work with Stanford: Prof. Andrew Ng Cheng-Tao, Chu, Yi-An Lin, Yuan Yuan Yu. Sang Kyun Kim with Prof. Kunle Olukotun. Ben Sapp, Intern
  • 2.
  • 3.
  • 4.
  • 5.
  • 7.
  • 8. Google’s Map-Reduce Programming Model Map Phase Process each record independently Example : Determine if business is a restaurant AND address is within 50 miles Reduce Phase Aggregate the filter output Example : Top(10) 10s of lines code 1 line code Find all restaurants within 25 miles of RNB Input Data A large collection of records Stored across multiple disks Example : Business Addresses Filter Filter Map Reduce Results
  • 9. More Detail: Google Search Index Generation In Reality, Uses 5-10 Map-Reduce Stages Efficiency impacts how often this is run. Impacts quality of the search. Web Page 3 Web Page 4 Web Page 5 Web Page 6 Web Page 1 Web Page 2 Web Page N Lexicon Table Filter Filter Filter A Aggregator A Web Graph Filter Filter Filter B Aggregator B Page Rank Table Filter Filter Filter C Aggregator C
  • 10. More Detail: In Use: Google Search Index Lookup Google Web Server Spell Check Server Ad Server Map Reduce Map Reduce Processing a Google Search Query Index Server Index Server Index Server Index Server Index Server Index Server Index Server Index Server Index Server Document Server
  • 11.
  • 12.
  • 13.
  • 14.
  • 15. Map-Reduce Infrastructure Overview Protocol Buffers Data description language like XML Google File System Customized Files system optimized for access pattern and disk failure Work Queue Manages processes creation on a cluster for different jobs Map-Reduce (MR-C++) Transparent distribution on a cluster: data distribution and scheduling Provides fault tolerance: handles machine failure Building Blocks Programming Model and Implementation C/C++ Library Sawzall Domain Specific Language Language
  • 16.
  • 17. Examples of Google Applications Google Search Offline : Minutes to Weeks Moderately Parallel Online : Seconds per Query “ Embarrassingly” Parallel Google News Froogle Search 100s of small programs (Scripts) Monitor, Analyze, Prototype, etc. IO-Bound Search Index News Index Product Index Search Index Generation Search Index Lookup News Index Generation News Index Lookup Froogle Index Generation Froogle Index Lookup
  • 18.
  • 19. Map-Reduce on Multi-Core Use machine learning to study.
  • 20.
  • 21.
  • 22. Machine Learning Summation Form: Map Reduce Example Simple example: locally weighted linear regression (LWLR): [1] To parallelize this algorithm, re-write it in “summation form” as follows: MAP : By writing LWLR as a summation over data points, we now easily parallelize the computation of A and b by dividing the data over the number of threads. For example, with two cores: REDUCE : A is obtained by adding A 1 and A 2 (at negligible additional cost). The computation for b is similarly parallelized. [1] Here, the w i are “weights.” If all weights = 1, this reduces to the ordinary least squares.
  • 23. ML fit with Many-Core Each implies an NxN communication To be more clear: We have a set of M data points each of length N N N Decomposition for Many-Core: Threads T 1 T 2 . . . T P N M Each “ T ” incurres a NxN communication cost N<<M or we haven't formulated a ML problem This is a good fit for Map-Reduce
  • 24.
  • 25.
  • 26. List of Implemented and Analyzed Algorithms Analyzed : Yes Histogram Intersection 18. Yes Multiple Discriminant Analysis 17. Unknown Gaussian process regression 15. Yes Fisher Linear Discriminant 16. Unknown SVM (with kernel) 14. Unknown * Boosting 13. Yes Policy search (PEGASUS) 12. Yes ICA (Independent components analysis) 11. Yes PCA (Principal components analysis) 10. Yes Neural networks 9. Yes EM for mixture distributions 8. Yes K-means clustering 7. Yes SVM (without kernel) 6. Yes Naïve Bayes 5. Yes Gaussian discriminant analysis 4. Yes Logistic regression 3. Yes Locally weighted linear regression 2. Yes Linear regression 1. Have summation form? Algorithm
  • 27. Complexity Analysis : m : number of data points; n : number of features; P : number of mappers (processors/cores); c : iterations or nodes. Assume matrix inverse is O(n^3/P’) where P’ are iterations that may be parallelized. For L ocally W eighted L inear R egression ( LWLR ) training: ( ‘ => transpose below): ============================================================== X ’ WX:  mn 2 Reducer: log(P)n 2 X ’ WY: mn Reducer: log(P)n (X ’ WX) -1 :   n 3 /P’ (X ’ WX) -1 X”WY: n 2 Serial: O(mn 2   +  n 3 /P’) Parallel:  O(mn 2 /P + n 3 /P’  + log(P)n 2                   map   invert    reduce Basically: Linear speedup with increasing numbers of cores.
  • 28.
  • 29.
  • 30.
  • 31. Map-Reduce for Machine Learning: Summary Speedups over 10 datasets Bold line is avg. Dashed is std. Bars are max/min RESULT: Basically linear or slightly better speedup.
  • 32.
  • 33.
  • 35.
  • 36.
  • 37.
  • 39.
  • 41.
  • 42.
  • 43.
  • 44. Some Current Work, just for fun… Statistical models of vision using Fast multi-grid techniques from physics
  • 47.
  • 48. Basic Flow For All Algorithms main () Create the training set and the algorithm instance Register map-reduce functions to Engine class Run Algorithm Algorithm::run(matrix x, matrix y) Create Engine instance & initialize Sequential computation1 (initialize variables, etc) Engine->run(“Map-Reduce1”) Sequential computation2 Engine->run(“Map-Reduce2”) etc main () Test Result?
  • 49.
  • 50.
  • 51.
  • 52. Where does parallelism takes place Sequential Code MAP REDUCE Sequential Code
  • 53.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.

Editor's Notes

  1. Our existing library (PNL) and future (Statistical) machine learning libraries divide the pattern recognition space into Modeless and Model based machine learning.
  2. Map Operations (also called Filters) operate on each record independently. Consequently, they are trivially parallelizable. Typical Map Operations are once which check if the record matches a condition OR process a record to extract information (like outgoing links from the web page). Aggregators are from a predefined set. All currently supported aggregators are associative and commutative. With aggregators, there are 3 possibilities: 1) Aggregator is a simple opeation (like Max). So, a sequential implementation suffices. 2) Aggregator can be easily parallelized. For instance, aggregator is counting occurances per key. In this case, the keys can be partitioned using a has function and the counts can be performed in parallel. 3) Aggregator is difficult to parallelize. However, since they belong to a small set, each of these can be efficiently implemented in parallel/distributed and hidden from the programmer. Typical Reduce Operation include (1) Sorting its inputs (2) Collecting an unbiased sample of its inputs (3) Aggregate its inputs (4) Top N values (4) Quantile (distribution) of its inputs (5) Unique values (6) Build a graph and perform a graph algorithm (7) Run a clustering algorithm
  3. Not sure if they us the Map-Reduce infrastructure for this. But they potentially could.
  4. 1. Use a cluster to complete the job quickly Parallelism: Distribute subtasks to different machines to exploit Parallelism Locality: Locate data and Schedule subtask close to the data Avoid heavily loaded machines 2. Map Phase: For each business, check if restaurant and within 25 miles of RNB (Execute the 10s of line of code of the Map filter for each record) 3. Gather the list of restaurants from the different machines Reduce Phase: Sort the list by distance (A common subset of Reduce operations like “Sort” would be provided by the library) Generate a web page with the results At any stage : Recover from faults. Machine/disk/network crashes
  5. “ Google Applications” here refers to the Applications Google runs on their clusters and not newer Desktop applications like Google Earth and Google Desktop. Similar Model: Maps.google.com, Local.google.com, etc. Red: Can use Map-Reduce. Might not get all the benefits because the Reduce operations can be quite complicated (Clustering, Graph Algorithms). Might have to parallelize explicitly if need be. Green: Extremely Parallel (by design so that response can be generated interactively). Map-Reduce is a good match. Blue: IO-Bound (Might not have indices to speedup the process). Map-Reduce is a good match.
  6. The lookup applications (that respond to queries at their various sites: search, news, froogle, etc.) appear to be their most cycle-consuming apps One of the main metrics that Google uses to evaluate a platform: “Queries per Second per Watt”.
  7. The goal of the paintable computing project is to jointly develop millimeter scale computing elements and techniques for self-assembly of distributed processes. The vision is a machine consisting of thousands of sand-grain sized processing nodes, each fitted with modest amounts of processing and memory, positioned randomly and communicating locally. Individually, the nodes are resource poor, vanishingly cheap and necessarily treated as freely expendable. Yet, when mixed together they cooperate to form a machine whose aggregate compute capacity grows with the addition of more node. This approach ultimately recasts computing as a particulate additive to ordinary materials such as building materials or paint. The motivations for this work are low node complexity, ease of deployment, and tight coupling to dense sensor arrays, which all work to enable extraordinary scaling performance and to reduce the overall cost per unit compute capacity -- in some cases, dramatically. The challenge is the explosive complexity at the system level; the cumulative effect of the unbounded node count, inherent asynchrony, variable topology, and poor unit reliability. In this regime, the limiting problem has consistently been the lack of a programming model capable of reliably directing the global behavior of such a system. The strategy is to admit that we as human programmers will never be able to structure code for unrestricted use on this architecture, and to employ the principles of self-assembly to enable the code to structure itself.
  8. In functional languages, map and fold (reduce) are fundamental operations (Typically, in place of For and While loops)