What we’ve been doing(1)
• Hacking Hadoop API.
• Writing different kinds of programs to
understand it. (Not CV programs)
• Adaboost
• SIFT, SURF
• Reading, Reading
Segmentation
ROI ROI
segmentation with overlap
get SIFT/SURF descriptor for partial segments
reduce no. of descriptors by grouping them.
region of interest (positive&negative)
count the frequency of occurrence of visual words
AdaBoost
Methodology
• For simplicity, assume the the same image is
stored on all slave nodes.
• Use ROI to run the algorithm.
• Hopefully this will make it easier for the
“Reduce”
Map-Reduce???
• It’s just a framework
• You can also implement it by reading the
paper[1]. :)
• Hadoop is one implementation. (Apache +
Yahoo)
• Google’s implementation is not made
public.
Map-Reduce for Machine
Learning on Multi-core
Introduction
• Algorithm ﬁtting Statistical Query Model
may be written in a certain “summation
form”
• Divide into data set into as many pieces as
the number of cores.
• Algorithm ﬁtting Statistical Query Model may be
written in a certain “summation form”
• Divide into data set into as many pieces as the number
of cores.
Algorithms(2)
• Principal Components Analysis
• Independent Components Analysis
• Expansion Maximization
• Support Vector Machine
Example (LWLR)
divide the computation among different mappers to compute:
2 reducers sum up the partial values for A and b and ﬁnally computes the solution
Experiment Result
• Used UCI Machine Learning repository
• Used only 2 cores.
• 1.9x times faster
• 54 times speed up on 64 cores.
• Speed up is achieved by “throwing cores”
only
Be the first to comment