Project Progress
What we’ve been doing(1)
 • Hacking Hadoop API.
 • Writing different kinds of programs to
   understand it. (Not CV programs)
 • Adaboost
 • SIFT, SURF
 • Reading, Reading
Segmentation

ROI   ROI
segmentation with overlap


             get SIFT/SURF descriptor for partial segments


              reduce no. of descriptors by grouping them.


region of interest (positive&negative)

          count the frequency of occurrence of visual words


                               AdaBoost
Methodology

• For simplicity, assume the the same image is
  stored on all slave nodes.
• Use ROI to run the algorithm.
• Hopefully this will make it easier for the
  “Reduce”
Map-Reduce???
• It’s just a framework
• You can also implement it by reading the
  paper[1]. :)
• Hadoop is one implementation. (Apache +
  Yahoo)
• Google’s implementation is not made
  public.
Map-Reduce for Machine
 Learning on Multi-core
Introduction

• Algorithm fitting Statistical Query Model
  may be written in a certain “summation
  form”
• Divide into data set into as many pieces as
  the number of cores.
• Algorithm fitting Statistical Query Model may be
  written in a certain “summation form”
• Divide into data set into as many pieces as the number
  of cores.
Algorithms(1)
• Locally Weight Linear Regression
• Naive Bayes
• Gaussian Discriminative Analysis
• k-means
• Logistic Regression
• Neural Network
Algorithms(2)

• Principal Components Analysis
• Independent Components Analysis
• Expansion Maximization
• Support Vector Machine
Example (LWLR)


          divide the computation among different mappers to compute:




2 reducers sum up the partial values for A and b and finally computes the solution
Experiment Result
• Used UCI Machine Learning repository
• Used only 2 cores.
• 1.9x times faster
• 54 times speed up on 64 cores.
• Speed up is achieved by “throwing cores”
  only

Project Progress

  • 1.
  • 2.
    What we’ve beendoing(1) • Hacking Hadoop API. • Writing different kinds of programs to understand it. (Not CV programs) • Adaboost • SIFT, SURF • Reading, Reading
  • 3.
  • 4.
    segmentation with overlap get SIFT/SURF descriptor for partial segments reduce no. of descriptors by grouping them. region of interest (positive&negative) count the frequency of occurrence of visual words AdaBoost
  • 5.
    Methodology • For simplicity,assume the the same image is stored on all slave nodes. • Use ROI to run the algorithm. • Hopefully this will make it easier for the “Reduce”
  • 6.
    Map-Reduce??? • It’s justa framework • You can also implement it by reading the paper[1]. :) • Hadoop is one implementation. (Apache + Yahoo) • Google’s implementation is not made public.
  • 7.
    Map-Reduce for Machine Learning on Multi-core
  • 8.
    Introduction • Algorithm fittingStatistical Query Model may be written in a certain “summation form” • Divide into data set into as many pieces as the number of cores.
  • 9.
    • Algorithm fittingStatistical Query Model may be written in a certain “summation form” • Divide into data set into as many pieces as the number of cores.
  • 10.
    Algorithms(1) • Locally WeightLinear Regression • Naive Bayes • Gaussian Discriminative Analysis • k-means • Logistic Regression • Neural Network
  • 11.
    Algorithms(2) • Principal ComponentsAnalysis • Independent Components Analysis • Expansion Maximization • Support Vector Machine
  • 12.
    Example (LWLR) divide the computation among different mappers to compute: 2 reducers sum up the partial values for A and b and finally computes the solution
  • 13.
    Experiment Result • UsedUCI Machine Learning repository • Used only 2 cores. • 1.9x times faster • 54 times speed up on 64 cores. • Speed up is achieved by “throwing cores” only