Your SlideShare is downloading. ×
0
HIPI: Computer Vision atLarge Scale<br />Chris Sweeny<br />Liu Liu<br />
Intro to MapReduce<br />SIMD at Scale<br />Mapper / Reducer<br />
MapReduce, Main Takeaway<br />Data Centric, Data Centric, Data Centric!<br />
Hadoop, a Java Impl<br />An Implementation of MapReduce originated from Yahoo!<br />The Cluster we worked at has 625.5 nod...
Computer Vision at Scale<br />The “computational vision”<br />The sheer size of dataset:<br />PCA of Natural Images (1992)...
HIPI Workflow<br />
HIPI Image Bundle Setup<br />Moral of the story:<br />Many small files are killing the performance in distributed file sys...
Redo PCA in Natural Images at Scale<br />The first 15 principal components with 15 images (Hancock, 1992):<br />
Redo PCA in Natural Images at Scale<br />Comparison:<br />Hancock, 1992<br />HIPI, 100<br />HIPI, 1,000<br />HIPI, 10,000<...
Optimize HIPI Performance<br />Culling: because decompression is costly<br />Decompress at need<br />A boolean cull(ImageH...
Culling, to inspect specific camera effects<br />Canon Powershot S500, at 2592x1944<br />
HIPI, Glance at Performance figures<br />An empty job (only decompressing and looping over images), 5 run, using minimal f...
HIPI, Glance at Performance figures<br />Im2gray job (converting images to gray scale), 5 run, using minimal figure, in se...
HIPI, Glance at Performance figures<br />Covariance job (compute covariance matrix of patches, 100 patches per image), 1~3...
HIPI, Glance at Performance figures<br />Culling job (decompressing all images V.S. decompressing images we care about), 1...
Conclusion<br />Everything at large scale gets better.<br />HIPI provides an image-centric interface that performs on par ...
Future work<br />Release HIPI as Opensource Project.<br />Work on deep integration with Hadoop.<br />Making HIPI work-load...
Upcoming SlideShare
Loading in...5
×

Hipi: Computer Vision at Large Scale

2,134

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,134
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Hipi: Computer Vision at Large Scale"

  1. 1. HIPI: Computer Vision atLarge Scale<br />Chris Sweeny<br />Liu Liu<br />
  2. 2. Intro to MapReduce<br />SIMD at Scale<br />Mapper / Reducer<br />
  3. 3. MapReduce, Main Takeaway<br />Data Centric, Data Centric, Data Centric!<br />
  4. 4. Hadoop, a Java Impl<br />An Implementation of MapReduce originated from Yahoo!<br />The Cluster we worked at has 625.5 nodes, with map task capacity of 2502 and reduce task capacity of 834<br />
  5. 5. Computer Vision at Scale<br />The “computational vision”<br />The sheer size of dataset:<br />PCA of Natural Images (1992): 15 images, 4096 patches<br />High-perf Face Detection (2007): 75,000 samples<br />IM2GPS (2008): 6,472,304 images<br />
  6. 6. HIPI Workflow<br />
  7. 7. HIPI Image Bundle Setup<br />Moral of the story:<br />Many small files are killing the performance in distributed file system.<br />
  8. 8. Redo PCA in Natural Images at Scale<br />The first 15 principal components with 15 images (Hancock, 1992):<br />
  9. 9. Redo PCA in Natural Images at Scale<br />Comparison:<br />Hancock, 1992<br />HIPI, 100<br />HIPI, 1,000<br />HIPI, 10,000<br />HIPI, 100,000<br />
  10. 10. Optimize HIPI Performance<br />Culling: because decompression is costly<br />Decompress at need<br />A boolean cull(ImageHeader header) method for conditional decompression<br />
  11. 11. Culling, to inspect specific camera effects<br />Canon Powershot S500, at 2592x1944<br />
  12. 12. HIPI, Glance at Performance figures<br />An empty job (only decompressing and looping over images), 5 run, using minimal figure, in seconds, lower is better:<br />
  13. 13. HIPI, Glance at Performance figures<br />Im2gray job (converting images to gray scale), 5 run, using minimal figure, in seconds, lower is better:<br />
  14. 14. HIPI, Glance at Performance figures<br />Covariance job (compute covariance matrix of patches, 100 patches per image), 1~3 run*, using minimal figure, in seconds, lower is better:<br />
  15. 15. HIPI, Glance at Performance figures<br />Culling job (decompressing all images V.S. decompressing images we care about), 1~3 run, using minimal figure, in seconds, lower is better:<br />
  16. 16. Conclusion<br />Everything at large scale gets better.<br />HIPI provides an image-centric interface that performs on par or better than the leading alternative<br />Cull method provides significant improvement and convenience<br />HIPI offers noticeable improvements!<br />
  17. 17. Future work<br />Release HIPI as Opensource Project.<br />Work on deep integration with Hadoop.<br />Making HIPI work-load more configurable.<br />Making work-load more balanced.<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×