Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

「ほとんど同じ」画像を簡単に整理するために

1,147 views

Published on

YAPC::Asia 2012 LT(day 2) で発表した資料です。

Published in: Technology
  • Be the first to comment

「ほとんど同じ」画像を簡単に整理するために

  1. 1. 「ほとんど同じ」画像を 簡単に整理するために- To throw “almost same” pictures into trash bin casually - 2012-09-29 LT(day 2)@YAPC::Asia2012 @turugina
  2. 2. What I talked about@ yesterdays LT-thon Making perl script Perl Scriptto downlod **** images :D_
  3. 3. Based on that. それを踏まえて
  4. 4. Do you have any photos like these?85(Original) (Different JPEG Quality) 50 http://www.gatag.net/07/02/2009/090000.html
  5. 5. Like these450×299(Original) (different size) 160x106 https://twitter.com/lestrrat
  6. 6. Or, like these.Without glasses With glasses らとみく | あらたとしひら http://p.tl/i/18558811
  7. 7. Lets throw unnecessary one away 85(Original) (Different JPEG Quality) 50 http://www.gatag.net/07/02/2009/090000.html
  8. 8. throw unnecessary one away...450×299(Original) (different size) 160x106 https://twitter.com/lestrrat
  9. 9. Oh, I want to keep both of them. :) Without glasses With glasses らとみく | あらたとしひら http://p.tl/i/18558811
  10. 10. How can we do that? けど、どうやって?
  11. 11. One of the answers, By your eyes and handfor all of target pictures 1つの方法: 手でやれ
  12. 12. “Are you kidding? I have tons of pictures!” 「バカな!手元には万単位で画像ファイルがあるんだぞ!」
  13. 13. Ok, lets make “similar image search system” 「よろしい、ならば類似画像検索システムだ」
  14. 14. Previous researches of“Similar Image Search” algorithms● Compare pixels by pixels (Image::Compare?)● By (reduced) color histogram● By extracted outlines of image.● By representative projection vectors of fractal image compression● By characteristic values of divided regions of image. (Average RGB, Hue, Saturation, Value, or so)
  15. 15. Previous researches of“Similar Image Search” algorithms● Compare pixels by pixels (Image::Compare?)● By (reduced) color histogram● By extracted outlines of image.● By representative projection vectors of fractal image compression● By characteristic values of divided regions of image. (Average RGB, Hue, Saturation, Value, or so)
  16. 16. Algorithm (1/3)Divide image into regions (3x3 for example)
  17. 17. Algorithm (2/3) Calculate characteristic value of each regions0.7 0.3 1.0 0.7 0.4 0.90.8 0.7 0.8 0.7 0.7 0.7 0.20.1 0.2 0.2 0.2 0.2
  18. 18. Algorithm (3/3) Calculate RMSE of characteristic value of each images0.7 0.3 1.0 二乗 0.7 0.4 0.9 平均 0.7 0.7 0.70.8 0.7 0.8 平方0.1 0.2 0.2 誤差 0.2 0.2 0.2 RMSE 0.0 (Same) < ~0.0x (Resembled) <<<<<<< 1.0 (Different)
  19. 19. Implementation● Image::Characteristics (not in CPAN) – lp:~turugina/+junk/p5-image-characteristics – Using Imager API from XS● Samples (contains no picture) – lp:~turugina/+junk/img_detect ● gather.pl (gathers files using File::Find and calculates characteristic values) ● matching.pl (makes pairs of suspecious files) ● rmse.pl (calculates RMSE of pairs) ● web.pl (GUI using Mojolicious::Lite)
  20. 20. DEMO
  21. 21. Thank you.

×