MobiCom on Android: Word Segmentation & Matching in Vision-Based Nutrition Information Extraction

1,262
-1

Published on

MobiCom on Android: Word Segmentation & Matching in Vision-Based Nutrition Information Extraction

Published in: Software
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,262
On Slideshare
0
From Embeds
0
Number of Embeds
46
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

MobiCom on Android: Word Segmentation & Matching in Vision-Based Nutrition Information Extraction

  1. 1. Word Segmentation & Matching in Vision-Based Nutrition Information Extraction Vladimir Kulyukinwww.youtube.com/vkedco www.vkedco.blogspot.com
  2. 2. Outline ● Character & Word Segmentation ● Midline & Baseline Detection ● Image Filtering with Average & Gaussian Filters ● Intra- & Inter-Word Gaps ● Word Blob Representation & Matchingwww.youtube.com/vkedco www.vkedco.blogspot.com
  3. 3. Back to the Big Picturewww.youtube.com/vkedco www.vkedco.blogspot.com
  4. 4. Character & Word Segmentation ● Character segmentation is the decomposition of an image of a sequence of characters into individual character symbols ● Word segmentation is the decomposition of an image of a sequence of words into word blobs ● Character segmentation is more generic than word segmentation because it potentially leads to more words recognized: applicable for unlimited or very large lexicons ● Word segmentation is less generic but simpler: applicable for limited lexiconswww.youtube.com/vkedco www.vkedco.blogspot.com
  5. 5. Topline, Midline, Baseline, Beardlinewww.youtube.com/vkedco www.vkedco.blogspot.com
  6. 6. Horizontal Projection of Segmented Lines Red line is the horizontal projection of black pixelswww.youtube.com/vkedco www.vkedco.blogspot.com
  7. 7. Midline & Baseline Detection Midline & Baseline are Detected by Detecting HP Peakswww.youtube.com/vkedco www.vkedco.blogspot.com
  8. 8. Vertical Projection & Gaps 1) Let VP(I) be the vertical project of the middle zone 2) VP(I) = 0 for intra-word & inter-word gaps Question: How do we distinguish intra-word from inter- word?www.youtube.com/vkedco www.vkedco.blogspot.com
  9. 9. Filtering ● Frequency domain analysis decomposes an image into its frequency content ● Low frequency means that slow variation of image intensities ● High frequency means rapid variation of image intensities ● A filter is an operation that amplifies a certain band of frequencies and reduces other frequency bandswww.youtube.com/vkedco www.vkedco.blogspot.com
  10. 10. Average Filter ● Replace each pixel by the average value of pixels around it ● Most common masks are 3 x 3 and 5 x 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1www.youtube.com/vkedco www.vkedco.blogspot.com
  11. 11. 5 x 5 Average Filterwww.youtube.com/vkedco www.vkedco.blogspot.com
  12. 12. Gaussian Filter ● Gaussian filters are designed to weigh each pixel by its distance from the center pixel ● A sample 3 x 3 Gaussian .00761 .036075 .10959 .21345 .2666 .21345 .10959 .03608 .00761www.youtube.com/vkedco www.vkedco.blogspot.com
  13. 13. 5 x 5 Gaussian Filterwww.youtube.com/vkedco www.vkedco.blogspot.com
  14. 14. Back to the Gaps Question 1) Let VP(I) be the vertical project of the middle zone 2) VP(I) = 0 for intra-word & inter-word gaps Question: How do we distinguish intra-word from inter- word?www.youtube.com/vkedco www.vkedco.blogspot.com
  15. 15. Back to the Gaps Question 1) Take a blurring filter (average 3 x 3) and blur the middle zone 2) Invert black and white pixels 3) Computer VP of black pixels 4) Threshold to determine inter-word gapswww.youtube.com/vkedco www.vkedco.blogspot.com
  16. 16. Word Segmentation Algorithm ● Take a text chunk (assume that it is a line) ● Determine the middle and base lines ● Use a blur filter (average or Gaussian) to blur the middle zone (the zone b/w middle and base lines) ● Determine inter-word gaps and use them to segment word blobswww.youtube.com/vkedco www.vkedco.blogspot.com
  17. 17. Word Blob Representation ● Word blobs can be represented as [R, I], where R is the height-to-width ratio and I are grayscale image pixels ● Each blob can be scaled down so that the height of each template image is X pixels ● Example: Blobs can be scaled down using bilinear interpolation, e.g., the output pixel in a scaled- down image is the weighted average of the neighboring 2 x 2 pixelswww.youtube.com/vkedco www.vkedco.blogspot.com
  18. 18. Word Blob Template Matching ● Let a word blob obtained in an image be represented as Bw = [Rw, Iw] ● Let TLib be a template library that consists of pre-computed template vectors: {B1, B2, ..., Bn} ● To match a word blob against the TLib images is a 2-stage process: – Compare ratios – If ratios are comparable, compare imageswww.youtube.com/vkedco www.vkedco.blogspot.com
  19. 19. Template Librarywww.youtube.com/vkedco www.vkedco.blogspot.com
  20. 20. Can OCR Engines Be Used? ● Yes, they can ● But, be prepared to deal with recognition errors ● There are two ways of dealing with these errors: – spelling corrections – improving image qualitywww.youtube.com/vkedco www.vkedco.blogspot.com
  21. 21. Tesseract Experiments Image Text Tesseract Output Nutrition Facts Nutrition Facts Serving size ¾ cup (32g) Sewmg SIZE 3X4 um 132g) Servings Per Container about 13 Semngs Ferfinnmmevahnm I3 Calories 160 calories from fat 40 lialmtss 160 Caiones 1mm Fa1 4U Total Fat 4.5g 7% TMil Fill 5g 7% Monounsaturated Fat 1g Finmunsaturaled Far 1n vitamin A 0% . Vitamin C 0% lvrramin A 0% I Vrtannn l) 0% Amount Per Serving " Blutllt Cereal trawl Cereal with ½ cup Fat Free Milk unnrusmrq Bani Fntmellwww.youtube.com/vkedco www.vkedco.blogspot.com
  22. 22. References ● https://code.google.com/p/tesseract-ocr/www.youtube.com/vkedco www.vkedco.blogspot.com
  23. 23. Can OCR Engines Be Used? ● Yes, they can ● But, be prepared to deal with recognition errors ● There are two ways of dealing with these errors: – spelling corrections – improving image qualitywww.youtube.com/vkedco www.vkedco.blogspot.com

×