Institute of                                   Information Systems  Separating compound figures in journalarticles to allo...
Motivation                                 Institute of                                           Information Systems•   F...
Aim                                               Institute of                                                  Informatio...
Compound figure examples   Institute of                           Information Systems
Methods. Dataset                       Institute of                                       Information Systems•   2982 manu...
Methods. Overview                                 Institute of                                                  Informatio...
Methods. Separator detection           Institute of                                       Information Systems •       Base...
Methods. Separator detection                       Institute of                                                   Informat...
Methods. Separator analysis                     Institute of                                                Information Sy...
Results   Institute of          Information Systems
Successful examples   Institute of                      Information Systems
Successful examples   Institute of                      Information Systems
Unsuccessful examples           Institute of                                Information Systems                    Not hor...
Conclusions future work                         Institute of                                                Information Sy...
Conclusions future work                         Institute of                                                Information Sy...
Institute of                                                                     Information Systems  Thanks for your atte...
Upcoming SlideShare
Loading in …5
×

Separating compound figures in journal articles to allow for subfigure classification

623 views

Published on

Journal images represent an important part of the knowledge stored in the medical literature. Figure classification has received much attention as the information of the image types can be used in a variety of contexts to focus image search and filter out unwanted information or ”noise”, for example non–clinical images. A major problem in figure classification is the fact that many figures in the biomedical literature are compound figures and do often contain more than a single figure type. Some journals do separate compound figures into several parts but many do not, thus requiring currently manual separation.

In this work, a technique of compound figure separation is proposed and implemented based on systematic detection and analysis of uniform space gaps. The method discussed in this article is evaluated on a dataset of journal figures of the open access literature that was created for the ImageCLEF 2012 benchmark and contains about 3000 compound figures.

Automatic tools can easily reach a relatively high accuracy in separating compound figures. To further increase accuracy efforts are needed to improve the detection process as well as to avoid over–separation with powerful analysis strategies. The tools of this article have also been tested on a database of approximately 150’000 compound figures from the biomedical literature, making these images available as separate figures for further image analysis and allowing to filter important information from them.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
623
On SlideShare
0
From Embeds
0
Number of Embeds
122
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Separating compound figures in journal articles to allow for subfigure classification

  1. 1. Institute of Information Systems Separating compound figures in journalarticles to allow for subfigure classification Ajad Chhatkuli Antonio Foncubierta-Rodríguez Dimitrios Markonis Henning Müller
  2. 2. Motivation Institute of Information Systems• Figures in biomedical journals contain a lot of information• CBIR has been proposed for accessing medical literature• Modality classification • Improves accessibility • Allows result filtering • But 50% of figures are compound or multipanel
  3. 3. Aim Institute of Information Systems• Develop a system that separates compound figures in the biomedical literature • Visual-information only • Textual information is discarded • Modality-independent • One method for many images types • Many methods for few images types • Tunable according to the dataset• Large-scale tested • Approximately 250 open access journals
  4. 4. Compound figure examples Institute of Information Systems
  5. 5. Methods. Dataset Institute of Information Systems• 2982 manually classified figures from ImageCLEF 2012 dataset• Ground truth: • Image subclass: 2x1,1x2, • Position of separators
  6. 6. Methods. Overview Institute of Information Systems• Problem is separated in two • Find subfigure separator candidates • Preprocessing if required • Analyze candidates • Remove false positives • Rule-based decisions
  7. 7. Methods. Separator detection Institute of Information Systems • Based on minimum pixel projection for white-space separated figures • Horizontal  Vertical detection • Inverse order by rotation according to aspect ratio • Recursive
  8. 8. Methods. Separator detection Institute of Information Systems • Rule-based processing • Progressive truncation to remove labels if no separators are found • Text removal based on connected commponents if no separators are found • Complement image for black-space separations • Standard deviation image for subtle separations • Binarization of non-graph figures: • Less than 40% of the image is white or almost white
  9. 9. Methods. Separator analysis Institute of Information Systems • Classification problem • True/false separator • Features used: • Closeness to border, division ratio, standard deviation, text removal analysis, histogram, gap comparison • Classifiers: • SVM • Rule-based classifier
  10. 10. Results Institute of Information Systems
  11. 11. Successful examples Institute of Information Systems
  12. 12. Successful examples Institute of Information Systems
  13. 13. Unsuccessful examples Institute of Information Systems Not horizontal/verticalNo separation gap separation
  14. 14. Conclusions future work Institute of Information Systems • Good results for a wide range of images • Using purely visual information • Separation problem: detection and analysis • Rule weights can be fine-tuned according to dataset • What would be the impact of a larger training set? • What would be the impact in existing modality classification accuracy?
  15. 15. Conclusions future work Institute of Information Systems • Good results for a wide range of images • Using purely visual information • Separation problem: detection and analysis • Rule weights can be fine-tuned according to dataset • What would be the impact of a larger training set? • What would be the impact in existing modality classification accuracy?
  16. 16. Institute of Information Systems Thanks for your attention! More information at http://medgift.hevs.chAjad Chhatkuli, Dimitrios Markonis, Antonio Foncubierta-Rodríguez, Fabrice Meriaudeauand Henning Müller, Separating compound figures in journal articles to allow for subfigure classification, in: SPIE, Medical Imaging, Orlando, FL, USA, 2013

×