SlideShare a Scribd company logo
Università degli studi di Bari “Aldo Moro”
                         Dipartimento di Informatica




      A Run Length Smoothing-Based Algorithm
     for non-Manhattan Document Segmentation
                           S. Ferilli, F. Leuzzi, F. Rotella, F. Esposito
                               Via Orabona, 4 - 70126 Bari – Italy
                                   {ferilli, esposito}@di.uniba.it
L.A.C.A.M.                    {fabio.leuzzi, fulvio.rotella}@uniba.it
http://lacam.di.uniba.it
Introduction
● Automatic document processing a hot topic
  ― Layout analysis a fundamental step

    ● Identification of frames (relevant components in the document)

    ● Performance can determine quality and feasibility of the whole process

● Two different…

    ● Kinds of sources: Digitized (scanned) vs. Natively digital documents

    ● Categories of layouts: Manhattan vs. Non-Manhattan

    ● Types of algorithms: Top-down vs. Bottom-up




● Run Length Smoothing Algorithm
    ● Manhattan Layout

● Other works exploit or try to improve the RLSA by setting its parameters

● Many works on Manhattan layout

  ― Top-down strategies

● Less works on non-Manhattan layout

  ― Bottom-up strategies




●   The Manhattan assumption holds for many typeset documents, simplifies
    document processing…BUT cannot be assumed in general
RLSO
                   Application to scanned images
RLSO (Run Length Smoothing with OR)
1) horizontal smoothing with threshold th, row by row

2) vertical smoothing with threshold tv, column by column
●   logical OR of the images obtained in steps 1 and 2
                                         th = 5
                                         tv = 4
                                        (AND)
RLSO




                         ?
Application to scanned images
RLSO
              Application to born-digital documents
●   Set horizontal/vertical distance thresholds th/tv
●   build a frame for each basic block
●   H ={(dh, b’, b’’) | b’ and b’’ are horizontally adjacent basic blocks
                          and dh is the horizontal distance between them}
●for all (dh,1, b’h,1, b’’h,1) ∈ H s.t. dh,1 ≤ th merge the frames to which b’h,1, b’’h,1
belong

●   V = {(dv, b’, b’’) | b’ and b’’ are vertically adjacent basic blocks
                           and dv is the vertical distance between them}
●   for all (dv,1, b’h,1, b’’h,1) ∈ V s.t. dv,1 ≤ tv merge the frames to which b’h,1, b’’h,1 belong


      Reference block
      Adjacent blocks
    Non-adjacent blocks
    Horizontal distance
     Vertical distance
RLSO
Application to born-digital documents
RLSO
●   Run Length Smoothing algorithms based on thresholds
    ―   Hard to properly set manually (Not typical human activity)
    ―   Heuristic approaches (Ad hoc)
    ―   Tampers the idea of automatic processing
    ―   Fixed thresholds not suitable to documents with several different
        spacings




                   Automatic assessment of RLSO thresholds
RLSO
                   Automatic threshold assessment
●   Study of Run Lengths behavior                                     Figure 1.
                                                                      a fragment of
    ―   Histogram very irregular                                      scientific paper
            ● Peaks = most frequent spacings

            ● Peak clusters = equally spaced

              components
          ― Hard to exploit by automatic

            techniques

    ―   Cumulative histograms more regular
          ― Bar b = runs larger or equal than

            b                                   H’(i) = ∑ j≥ i H(j)
        ● Monotonically decreasing

          ― Flat zones = lengths for which no

            runs are present
        ● Scaled down to 10%

          ― Reduces variability
RLSO
                    Automatic threshold assessment
●   Select threshold on flat zones
    ― Derivative a good indicator

      ● Slope = 0

      ● Discrete approximation on bar

        b:
    ― Tolerance possible                               Figure 1-a.

      ● Slope = – 30

    ― Skip starting and trailing flat

      zones
      ● Starting zone = missing small
                                                b
        run lengths
      ● Trailing zone = merge whole

        content                                         Figure 1-b.


●   Iteration of technique on
    previously smoothed image
    ― Finds progressively more
                                        (Figure 1-a/1-b) successive application of RLSO with
      spaced components                 automatic threshold assessment on Figure 1.
Sample Evaluation
Conclusions
●   RLSO (Run Length Smoothing with OR) identifies runs of white pixel in the
    document image and fill them with black pixels whenever they are shorter than a
    given threshold
     –   Both Manhattan and Non-Manhattan Layout
     –   Version for natively digital documents
●   Automatic thresholding effective on documents having
     –   single character size
     –   different spacings

●   Good baseline towards more complex documents
     –   different character sizes
     –   graphics
●   Current and future Work
     –   Stop criterion for iteration
     –   Clustering based on positioning and spacing

More Related Content

What's hot

Automatic digital terrain modelling
Automatic digital terrain modellingAutomatic digital terrain modelling
Automatic digital terrain modelling
Sumant Diwakar
 

What's hot (20)

Lbp based edge-texture features for object recoginition
Lbp based edge-texture features for object recoginitionLbp based edge-texture features for object recoginition
Lbp based edge-texture features for object recoginition
 
Text detection and recognition from natural scenes
Text detection and recognition from natural scenesText detection and recognition from natural scenes
Text detection and recognition from natural scenes
 
IRJET- Devnagari Text Detection
IRJET- Devnagari Text DetectionIRJET- Devnagari Text Detection
IRJET- Devnagari Text Detection
 
Text extraction from images
Text extraction from imagesText extraction from images
Text extraction from images
 
Image to text Converter
Image to text ConverterImage to text Converter
Image to text Converter
 
Self-Directing Text Detection and Removal from Images with Smoothing
Self-Directing Text Detection and Removal from Images with SmoothingSelf-Directing Text Detection and Removal from Images with Smoothing
Self-Directing Text Detection and Removal from Images with Smoothing
 
E041122335
E041122335E041122335
E041122335
 
F045053236
F045053236F045053236
F045053236
 
Improved algorithm for road region segmentation based on sequential monte car...
Improved algorithm for road region segmentation based on sequential monte car...Improved algorithm for road region segmentation based on sequential monte car...
Improved algorithm for road region segmentation based on sequential monte car...
 
Detecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width TransformDetecting text from natural images with Stroke Width Transform
Detecting text from natural images with Stroke Width Transform
 
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMSCLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
 
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMSCLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
CLASSIFICATION AND COMPARISON OF LICENSE PLATES LOCALIZATION ALGORITHMS
 
Another Simple but Faster Method for 2D Line Clipping
Another Simple but Faster Method for 2D Line ClippingAnother Simple but Faster Method for 2D Line Clipping
Another Simple but Faster Method for 2D Line Clipping
 
Locally densest subgraph discovery
Locally densest subgraph discoveryLocally densest subgraph discovery
Locally densest subgraph discovery
 
Automatic digital terrain modelling
Automatic digital terrain modellingAutomatic digital terrain modelling
Automatic digital terrain modelling
 
Another simple but faster method for 2 d line clipping
Another simple but faster method for 2 d line clippingAnother simple but faster method for 2 d line clipping
Another simple but faster method for 2 d line clipping
 
Topology-Preserving Ordering of the RGB Space with an Evolutionary Algorithm
Topology-Preserving Ordering of the RGB Space with an Evolutionary AlgorithmTopology-Preserving Ordering of the RGB Space with an Evolutionary Algorithm
Topology-Preserving Ordering of the RGB Space with an Evolutionary Algorithm
 
Static Spatial Graph Features
Static Spatial Graph FeaturesStatic Spatial Graph Features
Static Spatial Graph Features
 
A Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsA Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphs
 
Text Detection Strategies
Text Detection StrategiesText Detection Strategies
Text Detection Strategies
 

Viewers also liked

Viewers also liked (8)

Take your sbdc online
Take your sbdc onlineTake your sbdc online
Take your sbdc online
 
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
 
Recognising the Social Attitude in Natural Interaction with Pedagogical Agents
Recognising the Social Attitude in Natural Interaction with Pedagogical AgentsRecognising the Social Attitude in Natural Interaction with Pedagogical Agents
Recognising the Social Attitude in Natural Interaction with Pedagogical Agents
 
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
 
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
 
Recognising the Social Attitude in Natural Interaction with Pedagogical Agents
Recognising the Social Attitude in Natural Interaction with Pedagogical AgentsRecognising the Social Attitude in Natural Interaction with Pedagogical Agents
Recognising the Social Attitude in Natural Interaction with Pedagogical Agents
 
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned fro...
 
Improving Robustness and Flexibility of Concept Taxonomy Learning from Text
Improving Robustness and Flexibility of Concept Taxonomy Learning from Text Improving Robustness and Flexibility of Concept Taxonomy Learning from Text
Improving Robustness and Flexibility of Concept Taxonomy Learning from Text
 

Similar to A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation

DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATIONDEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
SelvaLakshmi63
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure Extraction
Jia-Bin Huang
 
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Matthias Trapp
 
Double Patterning (4/2 update)
Double Patterning (4/2 update)Double Patterning (4/2 update)
Double Patterning (4/2 update)
Danny Luk
 
Summary of My Research
Summary of My ResearchSummary of My Research
Summary of My Research
shripadthite
 
Presentation at SMI 2023
Presentation at SMI 2023Presentation at SMI 2023
Presentation at SMI 2023
Joaquim Jorge
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14
Ashish Mundhra
 
Classic video datasets and algorithms.pptx
Classic video datasets and algorithms.pptxClassic video datasets and algorithms.pptx
Classic video datasets and algorithms.pptx
AzhanQazi
 

Similar to A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation (17)

Ip unit 5
Ip unit 5Ip unit 5
Ip unit 5
 
Chromatic Sparse Learning
Chromatic Sparse LearningChromatic Sparse Learning
Chromatic Sparse Learning
 
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATIONDEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
DEEP LEARNING TECHNIQUES POWER POINT PRESENTATION
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure Extraction
 
Path planning all algos
Path planning all algosPath planning all algos
Path planning all algos
 
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
Interactive Stereoscopic Rendering for Non-Planar Projections (GRAPP 2009)
 
Double Patterning (4/2 update)
Double Patterning (4/2 update)Double Patterning (4/2 update)
Double Patterning (4/2 update)
 
Line Detection in Computer Vision - Recent Developments and Applications
Line Detection in Computer Vision - Recent Developments and ApplicationsLine Detection in Computer Vision - Recent Developments and Applications
Line Detection in Computer Vision - Recent Developments and Applications
 
Robotics - introduction to Robotics
Robotics -  introduction to Robotics  Robotics -  introduction to Robotics
Robotics - introduction to Robotics
 
Pulse Estimation
Pulse EstimationPulse Estimation
Pulse Estimation
 
Summary of My Research
Summary of My ResearchSummary of My Research
Summary of My Research
 
Presentation at SMI 2023
Presentation at SMI 2023Presentation at SMI 2023
Presentation at SMI 2023
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14
 
project_PPT_final
project_PPT_finalproject_PPT_final
project_PPT_final
 
Classic video datasets and algorithms.pptx
Classic video datasets and algorithms.pptxClassic video datasets and algorithms.pptx
Classic video datasets and algorithms.pptx
 
An introduction to isogeometric analysis
An introduction to isogeometric analysisAn introduction to isogeometric analysis
An introduction to isogeometric analysis
 
Computer Graphics - Hidden Line Removal Algorithm
Computer Graphics - Hidden Line Removal AlgorithmComputer Graphics - Hidden Line Removal Algorithm
Computer Graphics - Hidden Line Removal Algorithm
 

Recently uploaded

Recently uploaded (20)

Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Server-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at PricelineServer-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at Priceline
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation

  • 1. Università degli studi di Bari “Aldo Moro” Dipartimento di Informatica A Run Length Smoothing-Based Algorithm for non-Manhattan Document Segmentation S. Ferilli, F. Leuzzi, F. Rotella, F. Esposito Via Orabona, 4 - 70126 Bari – Italy {ferilli, esposito}@di.uniba.it L.A.C.A.M. {fabio.leuzzi, fulvio.rotella}@uniba.it http://lacam.di.uniba.it
  • 2. Introduction ● Automatic document processing a hot topic ― Layout analysis a fundamental step ● Identification of frames (relevant components in the document) ● Performance can determine quality and feasibility of the whole process ● Two different… ● Kinds of sources: Digitized (scanned) vs. Natively digital documents ● Categories of layouts: Manhattan vs. Non-Manhattan ● Types of algorithms: Top-down vs. Bottom-up ● Run Length Smoothing Algorithm ● Manhattan Layout ● Other works exploit or try to improve the RLSA by setting its parameters ● Many works on Manhattan layout ― Top-down strategies ● Less works on non-Manhattan layout ― Bottom-up strategies ● The Manhattan assumption holds for many typeset documents, simplifies document processing…BUT cannot be assumed in general
  • 3. RLSO Application to scanned images RLSO (Run Length Smoothing with OR) 1) horizontal smoothing with threshold th, row by row 2) vertical smoothing with threshold tv, column by column ● logical OR of the images obtained in steps 1 and 2 th = 5 tv = 4 (AND)
  • 4. RLSO ? Application to scanned images
  • 5. RLSO Application to born-digital documents ● Set horizontal/vertical distance thresholds th/tv ● build a frame for each basic block ● H ={(dh, b’, b’’) | b’ and b’’ are horizontally adjacent basic blocks and dh is the horizontal distance between them} ●for all (dh,1, b’h,1, b’’h,1) ∈ H s.t. dh,1 ≤ th merge the frames to which b’h,1, b’’h,1 belong ● V = {(dv, b’, b’’) | b’ and b’’ are vertically adjacent basic blocks and dv is the vertical distance between them} ● for all (dv,1, b’h,1, b’’h,1) ∈ V s.t. dv,1 ≤ tv merge the frames to which b’h,1, b’’h,1 belong Reference block Adjacent blocks Non-adjacent blocks Horizontal distance Vertical distance
  • 7. RLSO ● Run Length Smoothing algorithms based on thresholds ― Hard to properly set manually (Not typical human activity) ― Heuristic approaches (Ad hoc) ― Tampers the idea of automatic processing ― Fixed thresholds not suitable to documents with several different spacings Automatic assessment of RLSO thresholds
  • 8. RLSO Automatic threshold assessment ● Study of Run Lengths behavior Figure 1. a fragment of ― Histogram very irregular scientific paper ● Peaks = most frequent spacings ● Peak clusters = equally spaced components ― Hard to exploit by automatic techniques ― Cumulative histograms more regular ― Bar b = runs larger or equal than b H’(i) = ∑ j≥ i H(j) ● Monotonically decreasing ― Flat zones = lengths for which no runs are present ● Scaled down to 10% ― Reduces variability
  • 9. RLSO Automatic threshold assessment ● Select threshold on flat zones ― Derivative a good indicator ● Slope = 0 ● Discrete approximation on bar b: ― Tolerance possible Figure 1-a. ● Slope = – 30 ― Skip starting and trailing flat zones ● Starting zone = missing small b run lengths ● Trailing zone = merge whole content Figure 1-b. ● Iteration of technique on previously smoothed image ― Finds progressively more (Figure 1-a/1-b) successive application of RLSO with spaced components automatic threshold assessment on Figure 1.
  • 11. Conclusions ● RLSO (Run Length Smoothing with OR) identifies runs of white pixel in the document image and fill them with black pixels whenever they are shorter than a given threshold – Both Manhattan and Non-Manhattan Layout – Version for natively digital documents ● Automatic thresholding effective on documents having – single character size – different spacings ● Good baseline towards more complex documents – different character sizes – graphics ● Current and future Work – Stop criterion for iteration – Clustering based on positioning and spacing