Mapping online data on offline documents Artificial handwriting recognition Stefan Kennedie
Overview Introduction Method Experiment Discussion Conclusion
Introduction Online data Tablet Sequence of coordinates and pressure Offline data Digital scanner Bitmap with greyvalue or color
Properties of online and offline data Known Unknown Width of ink Unknown Known Location of pen known when not on paper Unknown Known Velocity and acceleration Unknown Known Order of writing Offline Online
Advantages of combining Applying online techniques to offline data and vice verca. e.g. segmentation Recognition improves
Combining – what’s the problem? Difference in resolution between tablet and digital scanner Allignment of paper on tablet and digital scanner Movement of paper (not investigated) Pen angle (unsolved)
Pen angle Difference in contact point Unsymmetric magnetic field
Goal Find correct match between offline and online data Create a match (mapping) between online and offline data that is as high as possible using: Rotation Resizing
Method – Overview Search for a query (created from online data) in the offline document In low (12.5%) resolution Mark locations with a good match Investigate these locations in detail, find best match using: Rotation Resizing
Method (1) – Query creation Create a query from online data Unique identification code Bresenham’s line algorithm 12.5% of original resolution of offline data
Method (1) – preprocessing offline data  Upper half of first quadrant Remove noise Using Otsu Threshold algorithm Resize to 12.5% of original resolution
Method (1) – searching for query Find location with optimal match between query and offline document Compare query with offline document at all possible locations using: Sliding window Euclidean Distance Mapping for Matching Output: Match error Good match    small error  Visualized as light pixel Bad match    big error Visualized as dark pixel
Method (1) – euclidean distance mapping example Offline document Query XOR image
Method (1) – processing match errors Remove locations with high error Remove error values Add surrounding locations 1 2 3
Method (2) – investigation in detail Find location, rotation and resize factor with best match.
Experiments Retrieval experiment – mapping in document level Optimal rotation and resizing experiment – mapping on pixel level Both done for three writers
Experiment 1 Find correct match between the online and offline data Uses only part one of the method, no rotation and resizing Compare all online data with all offline data (of one writer) Pairs with smallest match error belong to each other? Visual exploring the results
Experiment 1 – results In total 207 data pairs Algorithm found 203  correct matches (98.07%) % correct pixels for ‘match’ group: 81.08% % correct pixels for ‘no match’ group: 38.15% This difference of 42.93% is significant (p=.000)
Experiment 2 Compute the maximum % of correct pixels for all rotatoins and resizing factors for all data pairs.
Experiment 2 – results % of correct pixels without rotating and resizing: 80.72% % of correct pixels with optimal rotation and resizing factor: 87.28% Difference of 6.56% is significant (p=.000)
Discussion The method is very slow Alternative approach like Fast Foourier transformation The significance of the result depends on the data Large black regions in offline data Edge detection
Conclusion The method is useful for finding online-offline data pairs Succesfull in 98.07% The method is able to provide a better mapping between online and offline data 6.56% improvement Better results can be achieved using the pen angle
Questions ?

Bachelor Thesis

  • 1.
    Mapping online dataon offline documents Artificial handwriting recognition Stefan Kennedie
  • 2.
    Overview Introduction MethodExperiment Discussion Conclusion
  • 3.
    Introduction Online dataTablet Sequence of coordinates and pressure Offline data Digital scanner Bitmap with greyvalue or color
  • 4.
    Properties of onlineand offline data Known Unknown Width of ink Unknown Known Location of pen known when not on paper Unknown Known Velocity and acceleration Unknown Known Order of writing Offline Online
  • 5.
    Advantages of combiningApplying online techniques to offline data and vice verca. e.g. segmentation Recognition improves
  • 6.
    Combining – what’sthe problem? Difference in resolution between tablet and digital scanner Allignment of paper on tablet and digital scanner Movement of paper (not investigated) Pen angle (unsolved)
  • 7.
    Pen angle Differencein contact point Unsymmetric magnetic field
  • 8.
    Goal Find correctmatch between offline and online data Create a match (mapping) between online and offline data that is as high as possible using: Rotation Resizing
  • 9.
    Method – OverviewSearch for a query (created from online data) in the offline document In low (12.5%) resolution Mark locations with a good match Investigate these locations in detail, find best match using: Rotation Resizing
  • 10.
    Method (1) –Query creation Create a query from online data Unique identification code Bresenham’s line algorithm 12.5% of original resolution of offline data
  • 11.
    Method (1) –preprocessing offline data Upper half of first quadrant Remove noise Using Otsu Threshold algorithm Resize to 12.5% of original resolution
  • 12.
    Method (1) –searching for query Find location with optimal match between query and offline document Compare query with offline document at all possible locations using: Sliding window Euclidean Distance Mapping for Matching Output: Match error Good match  small error Visualized as light pixel Bad match  big error Visualized as dark pixel
  • 13.
    Method (1) –euclidean distance mapping example Offline document Query XOR image
  • 14.
    Method (1) –processing match errors Remove locations with high error Remove error values Add surrounding locations 1 2 3
  • 15.
    Method (2) –investigation in detail Find location, rotation and resize factor with best match.
  • 16.
    Experiments Retrieval experiment– mapping in document level Optimal rotation and resizing experiment – mapping on pixel level Both done for three writers
  • 17.
    Experiment 1 Findcorrect match between the online and offline data Uses only part one of the method, no rotation and resizing Compare all online data with all offline data (of one writer) Pairs with smallest match error belong to each other? Visual exploring the results
  • 18.
    Experiment 1 –results In total 207 data pairs Algorithm found 203 correct matches (98.07%) % correct pixels for ‘match’ group: 81.08% % correct pixels for ‘no match’ group: 38.15% This difference of 42.93% is significant (p=.000)
  • 19.
    Experiment 2 Computethe maximum % of correct pixels for all rotatoins and resizing factors for all data pairs.
  • 20.
    Experiment 2 –results % of correct pixels without rotating and resizing: 80.72% % of correct pixels with optimal rotation and resizing factor: 87.28% Difference of 6.56% is significant (p=.000)
  • 21.
    Discussion The methodis very slow Alternative approach like Fast Foourier transformation The significance of the result depends on the data Large black regions in offline data Edge detection
  • 22.
    Conclusion The methodis useful for finding online-offline data pairs Succesfull in 98.07% The method is able to provide a better mapping between online and offline data 6.56% improvement Better results can be achieved using the pen angle
  • 23.