SlideShare a Scribd company logo
1 of 16
Let the Product Lead:
R&D Paradigms for
Recognition Models of
Handwritten Math
Expressions
Quinn Lathrop
quinn.lathrop@gmail.com
AI Products and Solutions
• Global organization
• Research, Development, Engineering, Product, Design
• We own the product/feature from ideation to delivery
• Achieve broad innovation goals by developing B2C products first
Contributors to this work:
Zac Hancock, Jiamin He, Michael Chifala, Sungjin Nam, Bill Vander Lugt, Mounika Kakarla, Teddy Ampian, Luke Tuguluke, JB
DeVries, Leslie Satterfield, Claudia Cassidy, Douglas Cobb, Brian LoPiccolo, Joey Ashcroft, Ashley Fallon, Holly Smith, Tim
Stewart, JD Corbin, Tejawini Nallagatla, Jakob Vendegna, Wes Galbraith, Randall Barnhart, Eric Kattwinkel, Johann Larusson,
Jason Fournier, Michal Okulski, Piotr Kabacinski, Kacper Lodzikowski
Aida Calculus
Check This Problem Flow:
• Student inputs their problem
by taking a picture of their
handwritten work
• Student receives step-by-step
feedback
• Personalized hints and
tutoring
Video
Handwriting Recognition of
Math Expressions
y = ( 2 x – 3 ) ( x^2 - 5 )^3
Typical Data -> Model Flow
Waterfall:
• Collect dataset
• Iterate on model
At a certain point of time
the dataset is fixed
Real data is not perfect
Product Driven R&D
• Iteratively build a synthetic generation capability
towards requirements
• Control over the distribution of math
expressions, characters, location of characters,
specific visual qualities of the math, image noise,
and image augmentations
• Control every pixel of image
• Can create millions of perfectly labelled images in
hours
Math Expression: y=(2x-3)(x^2-5)^3
Background
Extra Marks
Bleed Through
Our synthetic data builds these features from the ground up:
• Math Expression
• Font and writing utensil
• Character-specific distortions/augmentations
• Backgrounds
• Other visual noise
• Simulated photo – angle, distance, quality, shadows
Benefits: Product
Development Cycles
When other product features depend on a developing AI capability, it is
important to integrate and iterate early and often
• Alpha
• Beta
• MVP
Benefits: Exact Control
and Visibility of the
Population Distributions
Benefits: User Behavior
Drives Backlog
Example: computer screens
• Can generate data and quickly ask the question: Does our model architecture
support this expansion in scope?
Benefits: Modeling
Because we have 100% correct pixel-level tagging, a
range of Object Detection models are available.
Tagging bounding boxes and masks comes at no
additional data collection cost
Open Sourcing Dataset on Kaggle
100,000 images with ground truth Latex, Bounding Boxes, and Masks
https://www.kaggle.com/aidapearson/ocr-
data
Thank you!
Open Sourced Kaggle Dataset
https://www.kaggle.com/aidapearson/ocr-data
Quinn Lathrop
quinn.lathrop@gmail.com

More Related Content

What's hot

The Art of the Presentation
The Art of the PresentationThe Art of the Presentation
The Art of the PresentationJeffrey Stevens
 
ADOBE ILLUSTRATOR NOTES.docx
ADOBE ILLUSTRATOR NOTES.docxADOBE ILLUSTRATOR NOTES.docx
ADOBE ILLUSTRATOR NOTES.docxjuvisolutions
 
Hello Heart's $70M Series D Pitch Deck for heart monitoring healthtech
Hello Heart's $70M Series D Pitch Deck for heart monitoring healthtechHello Heart's $70M Series D Pitch Deck for heart monitoring healthtech
Hello Heart's $70M Series D Pitch Deck for heart monitoring healthtechPitch Decks
 
Chapter 04 What is Software ~ Urdu Guide
Chapter 04   What is Software  ~ Urdu GuideChapter 04   What is Software  ~ Urdu Guide
Chapter 04 What is Software ~ Urdu GuideMuhammad Tayyab Rana
 
12 Resolutions for a Great Year at Work
12 Resolutions for a Great Year at Work12 Resolutions for a Great Year at Work
12 Resolutions for a Great Year at WorkO.C. Tanner
 
The Hero's Journey (For movie fans, Lego fans, and presenters!)
The Hero's Journey (For movie fans, Lego fans, and presenters!)The Hero's Journey (For movie fans, Lego fans, and presenters!)
The Hero's Journey (For movie fans, Lego fans, and presenters!)Dan Roam
 
8 Strange British Laws You Didn't Know Were True
8 Strange British Laws You Didn't Know Were True8 Strange British Laws You Didn't Know Were True
8 Strange British Laws You Didn't Know Were TrueBanner Jones Solicitors
 
3 Ingredients to Spice Up Your Content Marketing
3 Ingredients to Spice Up Your Content Marketing3 Ingredients to Spice Up Your Content Marketing
3 Ingredients to Spice Up Your Content MarketingSemrush
 
Designing the Future: When Fact Meets Fiction
Designing the Future: When Fact Meets FictionDesigning the Future: When Fact Meets Fiction
Designing the Future: When Fact Meets FictionDean Johnson
 
Five Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideFive Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideCrispy Presentations
 
15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarev
15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarev15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarev
15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarevAnna Zubarev
 
Pixar's 22 Rules to Phenomenal Storytelling
Pixar's 22 Rules to Phenomenal StorytellingPixar's 22 Rules to Phenomenal Storytelling
Pixar's 22 Rules to Phenomenal StorytellingGavin McMahon
 
The Quest for Happiness
The Quest for HappinessThe Quest for Happiness
The Quest for HappinessINSEAD
 
Concept Art - Character Design
Concept Art - Character DesignConcept Art - Character Design
Concept Art - Character Designnombre thera
 
Why You Should Love Public Speaking
Why You Should Love Public SpeakingWhy You Should Love Public Speaking
Why You Should Love Public SpeakingEthos3
 

What's hot (20)

The Art of the Presentation
The Art of the PresentationThe Art of the Presentation
The Art of the Presentation
 
ADOBE ILLUSTRATOR NOTES.docx
ADOBE ILLUSTRATOR NOTES.docxADOBE ILLUSTRATOR NOTES.docx
ADOBE ILLUSTRATOR NOTES.docx
 
Hello Heart's $70M Series D Pitch Deck for heart monitoring healthtech
Hello Heart's $70M Series D Pitch Deck for heart monitoring healthtechHello Heart's $70M Series D Pitch Deck for heart monitoring healthtech
Hello Heart's $70M Series D Pitch Deck for heart monitoring healthtech
 
Character Design: Faces
Character Design: FacesCharacter Design: Faces
Character Design: Faces
 
Sketchnoting: 10 Tips to get Started
Sketchnoting: 10 Tips to get StartedSketchnoting: 10 Tips to get Started
Sketchnoting: 10 Tips to get Started
 
Chapter 04 What is Software ~ Urdu Guide
Chapter 04   What is Software  ~ Urdu GuideChapter 04   What is Software  ~ Urdu Guide
Chapter 04 What is Software ~ Urdu Guide
 
12 Resolutions for a Great Year at Work
12 Resolutions for a Great Year at Work12 Resolutions for a Great Year at Work
12 Resolutions for a Great Year at Work
 
The Minimum Loveable Product
The Minimum Loveable ProductThe Minimum Loveable Product
The Minimum Loveable Product
 
Towards Greatness
Towards GreatnessTowards Greatness
Towards Greatness
 
The Hero's Journey (For movie fans, Lego fans, and presenters!)
The Hero's Journey (For movie fans, Lego fans, and presenters!)The Hero's Journey (For movie fans, Lego fans, and presenters!)
The Hero's Journey (For movie fans, Lego fans, and presenters!)
 
8 Strange British Laws You Didn't Know Were True
8 Strange British Laws You Didn't Know Were True8 Strange British Laws You Didn't Know Were True
8 Strange British Laws You Didn't Know Were True
 
3 Ingredients to Spice Up Your Content Marketing
3 Ingredients to Spice Up Your Content Marketing3 Ingredients to Spice Up Your Content Marketing
3 Ingredients to Spice Up Your Content Marketing
 
Designing the Future: When Fact Meets Fiction
Designing the Future: When Fact Meets FictionDesigning the Future: When Fact Meets Fiction
Designing the Future: When Fact Meets Fiction
 
Five Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideFive Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same Slide
 
15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarev
15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarev15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarev
15 Pinterest Strategy Tips To Get Your Pins Go Viral. via @annazubarev
 
Alyce
AlyceAlyce
Alyce
 
Pixar's 22 Rules to Phenomenal Storytelling
Pixar's 22 Rules to Phenomenal StorytellingPixar's 22 Rules to Phenomenal Storytelling
Pixar's 22 Rules to Phenomenal Storytelling
 
The Quest for Happiness
The Quest for HappinessThe Quest for Happiness
The Quest for Happiness
 
Concept Art - Character Design
Concept Art - Character DesignConcept Art - Character Design
Concept Art - Character Design
 
Why You Should Love Public Speaking
Why You Should Love Public SpeakingWhy You Should Love Public Speaking
Why You Should Love Public Speaking
 

Similar to Applied Machine Learning Conference: Synthetic OCR data

Machine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example ImplementationMachine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example ImplementationSynerzip
 
AI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientAI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientKari Kakkonen
 
AI improves software testing through test automation, test creation and test ...
AI improves software testing through test automation, test creation and test ...AI improves software testing through test automation, test creation and test ...
AI improves software testing through test automation, test creation and test ...Kari Kakkonen
 
Rapid Product Design in the Wild, Agile 2013
Rapid Product Design in the Wild, Agile 2013Rapid Product Design in the Wild, Agile 2013
Rapid Product Design in the Wild, Agile 2013Michele Ide-Smith
 
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...Hans Põldoja
 
Scrum and Kanban - Getting the Most from Each
Scrum and Kanban - Getting the Most from EachScrum and Kanban - Getting the Most from Each
Scrum and Kanban - Getting the Most from EachMichael Sahota
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product ManagerProduct School
 
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...Louis Dorard
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...PAPIs.io
 
[2016/2017] RESEARCH in software engineering
[2016/2017] RESEARCH in software engineering[2016/2017] RESEARCH in software engineering
[2016/2017] RESEARCH in software engineeringIvano Malavolta
 
Exploring design with Agile
Exploring design with AgileExploring design with Agile
Exploring design with AgileMichael Le
 
UX in Action: IBM Watson
UX in Action: IBM WatsonUX in Action: IBM Watson
UX in Action: IBM WatsonUserTesting
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceLivePerson
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer LearningDanielle Dean
 

Similar to Applied Machine Learning Conference: Synthetic OCR data (20)

Machine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example ImplementationMachine Learning Vs. Deep Learning – An Example Implementation
Machine Learning Vs. Deep Learning – An Example Implementation
 
AI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficientAI improves software testing to be more fault tolerant, focused and efficient
AI improves software testing to be more fault tolerant, focused and efficient
 
AI improves software testing through test automation, test creation and test ...
AI improves software testing through test automation, test creation and test ...AI improves software testing through test automation, test creation and test ...
AI improves software testing through test automation, test creation and test ...
 
UXLX2012 User Research Hacks
UXLX2012 User Research HacksUXLX2012 User Research Hacks
UXLX2012 User Research Hacks
 
Rapid Product Design in the Wild, Agile 2013
Rapid Product Design in the Wild, Agile 2013Rapid Product Design in the Wild, Agile 2013
Rapid Product Design in the Wild, Agile 2013
 
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
Web-Based Self- and Peer-Assessment of Teachers’ Educational Technology Compe...
 
Scrum and Kanban - Getting the Most from Each
Scrum and Kanban - Getting the Most from EachScrum and Kanban - Getting the Most from Each
Scrum and Kanban - Getting the Most from Each
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product Manager
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...
 
[2016/2017] RESEARCH in software engineering
[2016/2017] RESEARCH in software engineering[2016/2017] RESEARCH in software engineering
[2016/2017] RESEARCH in software engineering
 
Exploring design with Agile
Exploring design with AgileExploring design with Agile
Exploring design with Agile
 
UX in Action: IBM Watson
UX in Action: IBM WatsonUX in Action: IBM Watson
UX in Action: IBM Watson
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 

Recently uploaded

Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Recently uploaded (20)

Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

Applied Machine Learning Conference: Synthetic OCR data

  • 1. Let the Product Lead: R&D Paradigms for Recognition Models of Handwritten Math Expressions Quinn Lathrop quinn.lathrop@gmail.com
  • 2. AI Products and Solutions • Global organization • Research, Development, Engineering, Product, Design • We own the product/feature from ideation to delivery • Achieve broad innovation goals by developing B2C products first Contributors to this work: Zac Hancock, Jiamin He, Michael Chifala, Sungjin Nam, Bill Vander Lugt, Mounika Kakarla, Teddy Ampian, Luke Tuguluke, JB DeVries, Leslie Satterfield, Claudia Cassidy, Douglas Cobb, Brian LoPiccolo, Joey Ashcroft, Ashley Fallon, Holly Smith, Tim Stewart, JD Corbin, Tejawini Nallagatla, Jakob Vendegna, Wes Galbraith, Randall Barnhart, Eric Kattwinkel, Johann Larusson, Jason Fournier, Michal Okulski, Piotr Kabacinski, Kacper Lodzikowski
  • 3. Aida Calculus Check This Problem Flow: • Student inputs their problem by taking a picture of their handwritten work • Student receives step-by-step feedback • Personalized hints and tutoring Video
  • 4.
  • 5. Handwriting Recognition of Math Expressions y = ( 2 x – 3 ) ( x^2 - 5 )^3
  • 6. Typical Data -> Model Flow Waterfall: • Collect dataset • Iterate on model At a certain point of time the dataset is fixed Real data is not perfect
  • 7. Product Driven R&D • Iteratively build a synthetic generation capability towards requirements • Control over the distribution of math expressions, characters, location of characters, specific visual qualities of the math, image noise, and image augmentations • Control every pixel of image • Can create millions of perfectly labelled images in hours
  • 8. Math Expression: y=(2x-3)(x^2-5)^3 Background Extra Marks Bleed Through Our synthetic data builds these features from the ground up: • Math Expression • Font and writing utensil • Character-specific distortions/augmentations • Backgrounds • Other visual noise • Simulated photo – angle, distance, quality, shadows
  • 9.
  • 10.
  • 11. Benefits: Product Development Cycles When other product features depend on a developing AI capability, it is important to integrate and iterate early and often • Alpha • Beta • MVP
  • 12. Benefits: Exact Control and Visibility of the Population Distributions
  • 13. Benefits: User Behavior Drives Backlog Example: computer screens • Can generate data and quickly ask the question: Does our model architecture support this expansion in scope?
  • 14. Benefits: Modeling Because we have 100% correct pixel-level tagging, a range of Object Detection models are available. Tagging bounding boxes and masks comes at no additional data collection cost
  • 15. Open Sourcing Dataset on Kaggle 100,000 images with ground truth Latex, Bounding Boxes, and Masks https://www.kaggle.com/aidapearson/ocr- data
  • 16. Thank you! Open Sourced Kaggle Dataset https://www.kaggle.com/aidapearson/ocr-data Quinn Lathrop quinn.lathrop@gmail.com