SlideShare a Scribd company logo
1 of 18
Download to read offline
Detecting & Recognizing arbitrary
shaped texts from Product Images
Rajesh Shreedhar Bhat
Senior Data Scientist, Walmart Global Tech India
Agenda
▪ Text Extraction Overview
▪ Text Detection(TD)
▪ Text Recognition(TR) training data
preparation
▪ CRNN-CTC model for TR
▪ Attention – OCR
▪ Spatial Transformer Nets for improving
TR accuracy
▪ Model Accuracies on different dataset.
▪ Training & Deployment.
▪ Questions ?
Text Extraction Overview
Text Detection
Text Detection – Model architecture
▪ VGG16 – BN as the backbone
▪ Model has skip connection in decoder part
which is similar to U-Nets.
▪ Output :
▪ Region score
▪ Affinity score - grouping characters
Ref:Baek, Youngmin, et al. "Character Region
Awareness for Text detection." Proceedings of the
IEEE Conference on Computer Vision and Pattern
Recognition. 2019.
Ground Truth Label Generation
Ref:Baek, Youngmin, et al. "Character region awareness for text detection." Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
Sample Output
Region Score Affinity Score Region Score Affinity Score
Sample Output ..
Text Recognition
Text Recognition – Training Data Preparation
SynthText: image generation engine
for building a large annotated dataset.
15 million images generated with
different font styles, size, color &
varying backgrounds using product
descriptions + open source datasets
Vocabulary: 92 characters
Includes capital + small letters,
numbers and special symbols
Link to “Text Recognition with CRNN-CTC model” blog
published in WANDB : https://bit.ly/3hBaWQv
Attention - OCR
• Encoder – Decoder
framework
• CNN used as visual feature
encoder.
• LSTM with Attention
mechanism is used to
extract text in a generative
fashion.
• Cross-entropy as a loss
function
Product Images with curved text
13
Spatial Transformation Networks
14
• Spatial Transformer
Network is a learnable
module aimed at increasing
the spatial invariance of
Convolutional Neural
Networks in a
computationally and
parameter efficient manner.
Model Accuracy on Regular and Arbitrary shaped
text
15
Dataset CRNN-CTC CNN-LSTM-Attn STN-CRNN-CTC STN-CNN-LSTM- Attn
IIIT 5K 81.6 82.1 85 85.16
SVT 82.9 83.5 88.7 88.8
ICDAR03_860 89.2 89.8 91.03 91.7
ICDAR03_867 91.1 91.0 91.59 92.4
ICDAR13_857 92.6 92.7 93.08 94.00
ICDAR13_1015 93.1 93.1 93.25 94.53
ICDAR15_1811 69.4 69.8 72.3 76.5
ICDAR15_2077 64.2 64.8 67.5 71.89
SVT-P 70 70.6 69.4 76.89
CUTE 65.5 66.7 85.7 83.3
Dataset mainly with
arbitrary shaped text
Training and deployment
▪ 15 million images ~ 690 GB when loaded into memory!! Given that on
an average images are of the shape (128 * 32 * 3) and dtype is float32.
▪ Usage Generators to load only single batch in memory.
▪ Deployed on Machine Learning Platform internal to Walmart.
▪ Both text detection and recognition are deployed on single V100
GPU’s and prediction time is ~0.45 seconds for each image.
16
The Team behind the project
Rajesh Shreedhar Bhat
Senior Data Scientist
Pranay Dugar
Data Scientist
Anirban Chatterjee
Staff Data Scientist
Vijay Agneeswaran
Director- Data Science
rsbhat@asu.edu
https://www.linkedin.com/in/rajeshshreedhar
Questions ??
Code + PPT
https://github.com/rajesh-bhat/data-ai-summit-2020

More Related Content

Similar to Detecting & Recognizing arbitrary shaped texts from Product Images

Optimal configuration of network
Optimal configuration of networkOptimal configuration of network
Optimal configuration of networkjpstudcorner
 
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...IRJET Journal
 
resume deeksha anandani NXP Semiconductors
resume deeksha anandani NXP Semiconductorsresume deeksha anandani NXP Semiconductors
resume deeksha anandani NXP SemiconductorsDeeksha Anandani
 
Text Detection and Recognition in Natural Images
Text Detection and Recognition in Natural ImagesText Detection and Recognition in Natural Images
Text Detection and Recognition in Natural ImagesIRJET Journal
 
IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...
IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...
IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...IRJET Journal
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSGanesan Narayanasamy
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IIYu Huang
 
Ocr using tensor flow
Ocr using tensor flowOcr using tensor flow
Ocr using tensor flowNaresh Kumar
 
Traffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNNTraffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNNIRJET Journal
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkIRJET Journal
 
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...ijtsrd
 
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...IRJET Journal
 
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...Edge AI and Vision Alliance
 
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET Journal
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015József Makai
 
Traffic Sign Recognition using CNNs
Traffic Sign Recognition using CNNsTraffic Sign Recognition using CNNs
Traffic Sign Recognition using CNNsIRJET Journal
 
Vehicle Number Plate Detection
Vehicle Number Plate DetectionVehicle Number Plate Detection
Vehicle Number Plate DetectionIRJET Journal
 
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...ijtsrd
 
Deep Learning Fundamentals and Case studies using IBM POWER Systems
Deep Learning Fundamentals and Case studies using IBM POWER SystemsDeep Learning Fundamentals and Case studies using IBM POWER Systems
Deep Learning Fundamentals and Case studies using IBM POWER SystemsGanesan Narayanasamy
 

Similar to Detecting & Recognizing arbitrary shaped texts from Product Images (20)

Optimal configuration of network
Optimal configuration of networkOptimal configuration of network
Optimal configuration of network
 
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
 
resume deeksha anandani NXP Semiconductors
resume deeksha anandani NXP Semiconductorsresume deeksha anandani NXP Semiconductors
resume deeksha anandani NXP Semiconductors
 
Text Detection and Recognition in Natural Images
Text Detection and Recognition in Natural ImagesText Detection and Recognition in Natural Images
Text Detection and Recognition in Natural Images
 
IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...
IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...
IRJET - Study on the Effects of Increase in the Depth of the Feature Extracto...
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning II
 
Ocr using tensor flow
Ocr using tensor flowOcr using tensor flow
Ocr using tensor flow
 
Traffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNNTraffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNN
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural Network
 
day3.pdf
day3.pdfday3.pdf
day3.pdf
 
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...
Reconstructing the Path of the Object based on Time and Date OCR in Surveilla...
 
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
A REVIEW ON IMPROVING TRAFFIC-SIGN DETECTION USING YOLO ALGORITHM FOR OBJECT ...
 
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
 
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015
 
Traffic Sign Recognition using CNNs
Traffic Sign Recognition using CNNsTraffic Sign Recognition using CNNs
Traffic Sign Recognition using CNNs
 
Vehicle Number Plate Detection
Vehicle Number Plate DetectionVehicle Number Plate Detection
Vehicle Number Plate Detection
 
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
Traffic Sign Detection and Recognition for Automated Driverless Cars Based on...
 
Deep Learning Fundamentals and Case studies using IBM POWER Systems
Deep Learning Fundamentals and Case studies using IBM POWER SystemsDeep Learning Fundamentals and Case studies using IBM POWER Systems
Deep Learning Fundamentals and Case studies using IBM POWER Systems
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 

Recently uploaded (20)

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 

Detecting & Recognizing arbitrary shaped texts from Product Images

  • 1. Detecting & Recognizing arbitrary shaped texts from Product Images Rajesh Shreedhar Bhat Senior Data Scientist, Walmart Global Tech India
  • 2. Agenda ▪ Text Extraction Overview ▪ Text Detection(TD) ▪ Text Recognition(TR) training data preparation ▪ CRNN-CTC model for TR ▪ Attention – OCR ▪ Spatial Transformer Nets for improving TR accuracy ▪ Model Accuracies on different dataset. ▪ Training & Deployment. ▪ Questions ?
  • 5. Text Detection – Model architecture ▪ VGG16 – BN as the backbone ▪ Model has skip connection in decoder part which is similar to U-Nets. ▪ Output : ▪ Region score ▪ Affinity score - grouping characters Ref:Baek, Youngmin, et al. "Character Region Awareness for Text detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
  • 6. Ground Truth Label Generation Ref:Baek, Youngmin, et al. "Character region awareness for text detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
  • 7. Sample Output Region Score Affinity Score Region Score Affinity Score
  • 10. Text Recognition – Training Data Preparation SynthText: image generation engine for building a large annotated dataset. 15 million images generated with different font styles, size, color & varying backgrounds using product descriptions + open source datasets Vocabulary: 92 characters Includes capital + small letters, numbers and special symbols
  • 11. Link to “Text Recognition with CRNN-CTC model” blog published in WANDB : https://bit.ly/3hBaWQv
  • 12. Attention - OCR • Encoder – Decoder framework • CNN used as visual feature encoder. • LSTM with Attention mechanism is used to extract text in a generative fashion. • Cross-entropy as a loss function
  • 13. Product Images with curved text 13
  • 14. Spatial Transformation Networks 14 • Spatial Transformer Network is a learnable module aimed at increasing the spatial invariance of Convolutional Neural Networks in a computationally and parameter efficient manner.
  • 15. Model Accuracy on Regular and Arbitrary shaped text 15 Dataset CRNN-CTC CNN-LSTM-Attn STN-CRNN-CTC STN-CNN-LSTM- Attn IIIT 5K 81.6 82.1 85 85.16 SVT 82.9 83.5 88.7 88.8 ICDAR03_860 89.2 89.8 91.03 91.7 ICDAR03_867 91.1 91.0 91.59 92.4 ICDAR13_857 92.6 92.7 93.08 94.00 ICDAR13_1015 93.1 93.1 93.25 94.53 ICDAR15_1811 69.4 69.8 72.3 76.5 ICDAR15_2077 64.2 64.8 67.5 71.89 SVT-P 70 70.6 69.4 76.89 CUTE 65.5 66.7 85.7 83.3 Dataset mainly with arbitrary shaped text
  • 16. Training and deployment ▪ 15 million images ~ 690 GB when loaded into memory!! Given that on an average images are of the shape (128 * 32 * 3) and dtype is float32. ▪ Usage Generators to load only single batch in memory. ▪ Deployed on Machine Learning Platform internal to Walmart. ▪ Both text detection and recognition are deployed on single V100 GPU’s and prediction time is ~0.45 seconds for each image. 16
  • 17. The Team behind the project Rajesh Shreedhar Bhat Senior Data Scientist Pranay Dugar Data Scientist Anirban Chatterjee Staff Data Scientist Vijay Agneeswaran Director- Data Science
  • 18. rsbhat@asu.edu https://www.linkedin.com/in/rajeshshreedhar Questions ?? Code + PPT https://github.com/rajesh-bhat/data-ai-summit-2020