SlideShare a Scribd company logo
1 of 13
Deep Learning OCR using Nimbix
POWER8
INTRODUCTION
• OCR is the transformation of Images of text to
Machine encoded text.
• A simple API to an OCR library might provide a
function which takes as input an image and
outputs a string.
• In this project we have applied Deep learning
Neural Network to solve Optical Character
Recognition.
• We have made use of Tensorflow and
Convolutional Neural Network.
MOTIVATION
• Optical character recognition is needed when the
information should be readable both to humans and to a
machine and alternative inputs can not be predefined.
• The basic OCR system was invented to convert the data
available on papers in to computer process able
documents, So that the documents can be editable and
reusable.
• Traditional OCR techniques are typically multi-stage
processes. For example, first the image may be divided into
smaller regions that contain the individual characters,
second the individual characters are recognized, and finally
the result is pieced back together. A difficulty with this
approach is to obtain a good division of the original image.
Sample Architecture for CNN
What are convolution Neural Network
• Step 1 – Convolution Operation
• Step 1(b) – ReLu layer (Rectified Linear unit)
• Step 2 – Pooling
• Step 3 – Flattering
• Step 4 – Full Connection
Fully Connected Layer of CNN model
Source : Created by Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team
OCRGen.py
STRINGPOWER
AI
Dataset is generated using the
Python Imaging Library (PIL)
A fully convolutional network is presented
which transforms the input volume into a
sequence of character predictions.
Predicted Output
Fully Connected
Layer
CSV file
Deep OCR Architecture
• A fully convolutional network is presented
which transforms the input volume into a
sequence of character predictions. These
character predictions can then be transformed
into a string. The architecture of the network
is shown below in Figure.
• Where N is the number of possible characters. In this example,
there are 63 possible characters for uppercase and lowercase
characters, digits, and a blank character. The parenthesized values in
the convolutional layers are the filter sizes and stride values from
top to bottom respectively. The values in the reshape layer are the
reshaped dimension.
• The input volume is a rectangular RGB image. This first height and
width of this volume are reduced across the convolutional layers
using striding. The 3rd dimension of this volume increases from 3
channels (RGB) to 1 channel for each character possible. Thus, the
volume is transformed from an RGB image into a sequence of
vectors. Applying argmax across the channel dimension gives a
sequence of 1-hot encoded vectors which can be transformed into a
string.
SOURCE
https://github.com/nicholastoddsmith/pythonml/blob/master/Dee
pOCR/TFModel/_classes.txt
Result
• To facilitate training this network, a dataset is generated using the Python
Imaging Library (PIL). Random strings consisting of alphanumeric
characters are generated. Using PIL, images are generated for each
random string. A CSV file is also generated which contains the file name
and the associated random string. Some examples from the generated
dataset are shown below in Figure.
Training Data
Generating Data
Test Data
• Training and cross-validation results are
shown
Training the Network
• To train the network, the CSV file is parsed and the images are loaded into
memory. Each target value for the training data is a sequence of 1-hot
vectors. Thus the target matrix is a 3D matrix with the three dimensions
corresponding to sample, character, and 1-hot encoding respectively.
• Next the neural network is constructed using the artificial neural network
classifier (ANNC) class from TFANN. The architecture described above is
represented in the following lines of code using ANNC
• Softmax cross-entropy is used as the loss function which is performed
over the 3rd dimension of the output.
• Fitting the network and performing predictions is simple using the ANNC
class. The prediction is split up using array_split from numpy to prevent
out of memory errors.
System Details
• Distributed Deep Learning (DDL) environment
on POWER8 system with IBM PowerAI ML/DL
frameworks.
• 40 threads POWER8, 256 RAM, 1 x k80 GPU
• PushToCompute for compiling POWER8
applications and deploying directly to the
Nimbix Cloud

More Related Content

What's hot

In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitJinwon Lee
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3zukun
 
Aerial detection part2
Aerial detection part2Aerial detection part2
Aerial detection part2ssuser456ad6
 
Lecture 11 neural network principles
Lecture 11 neural network principlesLecture 11 neural network principles
Lecture 11 neural network principlesVajira Thambawita
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo PeriniFlink Forward
 
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...Sunghoon Joo
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...Jinwon Lee
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsMarek Kraft
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsMLAI2
 
Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...LogicMindtech Nologies
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
NUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemNUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemMohammad Tahsin Alshalabi
 
Strings in c langauge
Strings in c langaugeStrings in c langauge
Strings in c langaugeYash Thakkar
 
Graph Matching
Graph MatchingGraph Matching
Graph Matchinggraphitech
 
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...CSCJournals
 
A short introduction to Network coding
A short introduction to Network codingA short introduction to Network coding
A short introduction to Network codingArash Pourdamghani
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsVijay Karan
 

What's hot (20)

In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
 
Jpeg
JpegJpeg
Jpeg
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3
 
Aerial detection part2
Aerial detection part2Aerial detection part2
Aerial detection part2
 
Tldr
TldrTldr
Tldr
 
Lecture 11 neural network principles
Lecture 11 neural network principlesLecture 11 neural network principles
Lecture 11 neural network principles
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
 
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
 
Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
NUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemNUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node System
 
Strings in c langauge
Strings in c langaugeStrings in c langauge
Strings in c langauge
 
Graph Matching
Graph MatchingGraph Matching
Graph Matching
 
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
 
A short introduction to Network coding
A short introduction to Network codingA short introduction to Network coding
A short introduction to Network coding
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
 

Similar to Ocr using tensor flow

Handwriting recognition
Handwriting recognitionHandwriting recognition
Handwriting recognitionMaeda Hanafi
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwritingVipul Kaushal
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkIRJET Journal
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015Beatrice van Eden
 
artificial neural network
artificial neural networkartificial neural network
artificial neural networkPallavi Yadav
 
A Neural Network that Understands Handwriting
A Neural Network that Understands HandwritingA Neural Network that Understands Handwriting
A Neural Network that Understands HandwritingShivam Sawhney
 
Alphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural NetworkAlphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural Networkijtsrd
 
OCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural NetworkOCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural Networkijsrd.com
 
Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...ijtsrd
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition Shobhit Saxena
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1ananth
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...IRJET Journal
 
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET Journal
 
Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewIRJET Journal
 

Similar to Ocr using tensor flow (20)

Handwriting recognition
Handwriting recognitionHandwriting recognition
Handwriting recognition
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwriting
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
Devanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural NetworkDevanagari Digit and Character Recognition Using Convolutional Neural Network
Devanagari Digit and Character Recognition Using Convolutional Neural Network
 
UNIT-4.pptx
UNIT-4.pptxUNIT-4.pptx
UNIT-4.pptx
 
Wits presentation 6_28072015
Wits presentation 6_28072015Wits presentation 6_28072015
Wits presentation 6_28072015
 
UNIT-4.pdf
UNIT-4.pdfUNIT-4.pdf
UNIT-4.pdf
 
UNIT-4.pdf
UNIT-4.pdfUNIT-4.pdf
UNIT-4.pdf
 
artificial neural network
artificial neural networkartificial neural network
artificial neural network
 
A Neural Network that Understands Handwriting
A Neural Network that Understands HandwritingA Neural Network that Understands Handwriting
A Neural Network that Understands Handwriting
 
Alphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural NetworkAlphabet Recognition System Based on Artifical Neural Network
Alphabet Recognition System Based on Artifical Neural Network
 
OCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural NetworkOCR for Gujarati Numeral using Neural Network
OCR for Gujarati Numeral using Neural Network
 
Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...Implementation and Performance Evaluation of Neural Network for English Alpha...
Implementation and Performance Evaluation of Neural Network for English Alpha...
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
 
Assignment-1-NF.docx
Assignment-1-NF.docxAssignment-1-NF.docx
Assignment-1-NF.docx
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
 
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character RecognitionIRJET- Automatic Data Collection from Forms using Optical Character Recognition
IRJET- Automatic Data Collection from Forms using Optical Character Recognition
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
 
Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A Review
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

Ocr using tensor flow

  • 1. Deep Learning OCR using Nimbix POWER8
  • 2. INTRODUCTION • OCR is the transformation of Images of text to Machine encoded text. • A simple API to an OCR library might provide a function which takes as input an image and outputs a string. • In this project we have applied Deep learning Neural Network to solve Optical Character Recognition. • We have made use of Tensorflow and Convolutional Neural Network.
  • 3. MOTIVATION • Optical character recognition is needed when the information should be readable both to humans and to a machine and alternative inputs can not be predefined. • The basic OCR system was invented to convert the data available on papers in to computer process able documents, So that the documents can be editable and reusable. • Traditional OCR techniques are typically multi-stage processes. For example, first the image may be divided into smaller regions that contain the individual characters, second the individual characters are recognized, and finally the result is pieced back together. A difficulty with this approach is to obtain a good division of the original image.
  • 4. Sample Architecture for CNN What are convolution Neural Network • Step 1 – Convolution Operation • Step 1(b) – ReLu layer (Rectified Linear unit) • Step 2 – Pooling • Step 3 – Flattering • Step 4 – Full Connection
  • 5. Fully Connected Layer of CNN model Source : Created by Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team
  • 6. OCRGen.py STRINGPOWER AI Dataset is generated using the Python Imaging Library (PIL) A fully convolutional network is presented which transforms the input volume into a sequence of character predictions. Predicted Output Fully Connected Layer CSV file
  • 7. Deep OCR Architecture • A fully convolutional network is presented which transforms the input volume into a sequence of character predictions. These character predictions can then be transformed into a string. The architecture of the network is shown below in Figure.
  • 8. • Where N is the number of possible characters. In this example, there are 63 possible characters for uppercase and lowercase characters, digits, and a blank character. The parenthesized values in the convolutional layers are the filter sizes and stride values from top to bottom respectively. The values in the reshape layer are the reshaped dimension. • The input volume is a rectangular RGB image. This first height and width of this volume are reduced across the convolutional layers using striding. The 3rd dimension of this volume increases from 3 channels (RGB) to 1 channel for each character possible. Thus, the volume is transformed from an RGB image into a sequence of vectors. Applying argmax across the channel dimension gives a sequence of 1-hot encoded vectors which can be transformed into a string. SOURCE https://github.com/nicholastoddsmith/pythonml/blob/master/Dee pOCR/TFModel/_classes.txt
  • 9. Result • To facilitate training this network, a dataset is generated using the Python Imaging Library (PIL). Random strings consisting of alphanumeric characters are generated. Using PIL, images are generated for each random string. A CSV file is also generated which contains the file name and the associated random string. Some examples from the generated dataset are shown below in Figure. Training Data Generating Data
  • 11. • Training and cross-validation results are shown
  • 12. Training the Network • To train the network, the CSV file is parsed and the images are loaded into memory. Each target value for the training data is a sequence of 1-hot vectors. Thus the target matrix is a 3D matrix with the three dimensions corresponding to sample, character, and 1-hot encoding respectively. • Next the neural network is constructed using the artificial neural network classifier (ANNC) class from TFANN. The architecture described above is represented in the following lines of code using ANNC • Softmax cross-entropy is used as the loss function which is performed over the 3rd dimension of the output. • Fitting the network and performing predictions is simple using the ANNC class. The prediction is split up using array_split from numpy to prevent out of memory errors.
  • 13. System Details • Distributed Deep Learning (DDL) environment on POWER8 system with IBM PowerAI ML/DL frameworks. • 40 threads POWER8, 256 RAM, 1 x k80 GPU • PushToCompute for compiling POWER8 applications and deploying directly to the Nimbix Cloud