Welcome to UiPath Document Understanding! In this program, you will embark on a thrilling journey into the world of intelligent automation and discover how UiPath can revolutionize your document processing workflows. Document understanding is a critical aspect of many industries, and UiPath offers powerful tools and capabilities to extract valuable insights from unstructured data with remarkable accuracy and efficiency.
Through your attendance, you will gain a deep understanding of UiPath Document Understanding, learning how to leverage its features to automate data extraction, classification, and analysis from various document types. You'll explore techniques such as optical character recognition (OCR), machine learning, and AI-driven document processing equipping you with the knowledge to tackle complex document challenges head-on. With real-life use cases, a practical demonstration, and best practices, you'll be empowered to unlock the full potential of UiPath Document Understanding and drive transformative outcomes for your organization. Get ready to embark on a transformative learning experience and unleash the power of UiPath in the world of document understanding!
Agenda:
1. Exploring UiPath Document Understanding
2. Deep Dive into Document Understanding Methods and Techniques
3. UiPath Document Understanding in Action
4. Best Practices for Implementing Document Understanding
5. Future Trends and Innovations
6. Q&A Session
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Unlocking the Power of UiPath: A Journey into Document Understanding
1. 1
Unlocking the Power of UiPath:
A Journey into Document Understanding
UiPath St. Louis Chapter
2. 2
A little bit about me…
• Ian Wickline; Senior Trainer, Partner Enablement
• 20+ Year Information Technology Professional
• Adjunct Professor: Computer Science, Information Technology, and Business Management
• Extensive Software Development Background
• Mainframe Developer
• Modern Day Web Development
• Various Data Analytics Roles
• Most Recently: UiPath!
• Lead Automation Engineer
• Developed and implemented numerous UiPath based production automations
• Solution Architecture for many more UiPath based production automations
• Development of a 5-day training program
3. 3
Agenda
• Introduction to Document Understanding
• What is Document Understanding?
• Document Types
• Document Processing Methodologies
• Model-Based Versus Rule-Based Extraction
• Document Understanding: Part of an End-to-end Automation Process
• Document Understanding: Typical Workflow
• Classification and Extraction: Success Criteria
• Example: Classification and Extraction: Invoice Post-processing
• Document Understanding Architectures
4. 4
Introduction to Document
Understanding
Get your documents processed
intelligently.
Teach your robots to understand
documents using rules, templates, or skills
enhanced by Artificial Intelligence (AI) for
data extraction and interpretation.
Drag and drop these capabilities directly
into your Robotic Process Automation
(RPA) workflows to combine AI and RPA.
5. 5
What Is Document Understanding?
Artificial
Intelligence (AI)
Document
Processing
Robotic Process
Automation (RPA)
Document
Understanding
Document Understanding is the ability to
extract and interpret information and
meaning from a wide range of documents.
It emerges at the intersection of document
processing, Artificial Intelligence (AI), and
RPA.
6. 6
Document Types
▪ Required information found in
the same place
▪ Fixed in format
▪ Examples: Forms, passports,
licenses, and time sheets
containing handwritten text,
signatures, checkboxes
▪ Repetitive information each time
▪ Found in fixed and variable
document parts such as tables
▪ Examples: Invoices, receipts,
purchase orders, medical bills,
bank statements, utility bills
▪ No fixed format
▪ Examples:
Contracts, agreements,
emails, disease descriptions,
drug prescriptions, news, voice
scripts
Structured Semi-structured Unstructured
7. 7
Document Processing Methodologies
Rule-based
Structured fields, mostly
used for structured
documents
Mostly structured
documents, tables,
checkboxes,
handwriting, signatures
Most structured
documents (forms)
Mostly semi–structured
documents
RegEx Based
Extractor
Form Extractor Forms AI Machine Learning
Extractor
Model-based
Hybrid
A combination of both – rule-based and model-based extractors
Mostly documents combining both structured and less structured formats
8. 8
Model-based Versus Rule-based
Extraction
▪ Relies on rules or templates.
▪ Processes documents in a fixed format for
structured data like forms or licenses.
▪ Has high accuracy for already known
documents.
▪ Requires extra costs for the addition of
templates or rules and ongoing
maintenance.
▪ Is unable to work on unknown documents.
▪ Relies on ML models.
▪ Processes less structured documents with
varying layouts, such as invoices or
receipts.
▪ Understands even the most obscure and
complex documents.
▪ Requires pre-trained ML models with
further retraining capabilities.
Model-based extraction
(template-less approach)
Rule-based extraction
(template-based approach)
9. 9
Document Understanding: Part of an
End-to-end Automation Process
Document Understanding processes aren’t standalone. They are a part of bigger business processes to be automated.
Here is the architecture for an end-to-end business process involving Document Understanding.
Document Understanding
Process
Runs one job for each
target file
Acts as dispatcher for the
subsequent processes
“Upstream”
automation, prior
to Document
Understanding
Acts as dispatcher
for the Document
Understanding
process
“Downstream”
automation, post
Document
Understanding
Utilizes extracted
data
10. 10
Document Understanding Typical
Workflow
Load taxonomy
defines document
types and fields for
processing.
Digitize
documents using
Optical Character
Recognition (OCR) to
make them machine-
readable.
Classify
and split the files into
document types.
Extract
information from the
documents.
Export
the extracted data for
further usage.
Train
classifiers based on
the validated data.
Train
extractors based on
the validated data.
Validate
classification results
(human review).
Validate
extractors results
(human review).
11. 11
Classification and Extraction Success
Criteria
▪ After both the classification and extraction
steps, there are dedicated workflows wherein
you can implement the business rule
validation.
▪ Based on the outcome of the validation
workflows, a decision is taken whether the
document will be sent for human validation or
not.
▪ If no logic is implemented, the documents will
be sent, by default, for human validation.
12. 12
Invoice Post-processing
These rules should NOT be used as-is, except for demo purposes. For a real implementation, post-
processing and validation should be tailored to the specifics of the business process.
For the invoice document type, a standard workflow is available that
validates six business rules.
▪ Verify that all mandatory fields and columns are extracted.
▪ Verify that all table rows match the rule: Quantity * Unit Price =
Line Amount.
▪ Verify that the sum of all Line Amounts = Net Amount.
▪ Verify the sum of the Net Amount and all fields defined as
SubTotalAdditions = Total.
▪ Verify the extraction confidence of all defined ‘ConfidenceFields’
against their individual confidence thresholds.
▪ Verify the extraction confidence of all the other fields against the
‘other-Confidence’ threshold.
13. 13
Document Understanding Architectures
Currently, Document Understanding is available both in cloud and on-premises.
Online Air-gapped
Using Public
Services
Using the Cloud
Tenant
In Cloud On-premises
Hybrid
14. 14
Review/Recap
• Introduction to Document Understanding
• What is Document Understanding?
• Document Types
• Document Processing Methodologies
• Model-Based Versus Rule-Based Extraction
• Document Understanding: Part of an End-to-end Automation Process
• Document Understanding: Typical Workflow
• Classification and Extraction: Success Criteria
• Example: Classification and Extraction: Invoice Post-processing
• Document Understanding Architectures