Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
#evolve19
GOING BEYOND METADATA:
EXTRACTING MEANINGFUL
INFORMATION FROM YOUR
DIGITAL ASSETS
PAUL LEGAN
August 7th, 2019
#evolve19 2
DIGITAL ASSET MANAGEMENT
REALLY, IT MAKES THIS PROCESS EASIER.
Find an
existing asset
or set of asset
artifact...
#evolve19 3
• Supports workflows that allow for
content modification
• Reduces costs of asset creation
and distribution
• ...
#evolve19 4
IF IT’S SO GREAT, WHY ISN’T IT EASY?
WE CAN ALL PROBABLY NAME A FEW REASONS.
#evolve19 5
“Let’s all use in-progress folders.”
ISSUE #1: ORGANIZATION
NAMING CONVENTIONS AND FOLDER STRUCTURE
→
“We can ...
#evolve19 6
ISSUE #2: INCONSISTENCY
TRAINING + USAGE GUIDELINES
No validation
Poor Naming
Conventions
Number Duplication
U...
#evolve19 7
ISSUE #3: MYOPIA
THINK BEYOND THE CURRENT USE CASE
Tag Redundancy
Folder Mismatches
No Scheduled Cleanup
#evolve19 8
MULTI-TOOL OF CHOICE: METADATA
WE CAN ALL PROBABLY NAME A FEW REASONS.
#evolve19 9
THE GENRE PROBLEM
ID3, WINAMP, AND ITUNES – UNITE!
(for all of you who totally legally purchased music 20 year...
#evolve19 10
THE HUMBLE SCHEMA
YOUR ASSET DATA LAYER
#evolve19 11
INGESTION PROCESS
ASSET PROCESSING AT SCALE
Define a Schema
(Superset of Properties)
Define Ingestion Process...
#evolve19 12
INGESTION PROCESS
ASSET PROCESSING AT SCALE
Define a Schema
(Superset of Properties)
Define Ingestion Process...
#evolve19 13
INGESTION PROCESS
ASSET PROCESSING AT SCALE
Define a Schema
(Superset of Properties)
Define Ingestion Process...
#evolve19 14
• Level #1 Automation
• Helps alleviate tedious work
• Applying global tags
• Complementing IPTC/XMP
data emb...
#evolve19 15
SMART TAGS
ADOBE I/O SMART CONTENT SERVICE
Can be trained and
training can be run on a
schedule
Auto-tag base...
#evolve19 16
SO… HOW CAN WE GO FURTHER?
LET’S SAY YOU WANT MORE AUTOMATION.
#evolve19 17
Uses Optical
Character
Recognition (OCR)
to automatically
detect printed text
and numbers in a
scan or render...
#evolve19 18
LOOKING INSIDE WITH OCR
JUDGE ASSETS BY MORE THAN THEIR COVER
#evolve19 19
LOOKING INSIDE WITH OCR
JUDGE ASSETS BY MORE THAN THEIR COVER
→
#evolve19 20
STRUCTURED DATA
EMBEDDED DOCUMENT INFORMATION
#evolve19 21
STRUCTURED DATA
EMBEDDED DOCUMENT INFORMATION
driver-data.pdf
#evolve19 22
HOW IT WORKS
TECHNICAL PROCESS
Image Uploaded via API
(S3 or Base64 Bytes)
Service Analyzes Input
(Sync or As...
#evolve19 23
HOW IT FITS IN AEM
TECHNICAL PROCESS
Image Uploaded via API
(S3 or Base64 Bytes)
Service Analyzes Input
(Sync...
#evolve19
AEMWorkflow
24
HOW IT FITS IN AEM
TECHNICAL PROCESS
Image Uploaded via API
(S3 or Base64 Bytes)
Service Analyzes...
#evolve19 25
DEMO
!
#evolve19 26
HOW DO THESE TOOLS HELP?
MORE THAN YOU THINK.
#evolve19 27
BENEFITS & IMPACT
HIGHLIGHTS
-75% -60%Less Effort By Humans
Per Ingested Asset
Reduction in Calls
to IT to De...
#evolve19 28
FUTURE POSSIBILITIES
JUST THINKING OUT LOUD
Process Invoices
& Sales Receipts
Normalize Financial
Document Da...
#evolve19 29
Links to Relevant Resources:
- https://aws.amazon.com/textract/
- https://github.com/aws-samples/amazon-textr...
#evolve19
THANK YOU!
Upcoming SlideShare
Loading in …5
×

Evolve 19 | Paul Legan | Going Beyond Metadata: Extracting Meaningful Information from Digital Assets Automatically in AEM

38 views

Published on

Learn how to reduce manual metadata tasks and find assets immediately in AEM Assets.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Evolve 19 | Paul Legan | Going Beyond Metadata: Extracting Meaningful Information from Digital Assets Automatically in AEM

  1. 1. #evolve19 GOING BEYOND METADATA: EXTRACTING MEANINGFUL INFORMATION FROM YOUR DIGITAL ASSETS PAUL LEGAN August 7th, 2019
  2. 2. #evolve19 2 DIGITAL ASSET MANAGEMENT REALLY, IT MAKES THIS PROCESS EASIER. Find an existing asset or set of asset artifacts Alter an existing or create a new creative asset Generate variations for different audiences Publish this asset for an appropriate duration Discovery Creation Automation Publication
  3. 3. #evolve19 3 • Supports workflows that allow for content modification • Reduces costs of asset creation and distribution • Automates tedious tasks like thumbnail generation • Increases marketing throughput for content variations and personalization • Increases creative autonomy DIGITAL ASSET MANAGEMENT LET’S START WITH THE BENEFITS
  4. 4. #evolve19 4 IF IT’S SO GREAT, WHY ISN’T IT EASY? WE CAN ALL PROBABLY NAME A FEW REASONS.
  5. 5. #evolve19 5 “Let’s all use in-progress folders.” ISSUE #1: ORGANIZATION NAMING CONVENTIONS AND FOLDER STRUCTURE → “We can delete this later.”
  6. 6. #evolve19 6 ISSUE #2: INCONSISTENCY TRAINING + USAGE GUIDELINES No validation Poor Naming Conventions Number Duplication Unused Fields
  7. 7. #evolve19 7 ISSUE #3: MYOPIA THINK BEYOND THE CURRENT USE CASE Tag Redundancy Folder Mismatches No Scheduled Cleanup
  8. 8. #evolve19 8 MULTI-TOOL OF CHOICE: METADATA WE CAN ALL PROBABLY NAME A FEW REASONS.
  9. 9. #evolve19 9 THE GENRE PROBLEM ID3, WINAMP, AND ITUNES – UNITE! (for all of you who totally legally purchased music 20 years ago)
  10. 10. #evolve19 10 THE HUMBLE SCHEMA YOUR ASSET DATA LAYER
  11. 11. #evolve19 11 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill)
  12. 12. #evolve19 12 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill) Metadata Profiles (Sensible Defaults) Smart Organization (Sort, Filter, Variants) Smart Tags (Auto-Tag, Pre-Fill)
  13. 13. #evolve19 13 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill) Metadata Profiles (Sensible Defaults) Smart Organization (Sort, Filter, Variants) Smart Tags (Auto-Tag, Pre-Fill)
  14. 14. #evolve19 14 • Level #1 Automation • Helps alleviate tedious work • Applying global tags • Complementing IPTC/XMP data embedded in the binaries • Photoshoot Location • Photographer • Type of Asset • Digital Rights Management • Easy to apply at the folder or file type level METADATA PROFILES SENSIBLE METADATA DEFAULTS
  15. 15. #evolve19 15 SMART TAGS ADOBE I/O SMART CONTENT SERVICE Can be trained and training can be run on a schedule Auto-tag based on object recognition
  16. 16. #evolve19 16 SO… HOW CAN WE GO FURTHER? LET’S SAY YOU WANT MORE AUTOMATION.
  17. 17. #evolve19 17 Uses Optical Character Recognition (OCR) to automatically detect printed text and numbers in a scan or rendering of a document. AMAZON TEXTRACT AN INTRODUCTION Enables you to detect key-value pairs in documents to retain the inherent context of the document without any manual intervention. Returns a confidence score for everything it identifies so you can make informed decisions about how you want to use the results.
  18. 18. #evolve19 18 LOOKING INSIDE WITH OCR JUDGE ASSETS BY MORE THAN THEIR COVER
  19. 19. #evolve19 19 LOOKING INSIDE WITH OCR JUDGE ASSETS BY MORE THAN THEIR COVER →
  20. 20. #evolve19 20 STRUCTURED DATA EMBEDDED DOCUMENT INFORMATION
  21. 21. #evolve19 21 STRUCTURED DATA EMBEDDED DOCUMENT INFORMATION driver-data.pdf
  22. 22. #evolve19 22 HOW IT WORKS TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) { "Document": { "Bytes": blob, "S3Object": { "Bucket": "string", "Name": "string", "Version": "string" } } } // SYNC DetectDocumentText() AnalyzeDocument() // ASYNC StartDocumentTextDetection() GetDocumentTextDetection() [Blocks] [Geometry] [Bounding Box] [Confidence] [Text] [Block Type] [ID] [/Blocks] → →
  23. 23. #evolve19 23 HOW IT FITS IN AEM TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) → → XML Binary Writeback (If applicable) Property Validation (Notification, Banner) Properties Saved to JCR (JSON Payload) → → → AEMWorkflow
  24. 24. #evolve19 AEMWorkflow 24 HOW IT FITS IN AEM TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) → → XML Binary Writeback (If applicable) Property Validation (Notification, Banner) Properties Saved to JCR (JSON Payload) → → → 3rd-Party DB (Search) Amazon Comprehend (NLP) Amazon Translate (Translation) → → →
  25. 25. #evolve19 25 DEMO !
  26. 26. #evolve19 26 HOW DO THESE TOOLS HELP? MORE THAN YOU THINK.
  27. 27. #evolve19 27 BENEFITS & IMPACT HIGHLIGHTS -75% -60%Less Effort By Humans Per Ingested Asset Reduction in Calls to IT to Deliver Assets Tedious Data Entry Increases the Risk of Human Error Reduces Margin of Error Reduces the Time to Find Assets and Lessens the Dependency on IT Better Discovery A Scalable System is a Usable System as Adoption Increases Enterprise Scale +80%User Adoption YoY Across Departments
  28. 28. #evolve19 28 FUTURE POSSIBILITIES JUST THINKING OUT LOUD Process Invoices & Sales Receipts Normalize Financial Document Data Automatically Redact PII from a Claim
  29. 29. #evolve19 29 Links to Relevant Resources: - https://aws.amazon.com/textract/ - https://github.com/aws-samples/amazon-textract-code-samples/ - https://github.com/aws-samples/amazon-textract-serverless-large-scale-document- processing MORE INFORMATION GETTING STARTED & BEYOND
  30. 30. #evolve19 THANK YOU!

×