Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Evolve 19 | Paul Legan | Going Beyond Metadata: Extracting Meaningful Information from Digital Assets Automatically in AEM

132 views

Published on

Learn how to reduce manual metadata tasks and find assets immediately in AEM Assets.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Evolve 19 | Paul Legan | Going Beyond Metadata: Extracting Meaningful Information from Digital Assets Automatically in AEM

  1. 1. #evolve19 GOING BEYOND METADATA: EXTRACTING MEANINGFUL INFORMATION FROM YOUR DIGITAL ASSETS PAUL LEGAN August 7th, 2019
  2. 2. #evolve19 2 DIGITAL ASSET MANAGEMENT REALLY, IT MAKES THIS PROCESS EASIER. Find an existing asset or set of asset artifacts Alter an existing or create a new creative asset Generate variations for different audiences Publish this asset for an appropriate duration Discovery Creation Automation Publication
  3. 3. #evolve19 3 • Supports workflows that allow for content modification • Reduces costs of asset creation and distribution • Automates tedious tasks like thumbnail generation • Increases marketing throughput for content variations and personalization • Increases creative autonomy DIGITAL ASSET MANAGEMENT LET’S START WITH THE BENEFITS
  4. 4. #evolve19 4 IF IT’S SO GREAT, WHY ISN’T IT EASY? WE CAN ALL PROBABLY NAME A FEW REASONS.
  5. 5. #evolve19 5 “Let’s all use in-progress folders.” ISSUE #1: ORGANIZATION NAMING CONVENTIONS AND FOLDER STRUCTURE → “We can delete this later.”
  6. 6. #evolve19 6 ISSUE #2: INCONSISTENCY TRAINING + USAGE GUIDELINES No validation Poor Naming Conventions Number Duplication Unused Fields
  7. 7. #evolve19 7 ISSUE #3: MYOPIA THINK BEYOND THE CURRENT USE CASE Tag Redundancy Folder Mismatches No Scheduled Cleanup
  8. 8. #evolve19 8 MULTI-TOOL OF CHOICE: METADATA WE CAN ALL PROBABLY NAME A FEW REASONS.
  9. 9. #evolve19 9 THE GENRE PROBLEM ID3, WINAMP, AND ITUNES – UNITE! (for all of you who totally legally purchased music 20 years ago)
  10. 10. #evolve19 10 THE HUMBLE SCHEMA YOUR ASSET DATA LAYER
  11. 11. #evolve19 11 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill)
  12. 12. #evolve19 12 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill) Metadata Profiles (Sensible Defaults) Smart Organization (Sort, Filter, Variants) Smart Tags (Auto-Tag, Pre-Fill)
  13. 13. #evolve19 13 INGESTION PROCESS ASSET PROCESSING AT SCALE Define a Schema (Superset of Properties) Define Ingestion Process (IPTC, XMP, Validation) Import Assets (Auto-Tag, Pre-Fill) Metadata Profiles (Sensible Defaults) Smart Organization (Sort, Filter, Variants) Smart Tags (Auto-Tag, Pre-Fill)
  14. 14. #evolve19 14 • Level #1 Automation • Helps alleviate tedious work • Applying global tags • Complementing IPTC/XMP data embedded in the binaries • Photoshoot Location • Photographer • Type of Asset • Digital Rights Management • Easy to apply at the folder or file type level METADATA PROFILES SENSIBLE METADATA DEFAULTS
  15. 15. #evolve19 15 SMART TAGS ADOBE I/O SMART CONTENT SERVICE Can be trained and training can be run on a schedule Auto-tag based on object recognition
  16. 16. #evolve19 16 SO… HOW CAN WE GO FURTHER? LET’S SAY YOU WANT MORE AUTOMATION.
  17. 17. #evolve19 17 Uses Optical Character Recognition (OCR) to automatically detect printed text and numbers in a scan or rendering of a document. AMAZON TEXTRACT AN INTRODUCTION Enables you to detect key-value pairs in documents to retain the inherent context of the document without any manual intervention. Returns a confidence score for everything it identifies so you can make informed decisions about how you want to use the results.
  18. 18. #evolve19 18 LOOKING INSIDE WITH OCR JUDGE ASSETS BY MORE THAN THEIR COVER
  19. 19. #evolve19 19 LOOKING INSIDE WITH OCR JUDGE ASSETS BY MORE THAN THEIR COVER →
  20. 20. #evolve19 20 STRUCTURED DATA EMBEDDED DOCUMENT INFORMATION
  21. 21. #evolve19 21 STRUCTURED DATA EMBEDDED DOCUMENT INFORMATION driver-data.pdf
  22. 22. #evolve19 22 HOW IT WORKS TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) { "Document": { "Bytes": blob, "S3Object": { "Bucket": "string", "Name": "string", "Version": "string" } } } // SYNC DetectDocumentText() AnalyzeDocument() // ASYNC StartDocumentTextDetection() GetDocumentTextDetection() [Blocks] [Geometry] [Bounding Box] [Confidence] [Text] [Block Type] [ID] [/Blocks] → →
  23. 23. #evolve19 23 HOW IT FITS IN AEM TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) → → XML Binary Writeback (If applicable) Property Validation (Notification, Banner) Properties Saved to JCR (JSON Payload) → → → AEMWorkflow
  24. 24. #evolve19 AEMWorkflow 24 HOW IT FITS IN AEM TECHNICAL PROCESS Image Uploaded via API (S3 or Base64 Bytes) Service Analyzes Input (Sync or Async) ML Response Sent (JSON Payload) → → XML Binary Writeback (If applicable) Property Validation (Notification, Banner) Properties Saved to JCR (JSON Payload) → → → 3rd-Party DB (Search) Amazon Comprehend (NLP) Amazon Translate (Translation) → → →
  25. 25. #evolve19 25 DEMO !
  26. 26. #evolve19 26 HOW DO THESE TOOLS HELP? MORE THAN YOU THINK.
  27. 27. #evolve19 27 BENEFITS & IMPACT HIGHLIGHTS -75% -60%Less Effort By Humans Per Ingested Asset Reduction in Calls to IT to Deliver Assets Tedious Data Entry Increases the Risk of Human Error Reduces Margin of Error Reduces the Time to Find Assets and Lessens the Dependency on IT Better Discovery A Scalable System is a Usable System as Adoption Increases Enterprise Scale +80%User Adoption YoY Across Departments
  28. 28. #evolve19 28 FUTURE POSSIBILITIES JUST THINKING OUT LOUD Process Invoices & Sales Receipts Normalize Financial Document Data Automatically Redact PII from a Claim
  29. 29. #evolve19 29 Links to Relevant Resources: - https://aws.amazon.com/textract/ - https://github.com/aws-samples/amazon-textract-code-samples/ - https://github.com/aws-samples/amazon-textract-serverless-large-scale-document- processing MORE INFORMATION GETTING STARTED & BEYOND
  30. 30. #evolve19 THANK YOU!

×