SlideShare a Scribd company logo
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BetterAnalyticsThrough Natural
Language Processing
Nino Bice
Product Manager
AWS AI
A I M 4 0 5
Ben Snively
Solutions Architect
AWS AI
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analyticswithalldata
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AmazonComprehend
D i s c o v e r i n s i g h t s a n d r e l a t i o n s h i p s i n t e x t
Entities
Key Phrases
Language
Sentiment
Syntax
Grouping
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customclassification
NEW
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automated training
Text Label
I am calling about my credit card Loyalty
My points aren’t being applied correctly Loyalty
I really need to shut the service down Account
We are moving and we don’t want this
anymore
Account
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customentities
Person
Organization
Part
Account_Action
NEW
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automated annotation and training
Entity:Account_Issues
Cancel order
Large bill
Escalate to manager
No connection
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Comprehend structured output
PERSON ORG LOCATION Policy Make
Dave Amazon Seattle 44-331 Zen
Steve Facebook San Francisco 55-433 Theta
Pre-trained entities Custom entities
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Removing ML/NLP complexity
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dmytro Dolgopolov
Senior Director Content Services and Analytics
FINRA
Matt Cardillo
Senior Director Application Platforms
FINRA
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
of storageevents per day
30+pb135 Billion
Up to
Monitoring 99% Equities
& 65% Options in the US
Reconstructing
Trillions
of Market Nodes & Edges
Investor
protection
Market
integrity
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Business challenges—MatterContent
• High volumes of unstructured content
• ~1M documents each year from stock brokers and investors to be
reviewed
• Documents contain incredibly useful information but mining is a
challenge
• Overlooking important information
• Risk-based approach itself presents risks
• Numerous features of interest
• Finding information about the Who, What, Where, When and How
is labor intensive
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Business challenges—MatterContent
John Doe alleged he did not understand how the two fixed
annuities sold to him by William Alex Smith worked and did
not understand the impact. At that time William Smith was
employed by Company, Inc.
Investor John Doe
Broker William Alex Smith, ID 12345
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our solution
Commix Command Center for People and Organizations
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Oursolution:Project3CPO
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Architecture <bucket>/input
Data
Preparation
Entity
Matching
Amazon Comprehend
<bucket>/matched
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dataflow
Feature Sets
(collected from document)
Name Search
(millions)
Proprietary
Reference Data
Comparator
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits
• Benefits to FINRA technology
• Extract individuals and organization
• Match extracted entities to FINRA records
• Flag individuals of interest
• Enable higher level analytics
• Benefits to our customers
• Bring information to the user instead of the user mining for information manually
• Flag individuals of interest to find bad actors that could otherwise go undetected
• Reclaim hours toward high value, strategic efforts over repetitive tactical document reviews
Net Result - Regulatory reviews made
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Comprehend use cases
Content personalization: Customers are using Amazon Comprehend NLP output to understand related
documents based on entities, phrases or even topic similarities for trends analysis, to drive content
personalization, and recommendations
Semantic search: Customers using Amazon Comprehend to index entities for boosting and ranking
search results.
Intelligent data warehouse: Customers are using Amazon Comprehend to query unstructured data in
relational databases, processing data within the data lake (S3), and then inserting it back into the data
warehouse
Social analytics: Customers are using Amazon Comprehend to ingest, process, and analyze trends from
entities and sentiment from social media posts across Twitter and Facebook.
Information management: Customers are using Amazon Comprehend for indexing and finding related
content for enterprise information management and various internal business processes including
compliance and IT.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS textanalyticsworkload
Amazon Kinesis
Amazon ES
Amazon Redshift
Amazon EMR
• Boosting
• Rich Filtering
• Grouping, Trends
• Joining, Correlating
• Clustering
• Graph, Search
• Near real-time
• Alerts
Amazon S3
Social Media, Support
Amazon Aurora
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customerfeedback analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customerfeedback analyticsarchitecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dataflow:
RSS News Feeds
EntitiesRSS News
Metadata
Named Entity
Property Graphs
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dataflow:
RSS News Feeds
Entities
Key
Phrases
Sentiment
Syntax
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

Machine Learning for Improving Disaster Management and Response (WPS313) - AW...
Machine Learning for Improving Disaster Management and Response (WPS313) - AW...Machine Learning for Improving Disaster Management and Response (WPS313) - AW...
Machine Learning for Improving Disaster Management and Response (WPS313) - AW...
Amazon Web Services
 
Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018
Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018
Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018
Amazon Web Services
 
Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...
Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...
Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...
Amazon Web Services
 
Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018
Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018
Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018
Amazon Web Services
 
Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...
Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...
Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...
Amazon Web Services
 
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Amazon Web Services
 
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Amazon Web Services
 
Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...
Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...
Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...
Amazon Web Services
 
Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...
Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...
Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...
Amazon Web Services
 
Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...
Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...
Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...
Amazon Web Services
 
Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...
Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...
Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...
Amazon Web Services
 
Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...
Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...
Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...
Amazon Web Services
 
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Amazon Web Services
 
Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018
Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018
Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018
Amazon Web Services
 
Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...
Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...
Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...
Amazon Web Services
 
Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...
Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...
Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...
Amazon Web Services
 
Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018
Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018
Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018
Amazon Web Services
 
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Amazon Web Services
 
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
Amazon Web Services
 
[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...
[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...
[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...
Amazon Web Services
 

What's hot (20)

Machine Learning for Improving Disaster Management and Response (WPS313) - AW...
Machine Learning for Improving Disaster Management and Response (WPS313) - AW...Machine Learning for Improving Disaster Management and Response (WPS313) - AW...
Machine Learning for Improving Disaster Management and Response (WPS313) - AW...
 
Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018
Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018
Debugging Gluon and Apache MXNet (AIM423) - AWS re:Invent 2018
 
Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...
Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...
Capture Voice of Customer Insights with NLP & Analytics (AIM415-R1) - AWS re:...
 
Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018
Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018
Build an ETL Pipeline to Analyze Customer Data (AIM416) - AWS re:Invent 2018
 
Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...
Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...
Go Global with Cloud-Native Architecture: Deploy AdTech Services Across Four ...
 
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
 
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
Drive Customer Value with Data-Driven Decisions (GPSBUS206) - AWS re:Invent 2018
 
Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...
Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...
Provide Faster, Scalable Solutions to Support Research Use Cases with AWS (WP...
 
Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...
Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...
Building a Governance, Risk, and Compliance Strategy with AWS (WPS204) - AWS ...
 
Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...
Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...
Democratize Data Preparation for Analytics & Machine Learning A Hands-On Lab ...
 
Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...
Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...
Alexa, Ask Jarvis to Create a Serverless App for Me (SRV315) - AWS re:Invent ...
 
Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...
Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...
Create Advanced Text Analytics Solutions with NLP - BDA310 - New York AWS Sum...
 
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
Build an Intelligent Multi-Modal User Agent with Voice and NLU (AIM340) - AWS...
 
Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018
Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018
Detect Anomalies Using Amazon SageMaker (AIM420) - AWS re:Invent 2018
 
Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...
Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...
Applying the Twelve-Factor App Methodology to Serverless Applications (SRV218...
 
Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...
Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...
Build Deep Learning Applications Using PyTorch and Amazon SageMaker (AIM432-R...
 
Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018
Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018
Build a "Who's Who" App for Your Media Content (AIM409) - AWS re:Invent 2018
 
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
Train Models on Amazon SageMaker Using Data Not from Amazon S3 (AIM419) - AWS...
 
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
透過最新的 AWS 服務在 2019 年為您的業務轉型 (Level 200)
 
[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...
[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...
[NEW LAUNCH!] Amazon FSx for Lustre: Introducing a new fully managed high-per...
 

Similar to [REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018

Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS SummitCreate Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Amazon Web Services
 
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Michaela Bromfield
 
From Data To Insights
From Data To Insights From Data To Insights
From Data To Insights
Orit Alul
 
Introduction to AI
Introduction to AIIntroduction to AI
Introduction to AI
Boaz Ziniman
 
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Amazon Web Services
 
Practical Human-in-the-Loop Machine Learning
 Practical Human-in-the-Loop Machine Learning Practical Human-in-the-Loop Machine Learning
Practical Human-in-the-Loop Machine Learning
Amazon Web Services
 
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video AnalysisBDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
Amazon Web Services
 
Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018
Amazon Web Services
 
BI & Analytics
BI & AnalyticsBI & Analytics
BI & Analytics
Amazon Web Services
 
Non-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SFNon-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SF
Amazon Web Services
 
Non-Relational Revolution
Non-Relational RevolutionNon-Relational Revolution
Non-Relational Revolution
Amazon Web Services
 
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Amazon Web Services
 
Big Data - EBC on the road Brazil Edition [Portuguese]
Big Data - EBC on the road Brazil Edition [Portuguese]Big Data - EBC on the road Brazil Edition [Portuguese]
Big Data - EBC on the road Brazil Edition [Portuguese]
Amazon Web Services
 
Using Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLUsing Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with ML
Amazon Web Services
 
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerBDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
Amazon Web Services
 
[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...
[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...
[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...
Amazon Web Services
 
TECHTalks - Boston MA - Tim Harney
TECHTalks - Boston MA - Tim HarneyTECHTalks - Boston MA - Tim Harney
TECHTalks - Boston MA - Tim Harney
EagleDream Technologies
 
Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...
Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...
Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...
Amazon Web Services
 
Introduction to AWS ML Application Services - BDA202 - Toronto AWS Summit
Introduction to AWS ML Application Services - BDA202 - Toronto AWS SummitIntroduction to AWS ML Application Services - BDA202 - Toronto AWS Summit
Introduction to AWS ML Application Services - BDA202 - Toronto AWS Summit
Amazon Web Services
 
Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...
Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...
Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...
Amazon Web Services
 

Similar to [REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018 (20)

Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS SummitCreate Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
Create Advanced Text Analytics Solutions with NLP - BDA310 - Chicago AWS Summit
 
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
 
From Data To Insights
From Data To Insights From Data To Insights
From Data To Insights
 
Introduction to AI
Introduction to AIIntroduction to AI
Introduction to AI
 
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
 
Practical Human-in-the-Loop Machine Learning
 Practical Human-in-the-Loop Machine Learning Practical Human-in-the-Loop Machine Learning
Practical Human-in-the-Loop Machine Learning
 
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video AnalysisBDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
 
Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018
 
BI & Analytics
BI & AnalyticsBI & Analytics
BI & Analytics
 
Non-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SFNon-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SF
 
Non-Relational Revolution
Non-Relational RevolutionNon-Relational Revolution
Non-Relational Revolution
 
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
Harness the Power of Crowdsourcing with Amazon Mechanical Turk (AIM351) - AWS...
 
Big Data - EBC on the road Brazil Edition [Portuguese]
Big Data - EBC on the road Brazil Edition [Portuguese]Big Data - EBC on the road Brazil Edition [Portuguese]
Big Data - EBC on the road Brazil Edition [Portuguese]
 
Using Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLUsing Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with ML
 
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerBDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
 
[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...
[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...
[NEW LAUNCH!] Introducing Amazon Personalize: Real-time Personalization and R...
 
TECHTalks - Boston MA - Tim Harney
TECHTalks - Boston MA - Tim HarneyTECHTalks - Boston MA - Tim Harney
TECHTalks - Boston MA - Tim Harney
 
Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...
Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...
Security Observability: Democratizing Security in the Cloud (DEV206-S) - AWS ...
 
Introduction to AWS ML Application Services - BDA202 - Toronto AWS Summit
Introduction to AWS ML Application Services - BDA202 - Toronto AWS SummitIntroduction to AWS ML Application Services - BDA202 - Toronto AWS Summit
Introduction to AWS ML Application Services - BDA202 - Toronto AWS Summit
 
Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...
Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...
Advanced Data Ingestion Pipeline: Analyzing Trade Blotters to Identify Market...
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
Amazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
Amazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
Amazon Web Services
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Amazon Web Services
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
Amazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
Amazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Amazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
Amazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Amazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BetterAnalyticsThrough Natural Language Processing Nino Bice Product Manager AWS AI A I M 4 0 5 Ben Snively Solutions Architect AWS AI
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analyticswithalldata
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AmazonComprehend D i s c o v e r i n s i g h t s a n d r e l a t i o n s h i p s i n t e x t Entities Key Phrases Language Sentiment Syntax Grouping
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customclassification NEW
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automated training Text Label I am calling about my credit card Loyalty My points aren’t being applied correctly Loyalty I really need to shut the service down Account We are moving and we don’t want this anymore Account
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customentities Person Organization Part Account_Action NEW
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automated annotation and training Entity:Account_Issues Cancel order Large bill Escalate to manager No connection
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Comprehend structured output PERSON ORG LOCATION Policy Make Dave Amazon Seattle 44-331 Zen Steve Facebook San Francisco 55-433 Theta Pre-trained entities Custom entities
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Removing ML/NLP complexity
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dmytro Dolgopolov Senior Director Content Services and Analytics FINRA Matt Cardillo Senior Director Application Platforms FINRA
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. of storageevents per day 30+pb135 Billion Up to Monitoring 99% Equities & 65% Options in the US Reconstructing Trillions of Market Nodes & Edges Investor protection Market integrity
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Business challenges—MatterContent • High volumes of unstructured content • ~1M documents each year from stock brokers and investors to be reviewed • Documents contain incredibly useful information but mining is a challenge • Overlooking important information • Risk-based approach itself presents risks • Numerous features of interest • Finding information about the Who, What, Where, When and How is labor intensive
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Business challenges—MatterContent John Doe alleged he did not understand how the two fixed annuities sold to him by William Alex Smith worked and did not understand the impact. At that time William Smith was employed by Company, Inc. Investor John Doe Broker William Alex Smith, ID 12345
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our solution Commix Command Center for People and Organizations
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Oursolution:Project3CPO
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Architecture <bucket>/input Data Preparation Entity Matching Amazon Comprehend <bucket>/matched
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dataflow Feature Sets (collected from document) Name Search (millions) Proprietary Reference Data Comparator
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Benefits • Benefits to FINRA technology • Extract individuals and organization • Match extracted entities to FINRA records • Flag individuals of interest • Enable higher level analytics • Benefits to our customers • Bring information to the user instead of the user mining for information manually • Flag individuals of interest to find bad actors that could otherwise go undetected • Reclaim hours toward high value, strategic efforts over repetitive tactical document reviews Net Result - Regulatory reviews made
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Comprehend use cases Content personalization: Customers are using Amazon Comprehend NLP output to understand related documents based on entities, phrases or even topic similarities for trends analysis, to drive content personalization, and recommendations Semantic search: Customers using Amazon Comprehend to index entities for boosting and ranking search results. Intelligent data warehouse: Customers are using Amazon Comprehend to query unstructured data in relational databases, processing data within the data lake (S3), and then inserting it back into the data warehouse Social analytics: Customers are using Amazon Comprehend to ingest, process, and analyze trends from entities and sentiment from social media posts across Twitter and Facebook. Information management: Customers are using Amazon Comprehend for indexing and finding related content for enterprise information management and various internal business processes including compliance and IT.
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS textanalyticsworkload Amazon Kinesis Amazon ES Amazon Redshift Amazon EMR • Boosting • Rich Filtering • Grouping, Trends • Joining, Correlating • Clustering • Graph, Search • Near real-time • Alerts Amazon S3 Social Media, Support Amazon Aurora
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customerfeedback analytics
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customerfeedback analyticsarchitecture
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dataflow: RSS News Feeds EntitiesRSS News Metadata Named Entity Property Graphs
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dataflow: RSS News Feeds Entities Key Phrases Sentiment Syntax
  • 33. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Editor's Notes

  1. Amazon Comprehend is a natural language processing service that uses machine learning to find insights and relationships in text. Amazon Comprehend identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; and automatically organizes a collection of text files by topic.   Our customers are using Amazon Comprehend to identify key topics, entities, and sentiments in social media and news streams, and to enhance their ability to access and aggregate unstructured data from the vast document libraries that exist within their organizations. Hotels.com has thousands of customer views and comments that are submitted by people who stay at the properties. It’s historically been difficult to find what matters in all this data. By using Amazon Comprehend, Hotels.com is able to uncover the unique characteristics that people like or don’t like about each hotel. Consequently, the company is better able to make recommendations to their users.
  2. Our customers want the accuracy of Comprehend to now support their own organization labels, terms and phrase Finance Insurance Manufacturing What are most important use cases for customization? Organizing documents based on their content Analyzing documents looking for business-specific terms and phrases
  3. Our customers want the accuracy of Comprehend to now support their own organization labels, terms and phrase Finance Insurance Manufacturing What are most important use cases for customization? Organizing documents based on their content Analyzing documents looking for business-specific terms and phrases Introducing Comprehend Custom Classification and Entity Types
  4. When our journey we began to the public cloud 5 years ago, high water mark was ~1/2 of what it is today We are FINRA – non-profit SRO with oversight from the SEC We regulate nearly the entire equities market and the majority of the options market State the mission We have massive amounts of data and we perform complex reconstructions to make sense of market data In the past five years, developed core competency for managing structured data in the cloud We do not think of our infrastructure as “fixed” any longer We’ve developed a robust data management solution We employed innovative partitioning strategies on our highly skewed data We leveraged multiple query engines based on our varying usage scenarios So, we’ve done a lot around our structured data…what’s next??
  5. We are working to develop the same level of competency around our unstructured data FINRA manages a backlog of case work that we refer to as matters. Working on these matters involves lots of unstructured content. Including: form filings; documents (about 1M each year); email correspondence; other data sources as reference Unearthing key features like who, what, where, when, and how is time consuming, error prone, and painful for our regulators. Some of our more complex matters have thousands of files associated. So, how do can our analysts know with confidence that they have not missed something?
  6. In this simple example… This is where Comprehend comes into play. We building text analytics solutions using Comprehend leveraging Entity Recognition Text Classifier To help identify features important to our regulators out of the piles of content received from filings, information requests, tips, and so on We want to go further then this simple example. We’ll be considering Comprehend’s custom capability around entity recognition, to deal with terms specific to our business of regulation. So, this is what we’re working on…
  7. We called this capability - Commix Command Center for People and Organizations True to form, like any organization with governmental oversight, we’ve become really good at coming up to flashing names for our capabilities… which always seems to conveniently turn into an catchy acronym
  8. Now Comprehend enables us to build these solutions. Here you see one of applications used by our users that streamlines the process of reviewing filings usually containing several multipage documents. A big part of such a review is to identify “bad actors”. Those are individuals that FINRA knows had some past infractions and filings that they are associated with will require additional level scrutiny. This easy to read dashboard helps our investigators to quickly identify “bad actors” without any of reading hundred page documents. Investigators can now quickly navigate thru the document and assess the level of risk associated with this filing.
  9. This is high level architecture of our solution. There are many operations we have to perform in order to extract the wealth of the information found in the documents. There are many regulatory insights we can get from the documents. Once you do it though it creates the new opportunities. It enables machine learning using Sagemaker, link analysis using Neptune, improves enterprise search using Elastic Search – the list goes on and on. Sorry Nino but this is why we want to use Comprehend. It’s like a salad that you have to it before you can have dessert. So this purple broccoli in the middle is very important part of our diet.
  10. Let’s look under the hood to see how we use entity extraction to match individuals with FINRA records. First we leverage Comprehend to collect various features found in the documents. Those are the features describing the individuals. For instance, name, email, address, employment, etc. Using proximity, context and other techniques we combine those features into feature sets – one per individuals identified in the document. Then we use full text search to limit the number of candidates. Our reference data contains information about millions of individuals. We take in consideration all the name variations. In the next step we use custom comparator that limits the number of candidates to a few and provides the level of confidence. Now we can surface the candidates of interest to the investigator for review. In the past they had to perform a manual search which would be very time-consuming especially for people with common names.
  11. Benefits of Comprehend Superior Entity and Locality recognition Extract multiple features to increase the probability of entity matches More accurate document categorization through topic modeling Key phase extraction is very beneficial to FINRA’s use case to quick discern signal from the noise Match extracted entities to FINRA records