[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BetterAnalyticsThrough Natural
Language Processing
Nino Bice
Product Manager
AWS AI
A I M 4 0 5
Ben Snively
Solutions Architect
AWS AI

Analyticswithalldata

AmazonComprehend
D i s c o v e r i n s i g h t s a n d r e l a t i o n s h i p s i n t e x t
Entities
Key Phrases
Language
Sentiment
Syntax
Grouping

Customclassification
NEW

Automated training
Text Label
I am calling about my credit card Loyalty
My points aren’t being applied correctly Loyalty
I really need to shut the service down Account
We are moving and we don’t want this
anymore
Account

Customentities
Person
Organization
Part
Account_Action
NEW

Automated annotation and training
Entity:Account_Issues
Cancel order
Large bill
Escalate to manager
No connection

Amazon Comprehend structured output
PERSON ORG LOCATION Policy Make
Dave Amazon Seattle 44-331 Zen
Steve Facebook San Francisco 55-433 Theta
Pre-trained entities Custom entities

Removing ML/NLP complexity

Dmytro Dolgopolov
Senior Director Content Services and Analytics
FINRA
Matt Cardillo
Senior Director Application Platforms
FINRA

of storageevents per day
30+pb135 Billion
Up to
Monitoring 99% Equities
& 65% Options in the US
Reconstructing
Trillions
of Market Nodes & Edges
Investor
protection
Market
integrity

Business challenges—MatterContent
• High volumes of unstructured content
• ~1M documents each year from stock brokers and investors to be
reviewed
• Documents contain incredibly useful information but mining is a
challenge
• Overlooking important information
• Risk-based approach itself presents risks
• Numerous features of interest
• Finding information about the Who, What, Where, When and How
is labor intensive

Business challenges—MatterContent
John Doe alleged he did not understand how the two fixed
annuities sold to him by William Alex Smith worked and did
not understand the impact. At that time William Smith was
employed by Company, Inc.
Investor John Doe
Broker William Alex Smith, ID 12345

Our solution
Commix Command Center for People and Organizations

Oursolution:Project3CPO

Architecture <bucket>/input
Data
Preparation
Entity
Matching
Amazon Comprehend
<bucket>/matched

Dataflow
Feature Sets
(collected from document)
Name Search
(millions)
Proprietary
Reference Data
Comparator

Benefits
• Benefits to FINRA technology
• Extract individuals and organization
• Match extracted entities to FINRA records
• Flag individuals of interest
• Enable higher level analytics
• Benefits to our customers
• Bring information to the user instead of the user mining for information manually
• Flag individuals of interest to find bad actors that could otherwise go undetected
• Reclaim hours toward high value, strategic efforts over repetitive tactical document reviews
Net Result - Regulatory reviews made

Amazon Comprehend use cases
Content personalization: Customers are using Amazon Comprehend NLP output to understand related
documents based on entities, phrases or even topic similarities for trends analysis, to drive content
personalization, and recommendations
Semantic search: Customers using Amazon Comprehend to index entities for boosting and ranking
search results.
Intelligent data warehouse: Customers are using Amazon Comprehend to query unstructured data in
relational databases, processing data within the data lake (S3), and then inserting it back into the data
warehouse
Social analytics: Customers are using Amazon Comprehend to ingest, process, and analyze trends from
entities and sentiment from social media posts across Twitter and Facebook.
Information management: Customers are using Amazon Comprehend for indexing and finding related
content for enterprise information management and various internal business processes including
compliance and IT.

AWS textanalyticsworkload
Amazon Kinesis
Amazon ES
Amazon Redshift
Amazon EMR
• Boosting
• Rich Filtering
• Grouping, Trends
• Joining, Correlating
• Clustering
• Graph, Search
• Near real-time
• Alerts
Amazon S3
Social Media, Support
Amazon Aurora

[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018

Similar to [REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018 (20)

More from Amazon Web Services

More from Amazon Web Services (20)

[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AWS re:Invent 2018

Editor's Notes