This document summarizes Sid Anand's microtalk on cloud native predictive data pipelines at a PayPal risk infrastructure all-hands meeting in January 2018. It discusses Agari's approach to detecting spear phishing emails in near real-time by intercepting emails and sending metadata to AWS cloud services for trust modeling and scoring, then returning signals to quarantine, label, or pass through emails. The architecture uses microservices, decoupled services, immutable services, polyglot persistence leveraging various AWS services, and focuses on building trust models for scoring emails in near real-time and batch processing.
8. • You may have been phished in
your personal email inbox
• This is a bigger problem for
enterprises
• ~80% of enterprise-targeted
attacks are via email
What does Agari do?
10. How does Agari work?
Enterprise
Customer
SEG
• A (targeted) spear-phish email is sent to an individual
at your company
• The email content is unique and personalized to a
specific employee
11. How does Agari work?
• A (targeted) spear-phish email is sent to an individual
in your company
• The email content is unique and personalized to a
specific employee
• Your company’s SEG (a.k.a. spam detector) lets it
through since it doesn’t match known signatures of
spam campaigns
Enterprise
Customer
SEG
12. AWS Cloud
How does Agari work?
Enterprise
Customer
SEG
Email
Metadata
Build Trust
Models &
Score
• Agari’s on-premise interceptor holds the email & sends
email headers to its Cloud-based SAAS prediction
service
• Agari assigns a trust score to the email, building a
model on the fly if more information is needed, and
records the result in a DB
13. AWS Cloud
Quarantine,
Label,
PassThrough
How does Agari work?
Enterprise
Customer
SEG
Email
Metadata
Build Trust
Models &
Score
• The prediction service sends a control signal back to
the on-premise interceptor to quarantine, label, or
release the held email message
14. AWS Cloud
Quarantine,
Label,
PassThrough
How does Agari work?
Enterprise
Customer
SEG
Email
Metadata
Build Trust
Models &
Score
• 95%ile SLA <3 seconds (end-to-end)
• Agari also provides near-realtime analytics on the mail
flow & actions
20. … With Nightly Model Building
K
enterprise C
enterprise B
enterprise A
Counter
K K
Importer
ASG
K
ES Upd8r
Alerter
ASG
SQS
SR SR SR
SR
SR
Scorer
ASG
S3 Upd8r
S3
21. Architectural Concepts
• Microservice Architecture - each service does a simple job but does it well
• Decoupled Services - Via Message Buses (Kinesis) & Avro (great support for schema
evolution)
• Immutable Services - Nightly models are packaged with code (co-versioned) in a new
image that is rolled out via the autoscaler
• Polyglot Persistence - Use the right datastore for the right job
• Postgres for message details
• ES for aggregates & search
• Redis for low-latency, high-frequency windowed counter-style R/W workloads
• Single source(s) of truth - S3 and Postgres are the sources of truth for semi-structured and
structured data, respectively. Everything else can be built from them
• Use Lambda/FaaS when possible for light-weight event processing - Note : it’s stateful!
• Leverage CDC - Create a stream of committed data! Avoid Write-then-Read patterns
22. Architectural Components
Component Role Details Pros Operability Model
Data Lake
• All data stored in S3 via
Kinesis Firehose
Scalable, Available,
Performant, Serverless
Serverless
Kinesis Messaging
• Streaming transport
modeled on Kafka
Scalable, Available,
Serverless
Serverless
General
Processing
• ASG Replacement
except for Rails Apps
Scalable, Available,
Serverless
Serverless
ASG
General
Processing
• Used for importing, data
cleansing, business logic
Scalable, Available,
Managed
Managed
Data Science
Processing
• Model Building Agari Operates
Workflow
Engine
• Nightly model builds +
some classic Ops cron
workloads
Lightweight, DAGs as
Code
Agari Operates
DB
Persistence for
WebApp
• Holds smaller subset of
data needed for Web
App
Rails + Postgres
‘nuff said
Agari Operates
Persistence for
WebApp
• Aggregation + Search
moved from DB to ES
• Model Building queries
moved to Elasticache
Redis
Faster. more accurate for
aggregates, frees up
headroom for DB
(polyglot persistence)
Managed
S3