More Related Content Similar to Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent 2018 (20) More from Amazon Web Services (20) Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent 20182. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Roy Hasson
Sr. Mgr, Business Development – Amazon Athena
AWS
Shane Andrade
Principal Engineer I - Email Infra Data Team
SendGrid
Amazon Athena: What’s new and how
SendGrid innovates using Athena
A N T 3 2 4
3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Customer trends
• Workload isolation and cost controls
• How SendGrid built email replay using Amazon Athena
4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Athena Customers
5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Understanding our users
Data Consumer
• Easily discover data
• Choice of tools
• Performance
Data Engineer
• Security
• Maintainability & Scale
• Performance
• Cost
7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying Your Data Lake
Devices
Web
Sensors
Social
EDW
Amazon Kinesis Data
Firehose writes
partitioned optimized
data
Ingest streaming
events in real time
with Amazon Kinesis
- Ingestion
S3://bucket/year/month/day/hour/file.parquet
S3://bucket/year/month/day/hour/file.orc
8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ingestion: Database and Data Warehouse
Devices
Web
Sensors
Social
EDW
Move snapshots and
incremental DB and
DWH tables
S3://bucket/table/LOAD001.csv
S3://bucket/table/20181127-1134010000.csv
S3://bucket/year/month/day/hour/file.parquet
S3://bucket/year/month/day/hour/file.orc
9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying Your Data Lake – Transform & Automate
Devices
Web
Sensors
Social
EDW
Automate routine
tasks such as data
cleansing
Perform unique data
transformations and
ML
10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying Your Data Lake – Catalog
Devices
Web
Sensors
Social
EDW
AWS Glue
Data Catalog
Permissions
Store transformed
data, crawl and
catalog its schema
Restrict access by
defining permissions
on databases and
tables
11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Catalog: Access Control
AWS Glue
Data Catalog
{
"Effect": "Allow",
"Action": [
"glue:GetTables”,
"glue:GetTable”,
],
"Resource": [
"arn:aws:glue:us-east-1:123456789012:catalog",
"arn:aws:glue:us-east-1:123456789012:database/example_db",
"arn:aws:glue:us-east-1:123456789012:table/example_db/*"
]
}
12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Querying Your Data Lake – Consume
Devices
Web
Sensors
Social
EDW
AWS Glue
Data Catalog
Data ConsumerData Engineer
13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data consumption
Permissions
Data Lake
AWS Cloud
AWS Cloud
Reporting &
Analytics
Machine
Learning
AWS Cloud
Custom
Applications
AWS Glue
Data Catalog
14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visualize your data with your favorite tools
Featured Athena Partners
Amazon QuickSight
15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data consumption – Data Analyst
AWS Glue
Data Catalog
JDBC/ODBC drivers
connect common BI
and SQL tools
Now 2-5x faster
Create optimized
tables on-demand
using Create Table
As Select
Abstract complex
queries & expose
only needed data
with Athena Views
16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data consumption – Data Analyst
JDBC/ODBC driver
integration with
Microsoft Active
Directory
jdbc:awsathena://AwsRegion=us-east-1;
S3OutputLocation=s3://bucket/path;
AwsCredentialsProviderClass=com.simba.athena.iamsupport.plugin.AdfsCredentialsProvider;
idp_host=example.adfs.server;
idp_port=233;
UID=HOMEjsmith;
PWD=simba12345;
preferred_role=arn:aws:iam::123456789123: role/JSMITH;
17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data consumption – Automated Reporting
athena.startQueryExecution("SELECT * FROM business_view”)
Query_ID
1
2
3 4
Email
notification
5
1. Schedule query
2. Track QueryID for status
3. Query results to Amazon S3
4. New file trigger
5. Job complete notification
18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data consumption – Data Scientist
AWS Glue
Data Catalog
Use PyAthena to query
Athena tables directly
from Amazon SageMaker
notebooks
19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data consumption – Custom Applications
AWS Glue
Data Catalog
Integrate with AWS
AppSync for easy access
to data, on and off-line
Get data to your
applications using AWS
SDK and Athena API
20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Inspecting AWS service logs
Service logs are
written directly to
Amazon S3
- Ingestion
21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Inspecting AWS service logs – Optimization
Optimize and
partition to improve
performance and
cost
Data stored
partitioned in Apache
Parquet format
22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimization: Small Files
It is recommend to
merge many small
files into fewer larger
ones Improves performance by up to 5x
when accessing tables containing
large number of small files
Size (Bytes) File name
14408 kfhconnectblog-parquet-1-2018-05-11-16-01-9e7cc0b631d8.parquet
14408 kfhconnectblog-parquet-1-2018-05-11-16-01-206cf7098588.parquet
14408 kfhconnectblog-parquet-1-2018-05-11-16-03-6a3fa4c14e22.parquet
23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Inspecting AWS service logs – Catalog
AWS Glue
Data Catalog
Permissions
AWS Glue crawler
catalogs data schema
and partitions
Restrict access by
defining permissions
on databases and
tables
24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Inspecting AWS service logs – Consume
AWS Glue
Data Catalog
Data ConsumerData Engineer
25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Athena Workgroups
Athena Workgroups are used to isolate queries
between different teams, workloads or applications,
and to set limits on amount of data each query or the
entire workgroup can process
Workload Isolation Query Metrics Cost Controls
27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Workgroups – Workload Isolation
Unique query output
location per
Workgroup
Encrypt results with
unique AWS KMS key
per Workgroup
Collect and publish
aggregated metrics
per Workgroup to
AWS CloudWatch
Use Workgroup
settings eliminating
need to configure
individual users
28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Workgroups – Metric Reporting
Total bytes scanned
per Workgroup
Total failed queries
per Workgroup
Total successful
queries per
Workgroup
Total query execution
time per Workgroup
29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Workgroups – Cost Controls
• Per query data scanned threshold; exceeding, will cancel query
• Trigger alarms to notify of increasing usage and cost
• Disable Workgroup when all queries exceed a maximum threshold
Any Athena metric: successful/failed & total queries, query run time, etc.
30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Workgroups – Usage Notifications
Define a hierarchy of
alarms to be alerted
as usage increases
31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Denver, Colorado | San Francisco, California | Irvine, California | London, England
78,000 customers in 100+ countries | 45B emails monthly | 4 offices
33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
OUR CUSTOMERS
Powering the customer engagement for the world’s
leading digital brands
34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SMTP/API
Transactional
Marketing
Campaigns
Promotional
Email ActivityParse API
35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Originally built entirely on prem
36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customers were limited to 7 days of data
37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
High volume customers were more constrained
38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
New customer base had different needs
39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Provisioning Risks
40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Serverless Elasticity API Integration
42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Initial architecture during internal beta
43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Controlled access to Athena
44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
We tested Athena query times vs file counts
45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Historical data needed to be handle separately
46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Athena architecture V3
AWS
GlueDynamoDB
47. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Benefits from using Athena
Scalability
Enable us to support
increased Email Activity data
in the future with little to no
additional cost.
Reduced variable
costs and ops
tickets
Improved customer
satisfaction
regarding access
to email data
48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Serverless Elasticity API Integration
49. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
50. Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Roy Hasson
royon@amazon.com
Shane Andrade
shane.andrade@sendgrid.com