Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Driving Business Insights
with a Modern Data Architecture
Craig Stires
Head of Big Data and Analytics
Amazon Web Services, APAC

Today's conversation
Iterative design for business outcomes
Building for scale and for speed

Data analyzed for benefit
Available data
Should we collect "all the data" and see what's in it?
COST
VALUE
Investment value of analytics
2010 2015 2020 2025
Datavolume

Starting by amassing "all your data" and dumping
into a large repository for the data gurus to start
finding "insights" is like trying to win the lottery by
buying all the tickets

Three big indicators of individual behavior
Purchases Movement Influence

A platform to build business outcomes from data
Purchases
Movement
Influence
Ingest/
Collect
Consume/
visualize
Store
Process/
analyze
1 4
0 9
5
Revenue Lift
Market
acquisition
Customer delight
Brand advocacy
Inventory
optimization
Supply chain
efficiency
...

Design an on-demand experimentation sandbox
... and only pay for what you use

Experimentation is fast and cost-efficient
Design once, automatically deploy many times
AWS
CloudFormation

Quickly find the right outcomes, and turn off
the rest -- Win fast, fail cheap

AWS Cloud is a robust technology infrastructure platform
delivered on-demand, via the internet, with pay-as-you-go pricing
Over 80 services designed for security, scale, and availability

Modern data architectures for business insights at scale

Outcome 1 : Modernize and Consolidate
Enhancing business applications and creating new digital
services involves the modernization and consolidation of
existing legacy applications and operational systems.
Business goals often consist of being an agile, well-run
organization, and to stop missing opportunities because
people are making decisions without accurate insights.#FOMO

Common initiatives
• Insights: 360 view of the business
• Digitization: Web-service that gives on-demand insights
• Data monetization: Enrich, aggregate, and sell business data
Outcome 1 : Modernize and Consolidate

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Data analysts
Start with the business case, and the personas

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Business users
External buyers
Data analysts
Extract and ingest data from on-premise systems and internet-native sources
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Transactions
Web logs /
cookies
ERP

Decouple Storage and Compute
Legacy design was large databases or
data warehouses with integrated
hardware
Big Data architectures often benefit
from decoupling storage and compute

Amazon
S3
Highly available object storage
99.999999999% data durability
Replicated across 3 facilities
Virtually unlimited scale
Pay only for usage, no pre-provisioning
Event notifications to trigger actions

Amazon
EMR
Fully managed Hadoop
Optimized with S3
Autoscaling for elasticity
Transient and long running clusters
Integration with AWS Spot Market

Hadoop at scale - best when built for purpose
Large Scale ETL
• "finish before 5a"
• Time insensitive
• Great for leveraging
Spot
• Use EMRFS w/S3
Analytic modelling;
iterative discovery
• "massive grid processing"
• On-demand from 9a-6p
• Agile binning strategies
• Use EMRFS w/S3
Un/semi-structured
data processing
• "process stream chunks"
• Runs 24x7
• Use EC2 RIs
• Use S3/Lambda triggers

Fully managed
MPP SQL database - fully relational
Optimised for analytics
Gigabytes to Petabytes
Less than 1/10th the cost of traditional
data warehouse technologies
Amazon
Redshift

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Business users
External buyers
Data analysts
Process data for ETL, cleansing, tagging, and place into Staged Data (Data Lake)
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETLTransactions
Web logs /
cookies
ERP

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Business users
External buyers
Data analysts
Secure all data and services, and enable governance
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Transactions
Web logs /
cookies
ERP

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Business users
External buyers
Data analysts
Load the Data Warehouse and other database platforms
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Transactions
Web logs /
cookies
ERP

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Business users
External buyers
Data analysts
Serve users through BI tools, dashboards, or API access
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Amazon
QuickSight
Amazon
API Gateway
Transactions
Web logs /
cookies
ERP

Outcome 2 : Innovate for new revenues
Organizations start operating based on what they know
about their customers, and can approach new ventures in
terms of confidence levels.
Product launches, campaigns, supply chain management,
packaged services, and customized offerings are designed
and executed based on predictive models.#KnownUnknown

Common initiatives
• Personalization: Refine market approaches on optimal segments
• Predict demand: Guide business owners to select best scenarios
• Risk measurement: Create freedom to act by quantifying exposures

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Insights to enhance business applications, new digital servicesStart with the business case, and the personas
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Engagement platforms
Transactions
Web logs /
cookies
ERP

Real-Time data processing large distributed streams
Elastic capacity scales to millions of events / second
Handle incoming stream events in real-time
Stream storage replicated across 3 facilities
Amazon
Kinesis

Interactive query service to analyze data
in Amazon S3 directly using standard SQL
No need to move data
No infrastructure to setup & manage
Fast -- results within seconds
Pay for only the queries you run
Amazon
Athena

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Insights to enhance business applications, new digital servicesData analysts can process data as "schema on read"
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Insights to enhance business applications, new digital servicesData scientists produce predictive models and other analysis
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media
Amazon EMR
MLlib
Deep Learning
Amazon ML

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Insights to enhance business applications, new digital servicesEngagement is automated, based on advanced analytics
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media
Advanced
Analytics
MLlib

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media
Advanced
Analytics
MLlib

Modern data architectures for real-time analytics and engagement

Outcome 3 : Real-time Engagement
Provide superior customer service by responding to
opportunities in real time. Fulfill requests for products or
services in an automated fashion to create a strong
competitive advantage over those that are unable to.
Adding another layer of opportunity and complexity is the use
of vast streams of data from devices that are measuring
location, video, behaviors, environmental conditions, and
more.
#WindowOfOpportunity

Common initiatives
• Interactive CX: Natural customer journeys with adaptive interfaces
• Event-driven automation: Triggered execution of business process
• Fraud detection: Protect customer and business interests
Outcome 3 : Real-time Engagement

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media
Advanced
Analytics
MLlib
Event Capture
Amazon Kinesis
Events are captured in the speed layer
Stream Analysis
Amazon EMR
Automation / events

Fully managed serverless compute
Can load data sources (S3, DynamoDB)
automatically into your data architecture (e.g.
Amazon Redshift)
Can be triggered in real-time by incoming events
in Amazon Kinesis, or changes to Amazon S3
buckets
Amazon
Lambda

Amazon Rekognition
Image Recognitions and Analysis
powered by Deep Learning which
allows to search, verify and organize
millions of images
Easy to use Batch Analysis Real-time
Analysis
Continually Improving Low Cost

Maple
Villa
Plant
Garden
Water
Swimming Pool
Tree
Potted Plant
Backyard

Demographic Data
Facial Landmarks
Sentiment Expressed
Image Quality
Brightness: 25.84
Sharpness: 160
General Attributes

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media
Advanced
Analytics
MLlib
Event Capture
Amazon Kinesis
The event handler sends for scoring, getting in-flight enrichment signals
Stream Analysis
Amazon EMR Event Scoring
Amazon AI
Event Handler
AWS Lambda
Automation / events

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media
Advanced
Analytics
MLlib
Event Capture
Amazon Kinesis
Published models are used, or black box services are called
Stream Analysis
Amazon AI
Event Handler
AWS Lambda
Automation / events

Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Transactions
Web logs /
cookies
ERP
AWS Database
Migration
AWS Direct
Connect
Internet
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
AWS
Cloud Trail
AWS
IAM
Amazon
CloudWatch
AWS
KMS
Data analysts
Data scientists
Business users
Amazon
ElasticSearch
Amazon Athena
Amazon
Kinesis
Connected
devices
Social media
Advanced
Analytics
MLlib
Event Capture
Amazon Kinesis
Responses are pushed for near real-time action
Stream Analysis
Amazon AI
Event Handler
AWS Lambda Response Handler
AWS Lambda
Near-Zero Latency
Amazon DynamoDB
Automation / events

Outcome 1 : Modernize and consolidate
• Insights to enhance business applications and create new digital services
• Personalization, demand forecasting, risk analysis
Outcome 3 : Real-time engagement
• Interactive customer experience, event-driven automation, fraud detection
Outcome 4 : Automate for expansive reach
• Automation of business processes and physical infrastructure
Business Outcomes on a Modern Data Architecture

Taking your ideas and building the next big thing

Outcomes from a modern data architecture
Start with the business case
Experiment and iterate Deploy with automation
De-couple to scale

Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017

Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017

Similar to Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017 (20)

More from Amazon Web Services

More from Amazon Web Services (20)

Recently uploaded

Recently uploaded (20)

Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017