SlideShare a Scribd company logo
1 of 31
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Pop-up Loft
Uses of Data Lakes (Data Lakes in the Wild)
Marie Yap
marieyap@amazon.com
Solutions Architect
David Roberts
drobemz@amazon.com
Solutions Architect
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.2 231
FINRA: Varied Analytic Use Cases
FINRA: Analytics Architecture
Validation
Data Management
Linkage
Data Analytics
Normalization Amazon
EC2
Amazon
S3
Amazon
Glacier
Amazon
Redshift
Amazon
EMR
VPC
Amazon
EMR
Amazon
RDS
Amazon
Machine
Learning
AWS
KMS 12
Batch Analytics Interactive & Visualizations Data Science
FINRA: Interactive Analytics
FINRA: Universal Data Science Platform
16
FINRA: Evolution of the Analytics Portfolio
FINRA: Analytics Impacts
• Removed obstacles
“Before data analysis of this magnitude required intervention from technology.”
“We are now able to see underlying data and visual representation of summaries together
with outliers and anomalies. This reduces our time to market on examinations.”
“We moved away from requesting raw reports to requesting dashboards that provide
meaningful information and tell a story…”
• Lowered the cost of curiosity
“Analysts are able to quickly obtain a full picture of what happens to an order over time,
helping to inform decision making as to whether a rule violation has occurred.”
“[W]ith a click we can now compare firms of our choice or defined peer groups. This helps
use by reducing a lot of noise…”
“Using machine learning algorithms validates our assumptions and makes us data driven”
• Optimize batch and interactive workloads without compromise
• Greater innovation and more engaged staff
21
B IG D ATA IN H E A LT H C A R E
RWE
We need to conduct observational
studies to support a value prop
We need to generate
comprehensive value dossiers to
support
marketing access
CLINICAL
We need to speed up patient
recruitment
We need to closely track our sites
COMMERCIAL
We need to track our sales vs.
forecast
We need to understand market
share
SUPPLY CHAIN
We need to watch our cycle times
We need to track our supply on
hand
REGULATORY
We need to track our global launch
pad
We need to track our regulatory
status
VERY DIFFICULT TO COORDINATE
ACROSS FUNCTIONS AND INFORM KEY
DECISIONS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Predictive Modeling
Targeted Patient Enrollment
Real Time Trial Monitoring
Data Flow Across Boundaries
(functional & organizational)
Scale up CROs to Demand
Open Collaboration with
Partners and Academia
The opportunity
Patient
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data hub architecture
SOURCES
Structured
Semi-Structured
Unstructured
3rd Party Feeds
Licensed Data
EDW
MDM Servers
Project Servers
Planning Systems
CRO
SAP
SFDC
CORE DATA HUB/DATA LAKE
Data Lake
Staging Layer
Conformed
Layer
Application
Layer
Data Discovery
Data
Wrangling
Data
Exploration
Data Access
Fabric
Data Access &
Data Feeds
ML |
Data Stores
SQL Data
Store
NoSQL
In-Memory
Data
Acquisition &
Processing
Data Sourcing
& Transport
Batch
Processing
(ELT)
Audit Balance
& Control
Data Archival |
Data Services
Data Catalog
Data Security
Data Quality
Metadata Mgmt.
Data Lineage
DATA CONSUMPTION
Systems
Consuming
Applications
Conversational
Interfaces
Chatbots
Voice
AssistantsBusiness
Intelligence
Dashboards
Reporting
Ad-Hoc Query
Data Science
Advanced
Analytics
Machine
Learning
FUNCTIONAL
AREAS
Clinical Operations
Research Operations
Medical Affairs
Pharmaceutical
Sciences
Commercial
Regulatory
Portfolio Management
Real World Evidence
Market Access
Supply Chain
Procurement
Visualization
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Solution
STREAMLINED
SOURCES
VISUALOUTPUT
(DASHBOARDS)
Data
Transparency
Data Driven
Decision Making
Systems
Integration
Analytics
Standardization
Accountability Automation
DATA HUB PROJECT CAPABILITIES FOR USER
DATA & ANALYTICS
ENVIRONMENT
DATA HUB
Governed Data
Entry
Workflows
Data Management
Automated
Integration
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Business interfaces (sample)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Platform Enables Us to be “Innovation Ready”
Conversational Interfaces (Amazon Lex)
Chatbots
Voice Assistants
Machine Learning (TensorFlow on AWS)
Patient Outcomes
Disease Pathways
Disease Understanding
innovative
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lessons learned
SPEED RULES
Achieving business
value is the primary
focus
PARTNERSHIPS
ARE THE NEW
NORMAL
More externalization
will occur, data
exchange and
collaboration are
critical
AWS CLOUD
PLATFORM
With Ecosystem of tool
vendors enabled us to
effectively achieve
business value and
rapidly adopt
innovations
CHANGE
MANAGEMENT
Don’t underestimate
and plan appropriately
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sysco foods
A n o v e r v i e w
Sysco is the global leader in selling, marketing, and distributing food products to restaurants, healthcare and
educational facilities, lodging establishments, and other customers who prepare meals away from home.
Sysco operates 197 distribution facilities and serves about half a million customers in 13 countries.
For Fiscal Year 2018 that ended July 1, 2018, Sysco generated sales of more than $55 billion.
COSTA
RICA
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The motivation = Three Year Plan
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SEED
S y s c o E c o s y s t e m f o r E n t e r p r i s e D a t a — a n o v e r v i e w
What is Sysco Ecosystem for Enterprise Data (SEED)?
v SEED is a AWS-based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward,
while also modernizing our technology landscape to enable scalable enterprise-wide data discovery and insights.
v SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security.
v SEED, being cloud native, inherently also helps drive the data science and our agile journey forward with the ability to quickly
stand up sandbox environments for experimentation.
ü Demand-driven model with predictable and affordable costs
ü Stabilization of environments reduced cost of delivery over time
ü Broad and deep functionality to support various use cases within data and analytics
ü Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED?
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SEED
E n a b l i n g a n a l y t i c s n e e d s
Analytical use cases
for the business
Revenue management
• Margins review by market
• Predictive pricing simulations with
external economic data
• Pass through predictive pricing analysis
at all levels of the organization
• Descriptive model for customer
segmentation
Merchandising and supply chain
• Assortment optimization at scale
• Track vendor cost components of items
• Lotting using decision trees
• Forecast vendor price changes
• Market basket analysis
• Warehouse performance analysis
Marketing
• Share of Wallet
• Machine learning for future promotions
• Cross-sell opportunity feeder
• Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas.
SEED
• Analytical sandboxes
• Quicker time to market
• R integration
• Better performing retrievals
• Large data sets
• Unstructured data
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Unlocking the data potential
T h r o u g h e n a b l e m e n t o f d o w n s t r e a m u s e c a s e s i n a n a l y t i c s
Marketing
data
sources
Raw data
storage
Amazon
Redshift
Transformed
Data
Amazon
Athena
Amazon
EMR
ETL
Process
Other
Source
Systems
Generic
Configuration
based Sync
Amazon
Redshift Spectrum
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SEED
S y s c o E c o s y s t e m f o r E n t e r p r i s e D a t a — a f i t f o r p u r p o s e a r c h i t e c t u r e
WMS, IDS, DPR,
Sales, Inventory,
Master Data
SWMS
Amazon S3
Raw data Transformed
Data
Reportable
Data
AWS Lambda Amazon EMR
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Science
ELT / Compute Layer
Storage Layer Analyze Layer
Ingestion//
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue
AWS CloudTrail
Amazon Glacier
archive Metastore
Amazon
Redshift
Spectrum
AWS Glue
eu-west-1 us-east-1 us-west-2
eu-west-1a eu-west-1b eu-west-1c us-east-1c us-east-1d us-east-1e us-west-2a us-west-2b us-west-2c
m4
.large
.xlarge
.2xlarge
.4xlarge
.16xlarge
m3
.medium
.large
.xlarge
.2xlarge
r4
.large
.xlarge
.2xlarge
.4xlarge
.8xlarge
.16xlarge
r3
.large
.xlarge
.2xlarge
.4xlarge
.8xlarge
i2
.xlarge
.2xlarge
.4xlarge
.8xlarge
i3
.large
.xlarge
.2xlarge
.4xlarge
.8xlarge
.16xlarge
d2
.xlarge
.2xlarge
.4xlarge
.8xlarge
Three regions, nine availability zones
7Instancefamilies,35instancetypes
5mainaccounts
1500+
configs!
Automation
Actionable
Insights
Deep Dives
Transparency
Intuitive and Interactive
dashboards
Exploratory analyses and case
studies
Targeted alerts, summary emails
and personalized dashboards
Optimization and Machine
Learning
HIERARCHY OF NEEDS
THE EFFICIENCY
HIERARCHY OF NEEDS
THE EFFICIENCY
What do you need to know before
you can even ask about efficiency ?
“That which is measured improves.
That which is measured and reported improves exponentially.”
– Karl Pearson (or Thomas Monson)
1. Tailor views to specific use cases
2. Add business context
3. When possible, co-locate with existing tools / workflows
Transparency through dashboards, with a few important rules:
Transparency
HIERARCHY OF NEEDS
THE EFFICIENCY
EC2 Alerts (Picsou)
• Compute reservation shortages
across all dimensions (accounts x
zones x instance families)
• List in descending order of cost
• Attribute to top growing apps
• Also sent as a digest email linked
back to Picsou
Actionable
Insights
Data: Billing + Tribal Knowledge + Metadata
HIERARCHY OF NEEDS
THE EFFICIENCY
S3 Storage Class Optimization
• Very similar to AWS S3 Analytics product.
• In fact, we use use AWS S3 access analysis data, but make our own
recommendations.
Automation
HIERARCHY OF NEEDS
THE EFFICIENCY
S3 Storage Class Optimization
• Every recommendation can be explained from the very same dashboard
Automation
Data: AWS S3 Analytics + Tribal Knowledge
Self-Service C2G
Give data producers, consumers and caretakers the ability to
manage their own efficiency :
• Identify all involved parties along a data-topics
• Apportion data-infrastructure cost to all relevant teams
• Quickly notice low usage data-topic
• Estimate data-replication or large sinks to users ratios
Long Term : enable data-platform owners to use this tool or underlying
data to add some automation.
2. This is achieved by implementing the successive layers of our efficiency hierarchy of
needs :
1. Netflix culture, scale, architecture and priorities requires efficiency to be
championed by a central team, but enforced by all engineers.
2a. Transparency to get context,
2b. Deep Dives to tell compelling stories and assemble puzzles,
2c. Actionable Insights to reduce the cognitive load on your organization,
2d. Automation to scale the impact of efficiency efforts.
KEY TAKEWAYS
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Pop-up Loft
aws.amazon.com/activate
Everything and Anything Startups
Need to Get Started on AWS

More Related Content

What's hot

Going Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph AnalyticsGoing Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph AnalyticsCambridge Semantics
 
Customer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer ExperiencesCustomer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer ExperiencesInformatica
 
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Databricks
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingCambridge Semantics
 
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Amazon Web Services
 
Fighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceFighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceDataWorks Summit
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and ManufacturingCloudera, Inc.
 
POSmart & Trade Smart Business Intelligence for CPG
POSmart & Trade Smart Business Intelligence for CPGPOSmart & Trade Smart Business Intelligence for CPG
POSmart & Trade Smart Business Intelligence for CPGJanet Dorenkott
 
Samsung’s First 90-Days Building a Next-Generation Analytics Platform
Samsung’s First 90-Days Building a Next-Generation Analytics PlatformSamsung’s First 90-Days Building a Next-Generation Analytics Platform
Samsung’s First 90-Days Building a Next-Generation Analytics PlatformCloudera, Inc.
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data EcosystemIvo Vachkov
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Cloudera, Inc.
 
Identify Companies Spending Money on Cloud
Identify Companies Spending Money on CloudIdentify Companies Spending Money on Cloud
Identify Companies Spending Money on CloudJayant Verma
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Hortonworks
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessDataWorks Summit/Hadoop Summit
 
Talend winter 2017 overview webinar
Talend winter 2017 overview webinarTalend winter 2017 overview webinar
Talend winter 2017 overview webinarJean-Michel Franco
 
Big Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial ServicesBig Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial ServicesCloudera, Inc.
 

What's hot (20)

Going Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph AnalyticsGoing Beyond Rows and Columns with Graph Analytics
Going Beyond Rows and Columns with Graph Analytics
 
Customer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer ExperiencesCustomer-Centric Data Management for Better Customer Experiences
Customer-Centric Data Management for Better Customer Experiences
 
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail Banking
 
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
Fighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceFighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial Intelligence
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
ESGYN Overview
ESGYN OverviewESGYN Overview
ESGYN Overview
 
POSmart & Trade Smart Business Intelligence for CPG
POSmart & Trade Smart Business Intelligence for CPGPOSmart & Trade Smart Business Intelligence for CPG
POSmart & Trade Smart Business Intelligence for CPG
 
Samsung’s First 90-Days Building a Next-Generation Analytics Platform
Samsung’s First 90-Days Building a Next-Generation Analytics PlatformSamsung’s First 90-Days Building a Next-Generation Analytics Platform
Samsung’s First 90-Days Building a Next-Generation Analytics Platform
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)Vadlamudi saketh30 (ml)
Vadlamudi saketh30 (ml)
 
Who is spending on cloud report
Who is spending on cloud reportWho is spending on cloud report
Who is spending on cloud report
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
Identify Companies Spending Money on Cloud
Identify Companies Spending Money on CloudIdentify Companies Spending Money on Cloud
Identify Companies Spending Money on Cloud
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awareness
 
Talend winter 2017 overview webinar
Talend winter 2017 overview webinarTalend winter 2017 overview webinar
Talend winter 2017 overview webinar
 
Big Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial ServicesBig Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial Services
 

Similar to Uses of Data Lakes: Data Analytics Week SF

Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Amazon Web Services
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Amazon Web Services
 
Choose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day IsraelChoose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day IsraelAmazon Web Services
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)Amazon Web Services
 
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...Amazon Web Services
 
AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2
AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2
AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2Amazon Web Services
 
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...AWS Summits
 
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...Amazon Web Services
 
BI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWSBI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWSAmazon Web Services
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Amazon Web Services
 
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...Amazon Web Services
 
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summits
 
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018Amazon Web Services
 
AWS Public Sector Summit 2018, Data Supply Chain Pipeline
AWS Public Sector Summit 2018, Data Supply Chain PipelineAWS Public Sector Summit 2018, Data Supply Chain Pipeline
AWS Public Sector Summit 2018, Data Supply Chain PipelineStephen Moon
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Amazon Web Services
 
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Amazon Web Services
 
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWS
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWSAWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWS
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWSAWS Summits
 
Leveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsLeveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsAmazon Web Services
 

Similar to Uses of Data Lakes: Data Analytics Week SF (20)

Customer Uses of Data Lakes
Customer Uses of Data LakesCustomer Uses of Data Lakes
Customer Uses of Data Lakes
 
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
 
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
Building a Data Lake for Your Enterprise, ft. Sysco (STG309) - AWS re:Invent ...
 
Choose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day IsraelChoose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day Israel
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
 
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
Data Privacy & Governance in the Age of Big Data: Deploy a De-Identified Data...
 
AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2
AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2
AWS Data-Driven Insights Learning Series_ANZ Sep 2019 Part 2
 
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
 
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
Building with AWS Databases: Match Your Workload to the Right Database | AWS ...
 
BI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWSBI & Analytics - A Datalake on AWS
BI & Analytics - A Datalake on AWS
 
Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML Preparing Your Data for Cloud Analytics & AI/ML
Preparing Your Data for Cloud Analytics & AI/ML
 
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
SaaS Analytics and Metrics: Capturing and Surfacing the Data That's Fundament...
 
AWSome Day 2018 Keynote
AWSome Day 2018 KeynoteAWSome Day 2018 Keynote
AWSome Day 2018 Keynote
 
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
 
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
Social Media Analytics with Amazon QuickSight (ANT370) - AWS re:Invent 2018
 
AWS Public Sector Summit 2018, Data Supply Chain Pipeline
AWS Public Sector Summit 2018, Data Supply Chain PipelineAWS Public Sector Summit 2018, Data Supply Chain Pipeline
AWS Public Sector Summit 2018, Data Supply Chain Pipeline
 
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
Driving Machine Learning and Analytics Use Cases with AWS Storage (STG302) - ...
 
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
 
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWS
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWSAWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWS
AWS Summit Singapore 2019 | Driving Business Outcomes with Data Lake on AWS
 
Leveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsLeveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven Decisions
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Uses of Data Lakes: Data Analytics Week SF

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Pop-up Loft Uses of Data Lakes (Data Lakes in the Wild) Marie Yap marieyap@amazon.com Solutions Architect David Roberts drobemz@amazon.com Solutions Architect
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.2 231
  • 4. FINRA: Analytics Architecture Validation Data Management Linkage Data Analytics Normalization Amazon EC2 Amazon S3 Amazon Glacier Amazon Redshift Amazon EMR VPC Amazon EMR Amazon RDS Amazon Machine Learning AWS KMS 12 Batch Analytics Interactive & Visualizations Data Science
  • 6. FINRA: Universal Data Science Platform 16
  • 7. FINRA: Evolution of the Analytics Portfolio
  • 8. FINRA: Analytics Impacts • Removed obstacles “Before data analysis of this magnitude required intervention from technology.” “We are now able to see underlying data and visual representation of summaries together with outliers and anomalies. This reduces our time to market on examinations.” “We moved away from requesting raw reports to requesting dashboards that provide meaningful information and tell a story…” • Lowered the cost of curiosity “Analysts are able to quickly obtain a full picture of what happens to an order over time, helping to inform decision making as to whether a rule violation has occurred.” “[W]ith a click we can now compare firms of our choice or defined peer groups. This helps use by reducing a lot of noise…” “Using machine learning algorithms validates our assumptions and makes us data driven” • Optimize batch and interactive workloads without compromise • Greater innovation and more engaged staff 21
  • 9. B IG D ATA IN H E A LT H C A R E RWE We need to conduct observational studies to support a value prop We need to generate comprehensive value dossiers to support marketing access CLINICAL We need to speed up patient recruitment We need to closely track our sites COMMERCIAL We need to track our sales vs. forecast We need to understand market share SUPPLY CHAIN We need to watch our cycle times We need to track our supply on hand REGULATORY We need to track our global launch pad We need to track our regulatory status VERY DIFFICULT TO COORDINATE ACROSS FUNCTIONS AND INFORM KEY DECISIONS
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Predictive Modeling Targeted Patient Enrollment Real Time Trial Monitoring Data Flow Across Boundaries (functional & organizational) Scale up CROs to Demand Open Collaboration with Partners and Academia The opportunity Patient
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data hub architecture SOURCES Structured Semi-Structured Unstructured 3rd Party Feeds Licensed Data EDW MDM Servers Project Servers Planning Systems CRO SAP SFDC CORE DATA HUB/DATA LAKE Data Lake Staging Layer Conformed Layer Application Layer Data Discovery Data Wrangling Data Exploration Data Access Fabric Data Access & Data Feeds ML | Data Stores SQL Data Store NoSQL In-Memory Data Acquisition & Processing Data Sourcing & Transport Batch Processing (ELT) Audit Balance & Control Data Archival | Data Services Data Catalog Data Security Data Quality Metadata Mgmt. Data Lineage DATA CONSUMPTION Systems Consuming Applications Conversational Interfaces Chatbots Voice AssistantsBusiness Intelligence Dashboards Reporting Ad-Hoc Query Data Science Advanced Analytics Machine Learning FUNCTIONAL AREAS Clinical Operations Research Operations Medical Affairs Pharmaceutical Sciences Commercial Regulatory Portfolio Management Real World Evidence Market Access Supply Chain Procurement Visualization
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Solution STREAMLINED SOURCES VISUALOUTPUT (DASHBOARDS) Data Transparency Data Driven Decision Making Systems Integration Analytics Standardization Accountability Automation DATA HUB PROJECT CAPABILITIES FOR USER DATA & ANALYTICS ENVIRONMENT DATA HUB Governed Data Entry Workflows Data Management Automated Integration
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business interfaces (sample)
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Platform Enables Us to be “Innovation Ready” Conversational Interfaces (Amazon Lex) Chatbots Voice Assistants Machine Learning (TensorFlow on AWS) Patient Outcomes Disease Pathways Disease Understanding innovative
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lessons learned SPEED RULES Achieving business value is the primary focus PARTNERSHIPS ARE THE NEW NORMAL More externalization will occur, data exchange and collaboration are critical AWS CLOUD PLATFORM With Ecosystem of tool vendors enabled us to effectively achieve business value and rapidly adopt innovations CHANGE MANAGEMENT Don’t underestimate and plan appropriately
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Sysco foods A n o v e r v i e w Sysco is the global leader in selling, marketing, and distributing food products to restaurants, healthcare and educational facilities, lodging establishments, and other customers who prepare meals away from home. Sysco operates 197 distribution facilities and serves about half a million customers in 13 countries. For Fiscal Year 2018 that ended July 1, 2018, Sysco generated sales of more than $55 billion. COSTA RICA
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The motivation = Three Year Plan
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SEED S y s c o E c o s y s t e m f o r E n t e r p r i s e D a t a — a n o v e r v i e w What is Sysco Ecosystem for Enterprise Data (SEED)? v SEED is a AWS-based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward, while also modernizing our technology landscape to enable scalable enterprise-wide data discovery and insights. v SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security. v SEED, being cloud native, inherently also helps drive the data science and our agile journey forward with the ability to quickly stand up sandbox environments for experimentation. ü Demand-driven model with predictable and affordable costs ü Stabilization of environments reduced cost of delivery over time ü Broad and deep functionality to support various use cases within data and analytics ü Improved agility and quality with powerful tools for data manipulations and migrations Why SEED?
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SEED E n a b l i n g a n a l y t i c s n e e d s Analytical use cases for the business Revenue management • Margins review by market • Predictive pricing simulations with external economic data • Pass through predictive pricing analysis at all levels of the organization • Descriptive model for customer segmentation Merchandising and supply chain • Assortment optimization at scale • Track vendor cost components of items • Lotting using decision trees • Forecast vendor price changes • Market basket analysis • Warehouse performance analysis Marketing • Share of Wallet • Machine learning for future promotions • Cross-sell opportunity feeder • Churn analysis The capabilities of SEED allow for the enablement of advanced analytics use cases already defined and requested by the various functional areas. SEED • Analytical sandboxes • Quicker time to market • R integration • Better performing retrievals • Large data sets • Unstructured data
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Unlocking the data potential T h r o u g h e n a b l e m e n t o f d o w n s t r e a m u s e c a s e s i n a n a l y t i c s Marketing data sources Raw data storage Amazon Redshift Transformed Data Amazon Athena Amazon EMR ETL Process Other Source Systems Generic Configuration based Sync Amazon Redshift Spectrum
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. SEED S y s c o E c o s y s t e m f o r E n t e r p r i s e D a t a — a f i t f o r p u r p o s e a r c h i t e c t u r e WMS, IDS, DPR, Sales, Inventory, Master Data SWMS Amazon S3 Raw data Transformed Data Reportable Data AWS Lambda Amazon EMR Amazon Redshift Amazon RDS Extracts Amazon Athena Other BI apps Internal External Data Science ELT / Compute Layer Storage Layer Analyze Layer Ingestion// Collection Layer Auditing and Monitoring Layer Amazon CloudWatch Extracts Consumers Sygma Freshpoint AWS Glue AWS CloudTrail Amazon Glacier archive Metastore Amazon Redshift Spectrum AWS Glue
  • 22.
  • 23. eu-west-1 us-east-1 us-west-2 eu-west-1a eu-west-1b eu-west-1c us-east-1c us-east-1d us-east-1e us-west-2a us-west-2b us-west-2c m4 .large .xlarge .2xlarge .4xlarge .16xlarge m3 .medium .large .xlarge .2xlarge r4 .large .xlarge .2xlarge .4xlarge .8xlarge .16xlarge r3 .large .xlarge .2xlarge .4xlarge .8xlarge i2 .xlarge .2xlarge .4xlarge .8xlarge i3 .large .xlarge .2xlarge .4xlarge .8xlarge .16xlarge d2 .xlarge .2xlarge .4xlarge .8xlarge Three regions, nine availability zones 7Instancefamilies,35instancetypes 5mainaccounts 1500+ configs!
  • 24. Automation Actionable Insights Deep Dives Transparency Intuitive and Interactive dashboards Exploratory analyses and case studies Targeted alerts, summary emails and personalized dashboards Optimization and Machine Learning HIERARCHY OF NEEDS THE EFFICIENCY
  • 25. HIERARCHY OF NEEDS THE EFFICIENCY What do you need to know before you can even ask about efficiency ? “That which is measured improves. That which is measured and reported improves exponentially.” – Karl Pearson (or Thomas Monson) 1. Tailor views to specific use cases 2. Add business context 3. When possible, co-locate with existing tools / workflows Transparency through dashboards, with a few important rules: Transparency
  • 26. HIERARCHY OF NEEDS THE EFFICIENCY EC2 Alerts (Picsou) • Compute reservation shortages across all dimensions (accounts x zones x instance families) • List in descending order of cost • Attribute to top growing apps • Also sent as a digest email linked back to Picsou Actionable Insights Data: Billing + Tribal Knowledge + Metadata
  • 27. HIERARCHY OF NEEDS THE EFFICIENCY S3 Storage Class Optimization • Very similar to AWS S3 Analytics product. • In fact, we use use AWS S3 access analysis data, but make our own recommendations. Automation
  • 28. HIERARCHY OF NEEDS THE EFFICIENCY S3 Storage Class Optimization • Every recommendation can be explained from the very same dashboard Automation Data: AWS S3 Analytics + Tribal Knowledge
  • 29. Self-Service C2G Give data producers, consumers and caretakers the ability to manage their own efficiency : • Identify all involved parties along a data-topics • Apportion data-infrastructure cost to all relevant teams • Quickly notice low usage data-topic • Estimate data-replication or large sinks to users ratios Long Term : enable data-platform owners to use this tool or underlying data to add some automation.
  • 30. 2. This is achieved by implementing the successive layers of our efficiency hierarchy of needs : 1. Netflix culture, scale, architecture and priorities requires efficiency to be championed by a central team, but enforced by all engineers. 2a. Transparency to get context, 2b. Deep Dives to tell compelling stories and assemble puzzles, 2c. Actionable Insights to reduce the cognitive load on your organization, 2d. Automation to scale the impact of efficiency efforts. KEY TAKEWAYS
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved Pop-up Loft aws.amazon.com/activate Everything and Anything Startups Need to Get Started on AWS