Data Analytics Week at the San Francisco Loft
Uses of Data Lakes
Examples of using data lakes from different AWS customers.
Speakers:
John Mallory - Principal Business Development Manager Storage (Object), AWS
Marie Yap - Enterprise Solutions Architect, AWS
8. FINRA: Analytics Impacts
• Removed obstacles
“Before data analysis of this magnitude required intervention from technology.”
“We are now able to see underlying data and visual representation of summaries together
with outliers and anomalies. This reduces our time to market on examinations.”
“We moved away from requesting raw reports to requesting dashboards that provide
meaningful information and tell a story…”
• Lowered the cost of curiosity
“Analysts are able to quickly obtain a full picture of what happens to an order over time,
helping to inform decision making as to whether a rule violation has occurred.”
“[W]ith a click we can now compare firms of our choice or defined peer groups. This helps
use by reducing a lot of noise…”
“Using machine learning algorithms validates our assumptions and makes us data driven”
• Optimize batch and interactive workloads without compromise
• Greater innovation and more engaged staff
21
9. B IG D ATA IN H E A LT H C A R E
RWE
We need to conduct observational
studies to support a value prop
We need to generate
comprehensive value dossiers to
support
marketing access
CLINICAL
We need to speed up patient
recruitment
We need to closely track our sites
COMMERCIAL
We need to track our sales vs.
forecast
We need to understand market
share
SUPPLY CHAIN
We need to watch our cycle times
We need to track our supply on
hand
REGULATORY
We need to track our global launch
pad
We need to track our regulatory
status
VERY DIFFICULT TO COORDINATE
ACROSS FUNCTIONS AND INFORM KEY
DECISIONS
25. HIERARCHY OF NEEDS
THE EFFICIENCY
What do you need to know before
you can even ask about efficiency ?
“That which is measured improves.
That which is measured and reported improves exponentially.”
– Karl Pearson (or Thomas Monson)
1. Tailor views to specific use cases
2. Add business context
3. When possible, co-locate with existing tools / workflows
Transparency through dashboards, with a few important rules:
Transparency
26. HIERARCHY OF NEEDS
THE EFFICIENCY
EC2 Alerts (Picsou)
• Compute reservation shortages
across all dimensions (accounts x
zones x instance families)
• List in descending order of cost
• Attribute to top growing apps
• Also sent as a digest email linked
back to Picsou
Actionable
Insights
Data: Billing + Tribal Knowledge + Metadata
27. HIERARCHY OF NEEDS
THE EFFICIENCY
S3 Storage Class Optimization
• Very similar to AWS S3 Analytics product.
• In fact, we use use AWS S3 access analysis data, but make our own
recommendations.
Automation
28. HIERARCHY OF NEEDS
THE EFFICIENCY
S3 Storage Class Optimization
• Every recommendation can be explained from the very same dashboard
Automation
Data: AWS S3 Analytics + Tribal Knowledge
29. Self-Service C2G
Give data producers, consumers and caretakers the ability to
manage their own efficiency :
• Identify all involved parties along a data-topics
• Apportion data-infrastructure cost to all relevant teams
• Quickly notice low usage data-topic
• Estimate data-replication or large sinks to users ratios
Long Term : enable data-platform owners to use this tool or underlying
data to add some automation.
30. 2. This is achieved by implementing the successive layers of our efficiency hierarchy of
needs :
1. Netflix culture, scale, architecture and priorities requires efficiency to be
championed by a central team, but enforced by all engineers.
2a. Transparency to get context,
2b. Deep Dives to tell compelling stories and assemble puzzles,
2c. Actionable Insights to reduce the cognitive load on your organization,
2d. Automation to scale the impact of efficiency efforts.
KEY TAKEWAYS