SlideShare a Scribd company logo
1 of 31
Download to read offline
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Diego Magalhães
Sr. Solutions Architect, Worldwide Public Sector
Amazon Web Services
Sept, 2018
Finding Meaning in the Noise:
Understanding Big Data with AWS Analytics
One tool to
rule them all
AWS Big Data and Analytic Services
Any analytic workload, any scale, at the lowest possible cost
Insights
Analytics
Data Lake
Data Movement
Amazon
QuickSight
Amazon
SageMaker
AWS Glue
(ETL & Data Catalog)
S3/Amazon
Glacier
(Storage)
Amazon
Redshift
+Spectrum
Amazon
EMR
Amazon
Athena
Amazon Elasticsearch
Service
Amazon Kinesis
Analytics
Database Migration Service | Snowball | Kinesis Data Firehose | Kinesis Data Streams
Real-time
Amazon Comprehend
DW Big data processing Interactive
Big Data on AWS
Immediate Availability. Deploy instantly. No hardware to
procure, no infrastructure to maintain & scale
Trusted & Secure. Designed to meet the strictest
requirements. Continuously audited, including certifications
such as ISO 27001, FedRAMP, DoD CSM, and PCI DSS.
Broad & Deep Capabilities. Over 100 services and 100s
of features to support virtually any big data application &
workload
Hundreds of Partners & Solutions. Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack.
The Modern Data Architecture
Traditionally, Analytics Used to Look Like This
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence Relational data
TBs-PBs scale
Schema defined prior to data load
Operational reporting and ad hoc
Large initial capex + $10K–$50K / TB / Year
New requirements break the traditional approach
Secure and combine
data from new and
existing sources
Do new types of
analysis (ML, big data
& real-time)
Capture and store new
non-relational data at
EB scale
Customers
need to:
Data exists in silos,
ETL does not scale
at EB data volumes
Operational
and ad hoc on
relational only
DW is optimized
for relational data
at PB scale
Challenges with
traditional approach:
Data Lakes Extend the Traditional Approach
Relational and non-relational data
TBs-EBs scale
Schema defined during analysis
Diverse analytical engines to gain insights
Designed for low-cost storage and analytics
OLTP ERP CRM LOB
Data Warehouse
Business
Intelligence
Data Lake
100110000100101011100
101010111001010100001
011111011010
0011110010110010110
0100011000010
Devices Web Sensors Social
Catalog
Machine
Learning
DW
Queries
Big data
processing
Interactive Real-time
Storing is not enough, data needs to be discoverable
Dark data are the information
assets organizations collect,
process, and store during
regular business activities,
but generally fail to use for
other purposes (for example,
analytics, business relationships
and direct monetizing).
Gartner
CRM ERP Data warehouse Mainframe
data
Web Social Log
files
Machine
data
Semi-
structured
Unstructured
“
”
1. Automated and Reliable Data Ingestion
2. Preservation of Original Source Data
3. Lifecycle Management and Cold Storage
4. Metadata Capture
5. Managing Governance, Security and Privacy
6. Self-Service Discovery, Search, and Access
7. Managing Data Quality
8. Preparing for Analytics
9. Orchestration and Job Scheduling
10. Capturing Data Change
Storage & Streams
Catalogue & Search
Entitlements
API & UI
Attributes of a Modern
Data Architecture
Key Pillars of a
Data Lake
Key Components of a Successful Data Strategy
Building a Data Lake on AWS
Athena
Query Service
AWS Batch AWS GlueIoT Lambda Amazon SageMaker
Amazon
QuickSight
Amazon
Redshift
Amazon
EMR
Building a Data Lake on AWS
Automated and reliable data ingestion
Building a Data Lake on AWS
Preservation of Original Source Data
Lifecycle Management and Cold Storage
Capturing Data Change
AWS GlueAmazon Glacier
Building a Data Lake on AWS
Metadata Capture
AWS Glue
Amazon
ElastiSearchDynamoDB
Building a Data Lake on AWS
Managing Governance, Security, Privacy
AWS Glue
Building a Data Lake on AWS
Self-Service Discovery, Search, Access
AWS Glue
Amazon
Cognito
Identity & Access
Management
API Gateway
Building a Data Lake on AWS
Managing Data QualityLambda
AWS Glue
Building a Data Lake on AWS
Preparing for Analytics
Lambda
AWS Glue
Building a Data Lake on AWS
Orchestration and Job Scheduling
AWS Glue
Lambda
Step Functions
CloudWatch
Simple Workflow Service
Processing Data for
Analytics on Your
Data Lake
Data Lake
Reference
Architecture
What about serverless?
Serverless Analytics
Deliver cost-effective analytic solutions faster
S3
Data Lake
AWS Glue
(ETL & Data
Catalog)
Amazon
Athena
Amazon
QuickSight
Serverless;
zero infrastructure;
zero administration
Never pay for
idle resources
$
Availability and
fault tolerance
built in
Automatically
scales resources
with usage
AWS IoT
Devices Web Sensors Social
What does the customer say?
https://aws.amazon.com/solutions/case-studies/analytics/
https://aws.amazon.com/solutions/case-studies/big-data/
FINRA Analyzes Billions of Transactions Daily
To respond to
rapidly changing
market dynamics,
FINRA, moved 75% of
its operations to
Amazon Web
Services, using AWS
to analyze 75B
records a day.
FINRA uses Amazon EMR and Amazon S3 to process up to 75 billion
trading events per day and securely store over 5 petabytes of data,
attaining savings of $10-20mm per year.
Fraud Detection
• AWS enables you to build sophisticated data strategies and related
analytics applications
• Retrospective, Real-time, Predictive
• You can build incrementally, adding use cases and increasing scale
as you go
• AWS provides a broad range of security and auditing features to
enable you to meet your security requirements
https://aws.amazon.com/big-data/
• Prescriptive guidance and rapidly deployable solutions to help
you store, analyze, and process big data on the AWS Cloud
• Derive Insights from IoT in Minutes using AWS IoT, Amazon
Kinesis Firehose, Amazon Athena, and Amazon QuickSight
• Deploying a Data Lake on AWS - March 2017 AWS Online Tech
Talks
• Harmonize, Search, and Analyze Loosely Coupled Datasets on
AWS with AWS Glue, Amazon Athena, and Amazon QuickSight
• From Data Lake to Data Warehouse: Enhancing Customer 360
with Amazon Redshift Spectrum
• Implement Continuous Integration and Delivery of Apache Spark
Applications using AWS
http://amzn.to/2vHIwBq
http://amzn.to/2i9gqZn
http://bit.ly/2qipA8h
http://amzn.to/2qpiFaK
http://amzn.to/2lpbc8p
Resources
http://amzn.to/2gIJcj8
Summary
Cloud and Big Data - perfect match: agility and new opportunities
AWS provides comprehensive Analytics, Security, Compliance capabilities
Data Lake requirements and use cases vary
Use AWS Partner Network (APN) and Open Source tools for specific needs
Join us for our first-ever Amazon Web Services Summit
in Ottawa on October 29, 2018
15 sessions featuring various
management and technical
topics
Connect with AWS & our
Canadian public sector partners
in the Solutions Expo
Meet and mingle with other public
sector customers from government,
education, and nonprofits
Register today!
aws.amazon.com/summits/ottawa-public-sector
We value your feedback!
Please share your feedback on the
AWS Public Sector Summit survey.
Survey will be emailed 24-48 hours following event.

More Related Content

What's hot

(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
Amazon Web Services
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
Amazon Web Services
 

What's hot (20)

Big data on aws
Big data on awsBig data on aws
Big data on aws
 
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
(ISM213) Building and Deploying a Modern Big Data Architecture on AWS
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Amazon big success using big data analytics
Amazon big success using big data analyticsAmazon big success using big data analytics
Amazon big success using big data analytics
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWS
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
 
Big Data Integration & Analytics Data Flows with AWS Data Pipeline (BDT207) |...
Big Data Integration & Analytics Data Flows with AWS Data Pipeline (BDT207) |...Big Data Integration & Analytics Data Flows with AWS Data Pipeline (BDT207) |...
Big Data Integration & Analytics Data Flows with AWS Data Pipeline (BDT207) |...
 
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
 
Structured, Unstructured and Streaming Big Data on the AWS
Structured, Unstructured and Streaming Big Data on the AWSStructured, Unstructured and Streaming Big Data on the AWS
Structured, Unstructured and Streaming Big Data on the AWS
 
Building a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - WebinarBuilding a Modern Data Architecture on AWS - Webinar
Building a Modern Data Architecture on AWS - Webinar
 
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
 
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
 
Best Practices for Building a Data Lake on AWS
Best Practices for Building a Data Lake on AWSBest Practices for Building a Data Lake on AWS
Best Practices for Building a Data Lake on AWS
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Introduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud ComputingIntroduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud Computing
 
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesIntroduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
 

Similar to Finding Meaning in the Noise: Understanding Big Data with AWS Analytics

Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Amazon Web Services
 

Similar to Finding Meaning in the Noise: Understanding Big Data with AWS Analytics (20)

Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
 
AWS Initiate Berlin - Das Zeitalter von Big Data - So nutzen Sie Daten zum En...
AWS Initiate Berlin - Das Zeitalter von Big Data - So nutzen Sie Daten zum En...AWS Initiate Berlin - Das Zeitalter von Big Data - So nutzen Sie Daten zum En...
AWS Initiate Berlin - Das Zeitalter von Big Data - So nutzen Sie Daten zum En...
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
 Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
Big Data Meets AI - Driving Insights and Adding Intelligence to Your Solutions
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 
ABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data ApplicationsABD202_Best Practices for Building Serverless Big Data Applications
ABD202_Best Practices for Building Serverless Big Data Applications
 
AWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scale
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWS
 
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWS
 
Value of Data Beyond Analytics by Darin Briskman
 Value of Data Beyond Analytics by Darin Briskman Value of Data Beyond Analytics by Darin Briskman
Value of Data Beyond Analytics by Darin Briskman
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
Build Data Lakes and Analytics on AWS: Patterns & Best Practices - BDA305 - A...
 
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Finding Meaning in the Noise: Understanding Big Data with AWS Analytics

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Diego Magalhães Sr. Solutions Architect, Worldwide Public Sector Amazon Web Services Sept, 2018 Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
  • 2. One tool to rule them all
  • 3. AWS Big Data and Analytic Services Any analytic workload, any scale, at the lowest possible cost Insights Analytics Data Lake Data Movement Amazon QuickSight Amazon SageMaker AWS Glue (ETL & Data Catalog) S3/Amazon Glacier (Storage) Amazon Redshift +Spectrum Amazon EMR Amazon Athena Amazon Elasticsearch Service Amazon Kinesis Analytics Database Migration Service | Snowball | Kinesis Data Firehose | Kinesis Data Streams Real-time Amazon Comprehend DW Big data processing Interactive
  • 4. Big Data on AWS Immediate Availability. Deploy instantly. No hardware to procure, no infrastructure to maintain & scale Trusted & Secure. Designed to meet the strictest requirements. Continuously audited, including certifications such as ISO 27001, FedRAMP, DoD CSM, and PCI DSS. Broad & Deep Capabilities. Over 100 services and 100s of features to support virtually any big data application & workload Hundreds of Partners & Solutions. Get help from a consulting partner or choose from hundreds of tools and applications across the entire data management stack.
  • 5. The Modern Data Architecture
  • 6. Traditionally, Analytics Used to Look Like This OLTP ERP CRM LOB Data Warehouse Business Intelligence Relational data TBs-PBs scale Schema defined prior to data load Operational reporting and ad hoc Large initial capex + $10K–$50K / TB / Year
  • 7. New requirements break the traditional approach Secure and combine data from new and existing sources Do new types of analysis (ML, big data & real-time) Capture and store new non-relational data at EB scale Customers need to: Data exists in silos, ETL does not scale at EB data volumes Operational and ad hoc on relational only DW is optimized for relational data at PB scale Challenges with traditional approach:
  • 8. Data Lakes Extend the Traditional Approach Relational and non-relational data TBs-EBs scale Schema defined during analysis Diverse analytical engines to gain insights Designed for low-cost storage and analytics OLTP ERP CRM LOB Data Warehouse Business Intelligence Data Lake 100110000100101011100 101010111001010100001 011111011010 0011110010110010110 0100011000010 Devices Web Sensors Social Catalog Machine Learning DW Queries Big data processing Interactive Real-time
  • 9. Storing is not enough, data needs to be discoverable Dark data are the information assets organizations collect, process, and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Gartner CRM ERP Data warehouse Mainframe data Web Social Log files Machine data Semi- structured Unstructured “ ”
  • 10. 1. Automated and Reliable Data Ingestion 2. Preservation of Original Source Data 3. Lifecycle Management and Cold Storage 4. Metadata Capture 5. Managing Governance, Security and Privacy 6. Self-Service Discovery, Search, and Access 7. Managing Data Quality 8. Preparing for Analytics 9. Orchestration and Job Scheduling 10. Capturing Data Change Storage & Streams Catalogue & Search Entitlements API & UI Attributes of a Modern Data Architecture Key Pillars of a Data Lake Key Components of a Successful Data Strategy
  • 11. Building a Data Lake on AWS Athena Query Service AWS Batch AWS GlueIoT Lambda Amazon SageMaker Amazon QuickSight Amazon Redshift Amazon EMR
  • 12. Building a Data Lake on AWS Automated and reliable data ingestion
  • 13. Building a Data Lake on AWS Preservation of Original Source Data Lifecycle Management and Cold Storage Capturing Data Change AWS GlueAmazon Glacier
  • 14. Building a Data Lake on AWS Metadata Capture AWS Glue Amazon ElastiSearchDynamoDB
  • 15. Building a Data Lake on AWS Managing Governance, Security, Privacy AWS Glue
  • 16. Building a Data Lake on AWS Self-Service Discovery, Search, Access AWS Glue Amazon Cognito Identity & Access Management API Gateway
  • 17. Building a Data Lake on AWS Managing Data QualityLambda AWS Glue
  • 18. Building a Data Lake on AWS Preparing for Analytics Lambda AWS Glue
  • 19. Building a Data Lake on AWS Orchestration and Job Scheduling AWS Glue Lambda Step Functions CloudWatch Simple Workflow Service
  • 20. Processing Data for Analytics on Your Data Lake
  • 23. Serverless Analytics Deliver cost-effective analytic solutions faster S3 Data Lake AWS Glue (ETL & Data Catalog) Amazon Athena Amazon QuickSight Serverless; zero infrastructure; zero administration Never pay for idle resources $ Availability and fault tolerance built in Automatically scales resources with usage AWS IoT Devices Web Sensors Social
  • 24. What does the customer say? https://aws.amazon.com/solutions/case-studies/analytics/ https://aws.amazon.com/solutions/case-studies/big-data/
  • 25. FINRA Analyzes Billions of Transactions Daily To respond to rapidly changing market dynamics, FINRA, moved 75% of its operations to Amazon Web Services, using AWS to analyze 75B records a day.
  • 26. FINRA uses Amazon EMR and Amazon S3 to process up to 75 billion trading events per day and securely store over 5 petabytes of data, attaining savings of $10-20mm per year. Fraud Detection
  • 27. • AWS enables you to build sophisticated data strategies and related analytics applications • Retrospective, Real-time, Predictive • You can build incrementally, adding use cases and increasing scale as you go • AWS provides a broad range of security and auditing features to enable you to meet your security requirements https://aws.amazon.com/big-data/
  • 28. • Prescriptive guidance and rapidly deployable solutions to help you store, analyze, and process big data on the AWS Cloud • Derive Insights from IoT in Minutes using AWS IoT, Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight • Deploying a Data Lake on AWS - March 2017 AWS Online Tech Talks • Harmonize, Search, and Analyze Loosely Coupled Datasets on AWS with AWS Glue, Amazon Athena, and Amazon QuickSight • From Data Lake to Data Warehouse: Enhancing Customer 360 with Amazon Redshift Spectrum • Implement Continuous Integration and Delivery of Apache Spark Applications using AWS http://amzn.to/2vHIwBq http://amzn.to/2i9gqZn http://bit.ly/2qipA8h http://amzn.to/2qpiFaK http://amzn.to/2lpbc8p Resources http://amzn.to/2gIJcj8
  • 29. Summary Cloud and Big Data - perfect match: agility and new opportunities AWS provides comprehensive Analytics, Security, Compliance capabilities Data Lake requirements and use cases vary Use AWS Partner Network (APN) and Open Source tools for specific needs
  • 30. Join us for our first-ever Amazon Web Services Summit in Ottawa on October 29, 2018 15 sessions featuring various management and technical topics Connect with AWS & our Canadian public sector partners in the Solutions Expo Meet and mingle with other public sector customers from government, education, and nonprofits Register today! aws.amazon.com/summits/ottawa-public-sector
  • 31. We value your feedback! Please share your feedback on the AWS Public Sector Summit survey. Survey will be emailed 24-48 hours following event.