Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
S U M M I T
SYDNEY
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Next Generation Data Lake and
Analyti...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
What has changed in the last five yea...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data is a strategic asset for every o...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Thinking about data as an asset, not ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data
every 5 years
There is more data...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Ask yourself these 3 questions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
If yes – then you need a data lake
Am...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
What is a data lake?
A data lake is a...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Traditionally, analytics looked like ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Analytics operated on isolated data s...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Ama z on S 3
Ama z on Gla cier
AWS Gl...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data lakes extend the traditional app...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS databases and analytics
Broad and...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data lakes, analytics, and ML portfol...
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Streaming ingest with Amazon Kinesis ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Extracting relational data using Amaz...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Ingest, catalog and secure your data
...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Redshift
analyse report
Fast s...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Query all your data
Redshift’s Spectr...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon EMR – Big data and ML at scale...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon EMR – Notebooks
Off-cluster no...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2017, Amazon Web Services, Inc. or ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon QuickSight – Create visual and...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon QuickSight – Machine learning ...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Build your data lake on AWS today
Per...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
More places to learn about Amazon ana...
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Raghu Prabhu
Upcoming SlideShare
Loading in …5
×

of

Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 1 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 2 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 3 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 4 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 5 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 6 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 7 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 8 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 9 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 10 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 11 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 12 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 13 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 14 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 15 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 16 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 17 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 18 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 19 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 20 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 21 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 22 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 23 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 24 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 25 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 26 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 27 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 28 Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney Slide 29
Upcoming SlideShare
What to Upload to SlideShare
Next

2 Likes

Share

Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney

Your business needs the best insights in the hands of decision makers -- executives, business users, analysts, data scientists. Traditional approaches to extracting, preparing, staging, securing, and serving information require a painful amount of heavy lifting. In this session, we will share what next-gen data lakes and analytics platforms have become. We look at ML-powered data warehousing with Amazon Redshift, ML-enhanced data collection and preparation with Amazon Lake Formations, and building ML development environments on a secure, open data lake.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Next-Gen Data Lakes and Analytics Platforms - AWS Summit Sydney

  1. 1. S U M M I T SYDNEY
  2. 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Next Generation Data Lake and Analytics Platform Raghu Prabhu Lead Global Business Development for Data Lakes Amazon Web Services
  3. 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T What has changed in the last five years? • Cloud has changed everything • Limitless storage • Numerous compute options • Cost effective, no contracts • There is a lot more data • New breed of analysts, statisticians, and data scientists • Applications and user experiences are guided by data
  4. 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data is a strategic asset for every organisation The world’s most valuable resource is *Copyright: The Economist, 2017, David Parkins
  5. 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Thinking about data as an asset, not a cost Stop throwing data away Make it available to more users and applications Arm users with better data processing technologies
  6. 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data every 5 years There is more data than people think. years live for Data platforms need to scalegrows
  7. 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Ask yourself these 3 questions
  8. 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T If yes – then you need a data lake Amazon S3
  9. 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T What is a data lake? A data lake is a centralised repository that allows you to store all your structured and unstructured data at any scale
  10. 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Traditionally, analytics looked like this Expensive: Large initial capex + $10k-$50k/TB/year GBs-TBs scale - not designed for PB/EBs Primarily relational data 90% of data was deleted to reduce cost OLTP ERP CRM LOB Data Warehouse Business Intelligence
  11. 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Analytics operated on isolated data silos Hadoop Cluster SQL Database Data Warehouse Appliance
  12. 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Ama z on S 3 Ama z on Gla cier AWS Glu e Store data in the format you want Open and comprehensive • Store data in the format you want: • Text files like CSV • Columnar like Apache Parquet, and Apache ORC • Logstash like Grok • JSON (simple, nested), AVRO • and more CSV ORC Grok Avro Parquet JSON
  13. 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data lakes extend the traditional approach OLTP ERP CRM LOB Data Warehouse Business Intelligence Devices Web Sensors Social Catalog Machine Learning DW Queries Big data processing Interactive Real-time = The analytical power of data warehouse The limitless scalability of serverless compute The distributed processing of big data systems + + Data Lake 1001100001001010111001010101110010 10100001011111011010 0011110010110010110 0100011000010 100110000100101011100101010111 001010100001011111011010 0011110010110010110 0100011000010
  14. 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T AWS databases and analytics Broad and deep portfolio, built for builders AWS Marketplace Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Amazon Athena Interactive analytics Amazon Kinesis Analytics Real-time Amazon Elasticsearch service Operational Analytics Amazon RDS MySQL, PostgreSQL, MariaDB, Oracle, SQL Server Amazon Aurora MySQL, PostgreSQL Amazon QuickSight Amazon SageMaker Amazon DynamoDB Key value, Document Amazon ElastiCache Redis, Memcached Amazon Neptune Graph Amazon Timestream Time Series Amazon QLDB Ledger Database S3/Amazon Glacier AWS Glue ETL & Data Catalog Lake Formation Data Lakes Database Migration Service | AWS Snowball | AWS Snowmobile | Amazon Kinesis Data Firehose Amazon Kinesis Data Streams | Amazon Data Pipeline | Amazon Direct Connect Data Movement AnalyticsDatabases Business Intelligence & Machine Learning Data Lake Managed Blockchain Blockchain Templates Blockchain Amazon Comprehend Amazon Rekognition Amazon Lex Amazon Transcribe AWS DeepLens 250+ solutions 730+ Database solutions 600+ Analytics solutions 25+ Blockchain solutions 20+ Data lake solutions 30+ solutions Amazon RDS on VMWare
  15. 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data lakes, analytics, and ML portfolio from AWS Broadest, deepest set of analytic services Amazon SageMaker AWS Deep Learning AMIs Amazon Rekognition Amazon Lex AWS DeepLens Amazon Comprehend Amazon Translate Amazon Transcribe Amazon Polly Amazon Athena Amazon EMR Amazon Redshift Amazon Elasticsearch service Amazon Kinesis Amazon QuickSight AnalyticsMachine Learning AWS Direct Connect AWS Snowball AWS Snowmobile AWS Database Migration Service AWS IoT Core Amazon Kinesis Data Firehose Amazon Kinesis Data Streams Amazon Kinesis Video Streams Real-time Data Movement On-premises Data Movement Data Lake on AWS Storage | Archival Storage | Data Catalog
  16. 16. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  17. 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Streaming ingest with Amazon Kinesis Data Services Easily collect, process, and analyse data streams in real time Load data streams into AWS data stores Analyse data streams in real-time Capture, process, and store data streams Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Amazon Kinesis Data Analytics
  18. 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Extracting relational data using Amazon DMS extract replicate consolidate data lake
  19. 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Ingest, catalog and secure your data build secure collection classification cleansing transformation govern secure
  20. 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Redshift analyse report Fast scalable data warehouse data lake
  21. 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Query all your data Redshift’s Spectrum Feature Extends Redshift Queries to an Amazon S3 Data Lake Amazon Redshift Spectrum Amazon Redshift Query Engine Amazon Redshift Data Amazon S3 • Directly query exabytes in Amazon S3, no loading required • Query across Amazon Redshift and Amazon S3 • Scale compute and storage separately • High concurrency • Support for Parquet, ORC, Avro, CSV, JSON, Grok, and other open file formats • Pay only for the amount of data scanned 100110000100101011100101010111001010100001011111011010 0011110010110010110 0100011000010 100110000100101011100101010 111001010100001011111011010 0011110010110010110 0100011000010 Data Lake
  22. 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon EMR – Big data and ML at scale run scale managed easy fast cost-effective Spark Flink Presto MXNet TensorFlow scalable
  23. 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon EMR – Notebooks Off-cluster notebook based on Jupyter Amazon EMR clusters User S3 bucket AWS Management Console for EMR EMR managed notebook based on Jupyter notebook users Customer VPC EMR VPC
  24. 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Permissions Data Lake AWS Cloud AWS Cloud Reporting & Analytics Machine Learning AWS Cloud Custom Apps AWS Lake Formation Catalog Amazon Athena – Query data directly in your lake query analyse point schema SQL Amazon AthenaAmazon S3 AWS CloudTrail
  25. 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon QuickSight – Create visual and embeddable dashboards visualise interact fast business intelligence insights everyone
  26. 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon QuickSight – Machine learning to discover new insights ML trends outliers business drivers what-if analysis forecasting
  27. 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Build your data lake on AWS today Permissions Data Lake AWS Lake Formation Catalog Amazon AthenaAmazon S3 AWS CloudTrail AWS Cloud AWS Cloud Reporting & Analytics Machine Learning AWS Cloud Custom Apps Amazon Athena Amazon Redshift Amazon EMR AWS Tools and SDKs
  28. 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T More places to learn about Amazon analytics and data lakes AWS Data Lakes and Analytics What is a data lake Learn more about AWS Lake Formation AWS Big Data Blog – Data Lakes AWS analytics customer case studies
  29. 29. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Raghu Prabhu
  • MaryMares8

    Nov. 26, 2021
  • MikoThusberg

    May. 20, 2019

Your business needs the best insights in the hands of decision makers -- executives, business users, analysts, data scientists. Traditional approaches to extracting, preparing, staging, securing, and serving information require a painful amount of heavy lifting. In this session, we will share what next-gen data lakes and analytics platforms have become. We look at ML-powered data warehousing with Amazon Redshift, ML-enhanced data collection and preparation with Amazon Lake Formations, and building ML development environments on a secure, open data lake.

Views

Total views

487

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

0

Shares

0

Comments

0

Likes

2

×