Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS as a Data Platform

3,002 views

Published on

Notes on the "AWS as a Data Platform" presentation.

Published in: Technology
  • Be the first to comment

AWS as a Data Platform

  1. 1. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 AWS as a Data Platform Joe Healy Sr. Consultant – AWS WWPS Professional Services ©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  2. 2. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 GB TB PB ZB EB Unconstrained growth: Big data is moving fast • 2.8 trillion GB in 2012 • 40 trillion GB in 2020 • 95% unstructured • 70% user-generated Source: IDC
  3. 3. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 It’s not just about size data velocity variety volume structured, unstructured, text, binary gigabytes, terabytes, petabytes millisecond, second, minute, hour, day
  4. 4. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 ease of uselower costs Why AWS?
  5. 5. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 no capital investment pay as you go no subscriptions pay only for what you use ease of uselower costs
  6. 6. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 programmable zero admin easy to configure integrates with existing tools ease of uselower costs
  7. 7. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 one tool to rule them all
  8. 8. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 II Use the right tools
  9. 9. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Movement and coordination Data PipelineDirect Connect Storage GatewayImport/Export
  10. 10. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Storage and analysis services EC2EBS Instance Storage RedshiftRDS SQL Stores EMR hadoop DynamoDB NOSQL Amazon Kinesis stream S3 Storage Services CloudFrontAmazon Glacier EFS machine Amazon Machine Learning
  11. 11. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Movement and coordination
  12. 12. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Movement and coordination: Plumbing ship us your disks Direct Connect Storage Gateway Import/ Export dedicated network pipes storage backup and archiving
  13. 13. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 AWS Data Pipeline resource management scheduling, execution, and retry dependency tracking failure notification AWS and on-premises Movement and coordination: Orchestration
  14. 14. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Data storage and analysis
  15. 15. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Storage services: Object store Amazon S3 Store objects (like “files”) Objects are stored in buckets Buckets keep data in a single AWS region, replicated across multiple facilities - cross-region replication highly durable, highly available, highly scalable - 99.999999999% durability
  16. 16. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Storage services: Archive storage low-cost, durable archive “cold storage” tape replacement infrequently accessed data integrated S3 lifecycle policies 99.999999999% durability immutable Amazon Glacier
  17. 17. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Storage services: Content delivery simple to use with global footprint (50+ edge locations) streaming support large file distribution private content S3, EC2, and ELB integration geo restrictions static and dynamic content Amazon CloudFront
  18. 18. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Instance storage: Options Ephemeral storage (“local”) you manage backup/restore free! high storage instances available  i2.8xlarge – 6.4 TB SSD (350K IOPS)  d2.8xl, hs1.8xl – 48 TB disk storage Amazon EC2 Elastic Block Store (EBS) “network attached storage” snapshot, encryption provisioned throughput (IOPS) magnetic or SSD up to 16 TB and 20,000 IOPS per volume
  19. 19. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Instance storage: Build your own Amazon EC2 NFS MongoDB Cassandra GraphLab Titan Kafka Luster Gluster Flume Scribe Presto …and more
  20. 20. PUBLIC MATERIAL | TOAN DO
  21. 21. Migrating to AWS: A proven method Refactor and develop applications for C2S cloud securely with the following requirements: portable across infrastructure, stable platform, and agile. stable platform enhanced security elastic stable platform enhanced security agile PUBLIC MATERIAL | TOAN DO
  22. 22. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Instance storage: Elastic File System Amazon EFS fully managed file system for EC2 instances works with standard operating system APIs sharable across thousands of instances elastically grows to petabyte scale delivers performance for a variety of workloads highly available and durable NFS v4–based
  23. 23. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 We focused on changing the game EFS is simple EFS is elastic EFS is scalable 1 2 3
  24. 24. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 EFS is simple • fully managed – no hardware, network, file layer – create a scalable file system in seconds • seamless integration with existing tools and apps – NFS v4—widespread, open – standard file system semantics – works with standard OS file system APIs • Simple pricing = simple forecasting 1
  25. 25. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 EFS is elastic • file systems grow and shrink automatically as you add and remove files • no need to provision storage capacity or performance • you pay only for the storage space you use, with no minimum fee 2
  26. 26. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 • file systems can grow to petabyte scale • throughput and IOPS scale automatically as file systems grow • consistent low latencies regardless of file system size • support for thousands of concurrent NFS connections EFS is scalable 3
  27. 27. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 MySQL, Aurora, Oracle, SQL Server, PostgreSQL backup/restore, high availability, encryption push-button scalability up to 3 TB and 30,000 IOPS Amazon RDS SQL stores - Managed relational DB
  28. 28. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 If you host your databases on-premises power, HVAC, net rack and stack server maintenance OS patches DB software patches database backups scaling high availability DB software installs OS installation app optimization
  29. 29. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 If you choose a managed DB service power, HVAC, net rack and stack server maintenance OS patches DB software patches database backups app optimization high availability DB software installs OS installation scaling
  30. 30. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 relational data warehouse massively parallel petabyte scale fully managed $1,000/TB/year Amazon Redshift SQL stores - Petabyte data warehouse
  31. 31. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 NoSQL database seamless scalability zero admin single-digit millisecond latency Amazon DynamoDB NoSQL - Dial up capacity
  32. 32. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 flexible tool and framework support - Hive, Impala, Pig, MapReduce, Presto, Spark easy to use; fully managed on-demand and Spot pricing persistent and transient clusters deep integration with S3 and other AWS services Amazon Elastic Map Reduce Hadoop - Managed
  33. 33. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 real-time data collection seamlessly scale to gigabytes/s low-cost managed service EMR integration Streaming at scale Amazon Kinesis
  34. 34. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Streaming - Amazon Kinesis architecture Amazon Web Services AZ AZ AZ durable, highly consistent storage replicates data across three data centers (Availability Zones) millions of sources producing 100s of terabytes per hour front end authentication authorization ordered stream of events supports multiple readers inexpensive: $0.028 per million puts aggregate analysis in Hadoop or data warehouse machine learning algorithms or sliding window analytics real-time dashboards and alarms aggregate and archive to S3
  35. 35. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 managed machine learning service built for developers robust technology based on Amazon’s internal systems create models using your existing data in AWS deploy models to production in seconds Machine learning made simple Amazon Machine Learning
  36. 36. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Smart applications by example based on what you know about the user: Will the user use your product? based on what you know about an order: Is this order fraudulent? based on what you know about a news article: What other articles are interesting?
  37. 37. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Three supported types of predictions • binary classification – predict the answer to a yes/no question • multiclass classification – predict the correct category from a list • regression – predict the value of a numeric variable
  38. 38. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 The right tool. At the right time. At the right scale.
  39. 39. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Open data on AWS …One more related topic
  40. 40. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 What is open data? Open data is data that can be used by anyone for any purpose for free. Many of our customers, such as Esri, the Weather Company, and the Climate Corporation, rely on quality open data as much as they rely on our computing, storage, and other web services.
  41. 41. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Data Enrichment Sensemaking Data at Rest (Object storage) Basic APIs Complex APIs Consumer applications Algorithmic policy Data-driven journalism Data Catalogs Focused data dashboards Predictive modeling Visualizations Lower cost of knowledge (Efficiency) Open data as a platform
  42. 42. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Data Creation Data Enrichment Sensemaking Data at Rest (Object storage) Basic APIs Complex APIs Consumer applications Algorithmic policy Data-driven journalism Data Catalogs Focused data dashboards Predictive modeling Visualizations Efficiency Open data as a platform
  43. 43. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Data Enrichment Sensemaking Amazon Kinesis Amazon EC2 Amazon EC2 AWS Data Pipeline Amazon S3 Amazon RDS Amazon EMR Amazon Redshift Amazon DynamoDB AWS Lambda Open data as a platform
  44. 44. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Public datasets on AWS To enable more innovation, AWS hosts a selection of datasets that anyone can access for free. Data in our public datasets is available for rapid access to our flexible and low-cost computing resources. earth science NASA Earth Exchange (NEX) life sciences 1000 Genomes project Internet science Common Crawl corpus
  45. 45. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Thank You. This presentation will be loaded to SlideShare the week following the Symposium. http://www.slideshare.net/AmazonWebServices AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

×