AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Educati...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Ease of useLower costs
...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
no capital investment
p...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
programmable
zero admin...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
One tool to rule them a...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
II
Use the right tools
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Movement and Coordinati...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Storage and Analysis Se...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Movement and Coordinati...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Movement and Coordinati...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Data
Pipeline
Resou...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Data Storage and Analys...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Storage Services – Obje...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Storage Services - Arch...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Storage Services – Edge...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Instance Storage - Opti...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Instance Storage - Buil...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
MySQL, Oracle, SQLServe...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Relational data warehou...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
SQL Stores- Amazon Reds...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
NoSQL Database
Seamless...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
WRITES
Continuously rep...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Hive, Impala, Spark, Pi...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Master instance group
T...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Real-time data collecti...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Streaming - Amazon Kine...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
Fully managed search en...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
The right tool. At the ...
AWS Government, Education, and Nonprofits Symposium
Washington, DC | June 24, 2014 - June 26, 2014
AWS Government, Educati...
Upcoming SlideShare
Loading in …5
×

AWS as a Data Platform - AWS Symposium 2014 - Washington D.C.

819 views

Published on

Come hear about the services that AWS provides to manage data and when to use which tools to manage data appropriately. You will learn about both data movement and coordination, as well as data storage and analysis, including when to use relational and NoSQL approaches, Hadoop, and data warehousing. This session will highlight how AWS data services have helped real-world customers.

Published in: Business, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
819
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
56
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • OBAMA for America -> In the system, Ruby on Rails (RoR), Python/Django, PHP, and a host of other front- and mid-tier technologies intermingled, creating a robust heterogeneous design. Below that, the use of 10 different structured storage systems reflected a focus on bringing tools suited to the data itself. Intermingling technologies included relational database management services like Amazon Relational Database Service (Amazon RDS) for MySQL, PostgreSQL, and Microsoft SQL Server; NoSQL software like MongoDB, Apache Hadoop, Vertica, and LevelDB; and Amazon S3, Amazon DynamoDB, and Amazon SimpleDB.
  • OBAMA for America -> In the system, Ruby on Rails (RoR), Python/Django, PHP, and a host of other front- and mid-tier technologies intermingled, creating a robust heterogeneous design. Below that, the use of 10 different structured storage systems reflected a focus on bringing tools suited to the data itself. Intermingling technologies included relational database management services like Amazon Relational Database Service (Amazon RDS) for MySQL, PostgreSQL, and Microsoft SQL Server; NoSQL software like MongoDB, Apache Hadoop, Vertica, and LevelDB; and Amazon S3, Amazon DynamoDB, and Amazon SimpleDB.
  • The latency characteristics of DynamoDB are under 10 msec and highly consistent.
    Most importantly, the data is durable in DynamoDB, constantly replicated across multiple data centers and persisted to SSD storage.
  • More context – Mongo DB, Cassandra

    Variety – can process many different types, custom serdes, etc.
    Velocity – certain pacakages that run on Hadoop help with real time data injestion, like flume, storm, kafka, spark streaming
    Volume – designed to work on massive data sets.
  • Start an EMR cluster using console or cli tools
    Master instance group created that controls the cluster
    Core instance group created for life of cluster
    Core instances run DataNode and TaskTracker daemons
    Optional task instances can be added or subtracted to perform work (SPOT)
    S3 can be used as underlying ‘file system’ for input/output data
    Master node coordinates distribution of work and manages cluster state
    Core and Task instances read-write to S3


  • Volume – pretty high
    Velocity – very high
    Variety – good if it fits into 40k, otherwise need to do some lifting.
  • Volume – pretty high
    Velocity – very high
    Variety – good if it fits into 40k, otherwise need to do some lifting.
  • AWS as a Data Platform - AWS Symposium 2014 - Washington D.C.

    1. 1. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS as a Data Platform Chris Keyser ckeyser@amazon.com
    2. 2. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Ease of useLower costs Why AWS?
    3. 3. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 no capital investment pay as you go no subscriptions only pay for what you use Ease of useLower costs
    4. 4. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 programmable zero admin easy to configure integrate with existing tools Ease of useLower costs
    5. 5. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 One tool to rule them all
    6. 6. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 II Use the right tools
    7. 7. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Movement and Coordination Data PipelineDirect Connect Storage GatewayImport / Export
    8. 8. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Storage and Analysis Services EC2EBS Instance Storage RedshiftRDS SQL Stores EMR Hadoop DynamoDB NOSQL Kinesis Stream Cloud Search Search S3 Storage Services Cloud FrontGlacier
    9. 9. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Movement and Coordination
    10. 10. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Movement and Coordination - Plumbing Ship us your disks Direct Connect Storage Gateway Import / Export Dedicated network pipes Storage backup & archiving
    11. 11. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Data Pipeline Resource management Scheduling, execution, and retry Dependency tracking Failure notification Movement and Coordination - Orchestration
    12. 12. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Data Storage and Analysis
    13. 13. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Storage Services – Object Store Amazon S3 > 1.5 million peak requests/sec Designed for 99.999999999% durability Trillions of objects Stores anything Lifecycle and Versioning
    14. 14. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Storage Services - Archive Storage Low cost, durable archiving “Cold Storage” Infrequently accessed data Integrated S3 lifecycle policies Amazon Glacier
    15. 15. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Storage Services – Edge Caching Simple to use with global footprint Streaming support Large file distribution Private content S3, EC2 and ELB integration Geo restrictions Amazon CloudFront
    16. 16. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
    17. 17. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Instance Storage - Options Ephemeral Storage (“local”) You manage backup/restoral High Storage instances available  i2.8xlarge – 6.4 TB SSD (350K IOPS)  hs1.8xlarge – 48 TB Disk Storage Amazon EC2 Elastic Block Storage “Network Attached Storage” Snapshot, Encryption Provisioned throughput (IOPS)
    18. 18. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Instance Storage - Build Your Own Amazon EC2 NFS MongoDB Cassandra GraphLab Titan Kafka Luster Gluster Flume Scribe Presto …and more
    19. 19. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 MySQL, Oracle, SQLServer, Postgres Backup/Restore, High Availability Push Button Scalability Up to 3 TB and 30K IOPS Amazon RDS SQL Stores - Managed Relational DB
    20. 20. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Relational data warehouse Massively parallel Petabyte scale Fully managed $1,000/TB/Year Amazon Redshift SQL Stores- Petabyte Data Warehouse
    21. 21. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 SQL Stores- Amazon Redshift Architecture • Leader Node – SQL endpoint – Stores metadata – Coordinates query execution • Compute Nodes – Local, columnar storage – Execute queries in parallel – Backup and restore via S3 – Parallel load from S3, EMR, or DynamoDB • HW optimized for data processing – DW1: 2TB – 1.6PB Magnetic – DW2: 160GB – 256TB SSD 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
    22. 22. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 NoSQL Database Seamless scalability Zero admin Single digit millisecond latency Amazon DynamoDB NoSQL – Dial Up Capacity
    23. 23. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 WRITES Continuously replicated to 3 AZ’s Quorum acknowledgment Persisted to disk (custom SSD) READS Strongly or eventually consistent No trade-off in latency NoSQL - Durable Low Latency at Scale
    24. 24. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Hive, Impala, Spark, Pig, MapReduce Easy to use; fully managed On-demand and spot pricing Persistent and transient clusters Deep integration with S3 Amazon Elastic Map Reduce Hadoop – On Demand
    25. 25. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Master instance group Task instance groupCore instance group HDFS HDFS Amazon S3Amazon Redshift Amazon DynamoDB Hadoop – Tuned for AWS
    26. 26. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014
    27. 27. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Real-time data collection Seamlessly scale to gigabytes/s Low cost managed service EMR integration Low cost managed service Streaming - at Scale Amazon Kinesis
    28. 28. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Streaming - Amazon Kinesis Architecture Amazon Web Services AZ AZ AZ Durable, highly consistent storage replicates data across three data centers (availability zones) Millions of sources producing 100s of terabytes per hour Front End Authentication Authorization Ordered stream of events supports multiple readers Inexpensive: $0.028 per million puts Aggregate analysis in Hadoop or data Warehouse Machine learning algorithms or sliding window analytics Real-time dashboards and alarms Aggregate and Archive to S3
    29. 29. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Fully managed search engine Simple to operate Highly available User configurable scaling Advanced feature support Search – Made Simple Amazon CloudSearch 34 languages Algorithmic stemming Geospatial search Faceted search Suggestions Highlighting Field weighting …
    30. 30. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 The right tool. At the right time. At the right scale.
    31. 31. AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 AWS Government, Education, and Nonprofits Symposium Washington, DC | June 24, 2014 - June 26, 2014 Thank You Chris Keyser ckeyser@amazon.com

    ×