Storage & Data Architecture on Amazon Web Services


Published on

This presentation will offer an overview of the storage and data services from Amazon Web Services. Further, we will explore cloud solutions for various storage & data requirements like databases, caching, backups, archival, scalable data/content delivery and securing your data on the cloud. This session will act as a primer for optimal and scalable storage & data architecture on Amazon Web Services cloud.

I have also made some predictions regarding the new storage related offerings that will arrive in the cloud markets in the months and years to come.

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Storage & Data Architecture on Amazon Web Services

  1. 1. Cloud Computing Storage & Data Architecture on DevOps Amazon Web Services StorageEnterprise& Scalable Data Apps Analytics Kalpak Shah CEO, Clogeny Technologies
  2. 2. Trends Data explosion • Exabytes, zettabytes… Enterprise cloud adoption Hybrid cloud environments Cloud bursting Connectivity • iCloud, Samsung Cloud BYOD Security for cloud storage (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  3. 3. Storage Choices in traditional IT Memory SAN NAS Databases DAS Offline archival Cost – benefit tradeoffs, well understood (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  4. 4. AWS Cloud Storage Options Elastic Block Storage (EBS) Ephemeral Storage Amazon S3 Amazon SQS Amazon SimpleDB Amazon DynamoDB Amazon Relational Database Service (RDS) Amazon ElastiCache Amazon Storage Gateway (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  5. 5. AWS Global Footprint Ability to use storage across continents as per your business requirements (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  6. 6. Disk Stores - 1 Ephemeral Disks • Instance storage • Non-persistent • Typical web scale concept Amazon Elastic Block Store (EBS) • Persistent storage for Amazon EC2 instances • Like a hard-drive in a physical server • Independent volumes that can be attached/detached to instances • EBS backed instances are persistent & can be shut-down & re- started • 1GB – 1TB in size (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  7. 7. Disk Stores - 2 Amazon Elastic Block Store (EBS) - contd.. • Multiple volumes can be attached to single instance • At a time 1 volume can be attached to only 1 instance • Incremental and fast snapshotting capabilities • Ideal for: Filesystems, databases, raw block devices • Variable performance since network attached • Use multiple EBS disks in RAID configurations for better performance • Larger EBS disks seem to give better performance • Setup snapshotting schedules as per requirements – how will you manage consistency? • Test restores before going into production  (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  8. 8. Amazon Simple Storage Service - 1 Storage for the internet – HTTP scalable storage Safe storage – 3 copies, offsite backup as well 99.999999999% durability and 99.99% availability Not like your filesystem • Folder => bucket • File => Object • No block based access Bucket can be created in any region • Reasons of latency, costs or regulatory requirements Reduced Redundancy Storage (RRS) – cheaper & less durable for your non-critical and reproducible data (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  9. 9. Amazon Simple Storage Service - 2 Use with multiple threads, multiple apps, multiple clients – will scale to any practical load Data is stored as-is, application needs to encrypt or compress data before uploading Server-side encryption is supported Delimiters can be used to implement filesystem-like hierarchy Integrated with CDN AWS Import/Export can be used to move large datasets physically Pricing tiers for storage and bandwidth usage You can schedule TTL for objects with predefined time periods It stores close to 1 trillion objects today! (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  10. 10. Amazon Simple Storage Service - 3 Data Protection • Multiple access control mechanisms  Identity & Access Management (IAM)  Access Control Lists (ACLs): grant permission to individual objects  Bucket policies: add or deny permission to all objects within the bucket  Query string authentication: Share specific objects with specific TTL Access Logging • Log all requests made to S3 for analysis and audit purposes (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  11. 11. SimpleDB Non-relational datastore as a web service • Scalable non-relational datastore, schema-less model • Create & manages multiple geographically distributed replicas for availability and durability • Simple Data Model: Domains, Items, Attributes, Values Consistency Options • Eventually consistent reads: Optimize read performance but you may not get latest written data, some time lag • Consistent reads Provides indexing & query service Excellent for flexible schema (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  12. 12. DynamoDB NoSQL Database Service backed by SSDs Replicated and scaled as per load by the service itself Data Model – attributes, items, tables • Attribute: Name-value pair • Item: Collection of attributes • Tables: contain items with information organized in discrete areas. Table will have a primary key. Data is indexed for querying EMR Integration (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  13. 13. Amazon Direct Connect Establish a dedicated network connection from your premise to the AWS 1GBPS or 10GBPS ports available Available in US, Singapore, Tokyo and London Get excellent network performance Private connectivity to your VPC Enables hybrid environments for enterprises to satisfy regulatory concerns (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  14. 14. AWS Storage Gateway - 1 On-premise virtual appliance with cloud-based storage Integrate on-premise environment with AWS storage seamlessly AWS Storage Gateway volumes can be attached as standard iSCSI devices to your on-premise servers Low latency access on-premise and asynchronous upload to S3. Data on S3 is encrypted and all communication is over SSL (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  15. 15. AWS Storage Gateway - 2 Stored on S3 as an EBS snapshot Restore as an EBS volume and attached to EC2 instances Can be restored on-premise or in the cloud Truly enabling hybrid scenarios or cloud bursting scenarios Couple with Direct Connect for excellent throughput & reduced network costs Gateway-cached volumes will be supported soon which will only keep frequently accessed data on-premise Disaster Recovery – Offsite Storage – Data Mirroring (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  16. 16. AWS Storage Gateway - 3 (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  17. 17. Quick Overview Amazon S3: Static objects for serving, snapshots S3 RRS: Static objects that are reproducible EC2 ephemeral disk: Transient data EBS: Persistent storage, volumes CloudFront: Content distribution SimpleDB: Simple, scalable data indexing/querying DynamoDB: NoSQL database service with SSDs ElastiCache: In-memory caching service RDS: RDBMS service Storage Gateway: Integrate with on-premise storage (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  18. 18. Trends & Predictions Here are some predictions on how storage in the cloud will evolve based on current trends… (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  19. 19. Distributed Storage Use of distributed, network filesystems to manage data across various clouds and on-premise deployments For example, Gluster (now Redhat Virtual Storage Appliance) allows aggregating EBS disks and EC2 servers to create large distributed single namespace storage systems Lustre, another distributed filesystem can be used over the WAN having a storage pool on-premise and another in the cloud. Replication and backups would be managed by the storage system itself. These filesystems can scale to petabytes and billions of objects today. Can expect them to scale to exabytes in a few years. Policies and auditing can be used to manage what data can stay where. (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  20. 20. Integrated Storage Amazon Elastic MapReduce (EMR) is a hosted Hadoop framework. • Integrated with S3, DynamoDB • Pull and push data back from S3 and DynamoDB S3 bucket can be marked as RRS enabled EBS snapshots are stored on S3 There is a trend here towards integrating storage/data tiers. When there are petabytes to manage, moving data around is not an option Expect more integrations from AWS and ISVs in the coming days (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  21. 21. Performance One of the concerns on the cloud is the storage performance AWS has already added DynamoDB which is backed by SSDs and hence very high performance The Cluster Compute instances offer much higher performance in terms of network & storage In some days, I expect to see SSDs as a service or more high performance options for storage in the AWS cloud • For example, Zadara Storage provides Virtual Private Storage Arrays which acts like your private SAN in the cloud. • You can select the type of drives you want (SSD, SAS, SATA) with the RAID level desired. (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  22. 22. Pure Predictions It is safe to assume that anything that is expensive, tough to maintain and takes longer to provision in the datacenter will be moved into the cloud • Long term archival • e-Discovery of your cloud based data (AWS already offers Elastic Search service) • SLA, Reliability and Performance based pricing • Services related to storing & managing unstructured data • Volume replication across on-premise and cloud for instant DR capability (US) 408-556-9645Clogeny Technologies (India) +91 20 661 43 482
  23. 23. Thanks ! Cloud Cutting Edge Computing Fun @ Work DevOps StorageEnterprise Scalable Data Apps Analytics