AWS for the Data Professional

  • 2,112 views
Uploaded on

Core AWS services for the data professional - EC2, RDS, S3, Kinesis and more

Core AWS services for the data professional - EC2, RDS, S3, Kinesis and more

More in: Software
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,112
On Slideshare
0
From Embeds
0
Number of Embeds
12

Actions

Shares
Downloads
10
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • https://console.aws.amazon.com/console/home
  • https://www.windowsazure.com/en-us/home/features/overview/
  • http://aws.amazon.com/rds/sqlserver/ and http://aws.amazon.com/rds/faqs/#4
    Can scale to larger instances, can backup, can restore up to 5 minutes, all tools work, all patching is managed
  • C:Program Files (x86)AWS ToolsDocumentationAWSToolsForWindows.html

    How to use PowerShell -- http://docs.aws.amazon.com/powershell/latest/userguide/pstools-welcome.html
  • C:Program Files (x86)AWS ToolsDocumentationAWSToolsForWindows.html

    How to use PowerShell -- http://docs.aws.amazon.com/powershell/latest/userguide/pstools-welcome.html
  • Hadoop on AWS - http://wiki.apache.org/hadoop/AmazonEC2
  • http://aws.amazon.com/aws-free-usage-tier0/
  • S3 = .12 / GB / month -> $ 150 100 GB / yr
    EBS = .10 / GB / month -> $ 100 100 GB / yr
    EC2 = .12 / hr (Small, on-demand, Windows) -> $ 1051 run all year (up to 3.85 / hr, down to .01 / hr for spot instances) can be PLUS other services, i.e. CloudWatch…
    RDS = .14 / hr (small, on demand, SQL 2008 STD) -> $ 1226 run all year ( up to 3.85 / hr, down to .05 / hr to heavy utilitization PLUS up/down data charged
    Dynamo = .01 / 10 writes & .01 / 50 reads PLUS up/down charges
    Elastic Beanstalk / Windows = starter package $ 42 / month -> $ 504 / yr
  • http://aws.amazon.com/usergroups/ & http://aws.amazon.com/aws-training/

Transcript

  • 1. Amazon Web Services for the SQL Server Professional Lynn Langit Architect Level: Intermediate
  • 2. What and Why AWS? AWS Amazon’s cloud Set of services Compute Data More Market leader In market longest Usually cheapest Most often used in production
  • 3. Amazon Web Services
  • 4. EC 2- VMs for train, test & production Pricing • On-demand • Spot • Reserved
  • 5. Demo - EC2 • Virtual Machines 5
  • 6. S3 and Glacier
  • 7. About EC2 storage S3 •10 GB max •3 copies •Usually for data storage EBS – expand / snapshot, etc… •Can store AMIs (persistent) •Can ‘stop’ EC2 instances and ‘re-start’ – saves $$$ •Costs more •Can expand •One copy only (faster) SSD – optional •For high performance •Provisioned IOPs
  • 8. Demo – S3 • File Storage 8
  • 9. Demo – Glacier • Archival Storage 9
  • 10. RDS – Managed Relational Data
  • 11. Demo – RDS • SQL Server as a service 11
  • 12. RDS vs. EC2 for SQL Server • Provisioned IO – performance guarantees • Scheduled backups • Point in time restores • Scheduled maintenance windows • Full use of all SQL tools, SSMS, Profiler, DTA, etc… • Supports Availability Groups (requires 2012 Enterprise) Why RDS costs more
  • 13. Redshift – $999 / TB / year
  • 14. Demo – Redshift • Data Warehousing as a Service 14
  • 15. DynamoDB for fast NoSQL with SSDs
  • 16. Demo – DynamoDB • NoSQL on SSD 16
  • 17. Elastic MapReduce for easy Hadoop
  • 18. Demo – MapReduce • Hadoop on AWS 18
  • 19. Kinesis for real-time Big Data Streams
  • 20. Demo – Kinesis • Real-time streaming for Big Data 20
  • 21. Data Pipelines – automated data transfer
  • 22. Demo – Data Pipeline • Build data flows on AWS 22
  • 23. Integration w/ Visual Studio – AWS SDK See Also: • AWS Tools for Windows Developers • Includes AWS Powershell
  • 24. AWS SDK includes AWS Powershell
  • 25. Demo – AWS SDK • Add-in for Visual Studio and .NET 25
  • 26. Cloud Database Services by Vendor AWS Google Microsoft RDBMS VMs EC2 AMIs w/SQL Server, etc… GCE w/MySQL Azure VM images w/SQL Server Managed RDBMS RDS - SQL Server, MySQL Cloud SQL - MySQL SQL Azure NoSQL buckets/databases S3, EBS, Glacier, DynamoDB Cloud Storage HR Datastore on GAE Azure Blobs & Tables Pipelines Data Pipelines Data Pipelines (beta) SSIS? Streaming Machine Learning Kinesis or Custom EC2 BigQuery & Prediction API StreamInsight Azure Machine Learning Document MongoDB on EC2 MongoDB on GCE MongoDB on Windows Azure Hadoop MapReduce Big Query (Dremel) HDInsight Other Redshift – Data Warehouse Workspaces & Zocalo Managed VMs GAE Azure Marketplace – premium data
  • 27. Costs - Free Tier for Database Services
  • 28. How much does it cost? Tip: When testing use Billing Alerts to make sure you’ve turned off test services!
  • 29. Creative Financing • Use what you need and no more, i.e. instance size, storage size… • Watch for price drops – RDS price decrease this week Regular Pricing • Pause EC2 instances to reduce compute charges • Delete EC2 instances to reduce storage charges Smart EC2 Instance Usage • Set pricing alerts • Use spot pricing • Re-selling compute / storage Vanity Pricing
  • 30. Usage Summary Compute EC2 Dev & Test Train Prod Storage S3 Raw Storage Glacier Archiving Data Services RDS Partially Managed RDBMS HA SQL Server Redshift Data Warehousing DynamoDB fast NoSQL – on SSDs EMR On Demand MapReduce Kinesis Streaming Data Pipelines Automation
  • 31. 31
  • 32. Keep Learning • Connect – @LynnLangit – www.youtube.com/user/SoCalDevGal • Get started – Sign up for AWS – use ‘Free Tier’ – Email me to get $100 AWS usage credit