Abhishek Sinha
Business Development Manager
sinhaar@amazon.com
@abysinha
Petabyte Scale Data Warehousing on the Cloud
Data warehousing done the AWS way
• No upfront costs, pay as you go
• Really fast performance at a really low price
• Open...
We set out to build…
A fast and powerful, petabyte-scale data warehouse that is:
Delivered as a managed service
A Lot Fast...
We’re off to a good start
We set out to build…
A fast and powerful, petabyte-scale data warehouse that is:
Delivered as a managed service
A Lot Fast...
Amazon Redshift dramatically reduces I/O
ID Age State
123 20 CA
345 25 WA
678 40 FL
Row storage Column storage
Scan
Direct...
Amazon Redshift dramatically reduces I/O
• Data compression
• Zone maps
• Direct-attached storage
• Large data block sizes...
Amazon Redshift dramatically reduces I/O
• Data compression
• Zone maps
• Direct-attached storage
• Large data block sizes...
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
• Large...
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Direct-attached storage
• Large data block ...
Amazon Redshift dramatically reduces I/O
• Column storage
• Data compression
• Zone maps
• Direct-attached storage
• Large...
Amazon Redshift architecture
• Leader Node
– SQL endpoint
– Stores metadata
– Coordinates query execution
• Compute Nodes
...
Amazon Redshift runs on optimized hardware
HS1.8XL: 128 GB RAM, 16 Cores, 24 Spindles, 16 TB compressed user storage, 2 GB...
Amazon Redshift parallelizes and distributes everything
• Query
• Load
• Backup
• Restore
• Resize
10 GigE
(HPC)
Ingestion...
Amazon Redshift parallelizes and distributes everything
• Query
• Load
• Backup/Restore
• Resize
Amazon Redshift parallelizes and distributes everything
• Load in parallel from Amazon
S3 or Amazon DynamoDB
• Data automa...
Amazon Redshift parallelizes and distributes everything
• Backups to Amazon S3 are
automatic, continuous and
incremental
•...
Amazon Redshift parallelizes and distributes everything
• Resize while remaining online
• Provision a new cluster in the
b...
Amazon Redshift parallelizes and distributes everything
• Query
• Load
• Backup/Restore
• Resize
• Automatic SQL endpoint ...
Point and click resize
Amazon Redshift lets you start small and grow big
Extra Large Node (HS1.XL)
3 spindles, 2 TB, 16 GB RAM, 2 cores
Single No...
Amazon Redshift is priced to let you analyze all your data
Price Per Hour for
HS1.XL Single Node
Effective Hourly
Price
Pe...
Amazon Redshift is easy to use
• Provision in minutes
• Monitor query performance
• Point and click resize
• Built in secu...
Provision a data warehouse in minutes
Monitor query performance
Amazon Redshift integrates with multiple data sources
Amazon
DynamoDB
Amazon Elastic
MapReduce
Amazon Simple
Storage Servi...
Amazon Redshift provides multiple data loading options
• Upload to Amazon S3
• AWS Import/Export
• AWS Direct Connect
• Wo...
Amazon Redshift works with your existing analysis tools
JDBC/ODBC
Amazon Redshift
More coming soon…
Jaspersoft for AWS Overview
©2010 Jaspersoft Corporation. Proprietary and Confidential 31
Competing on Time and Information
©2013 Jaspersoft Corporation. Proprietary and Confidential 32
“The New Factors of Produc...
We Need “Intelligence Inside”
©2013 Jaspersoft Corporation. Proprietary and Confidential 33
We want information to FIND US...
Jaspersoft: The Intelligence Inside
©2013 Jaspersoft Corporation. Proprietary and Confidential 34
Self-Service BI + Embedd...
Intelligence
Inside
Example Customers
Commercial
Apps
Customer
Portals
Cloud Apps
Internal Apps
Big Data
Analytics
The Int...
Strong Partnerships, Broad Recognition
High Growth Subscription
Revenue Company
©2013 Jaspersoft Corporation. Proprietary ...
Winner, Technology of the Year 2013
 Jaspersoft wins alongside iPad Mini, Hadoop, HTML5
 Only business intelligence or a...
Product Overview
Design Any Report . . .
©2013 Jaspersoft Corporation. Proprietary and Confidential 39
… Dashboard
40©2013 Jaspersoft Corporation. Proprietary and Confidential
… or Analytic View
41©2013 Jaspersoft Corporation. Proprietary and Confidential
POJO files
… using Any Data Type
Relational FilesRelational Big Data Files
©2013 Jaspersoft Corporation. Proprietary and C...
©2013 Jaspersoft Corporation. Proprietary and Confidential 43
… bringing Intelligence to Any App
Jaspersoft for AWS Details
Jaspersoft for AWS Overview
 Jaspersoft is the first BI service that you can buy per hour
 No user limitations, no month...
Jaspersoft for AWS In Action
46
“We've taken the
desktop power of data
visualization tools,
built it scale on the
HTML5 we...
©2010 Jaspersoft Corporation. Proprietary and Confidential 47
Jaspersoft on Amazon AWS
Fast Customer Growth
©2013 Jaspersoft Corporation. Proprietary and Confidential 48
Some Early Sta...
Some Early Customers
©2013 Jaspersoft Corporation. Proprietary and Confidential 49
NEW! Jaspersoft for AWS Promo
 What?
 Free Jaspersoft for AWS on XL instance
 $175 of AWS credits for AWS services
 Wh...
The Intelligence Inside
Thank You
www.jaspersoft.com/amazon
aws-marketplace@jaspersoft.com
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhishek Sinha and Co-Presented with Jaspersoft
Upcoming SlideShare
Loading in...5
×

AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhishek Sinha and Co-Presented with Jaspersoft

660

Published on

Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. In this session we'll give an introduction to the service and its pricing before diving into how it delivers fast query performance on data sets ranging from hundreds of gigabytes to a petabyte or more.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
660
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhishek Sinha and Co-Presented with Jaspersoft

  1. 1. Abhishek Sinha Business Development Manager sinhaar@amazon.com @abysinha Petabyte Scale Data Warehousing on the Cloud
  2. 2. Data warehousing done the AWS way • No upfront costs, pay as you go • Really fast performance at a really low price • Open and flexible with support for popular tools • Easy to provision and scale up massively
  3. 3. We set out to build… A fast and powerful, petabyte-scale data warehouse that is: Delivered as a managed service A Lot Faster A Lot Cheaper A Lot SimplerAmazon Redshift
  4. 4. We’re off to a good start
  5. 5. We set out to build… A fast and powerful, petabyte-scale data warehouse that is: Delivered as a managed service A Lot Faster A Lot Cheaper A Lot SimplerAmazon Redshift
  6. 6. Amazon Redshift dramatically reduces I/O ID Age State 123 20 CA 345 25 WA 678 40 FL Row storage Column storage Scan Direction
  7. 7. Amazon Redshift dramatically reduces I/O • Data compression • Zone maps • Direct-attached storage • Large data block sizes ID Age State Amou nt 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • With row storage you do unnecessary I/O • To get total amount, you have to read everything
  8. 8. Amazon Redshift dramatically reduces I/O • Data compression • Zone maps • Direct-attached storage • Large data block sizes ID Age State Amou nt 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • With column storage, you only read the data you need
  9. 9. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • Large data block sizes • Columnar compression saves space & reduces I/O • Amazon Redshift analyzes and compresses your data analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw
  10. 10. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Direct-attached storage • Large data block sizes • Track of the minimum and maximum value for each block • Skip over blocks that don’t contain the data needed for a given query • Minimize unnecessary I/O
  11. 11. Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • Large data block sizes • Use direct-attached storage to maximize throughput • Hardware optimized for high performance data processing • Large block sizes to make the most of each read • Amazon Redshift manages durability for you
  12. 12. Amazon Redshift architecture • Leader Node – SQL endpoint – Stores metadata – Coordinates query execution • Compute Nodes – Local, columnar storage – Execute queries in parallel – Load, backup, restore via Amazon S3 – Parallel load from Amazon DynamoDB • Single node version available 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  13. 13. Amazon Redshift runs on optimized hardware HS1.8XL: 128 GB RAM, 16 Cores, 24 Spindles, 16 TB compressed user storage, 2 GB/sec scan rate HS1.XL: 16 GB RAM, 2 Cores, 3 Spindles, 2 TB compressed customer storage • Optimized for I/O intensive workloads • High disk density • Runs in HPC - fast network • HS1.8XL available on Amazon EC2
  14. 14. Amazon Redshift parallelizes and distributes everything • Query • Load • Backup • Restore • Resize 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  15. 15. Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize
  16. 16. Amazon Redshift parallelizes and distributes everything • Load in parallel from Amazon S3 or Amazon DynamoDB • Data automatically distributed and sorted according to DDL • Scales linearly with number of nodes • Query • Load • Backup/Restore • Resize
  17. 17. Amazon Redshift parallelizes and distributes everything • Backups to Amazon S3 are automatic, continuous and incremental • Configurable system snapshot retention period • Take user snapshots on- demand • Streaming restores enable you to resume querying faster • Query • Load • Backup/Restore • Resize
  18. 18. Amazon Redshift parallelizes and distributes everything • Resize while remaining online • Provision a new cluster in the background • Copy data in parallel from node to node • Only charged for source cluster • Query • Load • Backup/Restore • Resize
  19. 19. Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize • Automatic SQL endpoint switchover via DNS • Decommission the source cluster • Simple operation via AWS Console or API
  20. 20. Point and click resize
  21. 21. Amazon Redshift lets you start small and grow big Extra Large Node (HS1.XL) 3 spindles, 2 TB, 16 GB RAM, 2 cores Single Node (2 TB) Cluster 2-32 Nodes (4 TB – 64 TB) Eight Extra Large Node (HS1.8XL) 24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE Cluster 2-100 Nodes (32 TB – 1.6 PB) Note: Nodes not to scale
  22. 22. Amazon Redshift is priced to let you analyze all your data Price Per Hour for HS1.XL Single Node Effective Hourly Price Per TB Effective Annual Price per TB On-Demand $ 0.850 $ 0.425 $ 3,723 1 Year Reservation $ 0.500 $ 0.250 $ 2,190 3 Year Reservation $ 0.228 $ 0.114 $ 999 Simple Pricing Number of Nodes x Cost per Hour No charge for Leader Node No upfront costs Pay as you go
  23. 23. Amazon Redshift is easy to use • Provision in minutes • Monitor query performance • Point and click resize • Built in security • Automatic backups
  24. 24. Provision a data warehouse in minutes
  25. 25. Monitor query performance
  26. 26. Amazon Redshift integrates with multiple data sources Amazon DynamoDB Amazon Elastic MapReduce Amazon Simple Storage Service (S3) Amazon Elastic Compute Cloud (EC2) AWS Storage Gateway Service Corporate Data Center Amazon Relational Database Service (RDS) Amazon Redshift More coming soon…
  27. 27. Amazon Redshift provides multiple data loading options • Upload to Amazon S3 • AWS Import/Export • AWS Direct Connect • Work with a partner Data Integration Systems Integrators More coming soon…
  28. 28. Amazon Redshift works with your existing analysis tools JDBC/ODBC Amazon Redshift More coming soon…
  29. 29. Jaspersoft for AWS Overview
  30. 30. ©2010 Jaspersoft Corporation. Proprietary and Confidential 31
  31. 31. Competing on Time and Information ©2013 Jaspersoft Corporation. Proprietary and Confidential 32 “The New Factors of Production: Time and Information” Brian Gentile, Jaspersoft But business users don’t have access to timely, actionable data Why? Most don’t spend their day inside a BI tool …nor do they want to!
  32. 32. We Need “Intelligence Inside” ©2013 Jaspersoft Corporation. Proprietary and Confidential 33 We want information to FIND US, not the other way round  Pipeline dashboard inside SaaS CRM app  Performance report inside partner portal  Salary data visualizations inside HR intranet  Portfolio analytics inside client website  Tickets crosstab inside custom helpdesk app  Interactive charts inside native mobile app “To make analytics more actionable and pervasively deployed, BI professionals must make analytics more invisible to their users […] through embedded analytic applications at the point of decision or action.”
  33. 33. Jaspersoft: The Intelligence Inside ©2013 Jaspersoft Corporation. Proprietary and Confidential 34 Self-Service BI + Embeddable + Affordable “We empower millions of people every day to make better decisions faster by delivering timely, actionable data to them inside their apps and business process through an embeddable, cost- effective reporting and analytics platform.”
  34. 34. Intelligence Inside Example Customers Commercial Apps Customer Portals Cloud Apps Internal Apps Big Data Analytics The Intelligence Inside Business ©2013 Jaspersoft Corporation. Proprietary and Confidential 35
  35. 35. Strong Partnerships, Broad Recognition High Growth Subscription Revenue Company ©2013 Jaspersoft Corporation. Proprietary and Confidential World’s Most Widely Deployed BI • Commercial Open Source BI Suite • Nearly 200 people worldwide • 16,000,000 downloads • 325,000 community members • 130,000 embedded applications • 1,800 subscription customers Jaspersoft: High Growth and Momentum 2010 2011 2012 2013 Magic Quadrants 36
  36. 36. Winner, Technology of the Year 2013  Jaspersoft wins alongside iPad Mini, Hadoop, HTML5  Only business intelligence or analytics vendor to win ©2013 Jaspersoft Corporation. Proprietary and Confidential 37 “Jaspersoft's powerhouse reporting and analytics platform [….] remains a flexible fit for a broad range of use cases. Whether you're looking to scrub petabytes of data with threat analytics, or just knock out some slick dashboards that drill into customer traffic patterns, Jaspersoft has the right stuff.” InfoWorld
  37. 37. Product Overview
  38. 38. Design Any Report . . . ©2013 Jaspersoft Corporation. Proprietary and Confidential 39
  39. 39. … Dashboard 40©2013 Jaspersoft Corporation. Proprietary and Confidential
  40. 40. … or Analytic View 41©2013 Jaspersoft Corporation. Proprietary and Confidential
  41. 41. POJO files … using Any Data Type Relational FilesRelational Big Data Files ©2013 Jaspersoft Corporation. Proprietary and Confidential 42
  42. 42. ©2013 Jaspersoft Corporation. Proprietary and Confidential 43 … bringing Intelligence to Any App
  43. 43. Jaspersoft for AWS Details
  44. 44. Jaspersoft for AWS Overview  Jaspersoft is the first BI service that you can buy per hour  No user limitations, no monthly fee,  Starting at $0.40 an hour  First BI service to automatically connect to your AWS data  10 minutes from purchase to analyzing your data in RDS or Redshift  AWS Security Integration ©2010 Jaspersoft Corporation. Proprietary and Confidential 45
  45. 45. Jaspersoft for AWS In Action 46 “We've taken the desktop power of data visualization tools, built it scale on the HTML5 web, and made it embeddable within any app, device or portal” ©2013 Jaspersoft Corporation. Proprietary and Confidential
  46. 46. ©2010 Jaspersoft Corporation. Proprietary and Confidential 47
  47. 47. Jaspersoft on Amazon AWS Fast Customer Growth ©2013 Jaspersoft Corporation. Proprietary and Confidential 48 Some Early Stats - Added 250 paying customers in 3 months - Currently ~ 30% staying active - Revenue grew 10X over last month - Last month usage ~ 70% US, 25% EU, 5% ROW “This is truly a disruptive product offering. The pricing is extremely cost effective and I had it setup with dashboards in an hour.” Sage Human Capital “Jaspersoft has developed a truly innovative offering with its utility-based pricing model.” Click Travel “I’ve been looking at your product offering for an internal project and the experience has been very positive. I think you guys have the right product, right place, right time.” Leading Cloud Provider
  48. 48. Some Early Customers ©2013 Jaspersoft Corporation. Proprietary and Confidential 49
  49. 49. NEW! Jaspersoft for AWS Promo  What?  Free Jaspersoft for AWS on XL instance  $175 of AWS credits for AWS services  When?  From June 15, 2013 – July 14, 2013  How?  Go to www.jaspersoft.com/cloud and sign up  Details: https://aws.amazon.com/marketplace/help/201193990 ©2010 Jaspersoft Corporation. Proprietary and Confidential 50
  50. 50. The Intelligence Inside Thank You www.jaspersoft.com/amazon aws-marketplace@jaspersoft.com

×