• Save
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhishek Sinha and Co-Presented with Jaspersoft
Upcoming SlideShare
Loading in...5
×
 

AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhishek Sinha and Co-Presented with Jaspersoft

on

  • 1,080 views

Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. In this session we'll give an introduction to the service and its pricing before diving into ...

Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. In this session we'll give an introduction to the service and its pricing before diving into how it delivers fast query performance on data sets ranging from hundreds of gigabytes to a petabyte or more.

Statistics

Views

Total Views
1,080
Views on SlideShare
1,074
Embed Views
6

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 6

http://www.brijj.com 6

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhishek Sinha and Co-Presented with Jaspersoft AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhishek Sinha and Co-Presented with Jaspersoft Presentation Transcript

  • Abhishek Sinha Business Development Manager sinhaar@amazon.com @abysinha Petabyte Scale Data Warehousing on the Cloud
  • Data warehousing done the AWS way • No upfront costs, pay as you go • Really fast performance at a really low price • Open and flexible with support for popular tools • Easy to provision and scale up massively
  • We set out to build… A fast and powerful, petabyte-scale data warehouse that is: Delivered as a managed service A Lot Faster A Lot Cheaper A Lot SimplerAmazon Redshift
  • We’re off to a good start
  • We set out to build… A fast and powerful, petabyte-scale data warehouse that is: Delivered as a managed service A Lot Faster A Lot Cheaper A Lot SimplerAmazon Redshift
  • Amazon Redshift dramatically reduces I/O ID Age State 123 20 CA 345 25 WA 678 40 FL Row storage Column storage Scan Direction
  • Amazon Redshift dramatically reduces I/O • Data compression • Zone maps • Direct-attached storage • Large data block sizes ID Age State Amou nt 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • With row storage you do unnecessary I/O • To get total amount, you have to read everything
  • Amazon Redshift dramatically reduces I/O • Data compression • Zone maps • Direct-attached storage • Large data block sizes ID Age State Amou nt 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • With column storage, you only read the data you need
  • Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • Large data block sizes • Columnar compression saves space & reduces I/O • Amazon Redshift analyzes and compresses your data analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw
  • Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Direct-attached storage • Large data block sizes • Track of the minimum and maximum value for each block • Skip over blocks that don’t contain the data needed for a given query • Minimize unnecessary I/O
  • Amazon Redshift dramatically reduces I/O • Column storage • Data compression • Zone maps • Direct-attached storage • Large data block sizes • Use direct-attached storage to maximize throughput • Hardware optimized for high performance data processing • Large block sizes to make the most of each read • Amazon Redshift manages durability for you
  • Amazon Redshift architecture • Leader Node – SQL endpoint – Stores metadata – Coordinates query execution • Compute Nodes – Local, columnar storage – Execute queries in parallel – Load, backup, restore via Amazon S3 – Parallel load from Amazon DynamoDB • Single node version available 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • Amazon Redshift runs on optimized hardware HS1.8XL: 128 GB RAM, 16 Cores, 24 Spindles, 16 TB compressed user storage, 2 GB/sec scan rate HS1.XL: 16 GB RAM, 2 Cores, 3 Spindles, 2 TB compressed customer storage • Optimized for I/O intensive workloads • High disk density • Runs in HPC - fast network • HS1.8XL available on Amazon EC2
  • Amazon Redshift parallelizes and distributes everything • Query • Load • Backup • Restore • Resize 10 GigE (HPC) Ingestion Backup Restore JDBC/ODBC
  • Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize
  • Amazon Redshift parallelizes and distributes everything • Load in parallel from Amazon S3 or Amazon DynamoDB • Data automatically distributed and sorted according to DDL • Scales linearly with number of nodes • Query • Load • Backup/Restore • Resize
  • Amazon Redshift parallelizes and distributes everything • Backups to Amazon S3 are automatic, continuous and incremental • Configurable system snapshot retention period • Take user snapshots on- demand • Streaming restores enable you to resume querying faster • Query • Load • Backup/Restore • Resize
  • Amazon Redshift parallelizes and distributes everything • Resize while remaining online • Provision a new cluster in the background • Copy data in parallel from node to node • Only charged for source cluster • Query • Load • Backup/Restore • Resize
  • Amazon Redshift parallelizes and distributes everything • Query • Load • Backup/Restore • Resize • Automatic SQL endpoint switchover via DNS • Decommission the source cluster • Simple operation via AWS Console or API
  • Point and click resize
  • Amazon Redshift lets you start small and grow big Extra Large Node (HS1.XL) 3 spindles, 2 TB, 16 GB RAM, 2 cores Single Node (2 TB) Cluster 2-32 Nodes (4 TB – 64 TB) Eight Extra Large Node (HS1.8XL) 24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE Cluster 2-100 Nodes (32 TB – 1.6 PB) Note: Nodes not to scale
  • Amazon Redshift is priced to let you analyze all your data Price Per Hour for HS1.XL Single Node Effective Hourly Price Per TB Effective Annual Price per TB On-Demand $ 0.850 $ 0.425 $ 3,723 1 Year Reservation $ 0.500 $ 0.250 $ 2,190 3 Year Reservation $ 0.228 $ 0.114 $ 999 Simple Pricing Number of Nodes x Cost per Hour No charge for Leader Node No upfront costs Pay as you go
  • Amazon Redshift is easy to use • Provision in minutes • Monitor query performance • Point and click resize • Built in security • Automatic backups
  • Provision a data warehouse in minutes
  • Monitor query performance
  • Amazon Redshift integrates with multiple data sources Amazon DynamoDB Amazon Elastic MapReduce Amazon Simple Storage Service (S3) Amazon Elastic Compute Cloud (EC2) AWS Storage Gateway Service Corporate Data Center Amazon Relational Database Service (RDS) Amazon Redshift More coming soon…
  • Amazon Redshift provides multiple data loading options • Upload to Amazon S3 • AWS Import/Export • AWS Direct Connect • Work with a partner Data Integration Systems Integrators More coming soon…
  • Amazon Redshift works with your existing analysis tools JDBC/ODBC Amazon Redshift More coming soon…
  • Jaspersoft for AWS Overview
  • ©2010 Jaspersoft Corporation. Proprietary and Confidential 31
  • Competing on Time and Information ©2013 Jaspersoft Corporation. Proprietary and Confidential 32 “The New Factors of Production: Time and Information” Brian Gentile, Jaspersoft But business users don’t have access to timely, actionable data Why? Most don’t spend their day inside a BI tool …nor do they want to!
  • We Need “Intelligence Inside” ©2013 Jaspersoft Corporation. Proprietary and Confidential 33 We want information to FIND US, not the other way round  Pipeline dashboard inside SaaS CRM app  Performance report inside partner portal  Salary data visualizations inside HR intranet  Portfolio analytics inside client website  Tickets crosstab inside custom helpdesk app  Interactive charts inside native mobile app “To make analytics more actionable and pervasively deployed, BI professionals must make analytics more invisible to their users […] through embedded analytic applications at the point of decision or action.”
  • Jaspersoft: The Intelligence Inside ©2013 Jaspersoft Corporation. Proprietary and Confidential 34 Self-Service BI + Embeddable + Affordable “We empower millions of people every day to make better decisions faster by delivering timely, actionable data to them inside their apps and business process through an embeddable, cost- effective reporting and analytics platform.”
  • Intelligence Inside Example Customers Commercial Apps Customer Portals Cloud Apps Internal Apps Big Data Analytics The Intelligence Inside Business ©2013 Jaspersoft Corporation. Proprietary and Confidential 35
  • Strong Partnerships, Broad Recognition High Growth Subscription Revenue Company ©2013 Jaspersoft Corporation. Proprietary and Confidential World’s Most Widely Deployed BI • Commercial Open Source BI Suite • Nearly 200 people worldwide • 16,000,000 downloads • 325,000 community members • 130,000 embedded applications • 1,800 subscription customers Jaspersoft: High Growth and Momentum 2010 2011 2012 2013 Magic Quadrants 36
  • Winner, Technology of the Year 2013  Jaspersoft wins alongside iPad Mini, Hadoop, HTML5  Only business intelligence or analytics vendor to win ©2013 Jaspersoft Corporation. Proprietary and Confidential 37 “Jaspersoft's powerhouse reporting and analytics platform [….] remains a flexible fit for a broad range of use cases. Whether you're looking to scrub petabytes of data with threat analytics, or just knock out some slick dashboards that drill into customer traffic patterns, Jaspersoft has the right stuff.” InfoWorld
  • Product Overview
  • Design Any Report . . . ©2013 Jaspersoft Corporation. Proprietary and Confidential 39
  • … Dashboard 40©2013 Jaspersoft Corporation. Proprietary and Confidential
  • … or Analytic View 41©2013 Jaspersoft Corporation. Proprietary and Confidential
  • POJO files … using Any Data Type Relational FilesRelational Big Data Files ©2013 Jaspersoft Corporation. Proprietary and Confidential 42
  • ©2013 Jaspersoft Corporation. Proprietary and Confidential 43 … bringing Intelligence to Any App
  • Jaspersoft for AWS Details
  • Jaspersoft for AWS Overview  Jaspersoft is the first BI service that you can buy per hour  No user limitations, no monthly fee,  Starting at $0.40 an hour  First BI service to automatically connect to your AWS data  10 minutes from purchase to analyzing your data in RDS or Redshift  AWS Security Integration ©2010 Jaspersoft Corporation. Proprietary and Confidential 45
  • Jaspersoft for AWS In Action 46 “We've taken the desktop power of data visualization tools, built it scale on the HTML5 web, and made it embeddable within any app, device or portal” ©2013 Jaspersoft Corporation. Proprietary and Confidential
  • ©2010 Jaspersoft Corporation. Proprietary and Confidential 47
  • Jaspersoft on Amazon AWS Fast Customer Growth ©2013 Jaspersoft Corporation. Proprietary and Confidential 48 Some Early Stats - Added 250 paying customers in 3 months - Currently ~ 30% staying active - Revenue grew 10X over last month - Last month usage ~ 70% US, 25% EU, 5% ROW “This is truly a disruptive product offering. The pricing is extremely cost effective and I had it setup with dashboards in an hour.” Sage Human Capital “Jaspersoft has developed a truly innovative offering with its utility-based pricing model.” Click Travel “I’ve been looking at your product offering for an internal project and the experience has been very positive. I think you guys have the right product, right place, right time.” Leading Cloud Provider
  • Some Early Customers ©2013 Jaspersoft Corporation. Proprietary and Confidential 49
  • NEW! Jaspersoft for AWS Promo  What?  Free Jaspersoft for AWS on XL instance  $175 of AWS credits for AWS services  When?  From June 15, 2013 – July 14, 2013  How?  Go to www.jaspersoft.com/cloud and sign up  Details: https://aws.amazon.com/marketplace/help/201193990 ©2010 Jaspersoft Corporation. Proprietary and Confidential 50
  • The Intelligence Inside Thank You www.jaspersoft.com/amazon aws-marketplace@jaspersoft.com