SlideShare a Scribd company logo
Amazon Redshift:
How we managed 300 billion rows with no DBA
Matt Cohen
Founder & President
matt@onespot.com
December 10th, 2013

Copyright©2013OneSpot,Proprietary&Confidential

1
What is OneSpot?
• OneSpot is a content
advertising platform that
distributes content as
ads that people want
to click on.
– Fortune 2000 clients
– Realtime ad exchange
bidding
– Adaptive machine learning
– Seed funded until
$5.3M Series A last month

• Big data, big analysis
Copyright©2013OneSpot,Proprietary&Confidential

2
What is Redshift?
1. When light from a receding object appears
shifted to the red end of the spectrum
– A consequence of the expanding universe.

2. A cheap, fast, Petabyte-scale, managed
SQL data warehouse service from Amazon
Web Services
– A consequence of the expanding cloud ecosystem

Copyright©2013OneSpot,Proprietary&Confidential

3
Why Redshift?
•
•
•
•
•
•
•

Cheap
Fast
Petabyte-scale
Managed Service
SQL
Data Warehouse
From AWS

Copyright©2013OneSpot,Proprietary&Confidential

4
SQL Data Warehouse
• Based on the commercial ParAccel database
– Which is based on Postgres

• Standards-based tools and knowledge
• Built for data warehousing
–
–
–
–
–

Column-oriented
Cluster architecture
Read optimized
No relational integrity
Almost no SQL extensions

Copyright©2013OneSpot,Proprietary&Confidential

5
SQL Data Warehouse
• Column-oriented

Copyright©2013OneSpot,Proprietary&Confidential

6
SQL Data Warehouse
• Column-oriented

• 11 different compression techniques

Copyright©2013OneSpot,Proprietary&Confidential

7
SQL Data Warehouse
• Cluster architecture

Copyright©2013OneSpot,Proprietary&Confidential

8
SQL Data Warehouse
• Read optimized

• No relational integrity

– Large block size (1MB)
– Data replication

– No indexes:
sort and distribution keys

• 2x live, 1x S3

• Almost no SQL
extensions

Copyright©2013OneSpot,Proprietary&Confidential

9
Fast = Cheap
• Starts with 1 XL node
– 85¢ an hour ($620/month) on demand
– 50¢ an hour ($365) 1 year reserved

• Benchmarks say:
– Scales linearly
– 5-10x faster than Hadoop/Hive

Copyright©2013OneSpot,Proprietary&Confidential

10
Petabyte scale
• Up to
– 32 XL nodes (64 Terabytes)
– 100 8XL nodes (1.6 Petabytes)

Copyright©2013OneSpot,Proprietary&Confidential

11
Managed Service from AWS
• Managed Service
– Incredibly easy
– Nice UI
– Most SQL tools

• From AWS
– Free data transfer
– Easy load from S3
– Use AWS Data Pipeline

Copyright©2013OneSpot,Proprietary&Confidential

12
The TL;DR
• Pros
–
–
–
–
–

Standard SQL
Super easy
Very fast
Affordable
Integrates with AWS

– No DBA
– No Sysadmin

• Cons
– Standard SQL
– Almost no SQL
extensions
– Best with Star Schema
• Big joins can be slow

–
–
–
–

Copyright©2013OneSpot,Proprietary&Confidential

No MapReduce
Fixed columns
Consistency
1.6 Pbyte limit

13
Amazon Redshift:
How we managed 300 billion rows with no DBA
Matt Cohen
Founder & President
matt@onespot.com
December 10th, 2013

Copyright©2013OneSpot,Proprietary&Confidential

14

More Related Content

What's hot

Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Fwdays
 
Pyramid vs QlikView
Pyramid vs QlikViewPyramid vs QlikView
Pyramid vs QlikView
Pyramid Analytics
 
Pyramid Analytics vs Sisense
Pyramid Analytics vs SisensePyramid Analytics vs Sisense
Pyramid Analytics vs Sisense
Pyramid Analytics
 
Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)
Rasmus Ekman
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
Amazon Web Services
 
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
HostedbyConfluent
 
Wix sql on-storm-platform
Wix sql on-storm-platformWix sql on-storm-platform
Wix sql on-storm-platform
alooma
 
Datastax Expedia
Datastax ExpediaDatastax Expedia
Datastax Expedia
Eddie Satterly
 
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics PlatformBig Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform
Sudhir Tonse
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
Elasticsearch
 
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
Databricks
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
Amazon Web Services
 
Optimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsOptimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics Workloads
Amazon Web Services
 
Azure Big Data Story
Azure Big Data StoryAzure Big Data Story
Azure Big Data Story
Lynn Langit
 
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...
HostedbyConfluent
 
Lessons learnt - building a data lake with redshift, emr, and athena - aws co...
Lessons learnt - building a data lake with redshift, emr, and athena - aws co...Lessons learnt - building a data lake with redshift, emr, and athena - aws co...
Lessons learnt - building a data lake with redshift, emr, and athena - aws co...
AWSCOMSUM
 
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
Amazon Web Services
 

What's hot (17)

Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
 
Pyramid vs QlikView
Pyramid vs QlikViewPyramid vs QlikView
Pyramid vs QlikView
 
Pyramid Analytics vs Sisense
Pyramid Analytics vs SisensePyramid Analytics vs Sisense
Pyramid Analytics vs Sisense
 
Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)Rethinking the database for the cloud (iJAWS)
Rethinking the database for the cloud (iJAWS)
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
Event & Data Mesh as a Service: Industrializing Microservices in the Enterpri...
 
Wix sql on-storm-platform
Wix sql on-storm-platformWix sql on-storm-platform
Wix sql on-storm-platform
 
Datastax Expedia
Datastax ExpediaDatastax Expedia
Datastax Expedia
 
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics PlatformBig Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
 
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
The Evolution of the Fashion Retail Industry in the Age of AI with Kshitij Ku...
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
Optimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics WorkloadsOptimizing Storage for Big Data/Analytics Workloads
Optimizing Storage for Big Data/Analytics Workloads
 
Azure Big Data Story
Azure Big Data StoryAzure Big Data Story
Azure Big Data Story
 
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...
Streaming Data in the Cloud with Confluent and MongoDB Atlas | Robert Walters...
 
Lessons learnt - building a data lake with redshift, emr, and athena - aws co...
Lessons learnt - building a data lake with redshift, emr, and athena - aws co...Lessons learnt - building a data lake with redshift, emr, and athena - aws co...
Lessons learnt - building a data lake with redshift, emr, and athena - aws co...
 
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
Taking the Performance of your Data Warehouse to the Next Level with Amazon R...
 

Viewers also liked

WebTrends Partner Program Guide
WebTrends Partner Program GuideWebTrends Partner Program Guide
WebTrends Partner Program Guide
Abed Farhan
 
How Lifecycle Marketing is Transforming Marketing Automation
How Lifecycle Marketing is Transforming Marketing AutomationHow Lifecycle Marketing is Transforming Marketing Automation
How Lifecycle Marketing is Transforming Marketing Automation
Right On Interactive
 
E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...
E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...
E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...
E-Commerce Berlin EXPO
 
Svendsen wevideo social media days 2014 oslo
Svendsen wevideo social media days 2014 osloSvendsen wevideo social media days 2014 oslo
Svendsen wevideo social media days 2014 oslo
Nils Petter Nordskar
 
On Target 2014, Christopher Engman, Vendemore
On Target 2014, Christopher Engman, VendemoreOn Target 2014, Christopher Engman, Vendemore
On Target 2014, Christopher Engman, Vendemore
Vendemore [A Bisnode Company]
 
On target 2015, Christopher Engman, Vendemore
On target 2015, Christopher Engman, VendemoreOn target 2015, Christopher Engman, Vendemore
On target 2015, Christopher Engman, Vendemore
Vendemore [A Bisnode Company]
 

Viewers also liked (6)

WebTrends Partner Program Guide
WebTrends Partner Program GuideWebTrends Partner Program Guide
WebTrends Partner Program Guide
 
How Lifecycle Marketing is Transforming Marketing Automation
How Lifecycle Marketing is Transforming Marketing AutomationHow Lifecycle Marketing is Transforming Marketing Automation
How Lifecycle Marketing is Transforming Marketing Automation
 
E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...
E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...
E-commerce Berlin Expo 2017 - Cross Border Ecommerce: Making the Most of Chin...
 
Svendsen wevideo social media days 2014 oslo
Svendsen wevideo social media days 2014 osloSvendsen wevideo social media days 2014 oslo
Svendsen wevideo social media days 2014 oslo
 
On Target 2014, Christopher Engman, Vendemore
On Target 2014, Christopher Engman, VendemoreOn Target 2014, Christopher Engman, Vendemore
On Target 2014, Christopher Engman, Vendemore
 
On target 2015, Christopher Engman, Vendemore
On target 2015, Christopher Engman, VendemoreOn target 2015, Christopher Engman, Vendemore
On target 2015, Christopher Engman, Vendemore
 

Similar to 2 one spot redshift bigdatacamp 1.02

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Amazon Web Services
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
Amazon Web Services
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
Amazon Web Services
 
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
Amazon Web Services
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of Light
Amazon Web Services LATAM
 
AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...
AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...
AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...
Amazon Web Services
 
Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users
mauerbac
 
A3 transforming data_management_in_the_cloud
A3 transforming data_management_in_the_cloudA3 transforming data_management_in_the_cloud
A3 transforming data_management_in_the_cloud
Dr. Wilfred Lin (Ph.D.)
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
Amazon Web Services
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
Amazon Web Services
 
Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.
Amazon Web Services
 
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Web Services
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
Amazon Web Services
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
SnapLogic
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Qubole
 
Deep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersDeep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million Users
Amazon Web Services
 
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfBuilding_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Amazon Web Services
 
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit DublinScaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Amazon Web Services
 
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit DublinScaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Ian Massingham
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
Amazon Web Services
 

Similar to 2 one spot redshift bigdatacamp 1.02 (20)

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
AWS Summit London 2014 | Scaling on AWS for the First 10 Million Users (200)
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of Light
 
AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...
AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...
AWS as a Data Platform for Cloud and On-Premises Workloads | AWS Public Secto...
 
Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users Scaling on AWS to the First 10 Million Users
Scaling on AWS to the First 10 Million Users
 
A3 transforming data_management_in_the_cloud
A3 transforming data_management_in_the_cloudA3 transforming data_management_in_the_cloud
A3 transforming data_management_in_the_cloud
 
Scaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million UsersScaling on AWS for the First 10 Million Users
Scaling on AWS for the First 10 Million Users
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.Amazon Redshift with Full 360 Inc.
Amazon Redshift with Full 360 Inc.
 
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
Amazon Redshift in Action: Enterprise, Big Data, and SaaS Use Cases (DAT205) ...
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
 
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
Data Warehouse Modernization - Big Data in the Cloud Success with Qubole on O...
 
Deep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million UsersDeep Dive: Scaling Up to Your First 10 Million Users
Deep Dive: Scaling Up to Your First 10 Million Users
 
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfBuilding_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
 
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit DublinScaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit Dublin
 
Scaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit DublinScaling on AWS for the First 10 Million Users at Websummit Dublin
Scaling on AWS for the First 10 Million Users at Websummit Dublin
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 

More from BigDataCamp

Ingest, Transform & Visualize w Amazon Web Services
Ingest, Transform & Visualize w Amazon Web ServicesIngest, Transform & Visualize w Amazon Web Services
Ingest, Transform & Visualize w Amazon Web Services
BigDataCamp
 
BigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 ScheduleBigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 Schedule
BigDataCamp
 
5 kinesis lightning
5 kinesis lightning5 kinesis lightning
5 kinesis lightning
BigDataCamp
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned
BigDataCamp
 
3 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-133 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-13
BigDataCamp
 
1 big datacampdell2013
1 big datacampdell20131 big datacampdell2013
1 big datacampdell2013
BigDataCamp
 
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCampStefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
BigDataCamp
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
BigDataCamp
 
Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCampStefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp
BigDataCamp
 
Sam Charrington Of Appistry Gives Lighting Talk
Sam Charrington Of Appistry Gives Lighting TalkSam Charrington Of Appistry Gives Lighting Talk
Sam Charrington Of Appistry Gives Lighting Talk
BigDataCamp
 
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampSteve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
BigDataCamp
 

More from BigDataCamp (11)

Ingest, Transform & Visualize w Amazon Web Services
Ingest, Transform & Visualize w Amazon Web ServicesIngest, Transform & Visualize w Amazon Web Services
Ingest, Transform & Visualize w Amazon Web Services
 
BigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 ScheduleBigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 Schedule
 
5 kinesis lightning
5 kinesis lightning5 kinesis lightning
5 kinesis lightning
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned
 
3 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-133 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-13
 
1 big datacampdell2013
1 big datacampdell20131 big datacampdell2013
1 big datacampdell2013
 
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCampStefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
 
Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCampStefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp
 
Sam Charrington Of Appistry Gives Lighting Talk
Sam Charrington Of Appistry Gives Lighting TalkSam Charrington Of Appistry Gives Lighting Talk
Sam Charrington Of Appistry Gives Lighting Talk
 
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampSteve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
 

Recently uploaded

Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 

Recently uploaded (20)

Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 

2 one spot redshift bigdatacamp 1.02

  • 1. Amazon Redshift: How we managed 300 billion rows with no DBA Matt Cohen Founder & President matt@onespot.com December 10th, 2013 Copyright©2013OneSpot,Proprietary&Confidential 1
  • 2. What is OneSpot? • OneSpot is a content advertising platform that distributes content as ads that people want to click on. – Fortune 2000 clients – Realtime ad exchange bidding – Adaptive machine learning – Seed funded until $5.3M Series A last month • Big data, big analysis Copyright©2013OneSpot,Proprietary&Confidential 2
  • 3. What is Redshift? 1. When light from a receding object appears shifted to the red end of the spectrum – A consequence of the expanding universe. 2. A cheap, fast, Petabyte-scale, managed SQL data warehouse service from Amazon Web Services – A consequence of the expanding cloud ecosystem Copyright©2013OneSpot,Proprietary&Confidential 3
  • 4. Why Redshift? • • • • • • • Cheap Fast Petabyte-scale Managed Service SQL Data Warehouse From AWS Copyright©2013OneSpot,Proprietary&Confidential 4
  • 5. SQL Data Warehouse • Based on the commercial ParAccel database – Which is based on Postgres • Standards-based tools and knowledge • Built for data warehousing – – – – – Column-oriented Cluster architecture Read optimized No relational integrity Almost no SQL extensions Copyright©2013OneSpot,Proprietary&Confidential 5
  • 6. SQL Data Warehouse • Column-oriented Copyright©2013OneSpot,Proprietary&Confidential 6
  • 7. SQL Data Warehouse • Column-oriented • 11 different compression techniques Copyright©2013OneSpot,Proprietary&Confidential 7
  • 8. SQL Data Warehouse • Cluster architecture Copyright©2013OneSpot,Proprietary&Confidential 8
  • 9. SQL Data Warehouse • Read optimized • No relational integrity – Large block size (1MB) – Data replication – No indexes: sort and distribution keys • 2x live, 1x S3 • Almost no SQL extensions Copyright©2013OneSpot,Proprietary&Confidential 9
  • 10. Fast = Cheap • Starts with 1 XL node – 85¢ an hour ($620/month) on demand – 50¢ an hour ($365) 1 year reserved • Benchmarks say: – Scales linearly – 5-10x faster than Hadoop/Hive Copyright©2013OneSpot,Proprietary&Confidential 10
  • 11. Petabyte scale • Up to – 32 XL nodes (64 Terabytes) – 100 8XL nodes (1.6 Petabytes) Copyright©2013OneSpot,Proprietary&Confidential 11
  • 12. Managed Service from AWS • Managed Service – Incredibly easy – Nice UI – Most SQL tools • From AWS – Free data transfer – Easy load from S3 – Use AWS Data Pipeline Copyright©2013OneSpot,Proprietary&Confidential 12
  • 13. The TL;DR • Pros – – – – – Standard SQL Super easy Very fast Affordable Integrates with AWS – No DBA – No Sysadmin • Cons – Standard SQL – Almost no SQL extensions – Best with Star Schema • Big joins can be slow – – – – Copyright©2013OneSpot,Proprietary&Confidential No MapReduce Fixed columns Consistency 1.6 Pbyte limit 13
  • 14. Amazon Redshift: How we managed 300 billion rows with no DBA Matt Cohen Founder & President matt@onespot.com December 10th, 2013 Copyright©2013OneSpot,Proprietary&Confidential 14