SlideShare a Scribd company logo
Introduction to
Amazon Redshift
May, 2014
/Abdullah Cetin CAVDAR @accavdar
What's Amazon Redshift?
Amazon Redshift is a fast and powerful, fully
managed, petabyte-scale data warehouse service in
the cloud
https://aws.amazon.com/redshift/
Features
Petabyte scale, massively parallel
Relational data warehouse
Fully managed, zero admin
SSD and HDD platforms
$999/TB/Year
Architecture
Client Applications
Integrates with various data loading and ETL (Extract, Transform, and
Load) tools and business intelligence (BI) reporting, data mining, and
analytics tools
Redshift is based on industry-standard PostgreSQL, so most existing
SQL client applications will work with only minimal changes
Connections
Redshift communicates with client applications by using industry-
standard PostgreSQL JDBC and ODBC drivers
Clusters
A cluster is composed of one or more compute nodes
Leader Node coordinates the compute nodes and handles external
communication
Leader Node
Manage communications with client programs and communications
with compute nodes
Store metadata
Coordinate query execution
Compute Nodes
Execute the compiled code, send intermediate results back to the
leader node for final aggregation
It has own dedicated CPU, memory, and attached disk storage, which
are determined by the node type
Databases
A cluster contains one or more databases
User data is stored on the compute nodes
Amazon Redshift is a Relational Database Management System
(RDBMS)
Amazon Redshift is optimized for high-performance analysis and
reporting of very large datasets
Amazon Redshift is based on PostgreSQL
Redshift reduces I/O
Column storage - read data you need
Data compression - analyzes and compress your data
Zone Map
Keep track of minimum and maximum value for each block
Skip over blocks that don't contain data needed for a given query
Minimize unnecessary I/O
Direct attached storage
Hardware optimized for high performance data processing
Large data block sizes
Large block sizes to make the most of each read
Redshift runs on optimized
hardware
Optimized for I/O intensive workloads
High disk density
Runs in HPC - fast network
Redshift parallelizes and
distributes everything
Query
Load
Backup/Restore
Resize
Redshift is easy to use
Provision in minutes
Monitor query performance
Point and click resize
Built in security
Automatic backups
Redshift has security built-in
SSL to secure data in transit
Encryption to secure data at rest
AES 256 - hardware accelerated
All blocks on disk and in Amazon S3 encrypted
No direct access to compute nodes
Amazon VPC support
Redshift backs up your data
and recovers from failures
Replication within the cluster and backup to Amazon S3
Backup to Amazon S3 are continuous, automatic and incremental
Continuous monitoring and automated recovery from failures
Able to restore snapshots to any Availability Zone
Use Cases
Traditional Enterprise DW
Reduce costs by extending DW rather than adding HW
Migrate completely from existing DW systems
Respond faster to business
Companies with Big Data
Improve performance by an order of magnitude
Make more data available for analysis
Access business data via standard reporting tools
SaaS Companies
Add analytic functionality to applications
Scale DW capacity as demand grows
Reduce HW and SW costs by an order of magnitude
 Use Caseskillpages
Data Architecture
Redshift Implementation
High Storage Extra Large (XL) DW Node
ETL Activities
Approx. 90 minutes including exports from RDBMS, copying to S3,
loading stage tables, loading target tables, vacuuming and
analysing tables
Schema
Compression
Retention
DW Anatomy
Why Redshift works for
SkillPages?
Scale - MPP
Performance - Columnar data access and compression
Platform Integration - S3, Dynamo
Operational Advantages
Ease of Access
Cost
Best Practices
Avoid large number of singleton Data Manipulation Language (DML)
statements if possible
Use COPY for uploading large datasets
Choose SORT and DISTRIBUTION keys with care
Encode data and time with TIMESTAMP data type
Experiment with WLM (Workload Manager) settings
Slides
https://github.com/accavdar/AmazonRedshift
THE END
by Abdullah Cetin CAVDAR / @accavdar

More Related Content

What's hot

Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
Aravindharamanan S
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
Amazon Web Services
 
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Amazon Web Services
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Amazon Web Services
 
Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2
Amazon Web Services
 

What's hot (20)

Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
 
Microsoft Azure Data Warehouse Overview
Microsoft Azure Data Warehouse OverviewMicrosoft Azure Data Warehouse Overview
Microsoft Azure Data Warehouse Overview
 
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...
Microsoft: Building a Massively Scalable System with DataStax and Microsoft's...
 
Running your database in the cloud presentation
Running your database in the cloud presentationRunning your database in the cloud presentation
Running your database in the cloud presentation
 
Lecture1
Lecture1Lecture1
Lecture1
 
Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2Scalable Data Analytics - DevDay Austin 2017 Day 2
Scalable Data Analytics - DevDay Austin 2017 Day 2
 
Azure Cosmos DB Pricing 101 Infographic
Azure Cosmos DB Pricing 101 InfographicAzure Cosmos DB Pricing 101 Infographic
Azure Cosmos DB Pricing 101 Infographic
 
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
Caching with DynamoDB and DAX - DevDay Austin 2017 Day 2
 
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
 
NoSQL Migration Technical Pitch Deck
NoSQL Migration Technical Pitch DeckNoSQL Migration Technical Pitch Deck
NoSQL Migration Technical Pitch Deck
 
Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
 
Co 4, session 2, aws analytics services
Co 4, session 2, aws analytics servicesCo 4, session 2, aws analytics services
Co 4, session 2, aws analytics services
 
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
re:Invent Round-up, Time Stream, Quantum and Managed Blockchain
 
2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2Data Design for Microservices - DevDay Austin 2017 Day 2
Data Design for Microservices - DevDay Austin 2017 Day 2
 
Super charged prototyping
Super charged prototypingSuper charged prototyping
Super charged prototyping
 
Deep Dive on Amazon Redshift - AWS Summit Cape Town 2017
Deep Dive on Amazon Redshift - AWS Summit Cape Town 2017 Deep Dive on Amazon Redshift - AWS Summit Cape Town 2017
Deep Dive on Amazon Redshift - AWS Summit Cape Town 2017
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 

Viewers also liked

Viewers also liked (7)

Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
Introduction to Amazon Redshift and What's Next (DAT103) | AWS re:Invent 2013
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
 
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
 
(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift(DAT201) Introduction to Amazon Redshift
(DAT201) Introduction to Amazon Redshift
 
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
AWS re:Invent 2016: Best Practices for Data Warehousing with Amazon Redshift ...
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
 

Similar to Introduction to Amazon Redshift

Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢
Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢
Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢
Amazon Web Services
 
State of the Union: Database & Analytics
State of the Union: Database & AnalyticsState of the Union: Database & Analytics
State of the Union: Database & Analytics
Amazon Web Services
 

Similar to Introduction to Amazon Redshift (20)

Databases - State of the Union
Databases - State of the UnionDatabases - State of the Union
Databases - State of the Union
 
Getting Started with Managed Database Services on AWS - AWS Summit Tel Aviv 2017
Getting Started with Managed Database Services on AWS - AWS Summit Tel Aviv 2017Getting Started with Managed Database Services on AWS - AWS Summit Tel Aviv 2017
Getting Started with Managed Database Services on AWS - AWS Summit Tel Aviv 2017
 
Datavail Accelerates AWS Adoption for Sony DADC New Media Solutions PPT
 Datavail Accelerates AWS Adoption for Sony DADC New Media Solutions PPT Datavail Accelerates AWS Adoption for Sony DADC New Media Solutions PPT
Datavail Accelerates AWS Adoption for Sony DADC New Media Solutions PPT
 
AWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif Abbasi
AWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif AbbasiAWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif Abbasi
AWS reinvent 2019 recap - Riyadh - Database and Analytics - Assif Abbasi
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢
Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢
Track 3 Session 6_打造應用專屬資料庫 (Purpose-built) 與了解託管服務優勢
 
State of the Union: Database & Analytics
State of the Union: Database & AnalyticsState of the Union: Database & Analytics
State of the Union: Database & Analytics
 
Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”Building Analytic Apps for SaaS: “Analytics as a Service”
Building Analytic Apps for SaaS: “Analytics as a Service”
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
AWS-DMS-2023.pptx
AWS-DMS-2023.pptxAWS-DMS-2023.pptx
AWS-DMS-2023.pptx
 
What is Amazon Redshift?
What is Amazon Redshift?What is Amazon Redshift?
What is Amazon Redshift?
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
What's New in Amazon RDS for Open Source and Commercial Databases
What's New in Amazon RDS for Open Source and Commercial DatabasesWhat's New in Amazon RDS for Open Source and Commercial Databases
What's New in Amazon RDS for Open Source and Commercial Databases
 
How Globe Telecom does Primary Backups via StorReduce to the AWS Cloud
 How Globe Telecom does Primary Backups via StorReduce to the AWS Cloud How Globe Telecom does Primary Backups via StorReduce to the AWS Cloud
How Globe Telecom does Primary Backups via StorReduce to the AWS Cloud
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
 
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
 

More from Abdullah Çetin ÇAVDAR (6)

Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
 
Big Data Tech Stack
Big Data Tech StackBig Data Tech Stack
Big Data Tech Stack
 
12-Factor App
12-Factor App12-Factor App
12-Factor App
 
Django Best Practices
Django Best PracticesDjango Best Practices
Django Best Practices
 
Internet of Things (IoT) and Google
Internet of Things (IoT) and GoogleInternet of Things (IoT) and Google
Internet of Things (IoT) and Google
 
Multi Screen Hell
Multi Screen HellMulti Screen Hell
Multi Screen Hell
 

Recently uploaded

Recently uploaded (20)

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 

Introduction to Amazon Redshift