SlideShare a Scribd company logo
1 of 29
Amazon Redshift
1
Author: Douglas Bernardini
What is Redshift?
• Cloud-Hosted data warehouse services: AWS
• Massive parallel processing (MPP)
• Analytics workloads on large scale datasets
• Stored by a column-oriented DBMS principle.
• Large scale datasets. Up petabytes
2
Features and Benefits
• Columnar storage
• Parallelizing queries
• Multiple nodes
• Custom JDBC and ODBC drivers
• Ready integraded:
• Amazon S3;
• Amazon DynamoDB;
• Amazon Elastic MapReduce;
• Amazon Kinesis
• Any SSH-enabled host.
• Fault Tolerant
• Automated Backups
• Fast Restores
• Secure:
• Encryption
• Network Isolation
• Audit and Compliance
• SQL friendly
3
MarketPlace
BI Tools
• Actian
• Actuate Corporation
• Birst
• Chartio
• ClearStory Data
• Dundas Data Visualization
• Infor
• Jaspersoft
• Jreport
• Logi Analytics
• Looker (Software)
• MicroStrategy
• Pentaho
• Periscope.io
Data Integrations Tools
• Attunity
• FlyData
• Informatica
• SnapLogic
• Talend
• Xplenty
4
• Qlik
• Redrock BI
• SAS (software)
• SiSense
• Spotfire
• Tableau Software
Data Load
5
DynamoDB Integration
6
DynamoDB Integration
7
Business Case
8
Data growing fast!
9
• Enterprise Data is growing at an exponential
rate
• Structured and Unstructured data
• Data requirements change rapidly
• Cost to maintain data is prohibitive
• Hardware not scalable
• Expensive to support
• Business agility suffers
• Reporting unable to change with the pace
of business
• Data silos create bottlenecks
Solution Proposal
10
• Leverage the flexibility of
Amazon Web Services
• Scalable
• Flexible
• Cost-Effective
• AWS Redshift
• Data Warehouse
• AWS S3
• Persistent Storage
• AWS Data Pipeline
• Data Orchestration and ETL
• AWS EC2 / MySQL
• Transaction Processing
• Qlik Sense Desktop
• Business Intelligence Reporting
AWS Redshift
11
Petabyte-Scale Data Warehouse
• Optimized for DW
• Columnar Storage
• Data Compression
• Zone Maps to reduce I/O
• Scalable
• Easily change # of Nodes
• 1-32 node configurations
• Cost-Efficient
• On-Demand pricing starts @ $.25/hr.
• Run as low as $1,000 per TB/yr.
AWS Redshift
12
Petabyte-Scale Data Warehouse
• Get Started in Minutes
• Web Console
• CLI
• Full Managed
• Fault Tolerant
• Automated Backups / Fast Restores
• Encryption
• Data at Rest – AES-256
• Can manage own keys
• Compatible
• SQL
• Data Integrations
AWS Simple Storage Service (S3)
13
Online File/Object Storage
• Durable
• Data redundantly stored across
multiple facilities/devices
• Available
• 99.99% availability
• Choose from different AWS regions
• Secure
• SSL – Data Transfer
• At Rest – Auto-Encrypted
• Scalable
• Flexible capacity based on data
demands
• Low Cost
• Pay for what you use
AWS Simple Storage Service (S3)
14
Reliable Simple
Scalable Low Cost
• Distributed Infrastructure
ensures activity completion
• Integrated with SNS for event
notifications
Data Processing and Transfer Platform
• Drag-and drop console
• Pre-built templates for other
AWS services
• Visual Pipeline editor
• Dispatch work to one machine
or many
• Serial and/or Parallel
processing
• Charged per Pipeline
• Frequency
• Volume
AWS Elastic Compute Cloud (EC2) + MySQL
15
Cloud Infrastructure for Applications & Development
• Flexible
• Linux and Windows virtual machines
• Supports multiple instance types, software packages, resource configs
• Elastic
• Increase/Decrease capacity within minutes
• Commission any number of server instances simultaneously
• Secure
• Security Groups / Network ACLs
• VPC / VPN
• Low Cost
• On-Demand / Reserved / Spot Instance options
Qlik Sense Desktop
16
Data Visualization / BI Tool
• Drag-and-drop Visualizations
• Smart Search
• Explore Multiple data sources in
single dashboard/report
• Access analytics on multiple device
types
• Collaborate and share insights within
reports
• Enables self-service simplicity
Architecture
17
Demo
18
Tech Demo
19
• During this demonstration, we will discuss the setup and execution of using Amazon Redshift as an on-
demand, cloud-based, data warehouse solution.
• Our sample data comes from the “Million Song Dataset” available from Columbia University -
http://labrosa.ee.columbia.edu/millionsong/
• The BI Tool that is used to create a business-focused dashboard is Qlik Sense Desktop, a Windows-
based desktop application - http://www.qlik.com/us/explore/products/sense
• In addition, the following services in the Amazon Web Services stack are used: Amazon Redshift,
Amazon S3, Pipeline, and EC2 (Linux AMI running MySQL serves as a transactional database for the
demo).
Demo Steps
1. Create new Linux AMI that will host
MySQL for transaction data processing.
• Start new Linux instance and update security groups
for MySQL accessibility
• Install MySQL
• Create new MySQL users, database, and populate
with demonstration dataset (using MySQL
Workbench)
2. Create new S3 bucket for Pipeline ETL
processes
3. Create Redshift Cluster (data warehouse)
• Instantiate cluster
• Connect using SQL Workbench (via JDBC)
• Create initial data table
4. Create AWS Pipeline(s) for data processing
• MySQL -> S3
• Activate Pipeline for initial ETL from MySQL to S3
• S3 -> Redshift
• Activate Pipeline for initial ETL from S3 to Redshift
5. Install Qlik Sense Desktop
• Install Redshift ODBC Drivers locally on desktop
• Create Qlik Sense “Report” (Included in FP
submission for simplicity). Verify initial data in
report.
6. Solution Demonstration
(Using Amazon CLI – Command Line Interface)
• Simulate transactional data load in MySQL
• Verify new data (record count) in MySQL using
MySQL Workbench
• Delete initial data in S3 bucket (from Round 1)
• Trigger AWS Pipeline that loads data to S3 from
MySQL
• Verify data load (CSV file) in S3 bucket
• Trigger AWS Pipeline that loads data to Redshift
from S3.
• Verify data load in Redshift (using SQL Workbench)
• Refresh Qlik report to view analytics of initial data
load.
20
Linux AMI hosts MySQL
21
Redshift Cluster
22
Pipes
23
QlikSense Desktop
24
Add New data into MySQL
25
Insert songs_data
Count (*)
Checking Redshift
26
Select count (*) from song_data
Qlik Update
27
Results
28
• Amazon Web Services provides a powerful
platform to extend on-premise Infrastructure to
the cloud
• Enables massive data consolidation
• Efficient ETL orchestration & workflow
• Simplifies resource management and drives
down computing costs across multiple
services
• Changing needs of Business Executives can be
made quickly and efficiently
• AWS supports industry standard data
source connections
• Existing Reporting/Dashboards can
consume AWS Redshift data with no code
changes
douglas.bernardini@d2-data.com
Questions?
29

More Related Content

What's hot

(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014Amazon Web Services
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysisAmazon Web Services
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Amazon Web Services
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsAmazon Web Services
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsYelp Engineering
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataQubole
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
AWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAmazon Web Services
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Amazon Web Services
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기Amazon Web Services Korea
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsAmazon Web Services
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)Amazon Web Services
 
ENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudAmazon Web Services
 
Modernize Legacy and Enterprise Application Through Implementation of Cloud N...
Modernize Legacy and Enterprise Application Through Implementation of Cloud N...Modernize Legacy and Enterprise Application Through Implementation of Cloud N...
Modernize Legacy and Enterprise Application Through Implementation of Cloud N...Amazon Web Services
 

What's hot (20)

Real-Time Event Processing
Real-Time Event ProcessingReal-Time Event Processing
Real-Time Event Processing
 
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
(ARC202) Real-World Real-Time Analytics | AWS re:Invent 2014
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 
Scaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique VisitorsScaling Traffic from 0 to 139 Million Unique Visitors
Scaling Traffic from 0 to 139 Million Unique Visitors
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Aws meetup 20190427
Aws meetup 20190427Aws meetup 20190427
Aws meetup 20190427
 
AWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon Kinesis
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
AWS Analytics
AWS AnalyticsAWS Analytics
AWS Analytics
 
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
2017 AWS DB Day | Amazon Redshift 자세히 살펴보기
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis Analytics
 
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
 
ENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the CloudENT306 Migrating large Scale Data Sets to the Cloud
ENT306 Migrating large Scale Data Sets to the Cloud
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
AWS Real-Time Event Processing
AWS Real-Time Event ProcessingAWS Real-Time Event Processing
AWS Real-Time Event Processing
 
Modernize Legacy and Enterprise Application Through Implementation of Cloud N...
Modernize Legacy and Enterprise Application Through Implementation of Cloud N...Modernize Legacy and Enterprise Application Through Implementation of Cloud N...
Modernize Legacy and Enterprise Application Through Implementation of Cloud N...
 

Similar to AWS Redshift Data Warehouse

AWS 201 - A Walk through the AWS Cloud: What's New with AWS
AWS 201 - A Walk through the AWS Cloud: What's New with AWSAWS 201 - A Walk through the AWS Cloud: What's New with AWS
AWS 201 - A Walk through the AWS Cloud: What's New with AWSAmazon Web Services
 
Amazon AWS vs Azure Cloud vs Kubernetes
Amazon AWS vs Azure Cloud vs KubernetesAmazon AWS vs Azure Cloud vs Kubernetes
Amazon AWS vs Azure Cloud vs KubernetesStridely Solutions
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...SnapLogic
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesOwen Cutajar
 
AWS Webcast - Attunity Couchsurfing
AWS Webcast - Attunity CouchsurfingAWS Webcast - Attunity Couchsurfing
AWS Webcast - Attunity CouchsurfingAmazon Web Services
 
Microservices Manchester: Serverless Architectures By Rafal Gancarz
Microservices Manchester: Serverless Architectures By Rafal GancarzMicroservices Manchester: Serverless Architectures By Rafal Gancarz
Microservices Manchester: Serverless Architectures By Rafal GancarzOpenCredo
 
SAP on Amazon web services
SAP on Amazon web servicesSAP on Amazon web services
SAP on Amazon web servicescloudnonstop
 
Grails in the Cloud (2013)
Grails in the Cloud (2013)Grails in the Cloud (2013)
Grails in the Cloud (2013)Meni Lubetkin
 
AWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the CloudAWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the CloudAmazon Web Services
 
Data Analysis on AWS
Data Analysis on AWSData Analysis on AWS
Data Analysis on AWSPaolo latella
 
Architecting for AWS Cloud - let's do it right!
Architecting for AWS Cloud - let's do it right!Architecting for AWS Cloud - let's do it right!
Architecting for AWS Cloud - let's do it right!Misha Hanin
 
AWS Webcast - Library Systems on the AWS Cloud
AWS Webcast - Library Systems on the AWS CloudAWS Webcast - Library Systems on the AWS Cloud
AWS Webcast - Library Systems on the AWS CloudAmazon Web Services
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...Amazon Web Services
 
AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...
AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...
AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...Amazon Web Services
 

Similar to AWS Redshift Data Warehouse (20)

AWS 201 - A Walk through the AWS Cloud: What's New with AWS
AWS 201 - A Walk through the AWS Cloud: What's New with AWSAWS 201 - A Walk through the AWS Cloud: What's New with AWS
AWS 201 - A Walk through the AWS Cloud: What's New with AWS
 
Amazon AWS vs Azure Cloud vs Kubernetes
Amazon AWS vs Azure Cloud vs KubernetesAmazon AWS vs Azure Cloud vs Kubernetes
Amazon AWS vs Azure Cloud vs Kubernetes
 
Big data on aws
Big data on awsBig data on aws
Big data on aws
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & Opportunities
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
AWS Webcast - Attunity Couchsurfing
AWS Webcast - Attunity CouchsurfingAWS Webcast - Attunity Couchsurfing
AWS Webcast - Attunity Couchsurfing
 
Microservices Manchester: Serverless Architectures By Rafal Gancarz
Microservices Manchester: Serverless Architectures By Rafal GancarzMicroservices Manchester: Serverless Architectures By Rafal Gancarz
Microservices Manchester: Serverless Architectures By Rafal Gancarz
 
SAP on Amazon web services
SAP on Amazon web servicesSAP on Amazon web services
SAP on Amazon web services
 
Grails in the Cloud (2013)
Grails in the Cloud (2013)Grails in the Cloud (2013)
Grails in the Cloud (2013)
 
AWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the CloudAWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the Cloud
 
Data Analysis on AWS
Data Analysis on AWSData Analysis on AWS
Data Analysis on AWS
 
Débuter sur le cloud AWS
Débuter sur le cloud AWSDébuter sur le cloud AWS
Débuter sur le cloud AWS
 
CMS on AWS Deep Dive
CMS on AWS Deep DiveCMS on AWS Deep Dive
CMS on AWS Deep Dive
 
[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop
 
Architecting for AWS Cloud - let's do it right!
Architecting for AWS Cloud - let's do it right!Architecting for AWS Cloud - let's do it right!
Architecting for AWS Cloud - let's do it right!
 
AWS Webcast - Library Systems on the AWS Cloud
AWS Webcast - Library Systems on the AWS CloudAWS Webcast - Library Systems on the AWS Cloud
AWS Webcast - Library Systems on the AWS Cloud
 
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
ENT305 Migrating Your Databases to AWS: Deep Dive on Amazon Relational Databa...
 
AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...
AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...
AWS re:Invent 2016: Workshop: Using the Database Migration Service (DMS) for ...
 

More from Douglas Bernardini

Top reasons to choose SAP hana
Top reasons to choose SAP hanaTop reasons to choose SAP hana
Top reasons to choose SAP hanaDouglas Bernardini
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedDouglas Bernardini
 
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapRHadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapRDouglas Bernardini
 
Finance month closing with HANA
Finance month closing with HANAFinance month closing with HANA
Finance month closing with HANADouglas Bernardini
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideDouglas Bernardini
 
SAP Business Objects - Lopes Supermarket
SAP   Business Objects - Lopes SupermarketSAP   Business Objects - Lopes Supermarket
SAP Business Objects - Lopes SupermarketDouglas Bernardini
 
SAP - Business Objects - Ri happy
SAP - Business Objects - Ri happySAP - Business Objects - Ri happy
SAP - Business Objects - Ri happyDouglas Bernardini
 
Retail: Big data e Omni-Channel
Retail: Big data e Omni-ChannelRetail: Big data e Omni-Channel
Retail: Big data e Omni-ChannelDouglas Bernardini
 
Granular Access Control Using Cell Level Security In Accumulo
Granular Access Control  Using Cell Level Security  In Accumulo             Granular Access Control  Using Cell Level Security  In Accumulo
Granular Access Control Using Cell Level Security In Accumulo Douglas Bernardini
 
Proposta aderencia drogaria onofre
Proposta aderencia   drogaria onofreProposta aderencia   drogaria onofre
Proposta aderencia drogaria onofreDouglas Bernardini
 

More from Douglas Bernardini (20)

Top reasons to choose SAP hana
Top reasons to choose SAP hanaTop reasons to choose SAP hana
Top reasons to choose SAP hana
 
The REAL face of Big Data
The REAL face of Big DataThe REAL face of Big Data
The REAL face of Big Data
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
 
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapRHadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
 
SAP HORTONWORKS
SAP HORTONWORKSSAP HORTONWORKS
SAP HORTONWORKS
 
R-language
R-languageR-language
R-language
 
Splunk
SplunkSplunk
Splunk
 
Finance month closing with HANA
Finance month closing with HANAFinance month closing with HANA
Finance month closing with HANA
 
RDBMS x NoSQL
RDBMS x NoSQLRDBMS x NoSQL
RDBMS x NoSQL
 
SAP - SOLUTION MANAGER
SAP - SOLUTION MANAGER SAP - SOLUTION MANAGER
SAP - SOLUTION MANAGER
 
MS-SQL SERVER ARCHITECTURE
MS-SQL SERVER ARCHITECTUREMS-SQL SERVER ARCHITECTURE
MS-SQL SERVER ARCHITECTURE
 
DBA oracle
DBA oracleDBA oracle
DBA oracle
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
 
SAP Business Objects - Lopes Supermarket
SAP   Business Objects - Lopes SupermarketSAP   Business Objects - Lopes Supermarket
SAP Business Objects - Lopes Supermarket
 
SAP - Business Objects - Ri happy
SAP - Business Objects - Ri happySAP - Business Objects - Ri happy
SAP - Business Objects - Ri happy
 
Hadoop on retail
Hadoop on retailHadoop on retail
Hadoop on retail
 
Retail: Big data e Omni-Channel
Retail: Big data e Omni-ChannelRetail: Big data e Omni-Channel
Retail: Big data e Omni-Channel
 
Granular Access Control Using Cell Level Security In Accumulo
Granular Access Control  Using Cell Level Security  In Accumulo             Granular Access Control  Using Cell Level Security  In Accumulo
Granular Access Control Using Cell Level Security In Accumulo
 
Proposta aderencia drogaria onofre
Proposta aderencia   drogaria onofreProposta aderencia   drogaria onofre
Proposta aderencia drogaria onofre
 
SAP-Solution-Manager
SAP-Solution-ManagerSAP-Solution-Manager
SAP-Solution-Manager
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad EscortsCall girls in Ahmedabad High profile
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Recently uploaded (20)

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

AWS Redshift Data Warehouse

  • 2. What is Redshift? • Cloud-Hosted data warehouse services: AWS • Massive parallel processing (MPP) • Analytics workloads on large scale datasets • Stored by a column-oriented DBMS principle. • Large scale datasets. Up petabytes 2
  • 3. Features and Benefits • Columnar storage • Parallelizing queries • Multiple nodes • Custom JDBC and ODBC drivers • Ready integraded: • Amazon S3; • Amazon DynamoDB; • Amazon Elastic MapReduce; • Amazon Kinesis • Any SSH-enabled host. • Fault Tolerant • Automated Backups • Fast Restores • Secure: • Encryption • Network Isolation • Audit and Compliance • SQL friendly 3
  • 4. MarketPlace BI Tools • Actian • Actuate Corporation • Birst • Chartio • ClearStory Data • Dundas Data Visualization • Infor • Jaspersoft • Jreport • Logi Analytics • Looker (Software) • MicroStrategy • Pentaho • Periscope.io Data Integrations Tools • Attunity • FlyData • Informatica • SnapLogic • Talend • Xplenty 4 • Qlik • Redrock BI • SAS (software) • SiSense • Spotfire • Tableau Software
  • 9. Data growing fast! 9 • Enterprise Data is growing at an exponential rate • Structured and Unstructured data • Data requirements change rapidly • Cost to maintain data is prohibitive • Hardware not scalable • Expensive to support • Business agility suffers • Reporting unable to change with the pace of business • Data silos create bottlenecks
  • 10. Solution Proposal 10 • Leverage the flexibility of Amazon Web Services • Scalable • Flexible • Cost-Effective • AWS Redshift • Data Warehouse • AWS S3 • Persistent Storage • AWS Data Pipeline • Data Orchestration and ETL • AWS EC2 / MySQL • Transaction Processing • Qlik Sense Desktop • Business Intelligence Reporting
  • 11. AWS Redshift 11 Petabyte-Scale Data Warehouse • Optimized for DW • Columnar Storage • Data Compression • Zone Maps to reduce I/O • Scalable • Easily change # of Nodes • 1-32 node configurations • Cost-Efficient • On-Demand pricing starts @ $.25/hr. • Run as low as $1,000 per TB/yr.
  • 12. AWS Redshift 12 Petabyte-Scale Data Warehouse • Get Started in Minutes • Web Console • CLI • Full Managed • Fault Tolerant • Automated Backups / Fast Restores • Encryption • Data at Rest – AES-256 • Can manage own keys • Compatible • SQL • Data Integrations
  • 13. AWS Simple Storage Service (S3) 13 Online File/Object Storage • Durable • Data redundantly stored across multiple facilities/devices • Available • 99.99% availability • Choose from different AWS regions • Secure • SSL – Data Transfer • At Rest – Auto-Encrypted • Scalable • Flexible capacity based on data demands • Low Cost • Pay for what you use
  • 14. AWS Simple Storage Service (S3) 14 Reliable Simple Scalable Low Cost • Distributed Infrastructure ensures activity completion • Integrated with SNS for event notifications Data Processing and Transfer Platform • Drag-and drop console • Pre-built templates for other AWS services • Visual Pipeline editor • Dispatch work to one machine or many • Serial and/or Parallel processing • Charged per Pipeline • Frequency • Volume
  • 15. AWS Elastic Compute Cloud (EC2) + MySQL 15 Cloud Infrastructure for Applications & Development • Flexible • Linux and Windows virtual machines • Supports multiple instance types, software packages, resource configs • Elastic • Increase/Decrease capacity within minutes • Commission any number of server instances simultaneously • Secure • Security Groups / Network ACLs • VPC / VPN • Low Cost • On-Demand / Reserved / Spot Instance options
  • 16. Qlik Sense Desktop 16 Data Visualization / BI Tool • Drag-and-drop Visualizations • Smart Search • Explore Multiple data sources in single dashboard/report • Access analytics on multiple device types • Collaborate and share insights within reports • Enables self-service simplicity
  • 19. Tech Demo 19 • During this demonstration, we will discuss the setup and execution of using Amazon Redshift as an on- demand, cloud-based, data warehouse solution. • Our sample data comes from the “Million Song Dataset” available from Columbia University - http://labrosa.ee.columbia.edu/millionsong/ • The BI Tool that is used to create a business-focused dashboard is Qlik Sense Desktop, a Windows- based desktop application - http://www.qlik.com/us/explore/products/sense • In addition, the following services in the Amazon Web Services stack are used: Amazon Redshift, Amazon S3, Pipeline, and EC2 (Linux AMI running MySQL serves as a transactional database for the demo).
  • 20. Demo Steps 1. Create new Linux AMI that will host MySQL for transaction data processing. • Start new Linux instance and update security groups for MySQL accessibility • Install MySQL • Create new MySQL users, database, and populate with demonstration dataset (using MySQL Workbench) 2. Create new S3 bucket for Pipeline ETL processes 3. Create Redshift Cluster (data warehouse) • Instantiate cluster • Connect using SQL Workbench (via JDBC) • Create initial data table 4. Create AWS Pipeline(s) for data processing • MySQL -> S3 • Activate Pipeline for initial ETL from MySQL to S3 • S3 -> Redshift • Activate Pipeline for initial ETL from S3 to Redshift 5. Install Qlik Sense Desktop • Install Redshift ODBC Drivers locally on desktop • Create Qlik Sense “Report” (Included in FP submission for simplicity). Verify initial data in report. 6. Solution Demonstration (Using Amazon CLI – Command Line Interface) • Simulate transactional data load in MySQL • Verify new data (record count) in MySQL using MySQL Workbench • Delete initial data in S3 bucket (from Round 1) • Trigger AWS Pipeline that loads data to S3 from MySQL • Verify data load (CSV file) in S3 bucket • Trigger AWS Pipeline that loads data to Redshift from S3. • Verify data load in Redshift (using SQL Workbench) • Refresh Qlik report to view analytics of initial data load. 20
  • 21. Linux AMI hosts MySQL 21
  • 25. Add New data into MySQL 25 Insert songs_data Count (*)
  • 26. Checking Redshift 26 Select count (*) from song_data
  • 28. Results 28 • Amazon Web Services provides a powerful platform to extend on-premise Infrastructure to the cloud • Enables massive data consolidation • Efficient ETL orchestration & workflow • Simplifies resource management and drives down computing costs across multiple services • Changing needs of Business Executives can be made quickly and efficiently • AWS supports industry standard data source connections • Existing Reporting/Dashboards can consume AWS Redshift data with no code changes