SlideShare a Scribd company logo
1 of 62
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Elliott Cordo
VP Data Analytics, Equinox
Ryan Kelly
Data Architect, Equinox
BDA307
Amazon Redshift Update and How Equinox
Fitness Clubs Migrated to a Modern Data
Warehouse
Greg Khairallah
Amazon Web Services
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Web Services
Amazon Redshift overview
Recently released & upcoming features
Equinox Fitness
Becoming data driven
Evolution of our data warehouse
Future directions
Agenda
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Analytics Portfolio
Collect Store Analyze
Amazon Kinesis
Data Firehose
AWS Direct
Connect
AWS
Snowball
Amazon Kinesis
Data Analytics
Amazon Kinesis
Data Streams
Amazon S3 Amazon Glacier
Amazon CloudSearch
Amazon RDS, Amazon
Aurora
Amazon
DynamoDB
Amazon ES
Amazon EMR
Amazon
Redshift
Amazon
QuickSight
AWS Database Migration Service AWS Glue
AmazonAthena
AmazonAI
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift
10x faster at 1/10th the cost
Fast
Delivers fast results for all types
of workloads
Cost-effective
No upfront costs, start small,
and pay as you go
Integrated Secure
Audit everything; encrypt data
end-to-end; extensive
certification and compliance
Integrated with Amazon S3 data lakes,
AWS services, and third-party tools
$
Simple
Create and start using a data
warehouse in minutes
Scalable
Gigabytes to petabytes
to exabytes
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift Spectrum
Extend the data warehouse to your Amazon S3 data lake
Scale compute and storage separately
Join data across Amazon Redshift and Amazon S3
Exabyte-scale Amazon Redshift SQL queries against
Amazon S3
Stable query performance and unlimited concurrency
Parquet, ORC, JSON, Grok, Avro, & CSV formats
Pay only for the amount of data scanned
S3 data lakeAmazon
Redshift data
Redshift Spectrum
query engine
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thousands of Companies Run Mission Critical Workloads
on Amazon Redshift
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester Research, Inc. The Forrester Wave™ is a graphical representation of
Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the
Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
“Amazon Redshift has the largest adoption of
BDW in the cloud.”
“With more than 5,000 deployments, Amazon
Redshift has the largest data warehouse
deployments in the cloud – some over 10
petabytes in size.”
AWS received a score of 5/5 (the highest score
possible) in the: customer base, market
awareness, ability to execute, road map, support,
and partners criteria
Forrester Wave Big Data Warehouse Q2 2017
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Selected Amazon Redshift Partners
Data Integration Systems IntegratorsBusiness Intelligence
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Recently Released Features
Performance | Ease of Use | Data Lake
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dense Compute Nodes (DC2)
2x performance as DC1 at the same price
3x more I/O with
Upgrade at no cost
30% better storage utilization
than DC1
“Amazon Redshift’s new DC2 node is giving us a
100 percent performance increase, allowing us to
provide faster insights for our retailers, more cost
effectively, to drive incremental revenue."
NVMe SSD DDR4 memory
Intel E5-2686 v4 (Broadwell)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
Performance | Ease of Use | Data Lake
“Amazon Redshift allows us to quickly spin up clusters and provide our data
scientists with a fast and easy method to access data and generate insights,” said
Bradley Todd, Liberty Mutual’s Technology Architect. “We saw a 9x reduction in
month-end reporting time with Amazon Redshift DC2 nodes as compared to
DC1."
“Analytical queries are 10 times faster in Amazon Redshift than they were with our
previous data warehouse. Our data science team can get to the data faster and then
analyze that data to find new ways to reduce costs, market products, and enable
new business,” said Yuki Moritani, Manager, Innovation Management Department,
NTT Docomo.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Short Query Acceleration
Express Lane for Short Queries
• Machine learning predicts the
runtime of queries
• Short queries are routed to an
express queue
• Resources are dynamically
dedicated to short queries
• Enable it today from your
AWS Management Console
How it works:
Analytics and
BI / Dashboard tools
Amazon
Redshift Machine Learning
Classifier
Machine learning
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Result-set Caching
Subsecond repeat queries
• Amazon Redshift customers can now serve 35% more queries on average,
using the same compute resources
• Tens of thousands of compute hours are freed up daily to serve the
remaining queries and data ingestion
• Transparent – it just works!
“With Amazon Redshift result caching, 20 percent of our
queries now complete in less than one second,” said
Greg Rokita, Executive Director for Technology, Edmunds
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Commit Enhancements
50% faster data commits for busy clusters
16% faster data ingestion and insertion
Commit Duration Per Transaction for Busy Clusters
Nov Jan Mar
Total Commit Time by Month
ds2.8xlarge, cluster size: 10 and up, us-west-2
Clusters with more than 90 backups a day
p99 p95 p90 p50 Linear (p99)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Query Performance Improvements
• Faster hash joins
• Improvements to hash algorithm (Jan '18)
• Significant improvement in memory utilization (Feb '18)
• Cache line prefetching to improve join performance (Mar '18)
• Join-intensive workloads like TPC-H and TPC-DS show a performance improvement
ranging from 28% to 2x for several queries
• 64x reduction of memory footprint fleet wide for hash joins and aggregations.
Significant improvement to overall throughput
• Read and write queries can now hop WLM queues
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
“With Amazon Redshift and Tableau, anyone in the company can set up any
queries they like—from how users are reacting to a feature, to growth by
demographic or geography, to the impact sales efforts had in different areas”
“Provides an easy-to-use mechanism for querying data with quick and
uniform response times that analysts can use to run research projects
and perform in-depth analysis…We don’t have to pre-allocate resources and
can easily scale up to meet demand and then scale down for efficiency”
Performance | Ease of Use | Data Lake
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift Advisor
Advisor provides
automated
recommendations to help
you optimize database
performance and
decrease operating costs.
Shows up to seven
recommendations to help
you optimize your cluster.
Available via the Amazon
Redshift console at no
charge.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
New Amazon CloudWatch Metrics for Easy Visualization of
Cluster Performance
• Monitor the performance and
health of your Amazon Redshift
cluster with two new CloudWatch
metrics, Query Throughput and
Query Duration.
• Query Throughput measures the
average number of queries
completed per second. Query
Duration measures the average
time taken to complete a query.
By observing these metrics, you
can easily determine how your
cluster is performing at any time.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential
“Redshift Spectrum enables us to directly operate on our data in its native
format in Amazon S3 with no preprocessing or transformation. Our data
pipeline is much simpler now, and our execution time has been lowered
significantly,” said Vladimir Barkov, Director of Data Architecture and
Engineering at Time Inc.
“We use Redshift Spectrum for interactive online queries. The new DC2 node
from Amazon Redshift has given us a 70 percent performance boost for
running Redshift Spectrum queries. As a result, we can analyze far more
data for our customers and deliver results much faster,” said Hyung-Joon Kim,
Principle Software Engineer, BrandVerity.
Performance | Ease of Use | Data Lake
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Redshift Spectrum Enhancements
• Added support for processing scalar JSON and ION file formats in S3
• In addition to Parquet, ORC, Avro, CSV, Grok, RCFile, RegexSerDe,
OpenCSV, SequenceFile, TextFile, and TSV
• Support for DATE data type
• Support for IAM role-chaining to assume cross-account roles
• Nested data support
• COPY from Parquet and ORC files
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Nested Data Support
• Analyze nested and semi-structured data in Amazon S3 with Spectrum
• Allows easy ETL of nested data in to Amazon Redshift using CTAS
• Support for open file formats: Parquet, ORC, JSON, and Ion
• Uses dot notation to extend your existing SQL
s3data.clickStream: <<
{ “session_time”: “20171013 14:05:00”,
“clicks”: [ {“page”: “/home”, “referrer”: “”},
{“page”: “/products”, “referrer”: “/home”} ]
},
{ “session_time”: “20171013 14:06:00”,
“clicks”: [ {“page”: “/contact”, “referrer”: “/home”} ]
} >>
SELECT c.page,
COUNT(*) AS count
FROM s3data.clickStream s,
s.clicks c
WHERE s.session_time > ‘2017-10-01 00:00:00’
AND c.referrer = “/home”
GROUP BY c.page;
Example: Find click frequency for links on “/home”:
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Nested Data Support
Improve query performance by analyzing nested data
OrderID CustomerID OrderTime ShipMode
5 23 10.00 12.50
8 32 1.00 5.60
OrdersWithItems
ItemID Quantity Price
23 10.00 12.50
16 1.00 1.99
32 1.00 5.60
24 5.00 26.50
OrderItems
OrderID ItemID Quantity Price
5 23 10.00 12.50
8 32 1.00 5.60
5 16 1.00 1.99
8 24 5.00 26.50
OrderID CustomerID OrderTime ShipMode
5 23 10.00 12.50
8 32 1.00 5.60
Orders
OrderItems
To improve query
performance, the
new Orders table
includes the
OrdersWithItems as
a nested column,
eliminating join
processing
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Equinox Case Study
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
WHO ARE WE
Is a company with integrated
luxury and lifestyle offerings
centered on movement,
nutrition, and regeneration
EQUINOX
Inclusive of our other brands –
Blink, Pure Yoga, SoulCycle,
Furthermore, Hotel – Equinox
operates more than 200
locations within every major city
across the country in addition to
London & Canada
And more…
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
HOW COMPLICATED COULD THIS BE?
People check in?
Members?
???
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
YOU CAN MAKE ANYTHING COMPLICATED
Many lines of business across 98 Equinox clubs, 200+ in total
- Personal Training
- Pilates
- Spa
- Group Fitness
- Membership/Sales
- Retail
- Food Services
…and Central Supporting Functions
- Digital Product
- CRM
- Marketing
- Creative
- Development/Building
- Finance
- Member’s Services
- Maintenance
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
IT IS ALL CONNECTED
Digital Products
- User Applications
- Connecting to Apple Health
Equipment
- Pursuit (Gamified cycling)
- Cardio
- Digital Scale
- Location Tracking
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Data Journey
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
THE
HISTORY OF
DATA
First there was LIFE…
This was Equinox’s first data
warehouse and was created in
2008
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A TRADITIONAL LIFE
Informatica
SQL Server
Rigorously Kimball
https://www.amazon.com/Data-
Warehouse-Toolkit-Complete-
Dimensional/dp/0471200247
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LIFE WAS GOOD..
- Reliable reporting
- Analytics, sometimes self-serviced!
- Customer Profile
- CRM, Email Marketing, Personalization
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
…AND SOMETIMES BAD
- Direct integration with applications, tight coupling
- Difficult SDLC, testing cycle, release management
- Functional debt
- No place to put NEW data
- In-flexibility for Data Science
- Expensive commercial software
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
VERSION 1.X TERADATA
- About 3 years ago we purchased Teradata
- Several apps running in beta…
- Lots of platform specific knowledge
- Limited integration
- Was very expensive
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RE-CENTERING ON OUR GOALS
- Providing business value
- Reduce cost and go all-in on cloud technology
- Building technology that differentiates
- Immortal systems
- Make scalable components
- Use ephemeral stateless resources
- Use distributed databases
- Less focus on individual servers
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
THE NEW SCHOOL
- Just put everything in Hadoop or Amazon S3
data lake
- You don’t need a data warehouse at all
- Everything can just be late bind
Doesn’t work for everything!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DATA WAREHOUSE VS DATA LAKE
Data Warehouse:
- Reliable high SLA reporting
- Developer and analyst friendly
- Efficient for specific types of data pipelines:
Updates/mutable data
Data Lakes:
- Large immutable datasets
- Semi-structured/unstructured
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
PROJECT COSMO
- 2 week proof of concept
- Re-platformed 1 Teradata app to Amazon Redshift + Amazon S3
Did it work?
Yes!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BYE BYE!
TERADATA
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
JARVIS Data Warehouse
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
JARVIS IS BORN
- Data Warehouse
- Data Lake
- Data Services
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
JARVIS ARCHITECTURE
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AMAZON REDSHIFT
- Flexible cloud-based MPP data warehouse platform
- Cost effective - $1K to $5K per TB per year on reserve
- Mostly Postgres compatible
- Fast and performant
- Ease of maintenance
- Low barriers for developers
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
WHAT DO OUR MODELS LOOK LIKE
- Somewhat like star schemas
- Flattened, no junk dimensions, no bridge tables
- Amazon Redshift is columnar so wide tables are ok!
- Distributed joins can be expensive
- Rational and conservative use of traditional “Type 2”
Basically, get answer and put in table!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
PROCESSING ON Amazon Redshift
- We perform light transformation via ELT scripts
- Orchestrated by Maximilian
- Big crunches and semi-structured data processing
- Happen outside of Amazon Redshift
- Reserves query capacity
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Lake
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
WHY DO WE HAVE
- Utilize Amazon S3 benefits of high performance, low cost, blob
storage
- Functioning analytic store (not a dumping ground)
- Employ flexible, late bind strategies where appropriate
- Extremely quick setup for external tables
- Easily implement DR strategies
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
WHAT DO WE STORE
- Clickstream data
- Cycling logs data
- Club management software data
- Data from software than enhances our services
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
MAKING IT WORK
- Tools
- AWS Glue to describe the data
- Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum to query
- Tips
- Leveraging self-described high compression Parquet files
- Lighten compute load on Amazon Redshift through Amazon EMR / Athena
- Amazon S3 to Amazon S3 transforms using Redshift Spectrum and UNLOAD
- Easier delta queries from daily snapshots
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SAMPLE AWS GLUE DEFINITION
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automation & DevOps
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SUPPORTING ACTORS
- Batchy – Batch and state, DAG execution
- HAMbot – Data quality monitoring
- Teletraan1, Robopager – Ops monitoring
- Rundeck – Scheduling
- Jenkins
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
HOW WE DO DEPLOYMENTS
- Jenkins workflows
- Spin up ephemeral Amazon Redshift clusters and Maximilian
assets
- Run major transformations
- Run HAMbot checks
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
V.I.N.CENT BOT
- Hero for our engineers
- Ops interaction via Slack
- Much easier than using the console
- Can start cluster in seconds
- Reduces console access requirement
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
MAXIMILIAN BOT
- Every hero needs a villain
- Further ops interaction via Slack
- Bot to bot communication
- No human interaction
- Saves money on unused infrastructure
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Results
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
THINGS ARE GOOD!
- Have increased productivity since
- Re-platformed and productionalized 2 apps in 4 months
- Finished re-platform in under a year
- Dependable – very few operational issues
- Faster time-to-benefit via automated regression
- Huge cost savings over Teradata
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BLINK DATA WAREHOUSE
- It worked so well we built Blink a brand new data platform too!
- The entire re-platform only took 4 months
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LESSONS LEARNED
- Try to use an S3/Data lake first approach whenever possible
- Strive to decouple
- Plan for flexibility to help embrace change as it comes
- One size doesn’t fit all – each tool serves a purpose
- Automate everything – leverage automated test and
deployment to your analytic environment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Shameless Careers Pitch
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
WE INNOVATE
- Cloud forward strategy
- Micro-service architecture
- Gamification & metric driven programming
- IoT - Connected cardio, beacons, wearables
- Integrated single view of customer & advanced CRM
- Machine Learning
- Recommendations, Predictions, NLP, Chat-bots
- Data Platform
- Amazon Redshift, Amazon EMR / Amazon Spark,
Amazon S3 / AWS Glue / Redshift Spectrum / Athena
Submit Session Feedback
1. Tap the Schedule icon.
2. Select the session you attended.
3. Tap Session Evaluation to submit your
feedback.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!

More Related Content

What's hot

Real-time Data Pipelines with SAP and Apache Kafka
Real-time Data Pipelines with SAP and Apache KafkaReal-time Data Pipelines with SAP and Apache Kafka
Real-time Data Pipelines with SAP and Apache KafkaCarole Gunst
 
SQL + Hadoop: The High Performance Advantage�
SQL + Hadoop:  The High Performance Advantage�SQL + Hadoop:  The High Performance Advantage�
SQL + Hadoop: The High Performance Advantage�Actian Corporation
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeAmazon Web Services
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?Jeraldine Phneah
 
Leveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsLeveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsAmazon Web Services
 
Module 3 - QuickSight Overview
Module 3 - QuickSight OverviewModule 3 - QuickSight Overview
Module 3 - QuickSight OverviewLam Le
 
How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...
How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...
How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...DataWorks Summit
 
Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Amazon Web Services
 
2020 Big Data & Analytics Maturity Survey Results
2020 Big Data & Analytics Maturity Survey Results2020 Big Data & Analytics Maturity Survey Results
2020 Big Data & Analytics Maturity Survey ResultsCarole Gunst
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
 
CWIN17 India / Insights platform architecture v1 0 virtual - subhadeep dutta
CWIN17 India / Insights platform architecture v1 0   virtual - subhadeep duttaCWIN17 India / Insights platform architecture v1 0   virtual - subhadeep dutta
CWIN17 India / Insights platform architecture v1 0 virtual - subhadeep duttaCapgemini
 
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoImmersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoAmazon Web Services LATAM
 
How to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudHow to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudAttunity
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Amazon Web Services
 
Turn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and AmazonTurn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and AmazonAmazon Web Services
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesCarole Gunst
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Machine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSMachine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSAmazon Web Services
 
Welcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution OverviewWelcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution OverviewAmazon Web Services
 

What's hot (20)

Real-time Data Pipelines with SAP and Apache Kafka
Real-time Data Pipelines with SAP and Apache KafkaReal-time Data Pipelines with SAP and Apache Kafka
Real-time Data Pipelines with SAP and Apache Kafka
 
SQL + Hadoop: The High Performance Advantage�
SQL + Hadoop:  The High Performance Advantage�SQL + Hadoop:  The High Performance Advantage�
SQL + Hadoop: The High Performance Advantage�
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_Singapore
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
 
Leveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsLeveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven Decisions
 
Module 3 - QuickSight Overview
Module 3 - QuickSight OverviewModule 3 - QuickSight Overview
Module 3 - QuickSight Overview
 
How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...
How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...
How Market Intelligence From Hadoop on Azure Shows Trucking Companies a Clear...
 
Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200Building Your Data Lake on AWS - Level 200
Building Your Data Lake on AWS - Level 200
 
2020 Big Data & Analytics Maturity Survey Results
2020 Big Data & Analytics Maturity Survey Results2020 Big Data & Analytics Maturity Survey Results
2020 Big Data & Analytics Maturity Survey Results
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
CWIN17 India / Insights platform architecture v1 0 virtual - subhadeep dutta
CWIN17 India / Insights platform architecture v1 0   virtual - subhadeep duttaCWIN17 India / Insights platform architecture v1 0   virtual - subhadeep dutta
CWIN17 India / Insights platform architecture v1 0 virtual - subhadeep dutta
 
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoImmersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
 
How to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudHow to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the Cloud
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
 
Turn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and AmazonTurn Big Data Into Big Value On Informatica and Amazon
Turn Big Data Into Big Value On Informatica and Amazon
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data Pipelines
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Machine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWSMachine Learning & Data Lake for IoT scenarios on AWS
Machine Learning & Data Lake for IoT scenarios on AWS
 
Welcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution OverviewWelcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution Overview
 

Similar to Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Data Warehouse - BDA307 - Chicago AWS Summit

BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftBDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftAmazon Web Services
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Amazon Web Services
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftAmazon Web Services
 
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Amazon Web Services
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETLAmazon Web Services
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Amazon Web Services
 
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...Amazon Web Services
 
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...Amazon Web Services
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTAmazon Web Services
 
Choose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day IsraelChoose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day IsraelAmazon Web Services
 
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)Amazon Web Services
 
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Amazon Web Services
 
Fanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWSFanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWSAmazon Web Services
 
Non-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SFNon-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SFAmazon Web Services
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)Amazon Web Services
 
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeAmazon Web Services
 
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018Amazon Web Services
 
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAnalyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAmazon Web Services
 
Data Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyData Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyAmazon Web Services
 

Similar to Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Data Warehouse - BDA307 - Chicago AWS Summit (20)

BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon RedshiftBDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
 
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
 
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
 
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
 
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPTHow TrueCar Gains Actionable Insights with Splunk Cloud PPT
How TrueCar Gains Actionable Insights with Splunk Cloud PPT
 
Choose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day IsraelChoose the right DB for the Job - Builders Day Israel
Choose the right DB for the Job - Builders Day Israel
 
Non-Relational Revolution
Non-Relational RevolutionNon-Relational Revolution
Non-Relational Revolution
 
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
AWS re:Invent 2016: What’s New with Amazon Redshift (BDA304)
 
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
 
Fanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWSFanatics Ingests Streaming Data to a Data Lake on AWS
Fanatics Ingests Streaming Data to a Data Lake on AWS
 
Non-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SFNon-Relational Revolution: Database Week SF
Non-Relational Revolution: Database Week SF
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
 
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
 
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
What's New with Amazon Redshift ft. McDonald's (ANT350-R1) - AWS re:Invent 2018
 
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAnalyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
 
Data Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyData Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit Sydney
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Data Warehouse - BDA307 - Chicago AWS Summit

  • 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Elliott Cordo VP Data Analytics, Equinox Ryan Kelly Data Architect, Equinox BDA307 Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Data Warehouse Greg Khairallah Amazon Web Services
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Web Services Amazon Redshift overview Recently released & upcoming features Equinox Fitness Becoming data driven Evolution of our data warehouse Future directions Agenda
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Analytics Portfolio Collect Store Analyze Amazon Kinesis Data Firehose AWS Direct Connect AWS Snowball Amazon Kinesis Data Analytics Amazon Kinesis Data Streams Amazon S3 Amazon Glacier Amazon CloudSearch Amazon RDS, Amazon Aurora Amazon DynamoDB Amazon ES Amazon EMR Amazon Redshift Amazon QuickSight AWS Database Migration Service AWS Glue AmazonAthena AmazonAI
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Redshift 10x faster at 1/10th the cost Fast Delivers fast results for all types of workloads Cost-effective No upfront costs, start small, and pay as you go Integrated Secure Audit everything; encrypt data end-to-end; extensive certification and compliance Integrated with Amazon S3 data lakes, AWS services, and third-party tools $ Simple Create and start using a data warehouse in minutes Scalable Gigabytes to petabytes to exabytes
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Redshift Spectrum Extend the data warehouse to your Amazon S3 data lake Scale compute and storage separately Join data across Amazon Redshift and Amazon S3 Exabyte-scale Amazon Redshift SQL queries against Amazon S3 Stable query performance and unlimited concurrency Parquet, ORC, JSON, Grok, Avro, & CSV formats Pay only for the amount of data scanned S3 data lakeAmazon Redshift data Redshift Spectrum query engine
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thousands of Companies Run Mission Critical Workloads on Amazon Redshift
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester Research, Inc. The Forrester Wave™ is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change. “Amazon Redshift has the largest adoption of BDW in the cloud.” “With more than 5,000 deployments, Amazon Redshift has the largest data warehouse deployments in the cloud – some over 10 petabytes in size.” AWS received a score of 5/5 (the highest score possible) in the: customer base, market awareness, ability to execute, road map, support, and partners criteria Forrester Wave Big Data Warehouse Q2 2017
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Selected Amazon Redshift Partners Data Integration Systems IntegratorsBusiness Intelligence
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Recently Released Features Performance | Ease of Use | Data Lake
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dense Compute Nodes (DC2) 2x performance as DC1 at the same price 3x more I/O with Upgrade at no cost 30% better storage utilization than DC1 “Amazon Redshift’s new DC2 node is giving us a 100 percent performance increase, allowing us to provide faster insights for our retailers, more cost effectively, to drive incremental revenue." NVMe SSD DDR4 memory Intel E5-2686 v4 (Broadwell)
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential Performance | Ease of Use | Data Lake “Amazon Redshift allows us to quickly spin up clusters and provide our data scientists with a fast and easy method to access data and generate insights,” said Bradley Todd, Liberty Mutual’s Technology Architect. “We saw a 9x reduction in month-end reporting time with Amazon Redshift DC2 nodes as compared to DC1." “Analytical queries are 10 times faster in Amazon Redshift than they were with our previous data warehouse. Our data science team can get to the data faster and then analyze that data to find new ways to reduce costs, market products, and enable new business,” said Yuki Moritani, Manager, Innovation Management Department, NTT Docomo.
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Short Query Acceleration Express Lane for Short Queries • Machine learning predicts the runtime of queries • Short queries are routed to an express queue • Resources are dynamically dedicated to short queries • Enable it today from your AWS Management Console How it works: Analytics and BI / Dashboard tools Amazon Redshift Machine Learning Classifier Machine learning
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Result-set Caching Subsecond repeat queries • Amazon Redshift customers can now serve 35% more queries on average, using the same compute resources • Tens of thousands of compute hours are freed up daily to serve the remaining queries and data ingestion • Transparent – it just works! “With Amazon Redshift result caching, 20 percent of our queries now complete in less than one second,” said Greg Rokita, Executive Director for Technology, Edmunds
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Commit Enhancements 50% faster data commits for busy clusters 16% faster data ingestion and insertion Commit Duration Per Transaction for Busy Clusters Nov Jan Mar Total Commit Time by Month ds2.8xlarge, cluster size: 10 and up, us-west-2 Clusters with more than 90 backups a day p99 p95 p90 p50 Linear (p99)
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Query Performance Improvements • Faster hash joins • Improvements to hash algorithm (Jan '18) • Significant improvement in memory utilization (Feb '18) • Cache line prefetching to improve join performance (Mar '18) • Join-intensive workloads like TPC-H and TPC-DS show a performance improvement ranging from 28% to 2x for several queries • 64x reduction of memory footprint fleet wide for hash joins and aggregations. Significant improvement to overall throughput • Read and write queries can now hop WLM queues
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential “With Amazon Redshift and Tableau, anyone in the company can set up any queries they like—from how users are reacting to a feature, to growth by demographic or geography, to the impact sales efforts had in different areas” “Provides an easy-to-use mechanism for querying data with quick and uniform response times that analysts can use to run research projects and perform in-depth analysis…We don’t have to pre-allocate resources and can easily scale up to meet demand and then scale down for efficiency” Performance | Ease of Use | Data Lake
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Redshift Advisor Advisor provides automated recommendations to help you optimize database performance and decrease operating costs. Shows up to seven recommendations to help you optimize your cluster. Available via the Amazon Redshift console at no charge.
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. New Amazon CloudWatch Metrics for Easy Visualization of Cluster Performance • Monitor the performance and health of your Amazon Redshift cluster with two new CloudWatch metrics, Query Throughput and Query Duration. • Query Throughput measures the average number of queries completed per second. Query Duration measures the average time taken to complete a query. By observing these metrics, you can easily determine how your cluster is performing at any time.
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential “Redshift Spectrum enables us to directly operate on our data in its native format in Amazon S3 with no preprocessing or transformation. Our data pipeline is much simpler now, and our execution time has been lowered significantly,” said Vladimir Barkov, Director of Data Architecture and Engineering at Time Inc. “We use Redshift Spectrum for interactive online queries. The new DC2 node from Amazon Redshift has given us a 70 percent performance boost for running Redshift Spectrum queries. As a result, we can analyze far more data for our customers and deliver results much faster,” said Hyung-Joon Kim, Principle Software Engineer, BrandVerity. Performance | Ease of Use | Data Lake
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Redshift Spectrum Enhancements • Added support for processing scalar JSON and ION file formats in S3 • In addition to Parquet, ORC, Avro, CSV, Grok, RCFile, RegexSerDe, OpenCSV, SequenceFile, TextFile, and TSV • Support for DATE data type • Support for IAM role-chaining to assume cross-account roles • Nested data support • COPY from Parquet and ORC files
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Nested Data Support • Analyze nested and semi-structured data in Amazon S3 with Spectrum • Allows easy ETL of nested data in to Amazon Redshift using CTAS • Support for open file formats: Parquet, ORC, JSON, and Ion • Uses dot notation to extend your existing SQL s3data.clickStream: << { “session_time”: “20171013 14:05:00”, “clicks”: [ {“page”: “/home”, “referrer”: “”}, {“page”: “/products”, “referrer”: “/home”} ] }, { “session_time”: “20171013 14:06:00”, “clicks”: [ {“page”: “/contact”, “referrer”: “/home”} ] } >> SELECT c.page, COUNT(*) AS count FROM s3data.clickStream s, s.clicks c WHERE s.session_time > ‘2017-10-01 00:00:00’ AND c.referrer = “/home” GROUP BY c.page; Example: Find click frequency for links on “/home”:
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Nested Data Support Improve query performance by analyzing nested data OrderID CustomerID OrderTime ShipMode 5 23 10.00 12.50 8 32 1.00 5.60 OrdersWithItems ItemID Quantity Price 23 10.00 12.50 16 1.00 1.99 32 1.00 5.60 24 5.00 26.50 OrderItems OrderID ItemID Quantity Price 5 23 10.00 12.50 8 32 1.00 5.60 5 16 1.00 1.99 8 24 5.00 26.50 OrderID CustomerID OrderTime ShipMode 5 23 10.00 12.50 8 32 1.00 5.60 Orders OrderItems To improve query performance, the new Orders table includes the OrdersWithItems as a nested column, eliminating join processing
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Equinox Case Study
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. WHO ARE WE Is a company with integrated luxury and lifestyle offerings centered on movement, nutrition, and regeneration EQUINOX Inclusive of our other brands – Blink, Pure Yoga, SoulCycle, Furthermore, Hotel – Equinox operates more than 200 locations within every major city across the country in addition to London & Canada And more…
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. HOW COMPLICATED COULD THIS BE? People check in? Members? ???
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. YOU CAN MAKE ANYTHING COMPLICATED Many lines of business across 98 Equinox clubs, 200+ in total - Personal Training - Pilates - Spa - Group Fitness - Membership/Sales - Retail - Food Services …and Central Supporting Functions - Digital Product - CRM - Marketing - Creative - Development/Building - Finance - Member’s Services - Maintenance
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. IT IS ALL CONNECTED Digital Products - User Applications - Connecting to Apple Health Equipment - Pursuit (Gamified cycling) - Cardio - Digital Scale - Location Tracking
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Data Journey
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. THE HISTORY OF DATA First there was LIFE… This was Equinox’s first data warehouse and was created in 2008
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. A TRADITIONAL LIFE Informatica SQL Server Rigorously Kimball https://www.amazon.com/Data- Warehouse-Toolkit-Complete- Dimensional/dp/0471200247
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. LIFE WAS GOOD.. - Reliable reporting - Analytics, sometimes self-serviced! - Customer Profile - CRM, Email Marketing, Personalization
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. …AND SOMETIMES BAD - Direct integration with applications, tight coupling - Difficult SDLC, testing cycle, release management - Functional debt - No place to put NEW data - In-flexibility for Data Science - Expensive commercial software
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. VERSION 1.X TERADATA - About 3 years ago we purchased Teradata - Several apps running in beta… - Lots of platform specific knowledge - Limited integration - Was very expensive
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. RE-CENTERING ON OUR GOALS - Providing business value - Reduce cost and go all-in on cloud technology - Building technology that differentiates - Immortal systems - Make scalable components - Use ephemeral stateless resources - Use distributed databases - Less focus on individual servers
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. THE NEW SCHOOL - Just put everything in Hadoop or Amazon S3 data lake - You don’t need a data warehouse at all - Everything can just be late bind Doesn’t work for everything!
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DATA WAREHOUSE VS DATA LAKE Data Warehouse: - Reliable high SLA reporting - Developer and analyst friendly - Efficient for specific types of data pipelines: Updates/mutable data Data Lakes: - Large immutable datasets - Semi-structured/unstructured
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. PROJECT COSMO - 2 week proof of concept - Re-platformed 1 Teradata app to Amazon Redshift + Amazon S3 Did it work? Yes!
  • 38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BYE BYE! TERADATA
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. JARVIS Data Warehouse
  • 40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. JARVIS IS BORN - Data Warehouse - Data Lake - Data Services
  • 41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. JARVIS ARCHITECTURE
  • 42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AMAZON REDSHIFT - Flexible cloud-based MPP data warehouse platform - Cost effective - $1K to $5K per TB per year on reserve - Mostly Postgres compatible - Fast and performant - Ease of maintenance - Low barriers for developers
  • 43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. WHAT DO OUR MODELS LOOK LIKE - Somewhat like star schemas - Flattened, no junk dimensions, no bridge tables - Amazon Redshift is columnar so wide tables are ok! - Distributed joins can be expensive - Rational and conservative use of traditional “Type 2” Basically, get answer and put in table!
  • 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. PROCESSING ON Amazon Redshift - We perform light transformation via ELT scripts - Orchestrated by Maximilian - Big crunches and semi-structured data processing - Happen outside of Amazon Redshift - Reserves query capacity
  • 45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Lake
  • 46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. WHY DO WE HAVE - Utilize Amazon S3 benefits of high performance, low cost, blob storage - Functioning analytic store (not a dumping ground) - Employ flexible, late bind strategies where appropriate - Extremely quick setup for external tables - Easily implement DR strategies
  • 47. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. WHAT DO WE STORE - Clickstream data - Cycling logs data - Club management software data - Data from software than enhances our services
  • 48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. MAKING IT WORK - Tools - AWS Glue to describe the data - Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum to query - Tips - Leveraging self-described high compression Parquet files - Lighten compute load on Amazon Redshift through Amazon EMR / Athena - Amazon S3 to Amazon S3 transforms using Redshift Spectrum and UNLOAD - Easier delta queries from daily snapshots
  • 49. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SAMPLE AWS GLUE DEFINITION
  • 50. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Automation & DevOps
  • 51. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SUPPORTING ACTORS - Batchy – Batch and state, DAG execution - HAMbot – Data quality monitoring - Teletraan1, Robopager – Ops monitoring - Rundeck – Scheduling - Jenkins
  • 52. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. HOW WE DO DEPLOYMENTS - Jenkins workflows - Spin up ephemeral Amazon Redshift clusters and Maximilian assets - Run major transformations - Run HAMbot checks
  • 53. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. V.I.N.CENT BOT - Hero for our engineers - Ops interaction via Slack - Much easier than using the console - Can start cluster in seconds - Reduces console access requirement
  • 54. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. MAXIMILIAN BOT - Every hero needs a villain - Further ops interaction via Slack - Bot to bot communication - No human interaction - Saves money on unused infrastructure
  • 55. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Results
  • 56. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. THINGS ARE GOOD! - Have increased productivity since - Re-platformed and productionalized 2 apps in 4 months - Finished re-platform in under a year - Dependable – very few operational issues - Faster time-to-benefit via automated regression - Huge cost savings over Teradata
  • 57. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BLINK DATA WAREHOUSE - It worked so well we built Blink a brand new data platform too! - The entire re-platform only took 4 months
  • 58. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. LESSONS LEARNED - Try to use an S3/Data lake first approach whenever possible - Strive to decouple - Plan for flexibility to help embrace change as it comes - One size doesn’t fit all – each tool serves a purpose - Automate everything – leverage automated test and deployment to your analytic environment
  • 59. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Shameless Careers Pitch
  • 60. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. WE INNOVATE - Cloud forward strategy - Micro-service architecture - Gamification & metric driven programming - IoT - Connected cardio, beacons, wearables - Integrated single view of customer & advanced CRM - Machine Learning - Recommendations, Predictions, NLP, Chat-bots - Data Platform - Amazon Redshift, Amazon EMR / Amazon Spark, Amazon S3 / AWS Glue / Redshift Spectrum / Athena
  • 61. Submit Session Feedback 1. Tap the Schedule icon. 2. Select the session you attended. 3. Tap Session Evaluation to submit your feedback.
  • 62. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you!