SlideShare a Scribd company logo
1 of 11
1
Metail's Redshift Experience and Why the 'Like'
in 'Postgres Like' Is Important
Gareth Rogers, Data Engineer
2
Metail lets you try on clothes online
Discover clothes on
your body shape
Create, save outfits
and share
Shop with confidence
of size and fit
3
Proven impact as validated by
American business schools and A/B tests
‘‘
…customers who had access to the fitting tool are more likely to come
back to the site, and this effect is statistically significant… ‘‘
…shows approximately a 5.1 percent reduction in returns
compared to the control group…In other words, providing fit information
reduces average fulfilment costs”
…sales for users with access to the tool were substantially higher overall - 22.32 percent larger
‘‘
Source: “The Value of Fit Information in Online Retail: Evidence from a
Randomized Field Experiment” by Prof Santiago Gallino (Dartmouth College -
Tuck School of Business) & Prof Antonio Moreno (Northwestern University) –Oct 21,
2015
DATA
1000+ GARMENTS
POINTS3M
4
Architecture
Comparing with a more modern flow:
http://tech.metail.com/elastic-mapreduce-metail-aws-loft-london/
User DB
DynamoDB
5
Creating the Cluster
6
Creating the Cluster
• Compute capacity vs storage capacity
– Tight coupling of compute and storage
– We load everything into Redshift so far >3TB of data
– At 1GB per day compute cluster last 10 sprints, at 30GB per day not so long :S
– Six node dc1.8xlarge cluster costs $957.60 per week on-demand pricing
7
Creating the Cluster
8
It’s Postgres Like – Connecting to the Server
• Postgres like system meant from day one there were
mature tools and stack overflow help
• Redshift ecosystem now more mature and
optimised tooling and help exists
• Redshift JDBC/ODBC is now recommended over
PostgresSQL driver
9
It’s Postgres Like – My First Query
WITH order_events AS (
SELECT collector_tstamp, event_id, ue_properties
FROM events
WHERE collector_tstamp >= '2015-09-20' AND collector_tstamp < '2015-10-02‘ AND event = 'unstruct'
AND JSON_EXTRACT_PATH_TEXT(ue_properties,'data','data','name') = 'Order'),
in_orders AS (
SELECT DATE(collector_tstamp) AS order_date,
COUNT(event_id) AS orders,
COUNT(DISTINCT event_id) AS orders_distinct
FROM order_events
WHERE ue_properties ILIKE '%"bin":"in",%'
GROUP BY DATE(collector_tstamp) ORDER BY DATE (collector_tstamp)),
out_orders AS (
SELECT DATE(collector_tstamp) AS order_date,
COUNT(event_id) AS orders,
COUNT(DISTINCT event_id) AS orders_distinct
FROM order_events
WHERE ue_properties ILIKE '%"bin":"out",%'
GROUP BY DATE(collector_tstamp)
ORDER BY DATE(collector_tstamp))
SELECT bin_in.order_date,
bin_in.orders AS bin_in_orders,
bin_out.orders AS bin_out_orders
FROM in_orders AS bin_in
INNER JOIN out_orders AS bin_out ON bin_in.order_date = bin_out.order_date
ORDER BY bin_in.order_date;
10
Not so Postgres Like – Schema Design
• For day-to-day querying even power users
won’t notice the difference
• For the schema designers the differences
matter and will bite you from the start
• Redshift = columnar; Postgres = row; Very
different optimisation considerations
11
Summary
• Redshift gives you all the usual AWS goodies
• Day-to-day you don’t care that Redshift is
Postgres like
• When designing the schema forget about row
databases, experiment with columnar stores

More Related Content

Viewers also liked (16)

Fits.me E-Fashion Summit 2013
Fits.me   E-Fashion Summit 2013Fits.me   E-Fashion Summit 2013
Fits.me E-Fashion Summit 2013
 
Data Insights Talk
Data Insights TalkData Insights Talk
Data Insights Talk
 
How to Land a Job in a Startup (26:02:15)
How to Land a Job in a Startup (26:02:15)How to Land a Job in a Startup (26:02:15)
How to Land a Job in a Startup (26:02:15)
 
Doctors
Doctors Doctors
Doctors
 
Las tics
Las ticsLas tics
Las tics
 
I’m drifting through negative space
I’m drifting through negative spaceI’m drifting through negative space
I’m drifting through negative space
 
8051 microcontroller
8051 microcontroller8051 microcontroller
8051 microcontroller
 
Anillos de superbowl
Anillos de superbowlAnillos de superbowl
Anillos de superbowl
 
Impactos de los Cultivos Transgénicos en Uruguay: Promesas, Riesgos y Certezas
Impactos de los Cultivos Transgénicos en Uruguay:  Promesas, Riesgos y CertezasImpactos de los Cultivos Transgénicos en Uruguay:  Promesas, Riesgos y Certezas
Impactos de los Cultivos Transgénicos en Uruguay: Promesas, Riesgos y Certezas
 
Las tics
Las ticsLas tics
Las tics
 
Raspored odeljenja po ucionicama 2015 09
Raspored odeljenja po ucionicama 2015 09Raspored odeljenja po ucionicama 2015 09
Raspored odeljenja po ucionicama 2015 09
 
20160919_CV_Beshr Al Hamwi
20160919_CV_Beshr Al Hamwi20160919_CV_Beshr Al Hamwi
20160919_CV_Beshr Al Hamwi
 
Hany KSA 1
Hany KSA 1Hany KSA 1
Hany KSA 1
 
Library as a classroom (23 June 2016)
Library as a classroom (23 June 2016)Library as a classroom (23 June 2016)
Library as a classroom (23 June 2016)
 
A causa do efeito inesperado
A causa do efeito inesperadoA causa do efeito inesperado
A causa do efeito inesperado
 
Naruto
NarutoNaruto
Naruto
 

Similar to Redshift and Why the 'Like' In 'PosgresSQL Like' Matters

BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...Big Data Week
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarinn5712036
 
Nastel AutoPilot Proactive Application Analytics
Nastel AutoPilot Proactive Application AnalyticsNastel AutoPilot Proactive Application Analytics
Nastel AutoPilot Proactive Application AnalyticsjKool
 
Spring Data JPA USE FOR CREATING DATA JPA
Spring Data JPA USE FOR CREATING DATA  JPASpring Data JPA USE FOR CREATING DATA  JPA
Spring Data JPA USE FOR CREATING DATA JPAmichaelaaron25322
 
Towards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning ResearchTowards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning ResearchArtemSunfun
 
Spring Data JPA in detail with spring boot
Spring Data JPA in detail with spring bootSpring Data JPA in detail with spring boot
Spring Data JPA in detail with spring bootrinky1234
 
Testing data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti BhushanTesting data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti BhushanKirti Bhushan
 
Running Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docx
Running Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docxRunning Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docx
Running Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docxtodd581
 
A federated information infrastructure that works
A federated information infrastructure that works A federated information infrastructure that works
A federated information infrastructure that works Stratebi
 
Mtc strategy-briefing-houston-pd m-05212018-3
Mtc strategy-briefing-houston-pd m-05212018-3Mtc strategy-briefing-houston-pd m-05212018-3
Mtc strategy-briefing-houston-pd m-05212018-3Dania Kodeih
 
Situation Awareness In A Complex World
Situation Awareness In A Complex WorldSituation Awareness In A Complex World
Situation Awareness In A Complex Worldvsorathia
 
Application Metrics - IPC2023
Application Metrics - IPC2023Application Metrics - IPC2023
Application Metrics - IPC2023Rafael Dohms
 
The Next Generation Application Server – How Event Based Processing yields s...
The Next Generation  Application Server – How Event Based Processing yields s...The Next Generation  Application Server – How Event Based Processing yields s...
The Next Generation Application Server – How Event Based Processing yields s...Guy Korland
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data ScientistsRichard Garris
 
Comprehensive container based service monitoring with kubernetes and istio
Comprehensive container based service monitoring with kubernetes and istioComprehensive container based service monitoring with kubernetes and istio
Comprehensive container based service monitoring with kubernetes and istioFred Moyer
 
Application Metrics (with Prometheus examples) #PHPDD18
Application Metrics (with Prometheus examples) #PHPDD18Application Metrics (with Prometheus examples) #PHPDD18
Application Metrics (with Prometheus examples) #PHPDD18Rafael Dohms
 
Application Metrics (with Prometheus examples)
Application Metrics (with Prometheus examples)Application Metrics (with Prometheus examples)
Application Metrics (with Prometheus examples)Rafael Dohms
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...Value Amplify Consulting
 
Introduction to System, Simulation and Model
Introduction to System, Simulation and ModelIntroduction to System, Simulation and Model
Introduction to System, Simulation and ModelMd. Hasan Imam Bijoy
 

Similar to Redshift and Why the 'Like' In 'PosgresSQL Like' Matters (20)

BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarin
 
Nastel AutoPilot Proactive Application Analytics
Nastel AutoPilot Proactive Application AnalyticsNastel AutoPilot Proactive Application Analytics
Nastel AutoPilot Proactive Application Analytics
 
Spring Data JPA USE FOR CREATING DATA JPA
Spring Data JPA USE FOR CREATING DATA  JPASpring Data JPA USE FOR CREATING DATA  JPA
Spring Data JPA USE FOR CREATING DATA JPA
 
Towards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning ResearchTowards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning Research
 
Spring Data JPA in detail with spring boot
Spring Data JPA in detail with spring bootSpring Data JPA in detail with spring boot
Spring Data JPA in detail with spring boot
 
Testing data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti BhushanTesting data warehouse applications by Kirti Bhushan
Testing data warehouse applications by Kirti Bhushan
 
Running Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docx
Running Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docxRunning Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docx
Running Head PROJECT DELIVERABLE 31PROJECT DELIVERABLE 310.docx
 
A federated information infrastructure that works
A federated information infrastructure that works A federated information infrastructure that works
A federated information infrastructure that works
 
Mtc strategy-briefing-houston-pd m-05212018-3
Mtc strategy-briefing-houston-pd m-05212018-3Mtc strategy-briefing-houston-pd m-05212018-3
Mtc strategy-briefing-houston-pd m-05212018-3
 
Situation Awareness In A Complex World
Situation Awareness In A Complex WorldSituation Awareness In A Complex World
Situation Awareness In A Complex World
 
Application Metrics - IPC2023
Application Metrics - IPC2023Application Metrics - IPC2023
Application Metrics - IPC2023
 
The Next Generation Application Server – How Event Based Processing yields s...
The Next Generation  Application Server – How Event Based Processing yields s...The Next Generation  Application Server – How Event Based Processing yields s...
The Next Generation Application Server – How Event Based Processing yields s...
 
Elementary Concepts of data minig
Elementary Concepts of data minigElementary Concepts of data minig
Elementary Concepts of data minig
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
 
Comprehensive container based service monitoring with kubernetes and istio
Comprehensive container based service monitoring with kubernetes and istioComprehensive container based service monitoring with kubernetes and istio
Comprehensive container based service monitoring with kubernetes and istio
 
Application Metrics (with Prometheus examples) #PHPDD18
Application Metrics (with Prometheus examples) #PHPDD18Application Metrics (with Prometheus examples) #PHPDD18
Application Metrics (with Prometheus examples) #PHPDD18
 
Application Metrics (with Prometheus examples)
Application Metrics (with Prometheus examples)Application Metrics (with Prometheus examples)
Application Metrics (with Prometheus examples)
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
 
Introduction to System, Simulation and Model
Introduction to System, Simulation and ModelIntroduction to System, Simulation and Model
Introduction to System, Simulation and Model
 

Recently uploaded

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 

Recently uploaded (20)

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 

Redshift and Why the 'Like' In 'PosgresSQL Like' Matters

  • 1. 1 Metail's Redshift Experience and Why the 'Like' in 'Postgres Like' Is Important Gareth Rogers, Data Engineer
  • 2. 2 Metail lets you try on clothes online Discover clothes on your body shape Create, save outfits and share Shop with confidence of size and fit
  • 3. 3 Proven impact as validated by American business schools and A/B tests ‘‘ …customers who had access to the fitting tool are more likely to come back to the site, and this effect is statistically significant… ‘‘ …shows approximately a 5.1 percent reduction in returns compared to the control group…In other words, providing fit information reduces average fulfilment costs” …sales for users with access to the tool were substantially higher overall - 22.32 percent larger ‘‘ Source: “The Value of Fit Information in Online Retail: Evidence from a Randomized Field Experiment” by Prof Santiago Gallino (Dartmouth College - Tuck School of Business) & Prof Antonio Moreno (Northwestern University) –Oct 21, 2015 DATA 1000+ GARMENTS POINTS3M
  • 4. 4 Architecture Comparing with a more modern flow: http://tech.metail.com/elastic-mapreduce-metail-aws-loft-london/ User DB DynamoDB
  • 6. 6 Creating the Cluster • Compute capacity vs storage capacity – Tight coupling of compute and storage – We load everything into Redshift so far >3TB of data – At 1GB per day compute cluster last 10 sprints, at 30GB per day not so long :S – Six node dc1.8xlarge cluster costs $957.60 per week on-demand pricing
  • 8. 8 It’s Postgres Like – Connecting to the Server • Postgres like system meant from day one there were mature tools and stack overflow help • Redshift ecosystem now more mature and optimised tooling and help exists • Redshift JDBC/ODBC is now recommended over PostgresSQL driver
  • 9. 9 It’s Postgres Like – My First Query WITH order_events AS ( SELECT collector_tstamp, event_id, ue_properties FROM events WHERE collector_tstamp >= '2015-09-20' AND collector_tstamp < '2015-10-02‘ AND event = 'unstruct' AND JSON_EXTRACT_PATH_TEXT(ue_properties,'data','data','name') = 'Order'), in_orders AS ( SELECT DATE(collector_tstamp) AS order_date, COUNT(event_id) AS orders, COUNT(DISTINCT event_id) AS orders_distinct FROM order_events WHERE ue_properties ILIKE '%"bin":"in",%' GROUP BY DATE(collector_tstamp) ORDER BY DATE (collector_tstamp)), out_orders AS ( SELECT DATE(collector_tstamp) AS order_date, COUNT(event_id) AS orders, COUNT(DISTINCT event_id) AS orders_distinct FROM order_events WHERE ue_properties ILIKE '%"bin":"out",%' GROUP BY DATE(collector_tstamp) ORDER BY DATE(collector_tstamp)) SELECT bin_in.order_date, bin_in.orders AS bin_in_orders, bin_out.orders AS bin_out_orders FROM in_orders AS bin_in INNER JOIN out_orders AS bin_out ON bin_in.order_date = bin_out.order_date ORDER BY bin_in.order_date;
  • 10. 10 Not so Postgres Like – Schema Design • For day-to-day querying even power users won’t notice the difference • For the schema designers the differences matter and will bite you from the start • Redshift = columnar; Postgres = row; Very different optimisation considerations
  • 11. 11 Summary • Redshift gives you all the usual AWS goodies • Day-to-day you don’t care that Redshift is Postgres like • When designing the schema forget about row databases, experiment with columnar stores