WMFRA # 46: Case Study - In-Store Analysis

441 views

Published on

This is the case study presentation of a in-store analysis in retail, presented at the WebMonday Frankfurt #46.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
441
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

WMFRA # 46: Case Study - In-Store Analysis

  1. 1. CC 2.0 by Per Olesen | http://flic.kr/p/7pVCgZ
  2. 2. CC 2.0 by Franck BLAIS | http://flic.kr/p/cwVnSy
  3. 3. CC 2.0 by John Steven Fernandez | http://flic.kr/p/a8uTzz
  4. 4. CC 2.0 by Ian Carroll | http://flic.kr/p/6NWoGm
  5. 5. CC 2.0 by Perry French | http://flic.kr/p/8wDMJS
  6. 6. CC 2.0 by John Mitchell | http://flic.kr/p/5UaPg8
  7. 7. März 8, 2013Before we started designing a blueprint 7solution we first of all asked ourselves:1  Who would be asked to answer questions like this?2  Who is this person?3  What tools does this person expect to use?4  And what is a typical skill set of this person?5  How do they work?PreparationHow do we answer these questions?
  8. 8. März 8, 2013From a high level of abstraction the 8answer is simple. We need a datamanagement system with three pieces:ingest, store and process. Data Data Data Data Source Ingestion Storage ProcessingTraditional Data Management System ApproachSo, how do we answer these questions as a
  9. 9. März 8, 2013We take this basis architecture and replace the 9generic terms while mapping it onto the Hadoopecosystem. Data HIVE, Source Flume HDFS Impala BI/Analysis/ ReportingWith this Hadoop architecture a Data Scientistshould be able to answer the questions without anyprogramming environment. He/she can also usefamiliar BI, analysis and reporting tools as well.Blueprint for a Data Management System with HadoopSo, how do we answer these questions as a
  10. 10. März 8, 20131  2 WiFi access points to simulate two different stores 10 with OpenWRT, a linux based firmware for routers, installed2  Flume to move all log messages to HDFS, without any manual intervention (no transformation, no filtering)3  A 4 node CDH4 cluster4  Pentaho Data Integration‘s graphical designer for data transformation, parsing, filtering and loading to the warehouse5  Hive as data warehouse system on top of Hadoop to project structure onto data6  Impala for querying data from Hive in real time7  Tool to visualize resultsSetupIngrediants
  11. 11. CC 2.0 by Qi Wei Fong | http://flic.kr/p/7w8vfq
  12. 12. März 8, 2013The plot indicates that about 85% of the visits were detected in store 12number one and about 15% in store number two. One might draw theconclusion that store number one is in a much better location with moreoccasional customers.But let’s gain more insights by analysing the number of unique visitors.Analysis ResultVisits for stores number one & two
  13. 13. März 8, 2013This plot gives us more details about the customers. It turns out 13that the 135 visits in store number one were caused by just 9unique visitors while store number two encountered 5 uniquevisitors.Analysis ResultUnique visitors
  14. 14. März 8, 2013This plot indicates that we have more returning than new users in both 14stores. In store number two we didn’t see a new user over the past 4 daysat all.It’s probably a good idea to start a marketing campaign which aims atnew customers, e.g. to give out vouchers for the first purchase.Analysis ResultNew vs. returning users
  15. 15. März 8, 2013The plot for the last 4 days vividly visualizes that the visit duration 15in store number one was evenly distributed while the distributionin store number two shows some peaks.We can also see that visitors tend to stay in shop number onemuch longer.Analysis ResultVisit duration over the past 4 days
  16. 16. März 8, 2013There is a lot of useful information that can be derived 16from this plot.1.  There is a repeating pattern of step-ins and step-outs within a short period of time.2.  There was a step-out of store number one and a step-in into store number two within just 28 seconds.Analysis ResultAvg. Duration Between Visits of one particular user
  17. 17. Mä rz   8,   201 3  CC 2.0 by Aurelien Guichard | http://flic.kr/p/cjg9yw
  18. 18. März 8, 20131  Presentation, Video and Post Series 18 •  http://bit.ly/YgtIMK2  http://sentric.ch3  http://www.bigdata-usergroup.ch4  http://about.me/jpkoenigLinks

×