• Save
From the Big Bang to Ecommerce, a journey in making sense of Big Data
Upcoming SlideShare
Loading in...5
×
 

From the Big Bang to Ecommerce, a journey in making sense of Big Data

on

  • 198 views

Big Data Business Forum

Big Data Business Forum

Statistics

Views

Total Views
198
Views on SlideShare
198
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    From the Big Bang to Ecommerce, a journey in making sense of Big Data From the Big Bang to Ecommerce, a journey in making sense of Big Data Presentation Transcript

    • How old are you?
    • FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA Patrick Deglon Director of Global Traffic Analytics pdeglon@ebay.com linkd.in/pdeglon
    • Agenda 1 Introduction: CERN & eBay 2 eBay Infrastructure 3 Examples of Analysis 4 Partnership & Trust FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 4
    • 5 Image: CERN
    • During 1996-2002, worked at CERN (the European Laboratory for Particle Physics) for my MS and PhD at the University of Geneva Mont Blanc Geneva Switzerland 17 miles underground tunnel for the LEP & LHC accelerator Source: CERN 6 Image: CERN
    • 7 Image: CERN Source: CERN
    • Example of a particle collision FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 8
    • Solving the puzzle… which particles go together? 1. AB + CD? 2. AC + BD? 3. AD + BC? A B ? D C FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 9
    • PAW – Physics Analysis Workstation Source: Wikipedia Tape robot Data collection & analysis was done in Fortran. Advance analysis/statistics was done through PAW. [1996-2002] Source: CERN FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 10
    • Solution: Big Data infrastructure enables large scale computational such as combine all possibilities (cross-product) Schematic View CERN Example (discovery of a new particle bb) Signal (particle resonance) Statistical Noise FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA Source: http://www.atlas.ch/news/2011/ATLAS-discovers-its-first-new-particle.html 11
    • Size of the electron? R < 5.1 x 10-19 m *** *** Patrick Deglon, Etude de la diffusion Bhabha avec le détecteur L3 au LEP, Th. phys. Genève, 2002; Sc. 3332 FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 12
    • Extra dimension? MS > 1.1 TeV *** graviton extra dimension e+ e+ ee- our universe in 4 dimensions *** Patrick Deglon, Etude de la diffusion Bhabha avec le détecteur L3 au LEP, Th. phys. Genève, 2002; Sc. 3332 FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 13
    • 2004, joined eBay European HQ in Bern, Switzerland FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 14
    • $68 billion in merchandise traded in 2011 ... or $1.3 million every 10 minutes FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 15
    • eBay: The World's Online Marketplace® every every every 26 2 4 min. min. sec. a Ford Mustang is sold a major appliance is sold a pair of shoes is sold FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 16
    • CERN vs EBAY CERN EBAY • Write kilometers long Fortran code • Analysis can run for many hours… before a batch robot error • Write miles long SQL code • Queries can run for many hours… before a spool space error • Study billions of collision data • Study billions of transactional data • Great depth of data structure & complexity • Great depth of data structure & complexity • Know your local expert for question – but try to find the solution by yourself… much quicker • Know your local expert for question – but try to find the solution by yourself… much quicker • Remove “bad runs” (unclean data batch) • Remove “wackos” (non material transactions) • Transform a complex system into insights • Transform a complex system into insights • Communicate findings to conferences • Communicate recommendation to business review • Strong competitive landscape (4 distinct experiments competing to the first to publish, or publish better results) • Strong competitive landscape FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 17
    • Analytics at eBay “CIO” “CDO” “CAO” “CMO” Analytics Platforms & Delivery (APD) Analytics Marketing    Technology Finance Business Units End Users of Big Data FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 18
    • What my friends think I do What my mum thinks I do What the BU thinks I do What I think I do What the BU wants me to do What I really do Source: Pierre Donzier FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 19
    • EBAY INFRASTRUCTURE 1 Introduction: CERN & eBay 2 eBay Infrastructure 3 Examples of Analysis 4 Partnership & Trust
    • Core Analytics Data Access Business Centric DataHub MS Excel Tableau Data Platform Technology Centric SAS/R OBIEE MicroStrategy Analyze & Report SOA/DAL Purpose Built Aps SQL Discover & Explore EDW “SINGULARITY” HADOOP CLUSTERS ENTERPRISE-CLASS SYSTEM LOW END ENTERPRISE-CLASS SYSTEM COMMODITY HARDWARE SYSTEM Teradata 55xx and 66xx Series Relational Data Dual System 10+ PB Semi Structured & Relational Data Deep Storage Unstructured Data Pattern Detection Deep Storage 40+ PB 40+ PB Data Integration Ab Initio Informatica Golden Gate UC4 BES MapReduce 21
    • DW Sandbox enables agile analytics Analytics teams have access to sandboxes within eBay Teradata data warehouses (~ 100 GB per sandbox): • Enable to keep the “Single analyst’s sandbox Teradata Data Warehouse Point of Truth” philosophy • Improved Time To Market – Days / Weeks vs Months • Enable the business to do agile prototyping • Enable the users to “Fail Fast” – Make it easy to try out new ideas • Eliminate isolated Data Marts FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 22
    • SO… WHERE DO WE GO FROM HERE? 1 Intro: CERN & eBay 2 eBay Infrastructure 3 Examples of Analysis 4 Partnership & Trust
    • Measuring impact of initiatives A/B test Pre/Post analysis illustrative example (Simulation) illustrative example (Simulation) Number of purchases Number of listings 35,000 Initiative launched 450 400 Impact of the initiative 350 300 test group 200 150 50 0 Aug 1st pre 2012 post D 25,000 20,000 250 100 30,000 Impact of the initiative Initiative launched B 15,000 2011 C 10,000 control group Sep 1st 5,000 Oct 1st • Randomized Test/Control group methodology is a golden standard in research A 0 Aug 1st Sep 1st Oct 1st • Used to measure the impact of an initiative in a full market or a market segment FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA
    • Marketing 101 Cost Direct Return Purchase L C L Incr Return ? No Purchase ? C D Don‟t Do Marketing D Do Marketing FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 25
    • Medici Effect • New ideas proliferate when professional or cultural fields collide. That‟s the “Medici Effect.“ • During the Renaissance, the Medici family enabled such collisions by funding various fields and facilitating interdisciplinary creativity. House of Medici Michelangelo Source: FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 26
    • Remember this physics problem? 1. AB + CD? 2. AC + BD? 3. AD + BC? A B ? D C FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 27
    • Solution: Big Data infrastructure enables large scale computational such as combine all possibilities (cross-product) Schematic View CERN Example (discovery of a new particle bb) Signal (particle resonance) Statistical Noise Combine correlated events and uncorrelated events produce a system with a statistical noise (which is simple enough to extract) and the researched signal FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA Source: http://www.atlas.ch/news/2011/ATLAS-discovers-its-first-new-particle.html 28
    • Big Data technologies enable the full Cartesian product of Marketing action & Revenue generating events Clicks – Conversion Playground Marketing Events (Clicks or Impressions) FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 29
    • Alternative way to understand customer behavior & incrementally: geographic experimentation Revenues / Cost 3 per. Mov. Avg. (Group 1) Baseline 3 per. Mov. Avg. (Group 2) 3 per. Mov. Avg. (Group 3) Phase 1 3 per. Mov. Avg. (Group 4) Phase 2 FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 30
    • CREATING VALUE THROUGH THE ORGANIZATION 1 Introduction: CERN & eBay 2 eBay Infrastructure 3 Examples of Analysis 4 Partnership & Trust
    • Analytics as a function? Embedded Model Functional Model “I‟m following my BU leader, but can‟t get promoted” “I‟m a partner of business execution”  Need to track satisfaction/loyalty/trust of our partnership FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 32
    • Net Promoter Score NPS: How likely is that you will recommend [Brand Name] to a friend or a colleague? 0 1 2 3 4 5 6 7 8 very unlikely 9 10 very likely Detractors Passives Promoters NPS = % Promoters - % Detractors FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 33
    • The logic behind NPS • To improve NPS, a company need to work on 2 fronts: – Move Detractors into Passives (i.e. fix the holes, i.e. no more unacceptable bad experiences) – Move Passives into Promoters (i.e. improve the whole experience, best-in-class buyer experience) 0 1 2 3 Detractors 4 5 6 7 8 Passives 9 10 Promoters NPS = % Promoters - % Detractors FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 34
    • Side note: Error on NPS measurement • NPS is a multinomial distribution with – p the probability to answer 0 to 6 – q the probability to answer 7 or 8 – r the probability to answer 9 or 10 – N the number of answers • The Expected value for the Net Promoter Score is then E(NPS) = r – p • The Variance is then V(NPS) = V(r-p) = V(r) + V(p) – 2 Cov(r,p) = r (1-r) / N + p (1-p) / N + 2 r p / N • Hence the error on NPS, i.e. the Standard Deviation, is then (NPS) = SQRT [ r (1-r) / N + p (1-p) / N + 2 r p / N ] FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 35
    • NPS is a measurement of Loyalty in a free environment. In a paid environment, it‟s more a measurement of Trust between co-workers/partners Net Promoter Score How likely is it that you would recommend working with Analyst XXX to a friend or colleague? 0 1 2 3 4 5 6 7 8 9 10 FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 36
    • eNPS Survey Team eNPS Survey Partner eNPS Survey • Identify opportunity to better partner with the business • Identify to better work together as a team • Enable directional assessment of eNPS; keeping in mind biases: low N, subjective question, unlikely to promote an unknown entity, partner <> client (i.e. Finance vs Agency) Now that we have a measurement, how to improve it? FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 37
    • What is Trust? How to improve it? Trust = Credibility Reliability Intimacy Unselfishness http://www.collieassociates.com/common/Trust_Equation.pdf Words: Convincing & believable Actions: Consistently good in quality & performance Emotions: Feel comfortable talking to you about the sensitive, personal issues connected to the surface issue Motives: Know that you care about serving higher interests FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 38
    • Build Trust: Trust Equation Trust = R × C × I × Trust Component Reliability (Actions = Consistently good in quality & performance) Credibility (Words = Convincing & believable) Insights Discovery ® Colors Hartman Personality Profiles Lead completely Fiery RED “Do it now!” RED Power Wielders Practice judgment Cool BLUE “Do it right!” BLUE The Do-gooders Keep it human Earth GREEN “Do it harmoniously!” WHITE The Peacekeepers Trust each other Sunshine YELLOW “Do it together!” YELLOW The Fun Lovers Intimacy (Emotions = Feel comfortable talking to you about the sensitive/personal issues connected to the surface issue) Unselfishness U eBay Success Factor (Motives = Know that you care about serving our higher interests) Carl Jung, Swiss psychologist FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 39
    • Example of an internal partners survey on the Trust foundation Translates ideas and concepts into action. 4.9 Turnaround requests effectively. 5.0 Is comfortable with change. 5.0 Is adept at prioritizing tasks. Does what one says one will do. Tell the truth. Is genuine in saying „Thank you‟ or „I don‟t know‟. Is comfortable saying 'no' at the beginning rather than being unable to deliver in the end. Creates an environment to address potential conflicts openly. Reliability (4.9) 4.9 5.2 5.6 5.5 Credibility (5.3) 5.0 5.0 Seeks help when facing difficulties. 5.3 Has an appropriate sense of humor. 5.3 Responds to and understand the feelings/needs of others. 5.4 Uses „we‟ rather than „they‟ or „I‟. Makes time for others. Intimacy (5.2) 5.2 5.4 Supports ideas for innovation from others. 5.3 Trusts others to make decisions and get things done for them. Unselfishness (5.3) 5.2 Please complete each of the following statements using the rating guide. Try to provide a rating for every statement and be honest with your feedback. Weak in this area=1, Some concerns=2, A minor shortfall=3, Competent=4, Better than competent=5, Outstanding=6 FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 40
    • Trust Equation assessment by the team and our partners Partner average answer 90 85 under confidence zone over confidence zone Intimacy, Keep It Human Credibility, Meets Quality 80 Non Political, Unselfishness 75 Reliability, Meets Deadline 70 65 60 60 65 70 75 80 85 90 Team average answer FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 41
    • Reliability: Value of an Analysis Keep It Simple & Stupid Individual Limit Total Cost Direct Return Preferred analyst‟s level of complexity Optimal level of complexity Complexity of Analytics Net Return (Profit) FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 42
    • Credibility: Principle Of Least Surprise (POLS) Don‟t surprise executives & partners with new metrics, new definition, new format or anything new… without a proper business reason. Setup Insights & Recommendation in a natural, logical, global & agreed-upon framework. FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 43
    • Credibility: Fixed Standard… or Flexible Chaos? Standardized Global Metrics Store any thing to enable measuring any metrics to answer any questions Chaos enable flexibility, but require a strong process to maintain credibility FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 44
    • (Business) Intimacy • Keep It Human – meet people, talk to people, walk to desk, pick-up the phone • Seek help when needed • Have a good sense of humor – “It‟s just a website…” • Create an enviroment where people can open-up and discuss underlying issue • Respond to the need/feeling of others • CONNECT with people (Avatar‟s “I see you”) FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 45
    • Unselfishness • Don‟t work in silo • Consider “we” rather than “I” or “they” • Support ideas for innovation from other (improv‟s “yes, and…”) • Trust other to make the right decision – and live with it • Be AVAILABLE – make time for other FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 46
    • Wrapping Up How complexity can spark innovation, but also kill effectiveness • Medici principle • KISS • Managing chaos Why an embedded or client-centric Analytics organization is not necessarily a great idea • Enable career path with an Analytics organization • Partner vs Client • eNPS - Maintain the pulse on the internal-client/partner satisfaction Why analyst creativity is antagonistic to executive reporting • Trust pillars: Reliability, Credibility, Intimacy, Unselfishness • POLS FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 47
    • Q&A
    • FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA Patrick Deglon Director of Global Traffic Analytics pdeglon@ebay.com linkd.in/pdeglon
    • Credibility: Key Phases of an Analytics Project Move the Business Follow-up / Implementation Readout Executive Summary Scoping Hypothesis to be verified Scoping the question Measurement set up Measuring Query Data check Guiding the Business Story Line / Deck Driving Insights Facts / Slides Review hypothesis Data manipulation Interpretation Statistics Graphs FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 50
    • James, 32, live in Pittsburgh, married, 1 child, Electronics Enthusiast Site Visit Site Visit YouTube Display Click Site Visit Offline Store Visit Google Search on “Digital Camera”, click on eBay PS Ad Google Search on “eBay Digital Camera” Click on NS link Purchase Loyalty Level i.e. Likelihood to purchase on eBay Woa.. They really have nice deals on eBay Ah…yes, e Bay was a good idea – what do they have? That‟s really expensive in a store Let‟s get that camera now Time FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 51
    • Marketing Attribution Logic $ YouTube Display Impression Google Search on “Digital Camera”, click on eBay PS Ad Google Search on “eBay Digital Camera” Click on NS link Purchase How does the purchase correlate to the customer touch points? How “close”/”distant” are the clicks & the purchase? Which one is the most important? FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 52
    • What is more important: the front wheel or the back wheel? FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 53
    • Marketing Attribution Management YouTube Display Impression Google Search on “Digital Camera”, click on eBay PS Ad Google Search on “eBay Digital Camera” Click on NS link Purchase Define correlation (“distance”) between customer touch points and purchase and the likelihood that it happens distance in time distance in KW space distance in Mindset • Latency: time between click and ROI event (2 minutes? 2 hours? 2 days?) • Relevancy: difference between Search keyword and Item purchased (KW-Title relevancy, KW-Vertical relevancy) • Loyalty: mindset of customer, i.e. RFM segment (Reactivation or Top Buyer) • … FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 54
    • Marketing Attribution Management Last Click First Click All Clicks Model YouTube Display Impression Google Search on “Digital Camera”, click on eBay PS Ad Google Search on “eBay Digital Camera” Click on NS link 100% YouTube Display Impression 100% Google Search on “Digital Camera”, click on eBay PS Ad Google Search on “eBay Digital Camera” Click on NS link YouTube Display Impression 33% Google Search on “Digital Camera”, click on eBay PS Ad 33% Google Search on “eBay Digital Camera” Click on NS link 33% YouTube Display Impression 60% Google Search on “Digital Camera”, click on eBay PS Ad 35% Google Search on “eBay Digital Camera” Click on NS link 5% Purchase Purchase Purchase Purchase FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 55
    • … So what? Last Click Channel A Channel B Channel C GMB 8% 5% 1% ROI +20% -10% +10% • Reduce spend on channel B • Invest in channel A • When prioritizing, ignore channel C <> All Clicks Model Channel A Channel B Channel C GMB 7% 6% 12% ROI -20% +30% +60% • Reduce spend on channel A • Invest heavily on channel C • Marketing counts actually for 25% of the site FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 56
    • Example of the International Weekly Variance Infrastructure (2007) Automated SQL Core DW database Excel inputs PDF print-out PET* Modular Back-end single pivot table PPT & Excel report Flexible Front-end * PET is a small database inside the Teradata Data Warehouse for building prototypes. FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 57
    • Example of Automated Quarterly Market Review deck (2007) PowerPoint chart object with a “SQL” field containing a EXEC MACRO to refresh data content of the chart Linked to an Excel file that can we refresh when needed PowerPoint table object with a “SQL” field containing a EXEC MACRO to refresh the table content 58
    • PowerPoint Reporting Tool (2012) Update the content of the selected objects (table or chart) Update the content of all objects in the PowerPoint Login to DW Add a “SQL” tag to objects (table of chart) and edit the SQL Create a dummy chart FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 59
    • Example of BI report using Tableau FROM THE BIG BANG TO ECOMMERCE, A JOURNEY IN MAKING SENSE OF BIG DATA 60