A More Transparent Interpretation of Health Club Surveys (YMCA)
 

A More Transparent Interpretation of Health Club Surveys (YMCA)

on

  • 1,116 views

 

Statistics

Views

Total Views
1,116
Views on SlideShare
1,021
Embed Views
95

Actions

Likes
1
Downloads
4
Comments
0

3 Embeds 95

http://www.salford-systems.com 77
https://www.salford-systems.com 12
http://test.salford-systems.com 6

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

A More Transparent Interpretation of Health Club Surveys (YMCA) A More Transparent Interpretation of Health Club Surveys (YMCA) Presentation Transcript

  • Abbott - A More Transparent Interpretation of Health Club SurveysSalford Analytics and Data Mining Conference 2012 San Diego, CA May 24, 2012 Dean Abbott Abbott Analytics, Inc. URL: http://www.abbottanalytics.com Blog: http://abbottanalytics.blogspot.com Twitter: @deanabb Copyright © 2004-2012, Abbott Analytics, Inc. 1 and Seer Analytics, Inc. All rights reserved.
  • About Seer Analytics  Seer Analytics, LLC – Founded in 2001, based in Tampa, FL – Produce actionable intelligence to help clients make smarter decision and drive business performance. – Reports embed sophisticated analytics yet are designed to be accessible and meaningful to a non-technical audience.  Bill Lazarus, Founder, President and CEO – BA from University of Wisconsin – MA from University of Toronto, – SM and PhD from Massachusetts Institute of Technology.Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 2
  • About Abbott Analytics  Abbott Analytics – Founded in 1999, based in San Diego, CA – Dedicated to data mining consulting and training  Principal: Dean Abbott – Applied Data Mining for 22+ years in  Direct Marketing, CRM, Survey Analysis, Tax Compliance, Fraud Detection, Predictive Toxicology, Biological Risk Assessment – Course Instruction  Public 1-, 2-, and 3-day Data Mining Courses  Conference Tutorials and Workshops (next: ACM Data Mining Bootcamp, November 13th in San Francisco) – Customized Training and Knowledge Transfer  Data mining methodology (CRISP-DM)  Training services for software products, including CART, Clementine, Affinium Model, Insightful MinerCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 3
  • Talk Outline  Health Club Survey Analysis Problem Description – Overview of Survey Analysis and Approaches  Solution 1: traditional approach – Statistical approach, fancy visualization  Solution 2: solution aligning models to business objectives  Results and conclusionsCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 4
  • Problem Setup: Member Survey  Question: – What are the characteristics of members who indicated the highest overall satisfaction with their Y?  Data: – 32,811 records containing survey answers – No demographic data except what was on survey (marital status, children, age, gender)  Approach: – Create supervised learning models with target variable “overall_satisfaction = 1”Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 5
  • Some Notes  It is very unusual to have so many records – The 31K responses were for one year – Responses are collected from across the country  Seer tracks survey responses longitudinally as well (not discussed in this talk) – Began collecting survey responses and storing in a database in 2001 => 10 years of data – Seer has moved beyond modeling satisfaction to include a more complete view of the YMCA member experienceCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 6
  • Data Preparation  Begin with 57 candidate inputs to model – All survey questions are multiple choice  Treated as categories, not numbers  Typically 6 categories per question (1-5)  Unknown initially coded as “0” – No text comments fields included as inputs to model  Create new column for target variable – If overall_satisfaction = 1, variable value = 1, otherwise, variable value = 0  Data very clean with respect to NULLsCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 7
  • Member Survey Question CategoriesCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 8
  • Sampling and Target Populations  Begin with 32,811 responses  Set aside about half for validation (not used during modeling): 16,379 records – These records will be used to provide final summaries of the segments  Q1 - Satisfaction = 1: 31% – 86% have Recommend to friends = 1  Q48 - Recommend to Friend = 1: 54% – 49% have Overall Satisfaction = 1 – 26.0% have both overall satisfaction and recommend to friends both equal to 1  Q32 - Likelihood to Renew = 1: 46%  Implications – All three are interesting, but Recommend is so high already, not much room for growthCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 9
  • Objective and Data Challenges  Project Objective – Interpret results of survey for YMCA  Challenges – Missing data (some questions either N/A or blank)  Solution: Impute values that least effect information communicated by question (not a mean or median!) – Question responses highly correlated with one another  Multi-collinearity and interpretation of results problematic  Must reduce dimensionality without losing interpretation of results  Solution: Factor analysisCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 10
  • Objective and Data Challenges  Challenges, cont‟d – Target variable  Three questions pointed to the important actionable information (related to how satisfied members were)  No one question fully characterized the value of a member  Solution: combine all three into a new “index of excellence” (IOE) – IOE = additive weighted sum of Q1, Q32, Q48 – Reverse scale so higher IOE is betterCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 11
  • Why Factor Analysis?  “Traditional approach to survey analysis involves the use of frequency counts, t-test, correlation, and measures of central tendency. “  “Factor analysis is a variable-reduction statistical technique capable of probing underlying relationships in variables” – Santos, J.R.A., Clegg, M.D. (1999), "Factor analysis adds new dimension to extension surveys", Journal of Extension, http://www.joe.org/joe/1999october/rb6.php  Our use of Factor Analysis – Traditional view: there is an underlying “truth” that exist, and the survey is a redundant measure of that truth. – Just a derived variable that reduces dimensionalityCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 12
  • Factor Analysis: Key Factors Factor 1 1.00 Loading Value 0.80 0.60 0.40 0.20 0.00 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Top Question Loadings Factor 2 0.80 Loading Values 0.60 0.40 0.20 0.00 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q23 Top Question LoadingsCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 13
  • Member Survey Factor Analysis Loadings Condition of Friendly / Factor Facilities Specific Competent Financial Description Staff Cares clean/safe Equipment Registration Equipment Staff Assistance ParkingFactor Number Factor1 Factor2 Factor3 Factor4 Factor5 Factor6 Factor7 Factor8 Q2 0.295 0.238 0.115 0.458 0.054 0.380 (0.016) 0.095 Q3 0.217 0.143 0.093 0.708 0.094 0.077 0.033 0.048 Q4 0.298 0.174 0.106 0.601 0.068 0.266 0.002 0.062 Q5 0.442 0.198 0.087 0.173 0.025 0.613 (0.021) 0.053 Q6 0.417 0.254 0.142 0.318 0.044 0.584 (0.008) 0.058 Q7 0.406 0.277 0.167 0.252 0.045 0.461 0.003 0.092 Q8 0.774 0.058 0.041 0.093 0.052 0.113 0.036 0.061 Q9 0.733 0.175 0.108 0.145 0.052 0.260 0.024 0.052 Q10 0.786 0.139 0.079 0.110 0.060 0.218 0.029 0.046 Q11 0.765 0.120 0.101 0.132 0.089 0.015 0.038 0.047 Q12 0.776 0.090 0.049 0.087 0.041 0.014 0.042 0.053 Q13 0.145 0.728 0.174 0.112 0.106 0.110 0.006 0.018 Q14 0.191 0.683 0.163 0.151 0.053 0.124 0.013 0.089 Q15 0.102 0.598 0.141 0.090 0.162 0.070 0.029 0.152 Q16 0.100 0.370 0.133 0.082 0.028 0.035 0.009 0.843 Q17 0.128 0.567 0.229 0.102 0.116 0.080 0.018 0.224 Q18 0.148 0.449 0.562 0.116 0.132 0.114 0.010 0.042 Q19 0.129 0.315 0.811 0.101 0.102 0.103 0.002 0.063 Q20 0.171 0.250 0.702 0.086 0.145 0.078 0.016 0.149 Q23 0.271 0.220 0.188 0.316 0.121 0.046 0.069 0.019 Q24 0.363 0.165 0.128 0.140 0.080 0.095 0.035 0.076Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 14
  • Reduce Variables using Regression  Already beginning Regression Rankings of Questions/Factors with only 13 variables 0.6 Regression Coefficient 0.5  Question: how 0.4 many of these are 0.3 useful predictors? 0.2  Decided to retain 5 0.1 factors for final 0 fa 1 0 2 9 1 4 3 8 6 5 7 44 22 fa 25 3. 3. 3. 3. 3. 3. 3. 3. 3. model 3. Q Q Q or or or or or or or or or or ct ct ct ct ct ct ct ct ct ct fa fa fa fa fa fa fa fa Question/FactorCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 15
  • Predictive Modeling Approach 3 questions with Identify Key high association 50+ Survey Questions Questions with target Regression Model: Find Significant Variables 13 fields Factor Analysis: down to 7 10 factors, or 10 factors variables that loaded Regression Model: highest on Find Significant each factor Variables 3 key questions Variable ranksCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 16
  • One Further Note on Final Regression Models  Empirical comparison: Factors as inputs vs. Top-loading question in factor as input – Top-loading or most interesting question on factor as representative of that factor produced slightly better models – Use of top-loading question makes final model more easily understood – This flies in the face of traditional theory, but worked better operationally  Final regression model contained these fields:Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 17
  • Key: Explaining Results Drivers of  Visualization shows key Satisfaction variables in survey associated with “excellence”, and Current performance metrics for Year vs. Staff 1 Staff 2 relationships Last Year each Y – How well did this Y do? goals – What is the change over last year‟s result? Prior year – This is a 45-dimensional equipment visualization (don‟t ask value me to name them all!) facility  Shows which attributes does the Y need to improve to improve customer satisfaction.Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 18
  • Interpreting Index/Drivers of Excellence Analysis  All factors listed are important  Position on „x‟ axis indicates relative importance of factors in driving IOE  Above the line = “better than peers”; below the line = “worse than peers”  Small dot indicates position last year  R-Y-G indicates magnitude of change from previous year  Size of bubble indicates magnitude of score on factorCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 19
  • drivers of excellence May 2003 Gardena YMCA 2003 vs.2002 Relative Performance Better Staff cares Meet fitness goals Same Staff competence Worse Peer Average 2002 year Prior Feel welcome Value Equipment Facilities Importance © 2003 Seer Analytics, LLC Tampa, FL 33602Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 20
  • Equipment drivers of Staff cares excellence Facilities May 2003 Feel welcome Montebello YMCA 2003 vs.2002 Relative Performance Staff competence Better Same Meet fitness goals Worse Peer Average Value 2002 year Prior Importance © 2003 Seer Analytics, LLC Tampa, FL 33602Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 21
  • drivers of excellence Equipment May 2003 Torrance South YMCA Staff competence Feel welcome Value Staff cares 2003 vs.2002 Relative Performance Better Same Worse Peer Facilities Average 2002 year Prior Meet fitness goals Importance © 2003 Seer Analytics, LLC Tampa, FL 33602Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 22
  • drivers of I love this visualization! excellence May 2003 Culver YMCA Feel welcome Staff cares 2003 vs.2002 Relative Performance Value Better Facilities Same Worse Peer Average Prior year Staff competence Meet fitness goals 2002 Equipment Importance © 2003 Seer Analytics, LLC Tampa, FL 33602Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 23
  • What’s the Problem with That?  Customer was not interested in “techno” solutions  Customer was interested in what actions could be taken as a result of the data mining models – Which characteristics are most correlated with best customers?  What do they like and dislike about the Y?  Is it equipment? relationships? facility? staff? – Show key contributors, how each Y compared with other Y locations, and if Y is improvingCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 24
  • So What’s The Problem with That? (cont’d)  Regression, Neural Networks are “global” estimators – The operate over the entire data space – Descriptors of Regression represent average influence – Neither technique provides explicit localized characteristics  Customer would like actionable analytics – Clear characteristics of subgroups – Different strategies for subgroups  Conclusion: In Round 2, use another approachCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 25
  • Who cares about “satisfaction”?  Issue: The YMCA is a cause-driven charity  It‟s not about running “satisfactory” gyms  It‟s about improving lives and building communities  Question: How can the member survey data help Ys achieve mission goals?  Answer: Develop a tool that is:  Grounded in solid social science  Accessible/understandable  Diagnostic/predictive  A driver of performance and changeCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 26
  • Satisfaction Model Performance 100 100 80 80 Misclassification for Learn Data % Class 60 60 Class N N Mis- Pct Cost 40 40 Cases Classed Error 0 33,220 8,178 24.62 0.25 20 20 1 14,845 2,622 17.66 0.18 0 0 0 20 40 60 80 100 % Population Node Cases % of Node % Cum % Cum % % Cases Cum Lift Tgt. Class Tgt. Class Tgt. Class Tgt. Class Pop Pop in Node lift Pop 1 7,289 72.788 49.101 49.101 20.834 20.834 10,014 2.357 2.357 9 904 51.984 6.09 55.19 24.452 3.618 1,739 2.257 1.683 2 2,317 50.612 15.608 70.798 33.977 9.525 4,578 2.084 1.639 3 471 46.45 3.173 73.971 36.087 2.11 1,014 2.05 1.504 10 431 43.186 2.903 76.874 38.163 2.076 998 2.014 1.398 4 349 40.819 2.351 79.225 39.942 1.779 855 1.984 1.322 12 462 38.404 3.112 82.337 42.445 2.503 1,203 1.94 1.243Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 27
  • Satisfaction Model from CART® Q25 Q13 Q13 • Q25: Feel Welcome Q22 Q6 Q2 – Surrogate: Q24 (can relate to other 1 members) Q13 Q31 Q31 – Q13: Facilities are clean 2 9 – Surrogate: Q14 (Facilities safe and 15 secure) Q19 Q36 • Q22: Value for Money 8 10 – Surrogates: Q21 (convenient Q36 schedule) and Q23 (quality 3 classes/programs) Q34 $ – Q6: Staff Competent – Surrogates: Q5 (friendly staff) and Q31 Q7 (enough staff)Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 28
  • Member Satisfaction Model: Key RulesTerminal Node 1 Terminal Node 2 Terminal Node 3• 10,014 surveys (20.8%), • 4.578 surveys (9.5%), • 1,739 surveys (3.6%),• 7,289 highly satisfied • 2,317 highly satisfied • 904 highly satisfied(72.8%), (50.6%), (52.0%),• 49% of all highly satisfied • 15.6% of all highly satisfied • 6.1% of all highly satisfiedRULE: RULE: RULE:If strongly agree that If strongly agree that If strongly agree that Y has the right equipmentfacilities are clean and feel welcome and and strongly agree that feelstrongly agree that strongly agree Y is value welcome, and somewhatmember feels welcome, for money, even if don‟t agree that facilities arethen highly satisfied strongly agree facilities clean, even though don‟t are clean, then highly strongly feel Y is good satisfied value for the money, thenCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. highly satisfied 29
  • Member Satisfaction Model: Other Rules Terminal Node 10 Terminal Node 9 • 998 surveys (2.1%), •1,739 surveys (3.6%), • 431 highly satisfied (43.2%), • 904 highly satisfied (52.0%), • 2.9% of all highly satisfied • 6.1% of all highly satisfied RULE: weakest of top 5 RULE If strongly agree that loyal to Y If strongly agree that facilities are and strongly agree that facilities clean, and strongly agree that staff are clean, even though don‟t is competent, even if don‟t strongly agree that feel welcome strongly agree feel welcome, then nor strongly agree that staff is highly satisfied competent, then highly satisfiedCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 30
  • Member Satisfaction Model: Unsatisfied Rules Terminal Node 15 Terminal Node 8 • 19,323 surveys (40.2%), • 1,364 surveys (2.8%), • 1,231 highly satisfied (6.4%), • 141 highly satisfied (10.3%), • 8.3% of highly satisfied • 1.0% of all highly satisfied • 58.2% of all not highly satisfied RULE RULE: If don‟t strongly agree that If don‟t strongly agree that staff facilities are clean and don‟t is efficient and don‟t strongly strongly agree that the Y is agree that feel welcome, and don‟t good value for the money, even strongly agree that the facilities though strongly agree that feel are clean, then member isn‟t welcome, member isn‟t highly highly satisfied satisfied.Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 31
  • Recommend to Friend Model from CART® Q31  Q31: Loyal – Surrogates: Q25, Q44, Q22, Q24 (can relate to Q25 Q22 other members)  Q25: Feel Welcome Q22 Q25 – Surrogates: Q24, Q5 1 7 (friendly staff) Q44  Q22: Value for Money 2 4 – Surrogates: Q23 (quality classes/programs) 5  Q44: Helps meet fitness goalsCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 32
  • Recommend to Friend Model: Key RulesTerminal Node 1 Terminal Node 2 Terminal Node 4• 13,678 surveys (28.5%), • 6,637 surveys (13.8%), • 2,628 surveys (5.5%),• 12,122 recommend (88.6%), • 4,744 recommend (71.5%), • 1,932 recommend (73.5%),• 47.0% of all strong • 18.4% of all strong • 6.1% of all strongrecommends recommends recommendsRULE: RULE: RULEIf strongly agree that loyal to If strongly agree that loyal to If strongly agree that Y is aY and strongly agree that feel Y and agree that Y is a good good value for the money andwelcome, then strongly agree value for the money, even strongly agree that feelthat will recommend to friend though don‟t strongly agree welcome, even though not feel welcome, strongly agree strongly loyal to Y, strongly will recommend to friend. agree will recommend to friendCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 33
  • Recommend to Friend Model: Other Rules Terminal Node 7 Terminal Node 5 • 21,865 surveys (45.5%), • 814 surveys (1.7%), • 5,461 highly recommend (25.0%), • 509 highly recommend (62.5%), • 21.2% of all highly recommend • 2.0% of all highly recommend RULE: RULE If don‟t strongly agree that loyal to If strongly agree that Y is good value Y and don‟t strongly agree that Y is for the money, and strongly agree that value for the money, then will not Y helps meet fitness goals, even highly recommend to a friend though not strongly loyal to the Y and don‟t strongly feel welcome, will highly recommend to a friendCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 34
  • Intend to Renew Model from CART® • Q25: Feel Welcome Q25 – Surrogate: Q24 (can relate to other members) – Q44: Helps meet fitness goals Q44 Q44 – Surrogate: Q51 (visit frequency) • Q22: Value for Money – Left split Surrogates: Q21 (convenient Q22 Q22 schedule) and Q23 (quality 1 8 classes/programs) – Right split surrogates: Q25 (feel Q47 Q27 welcome=2 or 3) 2 – Q47: Would be donor 7 – Surrogate: Q45A (have been donor) – Q27: Feel sense of belonging 3 – Surrogates: Q25, Q24 5Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 35
  • Intend to Renew Model: Key Rules Terminal Node 1 Terminal Node 2 Terminal Node 5 • 13,397 surveys (27.9%), • 3,051 surveys (6.3%), • 5,704 surveys (11.9%), • 9,903 renew (73.9%), • 1,823 renew (59.8%), • 3,201 recommend (56.1%), • 48.4% of all intend to • 8.9% of all intend to renew • 15.6% of all intend to renew renew RULE: RULE: If strongly agree Y is good RULE If strongly agree that feel value for the money and If strongly agree that feel welcome and strongly strongly agree that feel sense of belonging, and agree that Y helps meet welcome, even if don‟t agree that Y is value for the fitness goals, then strongly agree that Y helps money, and strongly agree strongly agree that intend meet fitness goals, then that Y helps meet fitness to renew strongly agree that intend to goals, even if don‟t feel renew welcome, then strongly agree intend to renew.Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 36
  • Intend to Renew Model: Other Rules Terminal Node 8 Terminal Node 7 18,547 surveys (38.6%), 2,178 surveys (4.5%), • 3,130 strongly intend to renew (16.9%), • 578 strongly intend to renew (26.5%), • 15.3% of all strongly intend to renew • 2.8% of all strongly intend to renew RULE: RULE If don‟t strongly agree that feel welcome If don‟t strongly agree that Y is good and don‟t strongly agree that Y helps meet value for money and don‟t strongly fitness goals, then don‟t strongly agree agree that feel welcome, even if that intend to renew strongly agree Y helps meet fitness goals, don‟t strongly agree that intend to renew.Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 37
  • Summary of Key Questions in Models  Feel Welcome was root splitter (or surrogate) for each model  Satisfaction is different than Recommend and Renew in other respects – Helps meet fitness goals was in Recommend and Renew models, but not satisfaction – Facilities clean only in satisfaction model 3Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 38
  • Key Differences Between Targets, Put Another Way  Satisfaction – Feel Welcome – Clean Facility  Renewal – Feel Welcome – Y Helps Meet Fitness Goals, Value for $$  Recommend to Friend – Feel Welcome – Loyal to Y, Value for $$Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 39
  • Top Terminal Nodes Comprise More than 70% of HitsCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 40
  • Subsequent Results Percent Measure 2002 2009 Improvement Satisfaction 31% 41% 32% Recommend to Friend 54% 57% 6%  Rules from Models are still in use today  Trees and Factors can help reduce # questions in survey – Employee ruleset (using same methodology) resulted in a new “short-form” survey using only questions in the splits – Not yet implemented in Member surveyCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 41
  • Index Construction and Scaling  Begin with Factor Analysis  Cluster attribute groupings to be managerially meaningful  Z-normalize the variables, cast all in units of variance  Run tests for deviation from Standard Normal by variable and factor  Create z-index for each factor  Re-scale to nation-wide percentile
  • Analysis of Hierarchy Claim Pearson Correlations Existing Order Facilities Support Value Engagement Impact InvolvementFacilities 1.00 0.53 0.65 0.37 0.25 0.04Support 1.00 0.72 0.78 0.46 0.16Value 1.00 0.65 0.55 0.25Engagement 1.00 0.61 0.53Impact 1.00 0.53Involvement 1.00n=425 Pearson Correlations Reverse Value and Support Facilities Value Support Engagement Impact InvolvementFacilities 1.00 0.65 0.53 0.37 0.25 0.04Value 1.00 0.72 0.65 0.55 0.25Support 1.00 0.78 0.46 0.16Engagement 1.00 0.61 0.53Impact 1.00 0.53Involvement 1.00
  • Summary from “Power of Habit” In 2000, for instance, two statisticians were hired by the YMCA—one of the nation’s largest nonprofit organizations— to use the powers of data-driven fortune-telling to make the world a healthier place. The YMCA has more than 2,600 branches in the United States, most of them gyms and community centers. About a decade ago, the organization’s leaders began worrying about how to stay competitive. They asked a social scientist and a mathematician—Bill Lazarus and Dean Abbott—for help. The two men gathered data from more than 150,000 YMCA member satisfaction surveys that had been collected over the years and started looking for patterns. At that point, the accepted wisdom among YMCA executives was that people wanted fancy exercise equipment and sparkling, modern facilities. The YMCA had spent millions of dollars .Copyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 44
  • Summary from “Power of Habit”: YMCA Satisfaction Retention, the data said, was driven by emotional factors, such as whether employees knew members’ names or said hello when they walked in. People, it turns out, often go to the gym looking for a human connection, not a treadmill. If a member made a friend at the YMCA, they were much more likely to show up for workout sessions. In other words, people who join the YMCA have certain social habits. If the YMCA satisfied them, members were happy. So if the YMCA wanted to encourage people to exercise, it needed to take advantage of patterns that already existed, and teach employees to remember visitors’ names. It’s a variation of the lesson learned by Target and radio DJs: to sell a new habit—in this case exercise—wrap it in something that people already know and like, such as the instinct to go places where it’s easy to make friends. “We’re cracking the code on how to keep people at the gym,” Lazarus told me. “People want to visit places that satisfy theirCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 45
  • Conclusions  The “best” solutions are not always “good” solutions – There is often more than one way to approach a solution – It is often unclear even to the end customer what solution is best until the solution exists on paper  Interactions are the Key (or why trees improve regression models) – Main effects are interesting, but deeper insights gained from subgroups  Don‟t give up – Matching data to decisions is difficult business – Get feedback; make sure the story themodel tells is understood by decision-makersCopyright © 2004-2012, Abbott Analytics, Inc. and Seer Analytics, Inc. All rights reserved. 46