Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Predicting Enrollment in Health & Wellness Programs
1. shapeup.com
Copyright 2015 ShapeUp Inc.
All Rights Reserved. Proprietary & Confidential
Samir A. Batla
Director, BI & Data
Analytics
Johanna Kincaid
Data Scientist
PREDICTING
ENROLLMENT IN
HEALTH &
WELLNESS
PROGRAMS
3. 3
OUR HISTORY
Drs. Rajiv Kumar and Brad
Weinberg founded ShapeUp
in 2006 while studying at
Brown Medical School
Launched initially as a non-
profit community challenge
called Shape Up RI, their
social wellness idea went
viral and reached over 10%
of the adult population in
Rhode Island
Six outcome-based studies
and 600 customers later,
ShapeUp now serves 10
Fortune 50 and more than
twenty Fortune 500
companies worldwide
ShapeUp operates in 138
countries and is translated
into 25 languages, reaching
1.6 million participants
globally
4. 4
WHAT IS PREDICTIVE ANALYTICS
Many definitions, that include:
• Data Mining
• Machine Learning
• AdvancedAnalytics
• Artificial Intelligence
• Etc.
The practice of analyzing existing data to make
predictions about the future.
5. 5
INSIGHTS FROM
ANALYTICS
• Predict enrollment, engagement, activity
• Gain deeper population insight
• identify trends
• influencing attributes
• Drive strategic decision making
• Address under-performing areas
• Reward above average performance
• Score participants to assess their likelihood of
meeting milestones
The things we can learn from predictive analytics will allow us to design better competitions and
programs to meet our mission of making healthier, happier and more successful workplaces
6. We have just embarked on our journey – we did an
experiment to predict future competition
enrollment.
We focused on three predictive attributes:
•Postal Codes
•Divisions
•Labor Codes (Unions, non-Unions)
Why these three variables?Answer is simple –
completeness. Next: were any missing attributes
significant?
Trained model on previous enrollments to predict
enrollment in 2015
PREDICTIVE ANALYTICS
AT SHAPEUP
8. 8
PROCESS
Training Data
2 yrs; ~ 350k
Classifications
New Data
~ 200k
New Classification
Exploratory Data Analysis /
Algorithm Development
Predictive Model
14. 14
MODEL PERFORMANCE – UNTIL THINGS CHANGE
Years of
Training Data
Training Predicted Actual Error %
2
1
.5549871
.5354515
.6206801
.6510264
.6192993
.6192993
0.14%
3.17%
15. 15
1
BEWARE
Unknowns unknowns.
But predictive analytics is not a dubious
endeavor. It’s flawed when it’s overly-
complex. What we don’t want to do is make
guarantees; however, we can, in fact, use it
to understand what’s happening, what might
happen and drive decision making.
16. 16
IN CLOSING…
Predictive Analytics is helping ShapeUp:
Help our customers understand their
population and prepare for future programs
and competitions
Better understand book-of-business
customer populations to design and offer
new products that aren’t obvious
Enrich the relationship between Account
Managers and customers – become
consultative
More questions are coming to light
Understand
Data
Prepare
Data
Model Data
Evaluate
Deploy
Monitor
Business
Goal
17. 17
PARTICIPANT
SENTIMENT
“It help shift a focus of
care for our employees.
Also, when our
employees are healthier
they function better at
their jobs.”
“I like that we do this as
a company...working
with others is always
easier....Thank you!”
“Company giving those
employees, who wish to
participate, an
opportunity to improve
their health.Thank
You!”
ShapeUp uses the latest in technology, gamification, social dynamics, behavioral economics, and social psychology to transform traditional wellness into something truly cutting edge. We call this approach “social wellness,” and it completely changing the industry.
We leverage group support, peer coaching, friendly competition, and social accountability to empower people to achieve their goals, improve their health, and enhance their overall well-being.
Talk about competitions, programs social platform
There are many definitions on the web for “predictive analytics” – many define it by using methods such as
I like to keep it simple
I bolded a few key phrases: practice, existing data, future
I believe predictive analytics (or any analytics for that matter) is a practice; not just a belief; but the application of a belief. Not a method; but an application of a method – and it’s repetitive, so we can get better at it.
Assumption: similar behavior can be expected in the future
If you think about it, we humans are doing analytics all the time and often we are practicing a form of predictive analytics. Think of your average kitchen table conversation – of course, if you have children, it may be utter chaos; but simply, what we do with each other is share information, we analyze it (applying our own built-in models), then help each other understand what might happen (future), given the current set of circumstances (or existing data), or how we can potentially change the outcomes by manipulating current circumstance.
Image: http://www.iqworkforce.com/wp-content/uploads/2015/07/future.jpeg
Any foray into analytics must start with a business context and questions – usually driven by the client. Of course, an exploration of data allows questions to come up as well; but we are in fact being asked direct questions by our clients.
The one question we constantly get is: “ShapeUp, what is your recommendation for our enrollment goal for the next competition?”
Billion dollar companies spend millions on health & wellness programs, spend millions more on incentives. ROI is difficult to measure – today, it is difficult to tie improvement in population health, reduction in healthcare costs to health & wellness solutions. So, what is the key success measure of the program?
The one question we consistently get is, “what was enrollment percentage?” (or what our clients typically call, “engagement”). In addition, we are asked to predict or suggest a goal for future programs. Because organizations spend millions on these programs, they need a plan for funding marketing, incentives, and other activities around the program, such as identifying staff (wellness leaders) to manage local execution and marketing of the programs – and this planning and preparation happens several months prior to the next competition – often times just after the previous one has ended.
Deeper population insight – identify influencing attributes, drivers across different business dimensions: line of business, job function, geography
Drive strategic decision making – address areas falling behind, reward above average behaviors, derive greater value from investment
http://images.techhive.com/images/article/2015/05/predictive_analytics-100587585-primary.idge.jpg
Our project considered data from one client who used our platform for three competitions. We trained our model on 2013 & 2014 enrollments and predicted enrollment for 2015.
There are many more attributes we may have considered; but due to the degree of missing data, considering other variables made our error rates much higher.
Scarcity of data is the Achilles' heel of analytics
R libraries:
Plotting:
ggplot2 – graphing package, makes plotting less of a hassle, good for multi-layered graphics.
grid – used for putting multiple graphs on one page
gridExtra – arrange multiple graphs on one page
Algorithm – Simple Naïve-Bayes classification: Naïve Bayes assumes the predictors are statistically independent which makes it an effective classification tool that is easy to interpret :
RMySQL – MySQL Driver for R
plyr - splitting, combining data, good for breaking large problems down into smaller pieces
2015 Enrollment: blue is “yes”; red is “no”
What makes Labor Code an interesting variable for our model is the unevenness between the yes (red) and no (blue) bars.
Labor Code was the most predictive variable.
2015 Enrollment: blue is “yes”; red is “no”
We have to strike a balance between adding information and condensing information. More information about users should lead to improved outcomes, but sometimes too much information actually muddles the influence of important attributes by reducing their influence.
Given more time, sitting with the client to create conceptual labor code groupings (e.g. like bins) would potentially provide us with an improved outcome
2015 Y is log of counts
Another variable we looked at was postal codes. But postal codes are a little funny, using the full 5-digit postal code (of course, ignoring the 4 digit extension), we got groupings of extremely small eligibles – not useful. So, we binned the postal codes by the first two digits, this allows for geographic breakdowns with a larger eligible population.
Again, the key here is the variance between the yes/no. Very few population bins come close to equal enrollment/non-enrollment
Simply stated, all five digits is too fine a breakdown.
Another exploratory graph that suggested postal code would be a good variable in our model
The overall consistency in distribution over the years does increase accuracy of prediction.
However, something that is interesting about this series of graphs is that the year to year consistency is very strong but the small differences are what account for small shifts in the predicted enrollment percentages.
Ex. quite a few more eligible people in a zip code with low enrollment leads to a slight decrease in enrollment rate.
2015 Enrollment: Y is log of counts
Division variable helped improve the model when combined with postal codes and labor codes
Need to find a balance between adding information and condensing information. More information about users should lead to improved outcomes, but sometimes too much information actually muddles the influence of important attributes by reducing their influence.
Again, the key here is the variance between the yes/no. Very few population bins come close to equal enrollment/non-enrollment
We ran several tests and averaged about 1.5% error rate.
For our question about 2015 enrollment, we tested the model with 1 year of training data (in this case, 2013) and two years of training data (2013, 2014) to predict enrollment in 2015.
Analogy of a big meeting.
can’t anticipate every unknown, modeling for many unknowns puts false sense of trust in the model – because there are so many variables, we must be right.
Think of the Black Swan theory: up until 1697, the presumption was that all swans have white feathers because in to time in history was there a swan of any other color. In 1697, a Dutch explorer discovered a Black Swan. What was once considered impossible was disproven with an unknown unknown.
Perceived impossibility might later be proven.
Can’t build a model that accounts for Black Swans – this has to be taken into consideration when working with Predictive Analytics
See Nassim Nicholas Taleb
At the most basic level, ShapeUp is turning customer data into knowledge and into action. It’s important to note, this is a cycle. Remember, we are trying to make decisions now about what the future might hold from understanding the past. This implies that we must constantly monitor, collect, understand our data and make the necessary adjustments as new information comes in. This is really a conversation – with our customers and indeed amongst ourselves.
Analog to real-life.
More questions are coming to light: I believe one axiom of analytics is that it creates more questions than it does answers – and that’s ok. We have to remember why we are doing this, we are trying to meet business objectives by making more informed decisions, then learning from those decisions. So what this exercise has done is allowed us to conceive of more questions:
Questions:
Is time of year the competition starts a significant factor?
Does changing the incentives have a significant impact?
Does someone role change have a significant impact? (Think of large organizations who have org structure changes constantly)
Does a change in product / offering have a significant impact?
These are the types of questions that will allow ShapeUp to provide greater value to our customers and prospects.