Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

User modelling challenge ideatory 2014


Published on

Data Analytics Contest organised by
Stage 1: Survey rating for various events by users to be predicted
Stage 2 : Recommend predictive modeling idea based on daily routine life issues

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

User modelling challenge ideatory 2014

  1. 1. Submitted by: Parindsheel S Dhillon Contest Organized by:
  2. 2. User Modeling Data Analytics Contest-Stage 1 Survey rating for various events by users to be predicted Geospatial data of users provided Event Details & Geospatial details of events provided 25 million observations of training Data25 million observations of training Data 0.3 million unique user-event combination rating to be predicted Adopted Analytics techniques Data pre processing Ordinal Linear regression, clustering, decision trees Analytics Tools R, WEKA, MS Office
  3. 3. User Modeling Data Analytics Contest-Stage 2 Recommend predictive modeling idea based on daily routine life Problem scenario Predictive modeling ideaPredictive modeling idea Data collection & Data Dictionary Data Sample(representative) Data Analytics & Result Delivery
  4. 4. Scenario Obesity is one of the biggest problem • More than 1.4 billion adults are overweight in 2008 (WHO) • More than 40 million children under the age of 5 were overweight or obese in 2012 (WHO) • More than 2/3 of USA current population is overweight• More than 2/3 of USA current population is overweight Overweight is leading factor for various diseases • Cardiovascular diseases, Diabetes type 2, Osteoarthritis & some cancers like endometrial, breast, and colon (WHO) Changing lifestyle & eating habits • Over use of Packaged food containing Trans Fat, sugar & Salt • Sedentary lifestyle with increase in use of Television, computer, mobile & sitting jobs.
  5. 5. Predictive modeling Idea By predicting body weight change in 3 months based on some daily activities • Many people will foresee their overweight future & its associated problems • Create awareness against obesity to save livesCreate awareness against obesity to save lives • Gym, Health centers etc can also be strategically involved to en- cash this opportunity by using weight change predictive modeling Idea adoption • Increasing awareness regarding health issues • Zero figure culture among female population
  6. 6. Data Collection Data collection from below activities Lifestyle & food intake Work profile Additional workout (if any) Personal & Demographics data Data Collection ProcessData Collection Process Food intake data will be collected using a smart phone app. Daily work out e.g. walking, cycling, running & swimming etc could also be collected using smart phone app. Personal & demographic data will be collected when a user signs up for the app
  7. 7. Representative Data 28 number of dependent variables having affect on body weight along with demographic variables have been suggested. Imaginary data for two observations is as below Date CustID Age Sex Weight Height Place Origin reg Breakfast reg Lunch reg Dinner reg Other total Cal sugar Freq junk Freq 1/8/2014 1 35 M 65 165 Chandigarh Indian 1000 1000 1200 300 3500 1-2 times 3-5 times 1/8/2014 2 48 F 80 162 Los Angeles American 700 1500 1400 500 4100 3-5 times more than 5 alcohol Freq alcohol Qty softDrink Freq skip Breakfast parent Overwt medical Prob medication sleep Hours quit Smoke work Profile work Hrs fitness Activity mins Fitness weight Change 3-5 times 60ml very rarely Nil yes no no 7 no sedentary 8 No 0 ? Nil Nil more than 500ml Nil yes No No 6 no light 10 gym 60 ?
  8. 8. Analytics Data Pre-processing Variable transformation e.g. Net calorie stored in body Calorie count might need to be calculated for energy used and energy intake Outlier detection & certain medical obesity issues Statistical techniques for data analyticsStatistical techniques for data analytics Linear regression (stepwise) Akaike information criterion (AIC) will be used for relative model quality Analysis of Variation (ANOVA) Time Series can also be used for long time prediction
  9. 9. Analytical Result & Delivery Analytical Result Body weight change in 3 months based on daily activities will be predicted for any individual For longer duration prediction Time series can be used along with Markov chains analysis Result DeliveryResult Delivery Phone application like fat booth need to be accommodated in original phone app to show the prediction along with weight bar, photo need to be taken for this additional activity. Strategic alliance heath centre address & contact can be forwarded along with the results General advice like reduction in sugary content or soft drinks etc can be given to customer based on data
  10. 10. Thanks Parindsheel Singh Dhillon