SlideShare a Scribd company logo
May 2012
                          Maria Lupetini
Engineering Asset Management & Analytics
                  Qualcomm Incorporated
   Advantages of MARS Modeling
   Predicting Demand for an Asset
   Capturing Trends and Seasonal Effects
   Finding Interactive Effects
   Weighting More Recent Data
   Autoregressive Model for Time Series
   Using Lag Variables
   Don’t be Afraid of Missing Values
   Summary of Findings
   Regression: Linear, Logistic, GLM, MARS
   ARIMA Time Series
   Decision Trees
   Neural Networks
   Support Vector Machines
   And more

Need to pick one or more approaches tailored
 to problem you are tackling
   Sales - Dollars, Number of Chips

   Resources - People, Software Assets

    Performance of a Semiconductor - Seconds
    to load a web page

   …You name it.
   Data contains continuous numbers
      $123,456.00
      Number of employees
   Understand influences of categories
      Geographical regions
      Operating system: Windows, Android
   Seasonal or repeated trends
      Months of the year
      Christmas season
   Special Effects
      Consumer Promotions and Advertising
      Switch turned on
What do you do if you want to predict a trend or find a pattern in data….and

   There are hundreds of possible variables that influence your outcome -
    ◦ Which ones matter?

   What if the variables interact with each other and effect the outcome
    ◦ How do you find that those relationships?

   What if variables are not linearly related to the outcome
    ◦ How do determine the what the relationship curves will look like?
    ◦ Threshold or plateau relationship

   What if the data you are using to predict is a mixture of numbers and categories
    ◦ How do you build a prediction formula?

   How do I build a prediction model that is easy to understand?


                                              … USE MARS
   MARS short for Multivariate Adaptive Regression Splines

   Technique introduced in 1991, Jerome Friedman, Stanford
    University

   Nonparametric, data driven algorithm

   Prediction is a regression model with additional side
    equations (basis functions)

   Uses piecewise regression splines to build the prediction

   Provides data reduction to select which variables matter
Software Used in Designing Semiconductor Chips

   Is the use of the software growing?

   What time of day are the software licenses most
    demanded?

   Does demand change over the weekend?

   How many copies do we need next week?
100
                                150
                                      200
                                            250
                                                  300
                                                        350




                     50

                 0
  8/28/2011 12…
 9/2/2011 4 PM
 9/8/2011 8 AM
  9/14/2011 12…
9/19/2011 4 PM
9/25/2011 8 AM
  10/1/2011 12…
10/6/2011 4 PM
  10/12/2011 8…
 10/18/2011 12…
  10/23/2011 4…
  10/29/2011 8…
  11/4/2011 12…
11/9/2011 4 PM
  11/15/2011 8…
 11/21/2011 12…
  11/26/2011 4…
12/2/2011 8 AM
  12/8/2011 12…
  12/13/2011 4…
  12/19/2011 8…
 12/25/2011 12…
  12/30/2011 4…
 1/5/2012 8 AM
  1/11/2012 12…
1/16/2012 4 PM
1/22/2012 8 AM
  1/28/2012 12…
                                                                     from Aug 2011 to April 2012




 2/2/2012 4 PM
 2/8/2012 8 AM
  2/14/2012 12…
2/19/2012 4 PM
                                                              Number of Software Licenses Used in an Hour




2/25/2012 8 AM
3/2/2012 12 AM
 3/7/2012 4 PM
3/13/2012 9 AM
3/19/2012 1 AM
3/24/2012 5 PM
                                                                                                            How do you forecast this time series of demand data?




3/30/2012 9 AM
 4/5/2012 1 AM
4/10/2012 5 PM
Actual
                 Licenses    Week             Day   Week
      Time         Used     Number   WeekDay Name   end    Holiday   Hour
9/4/2011 9 PM      58
                              37       1     Sun     1       Y       21
9/4/2011 10 PM     75
                              37       1     Sun     1       Y       22
9/4/2011 11 PM     88
                              37       1     Sun     1       Y       23
9/5/2011 12 AM     81
                              37       2     Mon     0       Y        0
9/5/2011 1 AM      74
                              37       2     Mon     0       Y        1
9/5/2011 2 AM      80
                              37       2     Mon     0       Y        2
9/5/2011 3 AM      81
                              37       2     Mon     0       Y        3

  •   Real Continuous or Integer Variables: License Counts, Week Number
  •   Categorical Text Variables: Holiday flag, Day Name
  •   Binary Numbers: Weekend flag
  •   Choice of Categorical or Real Number: Week Day, Hour
Can we building a prediction model of the form?

Demand =
        Constant Base+
        Baseline trend +
        Hour of day effect +
        Day of Week effect +
        Holiday effect
Setting Up Model in MARS
Trend line captures:
• Growing use of this software product from Sep 20112 to Apr 2012
• Deadlines of semiconductor chip projects (Jan. and March)
Additional
   licenses
 needed as
function of
hour of the
        day




        Hour Predictor Captures:
        • Highest use of licenses during 10 to 1pm US Pacific time
        • Effect of Use in European/Indian time zones
Additional                                  Weekday was coded as
   licenses                                  a continuous variable.
 needed as                                   Coding it as a
function of                                  categorical can also
 day of the                                  work here.
      week                                   1= Sunday,
                                             2=Monday, etc




        Day of Week Predictor Captures:
        • Highest use of licenses during Wednesday to Friday
Possible Interactive Effects Between Variables




         Look to find an interactive
         effects between hour of day
         and day of week.

         Did not want to allow
         interactive effects between
         week_number and holiday
         variables with other variables
Additional
licenses needed
   as function of
    hour and day




            Interactive effect
            • Work patterns are different on the weekends when
               compared to the work week.
Additional
    licenses
  needed on
non-holidays




           Holiday Predictor Captures:
           • The difference in demand in a hour if it is a holiday
Weighting of Observations
                                 5/21/2012 12 AM
     Day and Hour Observation


                                  4/1/2012 12 AM

                                 2/11/2012 12 AM

                                12/23/2011 12 AM

                                 11/3/2011 12 AM

                                 9/14/2011 12 AM

                                 7/26/2011 12 AM

                                                   0    1            2           3      4

                                                       Weight Applied to Observations


MARS will consider a “variable” as a weighting factor.
Here, the observations in April 2012 were 3 times
more important than observations in Sep 2011.
100
                                                                                 150
                                                                                       200
                                                                                             250
                                                                                                   300
                                                                                                         350




                                      50

                  0
 4/8/2012 12 AM
  4/8/2012 8 AM
  4/8/2012 4 PM
 4/9/2012 12 AM
  4/9/2012 8 AM
  4/9/2012 4 PM
4/10/2012 12 AM
 4/10/2012 8 AM
 4/10/2012 4 PM
4/11/2012 12 AM
 4/11/2012 8 AM
 4/11/2012 4 PM
4/12/2012 12 AM
 4/12/2012 8 AM
                                                                                                                                                                                                  Blue line Actual Licenses Used




 4/12/2012 4 PM




                      Part of the Training Dataset
4/13/2012 12 AM
 4/13/2012 8 AM
 4/13/2012 4 PM
4/14/2012 12 AM
 4/14/2012 8 AM
 4/14/2012 4 PM
4/15/2012 12 AM
 4/15/2012 8 AM
 4/15/2012 4 PM
4/16/2012 12 AM
 4/16/2012 8 AM
 4/16/2012 4 PM
4/17/2012 12 AM
 4/17/2012 8 AM
 4/17/2012 4 PM
4/18/2012 12 AM
 4/18/2012 8 AM
 4/18/2012 4 PM
4/19/2012 12 AM
                                                                                                                                                                                                                                   Number of Software Licenses Used and Predicted




 4/19/2012 8 AM
 4/19/2012 4 PM
4/20/2012 12 AM
                                                     Prediction on Unseen Data




 4/20/2012 8 AM
 4/20/2012 4 PM
                                                                                                          Red line is MARS fit on Training Data for 4/18 to 4/15 and Prediction on 4/15 to 4/21




4/21/2012 12 AM
 4/21/2012 8 AM
 4/21/2012 4 PM
100
                                                                        150
                                                                        200
                                                                        250
                                                                        300
                                                                        350




                                                                         50
                                                                          0
                                                      8/28/2011 12 AM
                                                        9/2/2011 4 PM
                                                        9/8/2011 8 AM
                                                      9/14/2011 12 AM
                                                       9/19/2011 4 PM
                                                       9/25/2011 8 AM
                                                      10/1/2011 12 AM
                                                       10/6/2011 4 PM
                                                      10/12/2011 8 AM
                                                     10/18/2011 12 AM
                                                      10/23/2011 4 PM
                                                      10/29/2011 8 AM
                                                      11/4/2011 12 AM
                                                       11/9/2011 4 PM
                                                      11/15/2011 8 AM
                                                     11/21/2011 12 AM
                                                      11/26/2011 4 PM
                                                       12/2/2011 8 AM
                                                      12/8/2011 12 AM
                                                      12/13/2011 4 PM
                                                      12/19/2011 8 AM
                                                     12/25/2011 12 AM

                                  Prediction Model


• Overall trend
                                                      12/30/2011 4 PM
                                                        1/5/2012 8 AM
                                                                              Training Dataset




                                                      1/11/2012 12 AM
                                                       1/16/2012 4 PM
                                                       1/22/2012 8 AM
                                                      1/28/2012 12 AM
                                  Actual

MARS was able to capture:

                                                        2/2/2012 4 PM
                                                                                                 Number of Software Licenses Used




                                                        2/8/2012 8 AM
                                                      2/14/2012 12 AM
• Hourly and Week Day effect


                                                       2/19/2012 4 PM
                                                       2/25/2012 8 AM
• Somewhat captured US holidays


                                                       3/2/2012 12 AM
                                                        3/7/2012 4 PM
                                                       3/13/2012 9 AM
                                                       3/19/2012 1 AM
                                                       3/24/2012 5 PM
                                                       3/30/2012 9 AM
                                                        4/5/2012 1 AM
                                                       4/10/2012 5 PM
Variable                                Importance         -gcv
---------------------------------------------------------------             MARS tells you
WEEKDAY                                  100.00000   2713.86182             which variables
                                                                            are most
HOUR                                      93.20326   2418.96997
WEEK_NUMBER                               44.00605    903.06390
HOLIDAY$                                  21.76427    574.55463             important.

                                                                            Great R-Squared
==============================
                                                                            of 90%. Other
                                                                            diagnostics, not
N: 15217.52                                R-SQUARED: 0.90281               presented here,
MEAN DEP VAR: 158.15640                ADJ R-SQUARED: 0.90214
                  UNCENTERED R-SQUARED = R-0 SQUARED: 0.98493               looked good too.
F-STATISTIC = 1344.99320                    S.E. OF REGRESSION =   35.12427
    P-VALUE = 0.00000                  RESIDUAL SUM OF SQUARES =   .678790E+07
  [MDF,NDF] = [ 38, 5502 ]           REGRESSION SUM OF SQUARES =   .630548E+08


                                  Actual Used:          Range 45 to 344 Licenses
                                                        Average 95
                                                        Standard Dev. 70
Can we build a prediction model of the
autoregressive form?

Demand =
   Constant Base+
   Baseline trend +
   Effect of Licenses Used from a week ago +
   Workweek vs. Weekend effect +
   Holiday effect
Set Up Autoregressive Model, Part 2



                    Creating lag variable for “Used Lag168.”
                    This predictor is the number of licenses
                    used in the same hour, in the same day,
                    in the prior week.
MARS found underlying trend when adjusting for other
factors in the Autoregressive model version.




           Adjusting for underlying trend makes series
           stationary. This is necessary for ARIMA models.
MARS captures contribution of Used Lag 168 hours
variable
Selected MARS Output Showing Model Form and Fit

BF1 = ( USED<168> ne . );
BF2 = ( USED<168> = . );                     Basis Functions and
BF3 = max( 0, USED<168> - 42) * BF1;         Prediction Equation
BF4 = max( 0, 42 - USED<168>) * BF1;         from MARS.
BF5 = (HOLIDAY$ in ( "Y" ));
BF7 = (MON_TO_FRI in ( 0 ));                 Note the handling of
BF9 = max( 0, WEEK_NUMBER - 50) * BF1;
                                             missing values.
BF10 = max( 0, 50 - WEEK_NUMBER) * BF1;
BF11 = max( 0, USED<168> - 137) * BF1;
BF13 = max( 0, USED<168> - 265) * BF1;       Reasonable fit with
BF15 = (MON_TO_FRI in ( 0 )) * BF2;          82% R-squared

Number of Lucenses Needed = 134- 39 * BF1 + 0.58 * BF3 - 2.12 * BF4
- 42* BF5 - 21.6 * BF7 - 0.235 * BF9 - 1.598 * BF10 + 0.338 * BF11
- 0.535 * BF13 - 38 * BF15;

N: 15055.88                                  R-SQUARED: 0.82525
 MEAN DEP VAR: 158.75413                  ADJ R-SQUARED: 0.82493

F-STATISTIC =   2533.14901                S.E. OF REGRESSION =   47.37796
For observations where the 168 lag of the “Used” variable is not missing:

Holiday = 1 if it’s a holiday, else 0
Weekend = 1 if it’s Saturday or Sunday, else 0

A = max( 0, USED<168> - 42)
B = max( 0, 42 - USED<168>)         Autoregressive
C = max( 0, USED<168> - 137)        Splines
D = max( 0, USED<168> - 265)

E = max( 0, WEEK_NUMBER - 50)
F = max( 0, 50 - WEEK_NUMBER)       Trend line Splines

    Forecasted License Need= 95 - 42*Holiday - 22 * Weekend
       [0.6 * A - 2.1 * B + 0.3 * C - 0.5 * D] +
       [- 0.2 * E - 1.6 * F]
100
                                       150
                                             200
                                                   250
                                                               350
                                                                     400


                                                         300




                          50
                   0
  9/4/2011 12 AM
  9/10/2011 6 AM
 9/16/2011 12 PM
  9/22/2011 6 PM
 9/29/2011 12 AM
  10/5/2011 6 AM
10/11/2011 12 PM
 10/17/2011 6 PM
10/24/2011 12 AM
 10/30/2011 6 AM
 11/5/2011 12 PM
 11/11/2011 6 PM
11/18/2011 12 AM
 11/24/2011 6 AM
11/30/2011 12 PM
  12/6/2011 6 PM
12/13/2011 12 AM
 12/19/2011 6 AM
12/25/2011 12 PM
 12/31/2011 6 PM
  1/7/2012 12 AM
  1/13/2012 6 AM
 1/19/2012 12 PM
  1/25/2012 6 PM
  2/1/2012 12 AM
   2/7/2012 6 AM
 2/13/2012 12 PM
  2/19/2012 6 PM
 2/26/2012 12 AM
   3/3/2012 6 AM
  3/9/2012 12 PM
  3/15/2012 7 PM
  3/22/2012 1 AM
  3/28/2012 7 AM
   4/3/2012 1 PM
   4/9/2012 7 PM
  4/16/2012 1 AM
                   USED
                          Predicted
100
                                                                           150
                                                                                 200
                                                                                       250
                                                                                             300
                                                                                                   350
                                                                                                                                        400




                  0
                      50
 4/8/2012 12 AM
  4/8/2012 8 AM
  4/8/2012 4 PM
 4/9/2012 12 AM
  4/9/2012 8 AM
  4/9/2012 4 PM
4/10/2012 12 AM
 4/10/2012 8 AM
 4/10/2012 4 PM
4/11/2012 12 AM
 4/11/2012 8 AM
                                                                                                         Blue line is Actual Used




 4/11/2012 4 PM




                      Part of Training Dataset
4/12/2012 12 AM
 4/12/2012 8 AM
 4/12/2012 4 PM
4/13/2012 12 AM
 4/13/2012 8 AM
 4/13/2012 4 PM
4/14/2012 12 AM
 4/14/2012 8 AM
 4/14/2012 4 PM
4/15/2012 12 AM
 4/15/2012 8 AM
 4/15/2012 4 PM
4/16/2012 12 AM
 4/16/2012 8 AM
 4/16/2012 4 PM
4/17/2012 12 AM
 4/17/2012 8 AM
 4/17/2012 4 PM
4/18/2012 12 AM
 4/18/2012 8 AM
 4/18/2012 4 PM
                                                                                                                                                                                                     Number of Licenses Used and Predicted




4/19/2012 12 AM
 4/19/2012 8 AM
                                                 Forecasting Unseen Data




 4/19/2012 4 PM
4/20/2012 12 AM
 4/20/2012 8 AM
 4/20/2012 4 PM
4/21/2012 12 AM
                                                                                                         Red line is MARS fit on Training data for 4/8 to 4/14 and Prediction on 4/15 to 4/21 data




 4/21/2012 8 AM
 4/21/2012 4 PM
Number of Licenses




                                                  100
                                                         150
                                                                200
                                                                      250
                                                                             300
                                                                                   350
                                                                                         400




                                             50


                  0
 4/8/2012 12 AM
  4/8/2012 9 AM
  4/8/2012 6 PM
  4/9/2012 3 AM
 4/9/2012 12 PM
  4/9/2012 9 PM
 4/10/2012 6 AM
 4/10/2012 3 PM
4/11/2012 12 AM
 4/11/2012 9 AM
 4/11/2012 6 PM
 4/12/2012 3 AM
4/12/2012 12 PM
 4/12/2012 9 PM
 4/13/2012 6 AM



                  Predicted_AutoRegressive
 4/13/2012 3 PM
4/14/2012 12 AM
 4/14/2012 9 AM
 4/14/2012 6 PM
 4/15/2012 3 AM
4/15/2012 12 PM
 4/15/2012 9 PM
                  Actual Used




 4/16/2012 6 AM
                                                                                               to Actual Licenses Used




 4/16/2012 3 PM
4/17/2012 12 AM
 4/17/2012 9 AM
                                                                                                                         Compare Forecast of Two Models




 4/17/2012 6 PM
 4/18/2012 3 AM
4/18/2012 12 PM
 4/18/2012 9 PM
 4/19/2012 6 AM
 4/19/2012 3 PM
4/20/2012 12 AM
 4/20/2012 9 AM
                  Predicted Not Auto Reg




 4/20/2012 6 PM
 4/21/2012 3 AM
4/21/2012 12 PM
 4/21/2012 9 PM
Mathematically
 MARS is versatile; it models most data types
 Selects best predictors
 Models nonlinear relationships
 Easily finds selective interactive effects
 Simple to create lag variables as predictors
 Flexible weighting schemes for observations
 Can handle missing values


Operationally
 Don’t call me for more software license copies on
  Thursday at noon; everyone else is!

More Related Content

Viewers also liked

Drinks Menu_2015 NAM
Drinks Menu_2015 NAMDrinks Menu_2015 NAM
Drinks Menu_2015 NAM
Marci Mellor
 
General Additive Models in R
General Additive Models in RGeneral Additive Models in R
General Additive Models in R
Noam Ross
 
Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...
Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...
Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...
FutureToday
 
Mars
MarsMars
Mars
raja1233
 
Uncle Ben's Recipe Video Contest Flyer
Uncle Ben's Recipe Video Contest FlyerUncle Ben's Recipe Video Contest Flyer
Uncle Ben's Recipe Video Contest Flyer
aeiser
 
Seattle SeaHawks
Seattle SeaHawksSeattle SeaHawks
Seattle SeaHawks
1apinedo
 
Introduction to MARS (1999)
Introduction to MARS (1999)Introduction to MARS (1999)
Introduction to MARS (1999)
Salford Systems
 
Mars Business report
Mars  Business reportMars  Business report
Mars Business report
Yiqiao Song
 
Sara's m&m slideshow
Sara's m&m slideshowSara's m&m slideshow
Sara's m&m slideshow
reidhns1
 
Mars incorporated interview questions and answers
Mars incorporated interview questions and answersMars incorporated interview questions and answers
Mars incorporated interview questions and answers
PenelopeCruz99
 
Customer Success Story: Mars Inc. [New York]
Customer Success Story: Mars Inc. [New York]Customer Success Story: Mars Inc. [New York]
Customer Success Story: Mars Inc. [New York]
SAP Ariba
 
M&M's case study
M&M's case studyM&M's case study
M&M's case study
Pat Velayo
 
WALTHAM Puppy Growth Charts
WALTHAM Puppy Growth ChartsWALTHAM Puppy Growth Charts
WALTHAM Puppy Growth Charts
Waltham Centre for Pet Nutrition
 
Mars Incorporated Marketing Analysis
Mars Incorporated Marketing AnalysisMars Incorporated Marketing Analysis
Mars Incorporated Marketing Analysis
Emily Crowther
 
Mars, incorporated strategic swot analysis review
Mars, incorporated   strategic swot analysis reviewMars, incorporated   strategic swot analysis review
Mars, incorporated strategic swot analysis review
CompanyProfile123
 
Whiskas presentation updated
Whiskas presentation updatedWhiskas presentation updated
Whiskas presentation updated
rebeccashackley
 

Viewers also liked (16)

Drinks Menu_2015 NAM
Drinks Menu_2015 NAMDrinks Menu_2015 NAM
Drinks Menu_2015 NAM
 
General Additive Models in R
General Additive Models in RGeneral Additive Models in R
General Additive Models in R
 
Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...
Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...
Мероприятия как инструмент работы с молодыми специалистами и продвижения брен...
 
Mars
MarsMars
Mars
 
Uncle Ben's Recipe Video Contest Flyer
Uncle Ben's Recipe Video Contest FlyerUncle Ben's Recipe Video Contest Flyer
Uncle Ben's Recipe Video Contest Flyer
 
Seattle SeaHawks
Seattle SeaHawksSeattle SeaHawks
Seattle SeaHawks
 
Introduction to MARS (1999)
Introduction to MARS (1999)Introduction to MARS (1999)
Introduction to MARS (1999)
 
Mars Business report
Mars  Business reportMars  Business report
Mars Business report
 
Sara's m&m slideshow
Sara's m&m slideshowSara's m&m slideshow
Sara's m&m slideshow
 
Mars incorporated interview questions and answers
Mars incorporated interview questions and answersMars incorporated interview questions and answers
Mars incorporated interview questions and answers
 
Customer Success Story: Mars Inc. [New York]
Customer Success Story: Mars Inc. [New York]Customer Success Story: Mars Inc. [New York]
Customer Success Story: Mars Inc. [New York]
 
M&M's case study
M&M's case studyM&M's case study
M&M's case study
 
WALTHAM Puppy Growth Charts
WALTHAM Puppy Growth ChartsWALTHAM Puppy Growth Charts
WALTHAM Puppy Growth Charts
 
Mars Incorporated Marketing Analysis
Mars Incorporated Marketing AnalysisMars Incorporated Marketing Analysis
Mars Incorporated Marketing Analysis
 
Mars, incorporated strategic swot analysis review
Mars, incorporated   strategic swot analysis reviewMars, incorporated   strategic swot analysis review
Mars, incorporated strategic swot analysis review
 
Whiskas presentation updated
Whiskas presentation updatedWhiskas presentation updated
Whiskas presentation updated
 

Similar to Predictions from MARS

Feature Engineering for IoT
Feature Engineering for IoTFeature Engineering for IoT
Feature Engineering for IoT
NUS-ISS
 
DREAM Principles & User Guide 1.0
DREAM Principles & User Guide 1.0DREAM Principles & User Guide 1.0
DREAM Principles & User Guide 1.0
Marcus Drost
 
Project Output PowerPoint Presentation Slides
Project Output PowerPoint Presentation Slides Project Output PowerPoint Presentation Slides
Project Output PowerPoint Presentation Slides
SlideTeam
 
Android Fundamentals & Figures of 2012
Android Fundamentals & Figures of 2012Android Fundamentals & Figures of 2012
Android Fundamentals & Figures of 2012
NAILBITER
 
Project Output Powerpoint Presentation Slides
Project Output Powerpoint Presentation SlidesProject Output Powerpoint Presentation Slides
Project Output Powerpoint Presentation Slides
SlideTeam
 
What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?
SarahCraig7
 
Moodle mootjapan2013 「Moodle±5年」 English Version
Moodle mootjapan2013 「Moodle±5年」 English VersionMoodle mootjapan2013 「Moodle±5年」 English Version
Moodle mootjapan2013 「Moodle±5年」 English Version
鈴鹿工業高等専門学校
 
Developer Workshop Summary
Developer Workshop SummaryDeveloper Workshop Summary
Developer Workshop Summary
health2dev
 
Developer workshop summary
Developer workshop summaryDeveloper workshop summary
Developer workshop summary
health2dev
 
Developer workshop summary
Developer workshop summaryDeveloper workshop summary
Developer workshop summary
health2dev
 
LoCloud - D3.4: Vocabulary services
LoCloud - D3.4: Vocabulary servicesLoCloud - D3.4: Vocabulary services
LoCloud - D3.4: Vocabulary services
locloud
 
TESCO Meter Testing
TESCO Meter TestingTESCO Meter Testing
Unlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedgeUnlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedge
carolinestallings
 
Demystifying Cloud Security
Demystifying Cloud SecurityDemystifying Cloud Security
Demystifying Cloud Security
Ben Clay, CSP (IoT - Expert)
 
PhoneGap build
PhoneGap buildPhoneGap build
PhoneGap build
hardeepshoker
 
From 12 to 3500 deployments per year in production
From 12 to 3500 deployments per year in production From 12 to 3500 deployments per year in production
From 12 to 3500 deployments per year in production
Archie Cowan
 
Project Result PowerPoint Presentation Slides
Project Result PowerPoint Presentation Slides Project Result PowerPoint Presentation Slides
Project Result PowerPoint Presentation Slides
SlideTeam
 
Projects
ProjectsProjects
Projects
Eray Diler
 
Android Workshop Session 1
Android Workshop Session 1Android Workshop Session 1
Android Workshop Session 1
NAILBITER
 
Innovative it project management practices
Innovative it project management practicesInnovative it project management practices
Innovative it project management practices
Tathagat Varma
 

Similar to Predictions from MARS (20)

Feature Engineering for IoT
Feature Engineering for IoTFeature Engineering for IoT
Feature Engineering for IoT
 
DREAM Principles & User Guide 1.0
DREAM Principles & User Guide 1.0DREAM Principles & User Guide 1.0
DREAM Principles & User Guide 1.0
 
Project Output PowerPoint Presentation Slides
Project Output PowerPoint Presentation Slides Project Output PowerPoint Presentation Slides
Project Output PowerPoint Presentation Slides
 
Android Fundamentals & Figures of 2012
Android Fundamentals & Figures of 2012Android Fundamentals & Figures of 2012
Android Fundamentals & Figures of 2012
 
Project Output Powerpoint Presentation Slides
Project Output Powerpoint Presentation SlidesProject Output Powerpoint Presentation Slides
Project Output Powerpoint Presentation Slides
 
What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?
 
Moodle mootjapan2013 「Moodle±5年」 English Version
Moodle mootjapan2013 「Moodle±5年」 English VersionMoodle mootjapan2013 「Moodle±5年」 English Version
Moodle mootjapan2013 「Moodle±5年」 English Version
 
Developer Workshop Summary
Developer Workshop SummaryDeveloper Workshop Summary
Developer Workshop Summary
 
Developer workshop summary
Developer workshop summaryDeveloper workshop summary
Developer workshop summary
 
Developer workshop summary
Developer workshop summaryDeveloper workshop summary
Developer workshop summary
 
LoCloud - D3.4: Vocabulary services
LoCloud - D3.4: Vocabulary servicesLoCloud - D3.4: Vocabulary services
LoCloud - D3.4: Vocabulary services
 
TESCO Meter Testing
TESCO Meter TestingTESCO Meter Testing
TESCO Meter Testing
 
Unlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedgeUnlv user forum_jennifermercer_theraisersedge
Unlv user forum_jennifermercer_theraisersedge
 
Demystifying Cloud Security
Demystifying Cloud SecurityDemystifying Cloud Security
Demystifying Cloud Security
 
PhoneGap build
PhoneGap buildPhoneGap build
PhoneGap build
 
From 12 to 3500 deployments per year in production
From 12 to 3500 deployments per year in production From 12 to 3500 deployments per year in production
From 12 to 3500 deployments per year in production
 
Project Result PowerPoint Presentation Slides
Project Result PowerPoint Presentation Slides Project Result PowerPoint Presentation Slides
Project Result PowerPoint Presentation Slides
 
Projects
ProjectsProjects
Projects
 
Android Workshop Session 1
Android Workshop Session 1Android Workshop Session 1
Android Workshop Session 1
 
Innovative it project management practices
Innovative it project management practicesInnovative it project management practices
Innovative it project management practices
 

More from Salford Systems

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
Salford Systems
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
Salford Systems
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Salford Systems
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
Salford Systems
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
Salford Systems
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele Cutler
Salford Systems
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
Salford Systems
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To Remember
Salford Systems
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
Salford Systems
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
Salford Systems
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
Salford Systems
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
Salford Systems
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
Salford Systems
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hiv
Salford Systems
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
Salford Systems
 
SPM v7.0 Feature Matrix
SPM v7.0 Feature MatrixSPM v7.0 Feature Matrix
SPM v7.0 Feature Matrix
Salford Systems
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
Salford Systems
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998
Salford Systems
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPM
Salford Systems
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7
Salford Systems
 

More from Salford Systems (20)

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele Cutler
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To Remember
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hiv
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
 
SPM v7.0 Feature Matrix
SPM v7.0 Feature MatrixSPM v7.0 Feature Matrix
SPM v7.0 Feature Matrix
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPM
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7
 

Recently uploaded

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 

Recently uploaded (20)

Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 

Predictions from MARS

  • 1. May 2012 Maria Lupetini Engineering Asset Management & Analytics Qualcomm Incorporated
  • 2. Advantages of MARS Modeling  Predicting Demand for an Asset  Capturing Trends and Seasonal Effects  Finding Interactive Effects  Weighting More Recent Data  Autoregressive Model for Time Series  Using Lag Variables  Don’t be Afraid of Missing Values  Summary of Findings
  • 3. Regression: Linear, Logistic, GLM, MARS  ARIMA Time Series  Decision Trees  Neural Networks  Support Vector Machines  And more Need to pick one or more approaches tailored to problem you are tackling
  • 4. Sales - Dollars, Number of Chips  Resources - People, Software Assets  Performance of a Semiconductor - Seconds to load a web page  …You name it.
  • 5. Data contains continuous numbers  $123,456.00  Number of employees  Understand influences of categories  Geographical regions  Operating system: Windows, Android  Seasonal or repeated trends  Months of the year  Christmas season  Special Effects  Consumer Promotions and Advertising  Switch turned on
  • 6. What do you do if you want to predict a trend or find a pattern in data….and  There are hundreds of possible variables that influence your outcome - ◦ Which ones matter?  What if the variables interact with each other and effect the outcome ◦ How do you find that those relationships?  What if variables are not linearly related to the outcome ◦ How do determine the what the relationship curves will look like? ◦ Threshold or plateau relationship  What if the data you are using to predict is a mixture of numbers and categories ◦ How do you build a prediction formula?  How do I build a prediction model that is easy to understand? … USE MARS
  • 7. MARS short for Multivariate Adaptive Regression Splines  Technique introduced in 1991, Jerome Friedman, Stanford University  Nonparametric, data driven algorithm  Prediction is a regression model with additional side equations (basis functions)  Uses piecewise regression splines to build the prediction  Provides data reduction to select which variables matter
  • 8. Software Used in Designing Semiconductor Chips  Is the use of the software growing?  What time of day are the software licenses most demanded?  Does demand change over the weekend?  How many copies do we need next week?
  • 9. 100 150 200 250 300 350 50 0 8/28/2011 12… 9/2/2011 4 PM 9/8/2011 8 AM 9/14/2011 12… 9/19/2011 4 PM 9/25/2011 8 AM 10/1/2011 12… 10/6/2011 4 PM 10/12/2011 8… 10/18/2011 12… 10/23/2011 4… 10/29/2011 8… 11/4/2011 12… 11/9/2011 4 PM 11/15/2011 8… 11/21/2011 12… 11/26/2011 4… 12/2/2011 8 AM 12/8/2011 12… 12/13/2011 4… 12/19/2011 8… 12/25/2011 12… 12/30/2011 4… 1/5/2012 8 AM 1/11/2012 12… 1/16/2012 4 PM 1/22/2012 8 AM 1/28/2012 12… from Aug 2011 to April 2012 2/2/2012 4 PM 2/8/2012 8 AM 2/14/2012 12… 2/19/2012 4 PM Number of Software Licenses Used in an Hour 2/25/2012 8 AM 3/2/2012 12 AM 3/7/2012 4 PM 3/13/2012 9 AM 3/19/2012 1 AM 3/24/2012 5 PM How do you forecast this time series of demand data? 3/30/2012 9 AM 4/5/2012 1 AM 4/10/2012 5 PM
  • 10. Actual Licenses Week Day Week Time Used Number WeekDay Name end Holiday Hour 9/4/2011 9 PM 58 37 1 Sun 1 Y 21 9/4/2011 10 PM 75 37 1 Sun 1 Y 22 9/4/2011 11 PM 88 37 1 Sun 1 Y 23 9/5/2011 12 AM 81 37 2 Mon 0 Y 0 9/5/2011 1 AM 74 37 2 Mon 0 Y 1 9/5/2011 2 AM 80 37 2 Mon 0 Y 2 9/5/2011 3 AM 81 37 2 Mon 0 Y 3 • Real Continuous or Integer Variables: License Counts, Week Number • Categorical Text Variables: Holiday flag, Day Name • Binary Numbers: Weekend flag • Choice of Categorical or Real Number: Week Day, Hour
  • 11. Can we building a prediction model of the form? Demand = Constant Base+ Baseline trend + Hour of day effect + Day of Week effect + Holiday effect
  • 12. Setting Up Model in MARS
  • 13. Trend line captures: • Growing use of this software product from Sep 20112 to Apr 2012 • Deadlines of semiconductor chip projects (Jan. and March)
  • 14. Additional licenses needed as function of hour of the day Hour Predictor Captures: • Highest use of licenses during 10 to 1pm US Pacific time • Effect of Use in European/Indian time zones
  • 15. Additional Weekday was coded as licenses a continuous variable. needed as Coding it as a function of categorical can also day of the work here. week 1= Sunday, 2=Monday, etc Day of Week Predictor Captures: • Highest use of licenses during Wednesday to Friday
  • 16. Possible Interactive Effects Between Variables Look to find an interactive effects between hour of day and day of week. Did not want to allow interactive effects between week_number and holiday variables with other variables
  • 17. Additional licenses needed as function of hour and day Interactive effect • Work patterns are different on the weekends when compared to the work week.
  • 18. Additional licenses needed on non-holidays Holiday Predictor Captures: • The difference in demand in a hour if it is a holiday
  • 19. Weighting of Observations 5/21/2012 12 AM Day and Hour Observation 4/1/2012 12 AM 2/11/2012 12 AM 12/23/2011 12 AM 11/3/2011 12 AM 9/14/2011 12 AM 7/26/2011 12 AM 0 1 2 3 4 Weight Applied to Observations MARS will consider a “variable” as a weighting factor. Here, the observations in April 2012 were 3 times more important than observations in Sep 2011.
  • 20. 100 150 200 250 300 350 50 0 4/8/2012 12 AM 4/8/2012 8 AM 4/8/2012 4 PM 4/9/2012 12 AM 4/9/2012 8 AM 4/9/2012 4 PM 4/10/2012 12 AM 4/10/2012 8 AM 4/10/2012 4 PM 4/11/2012 12 AM 4/11/2012 8 AM 4/11/2012 4 PM 4/12/2012 12 AM 4/12/2012 8 AM Blue line Actual Licenses Used 4/12/2012 4 PM Part of the Training Dataset 4/13/2012 12 AM 4/13/2012 8 AM 4/13/2012 4 PM 4/14/2012 12 AM 4/14/2012 8 AM 4/14/2012 4 PM 4/15/2012 12 AM 4/15/2012 8 AM 4/15/2012 4 PM 4/16/2012 12 AM 4/16/2012 8 AM 4/16/2012 4 PM 4/17/2012 12 AM 4/17/2012 8 AM 4/17/2012 4 PM 4/18/2012 12 AM 4/18/2012 8 AM 4/18/2012 4 PM 4/19/2012 12 AM Number of Software Licenses Used and Predicted 4/19/2012 8 AM 4/19/2012 4 PM 4/20/2012 12 AM Prediction on Unseen Data 4/20/2012 8 AM 4/20/2012 4 PM Red line is MARS fit on Training Data for 4/18 to 4/15 and Prediction on 4/15 to 4/21 4/21/2012 12 AM 4/21/2012 8 AM 4/21/2012 4 PM
  • 21. 100 150 200 250 300 350 50 0 8/28/2011 12 AM 9/2/2011 4 PM 9/8/2011 8 AM 9/14/2011 12 AM 9/19/2011 4 PM 9/25/2011 8 AM 10/1/2011 12 AM 10/6/2011 4 PM 10/12/2011 8 AM 10/18/2011 12 AM 10/23/2011 4 PM 10/29/2011 8 AM 11/4/2011 12 AM 11/9/2011 4 PM 11/15/2011 8 AM 11/21/2011 12 AM 11/26/2011 4 PM 12/2/2011 8 AM 12/8/2011 12 AM 12/13/2011 4 PM 12/19/2011 8 AM 12/25/2011 12 AM Prediction Model • Overall trend 12/30/2011 4 PM 1/5/2012 8 AM Training Dataset 1/11/2012 12 AM 1/16/2012 4 PM 1/22/2012 8 AM 1/28/2012 12 AM Actual MARS was able to capture: 2/2/2012 4 PM Number of Software Licenses Used 2/8/2012 8 AM 2/14/2012 12 AM • Hourly and Week Day effect 2/19/2012 4 PM 2/25/2012 8 AM • Somewhat captured US holidays 3/2/2012 12 AM 3/7/2012 4 PM 3/13/2012 9 AM 3/19/2012 1 AM 3/24/2012 5 PM 3/30/2012 9 AM 4/5/2012 1 AM 4/10/2012 5 PM
  • 22. Variable Importance -gcv --------------------------------------------------------------- MARS tells you WEEKDAY 100.00000 2713.86182 which variables are most HOUR 93.20326 2418.96997 WEEK_NUMBER 44.00605 903.06390 HOLIDAY$ 21.76427 574.55463 important. Great R-Squared ============================== of 90%. Other diagnostics, not N: 15217.52 R-SQUARED: 0.90281 presented here, MEAN DEP VAR: 158.15640 ADJ R-SQUARED: 0.90214 UNCENTERED R-SQUARED = R-0 SQUARED: 0.98493 looked good too. F-STATISTIC = 1344.99320 S.E. OF REGRESSION = 35.12427 P-VALUE = 0.00000 RESIDUAL SUM OF SQUARES = .678790E+07 [MDF,NDF] = [ 38, 5502 ] REGRESSION SUM OF SQUARES = .630548E+08 Actual Used: Range 45 to 344 Licenses Average 95 Standard Dev. 70
  • 23. Can we build a prediction model of the autoregressive form? Demand = Constant Base+ Baseline trend + Effect of Licenses Used from a week ago + Workweek vs. Weekend effect + Holiday effect
  • 24.
  • 25. Set Up Autoregressive Model, Part 2 Creating lag variable for “Used Lag168.” This predictor is the number of licenses used in the same hour, in the same day, in the prior week.
  • 26. MARS found underlying trend when adjusting for other factors in the Autoregressive model version. Adjusting for underlying trend makes series stationary. This is necessary for ARIMA models.
  • 27. MARS captures contribution of Used Lag 168 hours variable
  • 28. Selected MARS Output Showing Model Form and Fit BF1 = ( USED<168> ne . ); BF2 = ( USED<168> = . ); Basis Functions and BF3 = max( 0, USED<168> - 42) * BF1; Prediction Equation BF4 = max( 0, 42 - USED<168>) * BF1; from MARS. BF5 = (HOLIDAY$ in ( "Y" )); BF7 = (MON_TO_FRI in ( 0 )); Note the handling of BF9 = max( 0, WEEK_NUMBER - 50) * BF1; missing values. BF10 = max( 0, 50 - WEEK_NUMBER) * BF1; BF11 = max( 0, USED<168> - 137) * BF1; BF13 = max( 0, USED<168> - 265) * BF1; Reasonable fit with BF15 = (MON_TO_FRI in ( 0 )) * BF2; 82% R-squared Number of Lucenses Needed = 134- 39 * BF1 + 0.58 * BF3 - 2.12 * BF4 - 42* BF5 - 21.6 * BF7 - 0.235 * BF9 - 1.598 * BF10 + 0.338 * BF11 - 0.535 * BF13 - 38 * BF15; N: 15055.88 R-SQUARED: 0.82525 MEAN DEP VAR: 158.75413 ADJ R-SQUARED: 0.82493 F-STATISTIC = 2533.14901 S.E. OF REGRESSION = 47.37796
  • 29. For observations where the 168 lag of the “Used” variable is not missing: Holiday = 1 if it’s a holiday, else 0 Weekend = 1 if it’s Saturday or Sunday, else 0 A = max( 0, USED<168> - 42) B = max( 0, 42 - USED<168>) Autoregressive C = max( 0, USED<168> - 137) Splines D = max( 0, USED<168> - 265) E = max( 0, WEEK_NUMBER - 50) F = max( 0, 50 - WEEK_NUMBER) Trend line Splines Forecasted License Need= 95 - 42*Holiday - 22 * Weekend [0.6 * A - 2.1 * B + 0.3 * C - 0.5 * D] + [- 0.2 * E - 1.6 * F]
  • 30. 100 150 200 250 350 400 300 50 0 9/4/2011 12 AM 9/10/2011 6 AM 9/16/2011 12 PM 9/22/2011 6 PM 9/29/2011 12 AM 10/5/2011 6 AM 10/11/2011 12 PM 10/17/2011 6 PM 10/24/2011 12 AM 10/30/2011 6 AM 11/5/2011 12 PM 11/11/2011 6 PM 11/18/2011 12 AM 11/24/2011 6 AM 11/30/2011 12 PM 12/6/2011 6 PM 12/13/2011 12 AM 12/19/2011 6 AM 12/25/2011 12 PM 12/31/2011 6 PM 1/7/2012 12 AM 1/13/2012 6 AM 1/19/2012 12 PM 1/25/2012 6 PM 2/1/2012 12 AM 2/7/2012 6 AM 2/13/2012 12 PM 2/19/2012 6 PM 2/26/2012 12 AM 3/3/2012 6 AM 3/9/2012 12 PM 3/15/2012 7 PM 3/22/2012 1 AM 3/28/2012 7 AM 4/3/2012 1 PM 4/9/2012 7 PM 4/16/2012 1 AM USED Predicted
  • 31. 100 150 200 250 300 350 400 0 50 4/8/2012 12 AM 4/8/2012 8 AM 4/8/2012 4 PM 4/9/2012 12 AM 4/9/2012 8 AM 4/9/2012 4 PM 4/10/2012 12 AM 4/10/2012 8 AM 4/10/2012 4 PM 4/11/2012 12 AM 4/11/2012 8 AM Blue line is Actual Used 4/11/2012 4 PM Part of Training Dataset 4/12/2012 12 AM 4/12/2012 8 AM 4/12/2012 4 PM 4/13/2012 12 AM 4/13/2012 8 AM 4/13/2012 4 PM 4/14/2012 12 AM 4/14/2012 8 AM 4/14/2012 4 PM 4/15/2012 12 AM 4/15/2012 8 AM 4/15/2012 4 PM 4/16/2012 12 AM 4/16/2012 8 AM 4/16/2012 4 PM 4/17/2012 12 AM 4/17/2012 8 AM 4/17/2012 4 PM 4/18/2012 12 AM 4/18/2012 8 AM 4/18/2012 4 PM Number of Licenses Used and Predicted 4/19/2012 12 AM 4/19/2012 8 AM Forecasting Unseen Data 4/19/2012 4 PM 4/20/2012 12 AM 4/20/2012 8 AM 4/20/2012 4 PM 4/21/2012 12 AM Red line is MARS fit on Training data for 4/8 to 4/14 and Prediction on 4/15 to 4/21 data 4/21/2012 8 AM 4/21/2012 4 PM
  • 32. Number of Licenses 100 150 200 250 300 350 400 50 0 4/8/2012 12 AM 4/8/2012 9 AM 4/8/2012 6 PM 4/9/2012 3 AM 4/9/2012 12 PM 4/9/2012 9 PM 4/10/2012 6 AM 4/10/2012 3 PM 4/11/2012 12 AM 4/11/2012 9 AM 4/11/2012 6 PM 4/12/2012 3 AM 4/12/2012 12 PM 4/12/2012 9 PM 4/13/2012 6 AM Predicted_AutoRegressive 4/13/2012 3 PM 4/14/2012 12 AM 4/14/2012 9 AM 4/14/2012 6 PM 4/15/2012 3 AM 4/15/2012 12 PM 4/15/2012 9 PM Actual Used 4/16/2012 6 AM to Actual Licenses Used 4/16/2012 3 PM 4/17/2012 12 AM 4/17/2012 9 AM Compare Forecast of Two Models 4/17/2012 6 PM 4/18/2012 3 AM 4/18/2012 12 PM 4/18/2012 9 PM 4/19/2012 6 AM 4/19/2012 3 PM 4/20/2012 12 AM 4/20/2012 9 AM Predicted Not Auto Reg 4/20/2012 6 PM 4/21/2012 3 AM 4/21/2012 12 PM 4/21/2012 9 PM
  • 33. Mathematically  MARS is versatile; it models most data types  Selects best predictors  Models nonlinear relationships  Easily finds selective interactive effects  Simple to create lag variables as predictors  Flexible weighting schemes for observations  Can handle missing values Operationally  Don’t call me for more software license copies on Thursday at noon; everyone else is!