SlideShare a Scribd company logo
As any other retail company in food industry, revenue may be affected by many different factors,
such as season, awareness of customers, location, age of the company, and etc. There are many
different ways to predict the sales in the future based on the experience in the past.
One of the ways to predict sales is building the regression model based on the significant factors.
During the last year and a half, company was collecting data about its revenue: day, revenue of that
day, and number of orders.
For building multiple regression model there was a need of adding additional information to the
data set. For instance, “days” were converted into three variables: “year”, “month”, and “day of the
week”. In addition, dummy variables were added for 11 month and 6 days of the week. Also,
additional variable “holiday”, that can potentially influence the sales, was added to the data set.
The response variable for Multiple regression model is “Revenue” and predicted variables include
Year, February, March, April, May, June, July, August, September, October, November, December,
Holiday, Monday, Tuesday, Wednesday, Thursday, Saturday, and Sunday.
After running the multiple regression on the software the following equation was build:
Sales = 60782.467 + 6299.3567*Year+64.674037*February+1111.6888
*March+1612.0077*April +5245.2574*May+7535.7937*June+ 18703.282*July+16775.916
August+13967.956 September+10346.342*October+9214.7979* November+5881.4187 *December-
27277.269*Monday-24640.78 *Tuesday-11204.269 *Wednesday+10031.046 *Thursday-27260.071
*Saturday-28896.612 *Sunday
where January /Friday were chosen to be interception.
Parameter Estimate Std. Err. Alternative DF T-Stat P-value
Intercept 60782.467 2433.0199 ≠ 0 588 24.982314 <0.0001
Year 6299.3567 1025.005 ≠ 0 588 6.1456838 <0.0001
February 64.674037 2096.0356 ≠ 0 588 0.03085541 0.9754
March 1111.6888 2055.2442 ≠ 0 588 0.54090352 0.5888
April 1612.0077 2092.8673 ≠ 0 588 0.77023882 0.4415
May 5245.2574 2045.5038 ≠ 0 588 2.5642864 0.0106
June 7535.7937 2066.248 ≠ 0 588 3.6470907 0.0003
July 18703.282 2076.2135 ≠ 0 588 9.0083619 <0.0001
August 16775.916 2078.3279 ≠ 0 588 8.071833 <0.0001
September 13967.956 2592.4881 ≠ 0 588 5.3878572 <0.0001
October 10346.342 2574.7308 ≠ 0 588 4.0184171 <0.0001
November 9214.7979 2592.5503 ≠ 0 588 3.5543371 0.0004
December 5881.4187 2557.4269 ≠ 0 588 2.2997407 0.0218
Holiday 31711.132 1955.9221 ≠ 0 588 16.212881 <0.0001
Monday -27277.269 1717.2913 ≠ 0 588 -15.883892 <0.0001
Tuesday -24640.78 1715.4465 ≠ 0 588 -14.364062 <0.0001
Wednesday -11204.269 1715.3106 ≠ 0 588 -6.5319183 <0.0001
Thursday 10031.046 1713.7723 ≠ 0 588 5.8531962 <0.0001
Saturday -27260.071 1713.6991 ≠ 0 588 -15.907152 <0.0001
Sunday -28896.612 1719.6078 ≠ 0 588 -16.804188 <0.0001
Table 1. Parameter estimates
Source DF SS MS F-stat P-value
Model 19 1.9297489e11 1.0156573e10 79.572676 <0.0001
Error 588 7.5051704e10 1.2763895e8
Total 607 2.6802659e11
Table 2. Analysis of variance table for multiple regression model
Summary of fit:
Root MSE: 11297.741
R-squared: 0.72
R-squared (adjusted): 0.7109
Building multiple regression model will show the importance of each chosen parameters (p-value)
and percentage of all data which can be predicted using this model (R-squared adjusted). R-squared is
equal 0.7109 meaning that about 70% of the data are to the fitted regression model and almost 30% are
not.
Overall, the p-value of the model is < 0.0001 which prove that the parameters are not equal 0.
Some of the individual p-values, for instance for February, March, and April, are higher than
0.05. Typically, the coefficient p-values are determining which parameter to keep in the regression
model. At the same time, excluding couple of months from the model will be causing the inability to
predict the sale during those months.
No doubt, to be sure that the model is valid following conditions should be checked. First
assumption is that the errors around the idealized regression model at any specified values of the X-
variables follow a Normal model. The Graph 1. is a histogram for residuals. It is proving that
residuals are normally distributed.
Graph 1.
The second condition is Condition of Plot Thickness. The scatterplot (Graph 2) of residuals against
predicted values shows no obvious changes in the spread about the line
Graph 2.
The last condition, Nearly Normal Condition: A histogram of the residuals is unimodal and symmetric.
.
Graph 3.
The completing those condition allows us to use the result of multiple regression model.
Index/time plot (Graph 4) is showing the Revenues and Predicted Variable for the all period of
time.
Graph 4.
Predicted Variables are more stable than Revenues. With the same mean, they have very different
standard deviations (Table 3). No doubt, the positive differences can be explain by unaccounted
parameters such as promotions, sales, or coupons distributed before that day. For instance, the
negative differences would be explain by hardware failure or negative weather conditions when
customers doesn’t want to step up outside for shopping from their houses.
Column Mean Std. dev. Coef. of var.
Revenues 63194.778 21013.316 33.251666
Pred. Values 63194.778 17830.193 28.21466
Table 3. Summary statistics
In addition, it is good to remember that the model build on the past experience. In the future, at any
point of time, any parameters, or relationship between them may be changed. Therefore, regression
model will be changed.

More Related Content

Similar to Sale prediction

Projet_Krutarth Desai_A20387996
Projet_Krutarth Desai_A20387996Projet_Krutarth Desai_A20387996
Projet_Krutarth Desai_A20387996
Krutarth Desai
 
Nielsen Case Study Project
Nielsen Case Study ProjectNielsen Case Study Project
Nielsen Case Study Project
Subhodeep Mukherjee
 
Inventory stock reduction ultimatum
Inventory stock reduction ultimatumInventory stock reduction ultimatum
Inventory stock reduction ultimatum
Ghassan Kabbara
 
Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...
Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...
Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...
Dorianner
 
Time series.ppt for pre university students
Time series.ppt for pre university studentsTime series.ppt for pre university students
Time series.ppt for pre university students
IRENAEUSALANTHONYMAR
 
Chapter 4 5 Inventory.pptx
Chapter 4  5 Inventory.pptxChapter 4  5 Inventory.pptx
Chapter 4 5 Inventory.pptx
Sheldon Byron
 
Demand forecasting methods 1 gp
Demand forecasting methods 1 gpDemand forecasting methods 1 gp
Demand forecasting methods 1 gp
PUTTU GURU PRASAD
 
Forecasting Quantitative - Time Series.ppt
Forecasting Quantitative - Time Series.pptForecasting Quantitative - Time Series.ppt
Forecasting Quantitative - Time Series.ppt
bookworm65
 
Part b (40 points)monthly time series forecasts starting jan. 202
Part b (40 points)monthly time series forecasts starting jan. 202Part b (40 points)monthly time series forecasts starting jan. 202
Part b (40 points)monthly time series forecasts starting jan. 202
JUST36
 
Training Module
Training ModuleTraining Module
Training Module
Vaseem Ahamad
 
Chapter8[1]
Chapter8[1]Chapter8[1]
Chapter8[1]
Hariharan Ponnusamy
 
Production & Operation Management Chapter8[1]
Production & Operation Management Chapter8[1]Production & Operation Management Chapter8[1]
Production & Operation Management Chapter8[1]
Hariharan Ponnusamy
 
New+residential+construction+%28 march+2016%29
New+residential+construction+%28 march+2016%29New+residential+construction+%28 march+2016%29
New+residential+construction+%28 march+2016%29
Mahmoud abd el wahab el said
 
Chapter 7 Forecasting Time Series ModelsLan WangCSU East .docx
Chapter 7 Forecasting Time Series ModelsLan WangCSU East .docxChapter 7 Forecasting Time Series ModelsLan WangCSU East .docx
Chapter 7 Forecasting Time Series ModelsLan WangCSU East .docx
christinemaritza
 
RMCPWSM_GCM_2015
RMCPWSM_GCM_2015RMCPWSM_GCM_2015
RMCPWSM_GCM_2015
Tanmoy Ganguli
 
Web trafic time series forecasting
Web trafic time series forecastingWeb trafic time series forecasting
Web trafic time series forecasting
Korivi Sravan Kumar
 
TYPES OF ANALYTICS.pptx
TYPES OF ANALYTICS.pptxTYPES OF ANALYTICS.pptx
TYPES OF ANALYTICS.pptx
Surendhranatha Reddy
 
Introduction to need of forecasting in business
Introduction to need of forecasting in businessIntroduction to need of forecasting in business
Introduction to need of forecasting in business
AnuyaK1
 
dow jones industrial average
dow jones industrial averagedow jones industrial average
dow jones industrial average
nadejaking
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
Derek Kane
 

Similar to Sale prediction (20)

Projet_Krutarth Desai_A20387996
Projet_Krutarth Desai_A20387996Projet_Krutarth Desai_A20387996
Projet_Krutarth Desai_A20387996
 
Nielsen Case Study Project
Nielsen Case Study ProjectNielsen Case Study Project
Nielsen Case Study Project
 
Inventory stock reduction ultimatum
Inventory stock reduction ultimatumInventory stock reduction ultimatum
Inventory stock reduction ultimatum
 
Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...
Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...
Operations Management in the Supply Chain Decisions and Cases 7th Edition Sch...
 
Time series.ppt for pre university students
Time series.ppt for pre university studentsTime series.ppt for pre university students
Time series.ppt for pre university students
 
Chapter 4 5 Inventory.pptx
Chapter 4  5 Inventory.pptxChapter 4  5 Inventory.pptx
Chapter 4 5 Inventory.pptx
 
Demand forecasting methods 1 gp
Demand forecasting methods 1 gpDemand forecasting methods 1 gp
Demand forecasting methods 1 gp
 
Forecasting Quantitative - Time Series.ppt
Forecasting Quantitative - Time Series.pptForecasting Quantitative - Time Series.ppt
Forecasting Quantitative - Time Series.ppt
 
Part b (40 points)monthly time series forecasts starting jan. 202
Part b (40 points)monthly time series forecasts starting jan. 202Part b (40 points)monthly time series forecasts starting jan. 202
Part b (40 points)monthly time series forecasts starting jan. 202
 
Training Module
Training ModuleTraining Module
Training Module
 
Chapter8[1]
Chapter8[1]Chapter8[1]
Chapter8[1]
 
Production & Operation Management Chapter8[1]
Production & Operation Management Chapter8[1]Production & Operation Management Chapter8[1]
Production & Operation Management Chapter8[1]
 
New+residential+construction+%28 march+2016%29
New+residential+construction+%28 march+2016%29New+residential+construction+%28 march+2016%29
New+residential+construction+%28 march+2016%29
 
Chapter 7 Forecasting Time Series ModelsLan WangCSU East .docx
Chapter 7 Forecasting Time Series ModelsLan WangCSU East .docxChapter 7 Forecasting Time Series ModelsLan WangCSU East .docx
Chapter 7 Forecasting Time Series ModelsLan WangCSU East .docx
 
RMCPWSM_GCM_2015
RMCPWSM_GCM_2015RMCPWSM_GCM_2015
RMCPWSM_GCM_2015
 
Web trafic time series forecasting
Web trafic time series forecastingWeb trafic time series forecasting
Web trafic time series forecasting
 
TYPES OF ANALYTICS.pptx
TYPES OF ANALYTICS.pptxTYPES OF ANALYTICS.pptx
TYPES OF ANALYTICS.pptx
 
Introduction to need of forecasting in business
Introduction to need of forecasting in businessIntroduction to need of forecasting in business
Introduction to need of forecasting in business
 
dow jones industrial average
dow jones industrial averagedow jones industrial average
dow jones industrial average
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
 

Recently uploaded

一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
TeukuEriSyahputra
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
1tyxnjpia
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
inaya7568
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 

Recently uploaded (20)

一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 

Sale prediction

  • 1. As any other retail company in food industry, revenue may be affected by many different factors, such as season, awareness of customers, location, age of the company, and etc. There are many different ways to predict the sales in the future based on the experience in the past. One of the ways to predict sales is building the regression model based on the significant factors. During the last year and a half, company was collecting data about its revenue: day, revenue of that day, and number of orders. For building multiple regression model there was a need of adding additional information to the data set. For instance, “days” were converted into three variables: “year”, “month”, and “day of the week”. In addition, dummy variables were added for 11 month and 6 days of the week. Also, additional variable “holiday”, that can potentially influence the sales, was added to the data set. The response variable for Multiple regression model is “Revenue” and predicted variables include Year, February, March, April, May, June, July, August, September, October, November, December, Holiday, Monday, Tuesday, Wednesday, Thursday, Saturday, and Sunday. After running the multiple regression on the software the following equation was build: Sales = 60782.467 + 6299.3567*Year+64.674037*February+1111.6888 *March+1612.0077*April +5245.2574*May+7535.7937*June+ 18703.282*July+16775.916 August+13967.956 September+10346.342*October+9214.7979* November+5881.4187 *December- 27277.269*Monday-24640.78 *Tuesday-11204.269 *Wednesday+10031.046 *Thursday-27260.071 *Saturday-28896.612 *Sunday where January /Friday were chosen to be interception.
  • 2. Parameter Estimate Std. Err. Alternative DF T-Stat P-value Intercept 60782.467 2433.0199 ≠ 0 588 24.982314 <0.0001 Year 6299.3567 1025.005 ≠ 0 588 6.1456838 <0.0001 February 64.674037 2096.0356 ≠ 0 588 0.03085541 0.9754 March 1111.6888 2055.2442 ≠ 0 588 0.54090352 0.5888 April 1612.0077 2092.8673 ≠ 0 588 0.77023882 0.4415 May 5245.2574 2045.5038 ≠ 0 588 2.5642864 0.0106 June 7535.7937 2066.248 ≠ 0 588 3.6470907 0.0003 July 18703.282 2076.2135 ≠ 0 588 9.0083619 <0.0001 August 16775.916 2078.3279 ≠ 0 588 8.071833 <0.0001 September 13967.956 2592.4881 ≠ 0 588 5.3878572 <0.0001 October 10346.342 2574.7308 ≠ 0 588 4.0184171 <0.0001 November 9214.7979 2592.5503 ≠ 0 588 3.5543371 0.0004 December 5881.4187 2557.4269 ≠ 0 588 2.2997407 0.0218 Holiday 31711.132 1955.9221 ≠ 0 588 16.212881 <0.0001 Monday -27277.269 1717.2913 ≠ 0 588 -15.883892 <0.0001 Tuesday -24640.78 1715.4465 ≠ 0 588 -14.364062 <0.0001 Wednesday -11204.269 1715.3106 ≠ 0 588 -6.5319183 <0.0001 Thursday 10031.046 1713.7723 ≠ 0 588 5.8531962 <0.0001 Saturday -27260.071 1713.6991 ≠ 0 588 -15.907152 <0.0001 Sunday -28896.612 1719.6078 ≠ 0 588 -16.804188 <0.0001 Table 1. Parameter estimates Source DF SS MS F-stat P-value Model 19 1.9297489e11 1.0156573e10 79.572676 <0.0001 Error 588 7.5051704e10 1.2763895e8 Total 607 2.6802659e11 Table 2. Analysis of variance table for multiple regression model Summary of fit: Root MSE: 11297.741 R-squared: 0.72 R-squared (adjusted): 0.7109
  • 3. Building multiple regression model will show the importance of each chosen parameters (p-value) and percentage of all data which can be predicted using this model (R-squared adjusted). R-squared is equal 0.7109 meaning that about 70% of the data are to the fitted regression model and almost 30% are not. Overall, the p-value of the model is < 0.0001 which prove that the parameters are not equal 0. Some of the individual p-values, for instance for February, March, and April, are higher than 0.05. Typically, the coefficient p-values are determining which parameter to keep in the regression model. At the same time, excluding couple of months from the model will be causing the inability to predict the sale during those months. No doubt, to be sure that the model is valid following conditions should be checked. First assumption is that the errors around the idealized regression model at any specified values of the X- variables follow a Normal model. The Graph 1. is a histogram for residuals. It is proving that residuals are normally distributed. Graph 1.
  • 4. The second condition is Condition of Plot Thickness. The scatterplot (Graph 2) of residuals against predicted values shows no obvious changes in the spread about the line Graph 2. The last condition, Nearly Normal Condition: A histogram of the residuals is unimodal and symmetric. . Graph 3.
  • 5. The completing those condition allows us to use the result of multiple regression model. Index/time plot (Graph 4) is showing the Revenues and Predicted Variable for the all period of time. Graph 4. Predicted Variables are more stable than Revenues. With the same mean, they have very different standard deviations (Table 3). No doubt, the positive differences can be explain by unaccounted parameters such as promotions, sales, or coupons distributed before that day. For instance, the negative differences would be explain by hardware failure or negative weather conditions when customers doesn’t want to step up outside for shopping from their houses. Column Mean Std. dev. Coef. of var. Revenues 63194.778 21013.316 33.251666 Pred. Values 63194.778 17830.193 28.21466 Table 3. Summary statistics
  • 6. In addition, it is good to remember that the model build on the past experience. In the future, at any point of time, any parameters, or relationship between them may be changed. Therefore, regression model will be changed.