SlideShare a Scribd company logo
1 of 48
Download to read offline
PRERNA SHARMA
July 18, 2016
FusionOps
DS Developer Test
1
FusionOps|7/18/2016
FusionOps
DS Developer Test
Solution -1
Flow Chart
Sales Data Analysis
Summary:-
Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag
Count 4000 4000 4000 4000 4000 4000
Mean 20132640.4 5047.348838 4.12248 12.1791675 193.22525 0.3215
Std 9731.630793 26503.76132 8.65621441 25.80803025 318.6396473 0.467110584
Min 20120131 9 0.07 0.32 3 0
25% 20121105.25 408.75 0.91 2.9 60 0
50% 20130880.5 963 1.46 4.32 108 0
75% 20140655.25 2117 2.2 6.51 203.25 1
Max 20150430 390630 52.38 166.43 3966 1
TABLE 1:- SUMMARY OF THE SALES LEVEL DATA
2
FusionOps|7/18/2016
In the above table, we found the proper details (i.e Central Tendency and Variations) of the given data.
As the better result Mean of the Sales Quantity is 5047.34883 and the Median is 6899.864279. It means that the
Median of the Sales Quantity is Higher than the Mean of the Sales Quantity.
Now, Mean of the Marketing Spent is 193.22525 and Median of the Marketing Spent is 155.981928. It
means that Mean of the Marketing Spent is higher than the Median of the Marketing Spent.
Uni-Variate and Bi-Variate Analysis:-
FIGURE 1:-PAIR WISE PLOT OF SALES LEVEL ANALYSIS
3
FusionOps|7/18/2016
FIGURE 2:- BARPLOT OF ABC CLASS
From the above Fiqure:-2 , we found the total count values of each category of the ABC Class as shown below:-
ABC Class Total Counts Discount Flags
A 1080 351
B 1480 453
C 1440 482
TABLE 2 CATEGORIES OF CLASS ABC
4
FusionOps|7/18/2016
Summary Of the Categories of the Class ABC(Product Segmentation):-
TABLE 3:-SUMMARY OF CATEGORY A
TABLE 4:- SUMMARY OF CLASS B
Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag
Count 1080 1080 1080 1080 1080 1080
Mean 20132640.4 16049.80648 4.633481481 17.26198148 453.6935185 0.325
Std 9734.921816 49349.07531 8.36362872 35.78886613 520.6330711 0.468591841
Min 20120131 47 0.07 0.32 54 0
25% 20121105.25 1266.75 0.45 1.55 190.5 0
50% 20130880.5 4185.5 1.03 3.12 297 0
75% 20140655.25 10848 2.9 6.725 479.25 1
Max 20150430 390630 32.39 166.43 3966 1
Sales Date Sales
Quantity
Unit Cost Unit Price Marketing
Spent
Discount
Flag
count 1480 1480 1480 1480 1480 1480
mean 20132640.4 1460.31 4.752783784 12.00458108 132.7682432 0.306081081
std 9733.702872 992.2210921 9.222109011 22.10871481 71.55556198 0.461019588
min 20120131 33.9 0.45 1.78 27 0
25% 20121105.25 783.5 0.8875 2.77 75 0
50% 20130880.5 1394 1.2 3.29 117 0
75% 20140655.25 1937.5 2.32 7.65 173.25 1
max 20150430 6951 39.21 110.8 531 1
5
FusionOps|7/18/2016
Sales Date Sales
Quantity
Unit Cost Unit Price Marketing
Spent
Discount
Flag
count 1440 1440 1440 1440 1440 1440
mean 20132640.4 482.1844097 3.091416667 8.546493056 60.01041667 0.334722222
std 9733.794272 294.1944846 8.167083891 18.85617808 34.8026187 0.472057205
min 20120131 9 0.66 3.02 3 0
25% 20121105.25 248 1.29 4.12 33 0
50% 20130880.5 464.5 1.74 5.01 53 0
75% 20140655.25 646.25 2.02 6.4725 78 1
max 20150430 1895 52.38 134.41 216 1
TABLE 5:- SUMMARY OF CLASS C
From the Above Summary, we came to know that :-
 Sales Quantity of Class A is more as compared to Class B and Class C.
 Marketing Spent of Class A is more as compared to Class B and Class C.
 Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In
other words, we can say that A comprises of more numbers of expensive Sales History.
 It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A
is 520 which is a way more expensive than respective values of Class B and Class C.
 Discount Flag of C is more as compared to Class A and Class B.
 Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B
and Class C.
Class Total Sales
A 17333791.0
B 2161258.8
C 694345.55
TABLE 6 TOTAL SALES
6
FusionOps|7/18/2016
BARPLOT OF DISCOUNT FLAGS
FIGURE 3:- BARPLOT OF DISCOUNT FLAG
From the above Barplot of Discount Flag, we get that there was more sale of the product under Discount as
compared to Promotion as you see in the below table:-
Discount Flag Total Counts
Promotion 1286
Discount 2714
TABLE 7 TOTAL COUNTS OF DISCOUNT FLAGS
As in the above result we see that 67.85 percent of Products are sold under the Discount and the remaining 32.15
percent of the Products are sold under Promotion.
7
FusionOps|7/18/2016
Correlation Of Sales Level Data:-
From the above table we see significant Correlation between Marketing Spent and Sales Quantity that is
0.622740066 which means if we increase the Marketing Spent , the Sales Quantity also increase.
Correlation between Unit Price and Unit cost is also good that is 0.940819548 which means if we
increase the Unit Price , the Unit Cost also increase.
Sales Date Sales
Quantity
Unit Cost Unit Price Marketing
Spent
Discount
Flag
Sales Date 1 0.018289693 0.000642644 0.000710024 0.022292488 -
0.006249567
Sales
Quantity
0.018289693 1 -
0.079925357
-
0.077920384
0.622740066 0.000442116
Unit Cost 0.000642644 -0.079925357 1 0.940819548 0.106927754 0.00759269
Unit Price 0.000710024 -0.077920384 0.940819548 1 0.237334958 -
0.035743076
Marketing
Spent
0.022292488 0.622740066 0.106927754 0.237334958 1 -
0.150267176
Discount
Flag
-
0.006249567
0.000442116 0.00759269 -
0.035743076
-0.150267176 1
TABLE 8 CORRELATION OF SALES LEVEL DATA
8
FusionOps|7/18/2016
Product Level Analysis
9
FusionOps|7/18/2016
Summary Of Product Level Data:-
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 100 100 100 100 100
mean 4.12248 201893.9535 12.1791675 12.86 7729.01
std 8.689409387 1036170.382 25.76945362 1.675853469 11271.81413
min 0.0775 1116 0.3785 10 931
25% 0.9100625 17890 2.892375 11 2897.25
50% 1.43425 38601.5 4.379625 13 4688
75% 2.157125 85749.45 6.6428125 14 8529.75
max 51.201 10290924 147.223 17 78490
TABLE 9 SUMMARY OF THE PRODUCT LEVEL DATA
From the above table , we see the Product level of the Data:-
As the Mean of the Sales Quantity is 201893.9535 and the Median is 275403.451160.
It means that the Median of the Sales Quantity is Higher than the Mean of the Sales Quantity.
10
FusionOps|7/18/2016
Barplot of Class ABC (Product Level)
FIGURE 1 BARPLOT OF CLASS ABC(PRODUCT LEVEL DATA)
From the above Fiqure , we found the total count values of each category of the ABC Class as shown below:-
Abc Class Total Counts
A 27
B 37
C 36
TABLE 10 TOTAL COUNTS OF THE CLASS ABC
11
FusionOps|7/18/2016
Summary of The Class ABC(Product Level Data):-
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 27 27 27 27 27
mean 4.633481481 641992.2593 17.26198148 13 18147.74074
Std 8.509789031 1951783.957 36.22357567 1.61721508 17879.64391
Min 0.0775 3933 0.3785 11 7353
25% 0.454375 56306.5 1.671125 11.5 9903.5
50% 1.005 177058 2.99625 13 12979
75% 2.666375 432854 6.612875 14.5 15234
Max 31.823 10290924 147.223 16 78490
TABLE 11 SUMMARY OF CLASS A (PRODUCT LEVEL)
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 37 37 37 37 37
mean 4.752783784 58412.4 12.00458108 12.24324324 5310.72973
Std 9.335749421 37122.25589 22.24332308 1.516674092 1464.675687
Min 0.485 1754.8 2.1505 10 3181
25% 0.88975 35531 2.81275 11 4255
50% 1.179 61624 3.155 12 5090
75% 2.27975 72011 7.22575 13 5879
Max 38.32575 152059 95.622 15 8979
TABLE 12 SUMMARY OF CLASS B (PRODUCT LEVEL)
12
FusionOps|7/18/2016
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
count 36 36 36 36 36
mean 3.091416667 19287.37639 8.546493056 13.38888889 2400.416667
std 8.271471763 10679.29319 19.00472153 1.711770642 755.2505876
min 0.716 1116 3.60075 11 931
25% 1.369125 10757.75 4.29525 12 1789.25
50% 1.7075 20211.5 5.016625 13 2385.5
75% 2.0568125 25934 6.2524375 14.25 2921.75
max 51.201 38983 119.047 17 3811
TABLE 13 SUMMARY OF CLASS C (PRODUCT LEVEL)
From the Above Summary of the PRODUCT LEVEL DATA, we came to know that :-
 Sales Quantity of Class A is more as compared to Class B and Class C.
 Marketing Spent of Class A is more as compared to Class B and Class C.
 Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In
other words, we can say that A comprises of more numbers of expensive Sales History.
 It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A
is 520 which is a way more expensive than respective values of Class B and Class C.
 Discount Flag of C is more as compared to Class A and Class B.
 Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B
and Class C.
Correlation Of Product Level Data
From the above table we see the significant Correlation between Marketing Spent and Sales Quantity and
Correlation between Unit Price and Unit Cost
Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent
Unit Cost 1 -0.082260483 0.947731937 0.092942086 0.12113583
Sales Quantity -0.082260483 1 -0.080638021 0.017179196 0.694245221
Unit Price 0.947731937 -0.080638021 1 0.130603683 0.258419279
Discount Flag 0.092942086 0.017179196 0.130603683 1 0.053700919
Marketing Spent 0.12113583 0.694245221 0.258419279 0.053700919 1
TABLE 13 CORRELATION OF PRODUCT LEVEL DATA
13
FusionOps|7/18/2016
Solution-2
Time Series Analytic
Product-1
1. Data
2. Models
14
FusionOps|7/18/2016
15
FusionOps|7/18/2016
3. Comparison
16
FusionOps|7/18/2016
Product-2
1.Data
2. Models
17
FusionOps|7/18/2016
18
FusionOps|7/18/2016
3.Comparisons
19
FusionOps|7/18/2016
Product-3
1. Data
2. Models
20
FusionOps|7/18/2016
21
FusionOps|7/18/2016
3. Comparison
22
FusionOps|7/18/2016
Product-4
1. Data
2. Models
23
FusionOps|7/18/2016
24
FusionOps|7/18/2016
3. Comparisons
25
FusionOps|7/18/2016
Product-5
1. Data
2. Models
26
FusionOps|7/18/2016
27
FusionOps|7/18/2016
3. Comparisons
28
FusionOps|7/18/2016
Product-6
1. Data
2. Models
29
FusionOps|7/18/2016
30
FusionOps|7/18/2016
3. Comparisons
31
FusionOps|7/18/2016
Product-7
1. Data
2. Models
32
FusionOps|7/18/2016
33
FusionOps|7/18/2016
3. Comparisons
34
FusionOps|7/18/2016
Product-8
1. Data
2. Models
35
FusionOps|7/18/2016
36
FusionOps|7/18/2016
3. Comparisons
37
FusionOps|7/18/2016
Product-9
1. Data
2. Models
38
FusionOps|7/18/2016
39
FusionOps|7/18/2016
3. Comparisons
40
FusionOps|7/18/2016
Product-10
1. Data
2.Models
41
FusionOps|7/18/2016
42
FusionOps|7/18/2016
3.Comparisons
43
FusionOps|7/18/2016
Results And Conclusions:-
From the Data of all the Product ,ARIMA stands out as the best model we have use in R Statistical
Programming Language from Time Series Analysis in which an Auto.Arima() function automatically
calculate ‘p’ (lag for AR) , ‘q’ (lag for MA) and ‘d’ (Stationary Flag) based on AIC (Akaike’s An Information
Criterion)
Please find the attached result of Time Series Forecasting for the month of January , February and March for
all the Product.
It is interesting to see all the comparison histogram of AIC , ARIMA as the minimum AIC except in products where
Exponential Smoothing is just better than ARIMA.
In Product 2, Product 4, Product 5, Product 6 and Product 7 shows an incremental trend in the Demand
whereas the rest has a fluctuating Demand along a constant rolling mean.
Interestingly Product 3 shows a decremented trend in the Demand with respect to Time.
The Product 4, Product 5, Product 6, Product 7, Product 8,Product 9 also shows the Seasonal
Behaviour.
Product January February March
Prod_1 540 476 466
Prod_2 2935 2941 2951
Prod_3 3279 3225 3229
Prod_4 8033 8108 7849
Prod_5 12450 12174 12255
Prod_6 1578 1609 1679
Prod_7 21174 20865 20204
Prod_8 233 284 257
Prod_9 82 125 60
Prod_10 1 0 2
44
FusionOps|7/18/2016
Code :
library(dataiku)
library(forecast)
library(dplyr)
#importing data for the DataSet-2
demandingdata <- read.csv('/home/prerna/Documents/DSDeveloperTest/DataSet-2.csv')
demandingdata= transform(demandingdata, ym = as.yearmon(as.character(demandingdata$Date), "%Y%m"))
pr=c()
for (i in unique(demandingdata$Product)) {
#creating temporary data for the products
temp=demandingdata[demandingdata$Product==i,]
#time series forecasting for the Demand
timedata=ts(temp$Demand,start = 2013,frequency = 12)
plot(timedata)
#forcast for the Exponential Smooting
m_ets = ets(timedata)
f_ets = forecast(m_ets, h=3) # forecast 24 months into the future
plot(f_ets)
#applying auto.arima() function for the forecast
m_aa = auto.arima(timedata)
f_aa = forecast(m_aa, h=3)
plot(f_aa)
#TBATS model for the forecast.
m_tbats = tbats(timedata)
f_tbats = forecast(m_tbats, h=3)
plot(f_tbats)
#Barplot for the ETS, ARIMA and TBATS
barplot(c(ETS=m_ets$aic, ARIMA=m_aa$aic, TBATS=m_tbats$AIC), col="light blue", ylab="AIC")
last_date = index(timedata)[length(timedata)]
#forecast for the predicted result of the Products
forecast_df =f_aa
x=as.data.frame(f_aa)
pr=rbind(pr,x$`Point Forecast`)
}
#exporting csv
write.csv(pr,file='predictedresult.csv')
45
FusionOps|7/18/2016
Solution-3
Task-1
1)Creating Data
a) $mysql -u root -p ******
b) Mysql> create database FusionOps;
2)Pushing CSV’s to MySQL Using Python:-
#!/usr/bin/python
import MySQLdb
from pandas.io import sql
import pandas as pd
# Open database connection
db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" )
cursor = db.cursor()
df=pd.read_csv("/home/prerna/Desktop/SalesData.csv")
df['Sales Order Date']=pd.to_datetime(df['Sales Order Date'])
df.to_sql(con=db, name='SalesData', if_exists='replace', flavor='mysql')
sd=cursor.execute("select * from SalesData;")
print sd
df=pd.read_csv("/home/prerna/Desktop/PurchasingData.csv")
df['Replenishment Date']=pd.to_datetime(df['Replenishment Date'])
df.to_sql(con=db, name='PurchasingData', if_exists='replace', flavor='mysql')
pd1=cursor.execute("select * from PurchasingData;")
print pd1
46
FusionOps|7/18/2016
Task-2
1)Creating DailySalesAndStockData table:-
#!/usr/bin/python
from datetime import date,datetime
import MySQLdb
from pandas.io import sql
import pandas as pd
# Open database connection
db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" )
cursor = db.cursor()
data_you_need=pd.DataFrame(columns=['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day
Stock'])
df1=pd.read_sql('select * from SalesData GROUP BY CONCAT(Part_Number,ShopNo);', con=db)
df1=df1[list(['Part_Number','ShopNo'])]
for index, row in df1.iterrows():
df_sale = pd.read_sql('select * from SalesData where Part_Number= ''+str(row['Part_Number'])+'' and
ShopNo=''+str(row['ShopNo'])+'';', con=db)
df_purchase= pd.read_sql('select * from PurchasingData where Part_Number= ''+str(row['Part_Number'])+'' and
ShopNo=''+str(row['ShopNo'])+'';', con=db)
date1=pd.date_range(date(2014,12,15), date(2015,3,31), freq='D')
df=pd.DataFrame(date1, index=date1,columns=['Date'])
df['PartNo']=str(row['Part_Number'])
df['ShopNo']=str(row['ShopNo'])
result = pd.merge(df, df_sale[list(['Sales_Order_Date','Sales_Quantity'])], how='left', left_on=['Date'],
right_on=['Sales_Order_Date'])
df = result.fillna(0)
result = pd.merge(df, df_purchase[list(['Replenishment_Date','Quantity_Produced/Bought'])], how='left',
left_on=['Date'], right_on=['Replenishment_Date'])
result= result[list(['Date','PartNo','ShopNo','Sales_Quantity','Quantity_Produced/Bought'])].fillna(0)
47
FusionOps|7/18/2016
result['Sales_Quantity_Cum']=result.Sales_Quantity.cumsum()
result['Quantity_Produced/Bought_Cum']=result['Quantity_Produced/Bought'].cumsum()
result['End-of-day Stock']=result['Quantity_Produced/Bought_Cum']-result['Sales_Quantity_Cum']
result=result[list(['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day Stock'])]
data_you_need=pd.concat([data_you_need,result[result['Date']>date(2014,12,31)]],ignore_index=True)
data_you_need.to_csv('out.csv',date_format='%d %b %Y')
data_you_need.to_sql(con=db, name='DailySalesAndStockData', if_exists='replace', flavor='mysql')
2)Please Find the attached Solution4.csv

More Related Content

Similar to FusionOps

Example of Business Operations Analysis
Example of Business Operations AnalysisExample of Business Operations Analysis
Example of Business Operations Analysis
Bojan Mitrovic, M.A.
 
Vera Bradley Final Presentation
Vera Bradley Final PresentationVera Bradley Final Presentation
Vera Bradley Final Presentation
Bashayer Baljon
 
Sheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docx
Sheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docxSheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docx
Sheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docx
bagotjesusa
 

Similar to FusionOps (20)

Marketing Plan Model
Marketing Plan ModelMarketing Plan Model
Marketing Plan Model
 
Krokosz lecture3 cost-volume-profit analysis
Krokosz lecture3 cost-volume-profit analysisKrokosz lecture3 cost-volume-profit analysis
Krokosz lecture3 cost-volume-profit analysis
 
Shield Hand and Body Sanitizing Lotion
Shield Hand and Body Sanitizing LotionShield Hand and Body Sanitizing Lotion
Shield Hand and Body Sanitizing Lotion
 
Chapter_09.ppt
Chapter_09.pptChapter_09.ppt
Chapter_09.ppt
 
Example of Business Operations Analysis
Example of Business Operations AnalysisExample of Business Operations Analysis
Example of Business Operations Analysis
 
Operations performance analysis that facilitates an informed decision making ...
Operations performance analysis that facilitates an informed decision making ...Operations performance analysis that facilitates an informed decision making ...
Operations performance analysis that facilitates an informed decision making ...
 
Budgetary control
Budgetary controlBudgetary control
Budgetary control
 
Transforming big data into supply chain analytics
Transforming big data into supply chain analyticsTransforming big data into supply chain analytics
Transforming big data into supply chain analytics
 
Statistics homework help
Statistics homework helpStatistics homework help
Statistics homework help
 
Bach-Business Simulation
Bach-Business SimulationBach-Business Simulation
Bach-Business Simulation
 
Comeos Blend360 Retail Value from Data Content v05.pdf
Comeos Blend360 Retail Value from Data Content v05.pdfComeos Blend360 Retail Value from Data Content v05.pdf
Comeos Blend360 Retail Value from Data Content v05.pdf
 
Slide Makeover #77: When you are forced to show a large spreadsheet
Slide Makeover #77: When you are forced to show a large spreadsheetSlide Makeover #77: When you are forced to show a large spreadsheet
Slide Makeover #77: When you are forced to show a large spreadsheet
 
Vera Bradley Final Presentation
Vera Bradley Final PresentationVera Bradley Final Presentation
Vera Bradley Final Presentation
 
Value Line Investment Research
Value Line Investment ResearchValue Line Investment Research
Value Line Investment Research
 
13 kp is
13 kp is13 kp is
13 kp is
 
Sheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docx
Sheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docxSheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docx
Sheet1ASSIGNMENTBeeGee Company, operating at full capacity, sold 1.docx
 
Final project of Managerial Accounting
Final project of Managerial AccountingFinal project of Managerial Accounting
Final project of Managerial Accounting
 
Price Elasticity in B2B
Price Elasticity in B2BPrice Elasticity in B2B
Price Elasticity in B2B
 
The Numbers
The NumbersThe Numbers
The Numbers
 
Scoring MODEL
Scoring MODELScoring MODEL
Scoring MODEL
 

FusionOps

  • 1. PRERNA SHARMA July 18, 2016 FusionOps DS Developer Test
  • 2. 1 FusionOps|7/18/2016 FusionOps DS Developer Test Solution -1 Flow Chart Sales Data Analysis Summary:- Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag Count 4000 4000 4000 4000 4000 4000 Mean 20132640.4 5047.348838 4.12248 12.1791675 193.22525 0.3215 Std 9731.630793 26503.76132 8.65621441 25.80803025 318.6396473 0.467110584 Min 20120131 9 0.07 0.32 3 0 25% 20121105.25 408.75 0.91 2.9 60 0 50% 20130880.5 963 1.46 4.32 108 0 75% 20140655.25 2117 2.2 6.51 203.25 1 Max 20150430 390630 52.38 166.43 3966 1 TABLE 1:- SUMMARY OF THE SALES LEVEL DATA
  • 3. 2 FusionOps|7/18/2016 In the above table, we found the proper details (i.e Central Tendency and Variations) of the given data. As the better result Mean of the Sales Quantity is 5047.34883 and the Median is 6899.864279. It means that the Median of the Sales Quantity is Higher than the Mean of the Sales Quantity. Now, Mean of the Marketing Spent is 193.22525 and Median of the Marketing Spent is 155.981928. It means that Mean of the Marketing Spent is higher than the Median of the Marketing Spent. Uni-Variate and Bi-Variate Analysis:- FIGURE 1:-PAIR WISE PLOT OF SALES LEVEL ANALYSIS
  • 4. 3 FusionOps|7/18/2016 FIGURE 2:- BARPLOT OF ABC CLASS From the above Fiqure:-2 , we found the total count values of each category of the ABC Class as shown below:- ABC Class Total Counts Discount Flags A 1080 351 B 1480 453 C 1440 482 TABLE 2 CATEGORIES OF CLASS ABC
  • 5. 4 FusionOps|7/18/2016 Summary Of the Categories of the Class ABC(Product Segmentation):- TABLE 3:-SUMMARY OF CATEGORY A TABLE 4:- SUMMARY OF CLASS B Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag Count 1080 1080 1080 1080 1080 1080 Mean 20132640.4 16049.80648 4.633481481 17.26198148 453.6935185 0.325 Std 9734.921816 49349.07531 8.36362872 35.78886613 520.6330711 0.468591841 Min 20120131 47 0.07 0.32 54 0 25% 20121105.25 1266.75 0.45 1.55 190.5 0 50% 20130880.5 4185.5 1.03 3.12 297 0 75% 20140655.25 10848 2.9 6.725 479.25 1 Max 20150430 390630 32.39 166.43 3966 1 Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag count 1480 1480 1480 1480 1480 1480 mean 20132640.4 1460.31 4.752783784 12.00458108 132.7682432 0.306081081 std 9733.702872 992.2210921 9.222109011 22.10871481 71.55556198 0.461019588 min 20120131 33.9 0.45 1.78 27 0 25% 20121105.25 783.5 0.8875 2.77 75 0 50% 20130880.5 1394 1.2 3.29 117 0 75% 20140655.25 1937.5 2.32 7.65 173.25 1 max 20150430 6951 39.21 110.8 531 1
  • 6. 5 FusionOps|7/18/2016 Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag count 1440 1440 1440 1440 1440 1440 mean 20132640.4 482.1844097 3.091416667 8.546493056 60.01041667 0.334722222 std 9733.794272 294.1944846 8.167083891 18.85617808 34.8026187 0.472057205 min 20120131 9 0.66 3.02 3 0 25% 20121105.25 248 1.29 4.12 33 0 50% 20130880.5 464.5 1.74 5.01 53 0 75% 20140655.25 646.25 2.02 6.4725 78 1 max 20150430 1895 52.38 134.41 216 1 TABLE 5:- SUMMARY OF CLASS C From the Above Summary, we came to know that :-  Sales Quantity of Class A is more as compared to Class B and Class C.  Marketing Spent of Class A is more as compared to Class B and Class C.  Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In other words, we can say that A comprises of more numbers of expensive Sales History.  It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A is 520 which is a way more expensive than respective values of Class B and Class C.  Discount Flag of C is more as compared to Class A and Class B.  Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B and Class C. Class Total Sales A 17333791.0 B 2161258.8 C 694345.55 TABLE 6 TOTAL SALES
  • 7. 6 FusionOps|7/18/2016 BARPLOT OF DISCOUNT FLAGS FIGURE 3:- BARPLOT OF DISCOUNT FLAG From the above Barplot of Discount Flag, we get that there was more sale of the product under Discount as compared to Promotion as you see in the below table:- Discount Flag Total Counts Promotion 1286 Discount 2714 TABLE 7 TOTAL COUNTS OF DISCOUNT FLAGS As in the above result we see that 67.85 percent of Products are sold under the Discount and the remaining 32.15 percent of the Products are sold under Promotion.
  • 8. 7 FusionOps|7/18/2016 Correlation Of Sales Level Data:- From the above table we see significant Correlation between Marketing Spent and Sales Quantity that is 0.622740066 which means if we increase the Marketing Spent , the Sales Quantity also increase. Correlation between Unit Price and Unit cost is also good that is 0.940819548 which means if we increase the Unit Price , the Unit Cost also increase. Sales Date Sales Quantity Unit Cost Unit Price Marketing Spent Discount Flag Sales Date 1 0.018289693 0.000642644 0.000710024 0.022292488 - 0.006249567 Sales Quantity 0.018289693 1 - 0.079925357 - 0.077920384 0.622740066 0.000442116 Unit Cost 0.000642644 -0.079925357 1 0.940819548 0.106927754 0.00759269 Unit Price 0.000710024 -0.077920384 0.940819548 1 0.237334958 - 0.035743076 Marketing Spent 0.022292488 0.622740066 0.106927754 0.237334958 1 - 0.150267176 Discount Flag - 0.006249567 0.000442116 0.00759269 - 0.035743076 -0.150267176 1 TABLE 8 CORRELATION OF SALES LEVEL DATA
  • 10. 9 FusionOps|7/18/2016 Summary Of Product Level Data:- Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 100 100 100 100 100 mean 4.12248 201893.9535 12.1791675 12.86 7729.01 std 8.689409387 1036170.382 25.76945362 1.675853469 11271.81413 min 0.0775 1116 0.3785 10 931 25% 0.9100625 17890 2.892375 11 2897.25 50% 1.43425 38601.5 4.379625 13 4688 75% 2.157125 85749.45 6.6428125 14 8529.75 max 51.201 10290924 147.223 17 78490 TABLE 9 SUMMARY OF THE PRODUCT LEVEL DATA From the above table , we see the Product level of the Data:- As the Mean of the Sales Quantity is 201893.9535 and the Median is 275403.451160. It means that the Median of the Sales Quantity is Higher than the Mean of the Sales Quantity.
  • 11. 10 FusionOps|7/18/2016 Barplot of Class ABC (Product Level) FIGURE 1 BARPLOT OF CLASS ABC(PRODUCT LEVEL DATA) From the above Fiqure , we found the total count values of each category of the ABC Class as shown below:- Abc Class Total Counts A 27 B 37 C 36 TABLE 10 TOTAL COUNTS OF THE CLASS ABC
  • 12. 11 FusionOps|7/18/2016 Summary of The Class ABC(Product Level Data):- Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 27 27 27 27 27 mean 4.633481481 641992.2593 17.26198148 13 18147.74074 Std 8.509789031 1951783.957 36.22357567 1.61721508 17879.64391 Min 0.0775 3933 0.3785 11 7353 25% 0.454375 56306.5 1.671125 11.5 9903.5 50% 1.005 177058 2.99625 13 12979 75% 2.666375 432854 6.612875 14.5 15234 Max 31.823 10290924 147.223 16 78490 TABLE 11 SUMMARY OF CLASS A (PRODUCT LEVEL) Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 37 37 37 37 37 mean 4.752783784 58412.4 12.00458108 12.24324324 5310.72973 Std 9.335749421 37122.25589 22.24332308 1.516674092 1464.675687 Min 0.485 1754.8 2.1505 10 3181 25% 0.88975 35531 2.81275 11 4255 50% 1.179 61624 3.155 12 5090 75% 2.27975 72011 7.22575 13 5879 Max 38.32575 152059 95.622 15 8979 TABLE 12 SUMMARY OF CLASS B (PRODUCT LEVEL)
  • 13. 12 FusionOps|7/18/2016 Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent count 36 36 36 36 36 mean 3.091416667 19287.37639 8.546493056 13.38888889 2400.416667 std 8.271471763 10679.29319 19.00472153 1.711770642 755.2505876 min 0.716 1116 3.60075 11 931 25% 1.369125 10757.75 4.29525 12 1789.25 50% 1.7075 20211.5 5.016625 13 2385.5 75% 2.0568125 25934 6.2524375 14.25 2921.75 max 51.201 38983 119.047 17 3811 TABLE 13 SUMMARY OF CLASS C (PRODUCT LEVEL) From the Above Summary of the PRODUCT LEVEL DATA, we came to know that :-  Sales Quantity of Class A is more as compared to Class B and Class C.  Marketing Spent of Class A is more as compared to Class B and Class C.  Unit Price of Class A of last 25 percentiles of Sales are more as compared to Class B and Class C. In other words, we can say that A comprises of more numbers of expensive Sales History.  It is very exciting to note that the Standard Deviation of Marketing Spent of the sales of the Class A is 520 which is a way more expensive than respective values of Class B and Class C.  Discount Flag of C is more as compared to Class A and Class B.  Sales Quantity of Class A :- Product in the category Class A are in huge demand as compared to Class B and Class C. Correlation Of Product Level Data From the above table we see the significant Correlation between Marketing Spent and Sales Quantity and Correlation between Unit Price and Unit Cost Unit Cost Sales Quantity Unit Price Discount Flag Marketing Spent Unit Cost 1 -0.082260483 0.947731937 0.092942086 0.12113583 Sales Quantity -0.082260483 1 -0.080638021 0.017179196 0.694245221 Unit Price 0.947731937 -0.080638021 1 0.130603683 0.258419279 Discount Flag 0.092942086 0.017179196 0.130603683 1 0.053700919 Marketing Spent 0.12113583 0.694245221 0.258419279 0.053700919 1 TABLE 13 CORRELATION OF PRODUCT LEVEL DATA
  • 44. 43 FusionOps|7/18/2016 Results And Conclusions:- From the Data of all the Product ,ARIMA stands out as the best model we have use in R Statistical Programming Language from Time Series Analysis in which an Auto.Arima() function automatically calculate ‘p’ (lag for AR) , ‘q’ (lag for MA) and ‘d’ (Stationary Flag) based on AIC (Akaike’s An Information Criterion) Please find the attached result of Time Series Forecasting for the month of January , February and March for all the Product. It is interesting to see all the comparison histogram of AIC , ARIMA as the minimum AIC except in products where Exponential Smoothing is just better than ARIMA. In Product 2, Product 4, Product 5, Product 6 and Product 7 shows an incremental trend in the Demand whereas the rest has a fluctuating Demand along a constant rolling mean. Interestingly Product 3 shows a decremented trend in the Demand with respect to Time. The Product 4, Product 5, Product 6, Product 7, Product 8,Product 9 also shows the Seasonal Behaviour. Product January February March Prod_1 540 476 466 Prod_2 2935 2941 2951 Prod_3 3279 3225 3229 Prod_4 8033 8108 7849 Prod_5 12450 12174 12255 Prod_6 1578 1609 1679 Prod_7 21174 20865 20204 Prod_8 233 284 257 Prod_9 82 125 60 Prod_10 1 0 2
  • 45. 44 FusionOps|7/18/2016 Code : library(dataiku) library(forecast) library(dplyr) #importing data for the DataSet-2 demandingdata <- read.csv('/home/prerna/Documents/DSDeveloperTest/DataSet-2.csv') demandingdata= transform(demandingdata, ym = as.yearmon(as.character(demandingdata$Date), "%Y%m")) pr=c() for (i in unique(demandingdata$Product)) { #creating temporary data for the products temp=demandingdata[demandingdata$Product==i,] #time series forecasting for the Demand timedata=ts(temp$Demand,start = 2013,frequency = 12) plot(timedata) #forcast for the Exponential Smooting m_ets = ets(timedata) f_ets = forecast(m_ets, h=3) # forecast 24 months into the future plot(f_ets) #applying auto.arima() function for the forecast m_aa = auto.arima(timedata) f_aa = forecast(m_aa, h=3) plot(f_aa) #TBATS model for the forecast. m_tbats = tbats(timedata) f_tbats = forecast(m_tbats, h=3) plot(f_tbats) #Barplot for the ETS, ARIMA and TBATS barplot(c(ETS=m_ets$aic, ARIMA=m_aa$aic, TBATS=m_tbats$AIC), col="light blue", ylab="AIC") last_date = index(timedata)[length(timedata)] #forecast for the predicted result of the Products forecast_df =f_aa x=as.data.frame(f_aa) pr=rbind(pr,x$`Point Forecast`) } #exporting csv write.csv(pr,file='predictedresult.csv')
  • 46. 45 FusionOps|7/18/2016 Solution-3 Task-1 1)Creating Data a) $mysql -u root -p ****** b) Mysql> create database FusionOps; 2)Pushing CSV’s to MySQL Using Python:- #!/usr/bin/python import MySQLdb from pandas.io import sql import pandas as pd # Open database connection db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" ) cursor = db.cursor() df=pd.read_csv("/home/prerna/Desktop/SalesData.csv") df['Sales Order Date']=pd.to_datetime(df['Sales Order Date']) df.to_sql(con=db, name='SalesData', if_exists='replace', flavor='mysql') sd=cursor.execute("select * from SalesData;") print sd df=pd.read_csv("/home/prerna/Desktop/PurchasingData.csv") df['Replenishment Date']=pd.to_datetime(df['Replenishment Date']) df.to_sql(con=db, name='PurchasingData', if_exists='replace', flavor='mysql') pd1=cursor.execute("select * from PurchasingData;") print pd1
  • 47. 46 FusionOps|7/18/2016 Task-2 1)Creating DailySalesAndStockData table:- #!/usr/bin/python from datetime import date,datetime import MySQLdb from pandas.io import sql import pandas as pd # Open database connection db = MySQLdb.connect("localhost","root","prerna1289","FusionOps" ) cursor = db.cursor() data_you_need=pd.DataFrame(columns=['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day Stock']) df1=pd.read_sql('select * from SalesData GROUP BY CONCAT(Part_Number,ShopNo);', con=db) df1=df1[list(['Part_Number','ShopNo'])] for index, row in df1.iterrows(): df_sale = pd.read_sql('select * from SalesData where Part_Number= ''+str(row['Part_Number'])+'' and ShopNo=''+str(row['ShopNo'])+'';', con=db) df_purchase= pd.read_sql('select * from PurchasingData where Part_Number= ''+str(row['Part_Number'])+'' and ShopNo=''+str(row['ShopNo'])+'';', con=db) date1=pd.date_range(date(2014,12,15), date(2015,3,31), freq='D') df=pd.DataFrame(date1, index=date1,columns=['Date']) df['PartNo']=str(row['Part_Number']) df['ShopNo']=str(row['ShopNo']) result = pd.merge(df, df_sale[list(['Sales_Order_Date','Sales_Quantity'])], how='left', left_on=['Date'], right_on=['Sales_Order_Date']) df = result.fillna(0) result = pd.merge(df, df_purchase[list(['Replenishment_Date','Quantity_Produced/Bought'])], how='left', left_on=['Date'], right_on=['Replenishment_Date']) result= result[list(['Date','PartNo','ShopNo','Sales_Quantity','Quantity_Produced/Bought'])].fillna(0)
  • 48. 47 FusionOps|7/18/2016 result['Sales_Quantity_Cum']=result.Sales_Quantity.cumsum() result['Quantity_Produced/Bought_Cum']=result['Quantity_Produced/Bought'].cumsum() result['End-of-day Stock']=result['Quantity_Produced/Bought_Cum']-result['Sales_Quantity_Cum'] result=result[list(['Date','PartNo', 'ShopNo' ,'Sales_Quantity' ,'Sales_Quantity_Cum' ,'End-of-day Stock'])] data_you_need=pd.concat([data_you_need,result[result['Date']>date(2014,12,31)]],ignore_index=True) data_you_need.to_csv('out.csv',date_format='%d %b %Y') data_you_need.to_sql(con=db, name='DailySalesAndStockData', if_exists='replace', flavor='mysql') 2)Please Find the attached Solution4.csv