In this paper, feature tracking based and histogram based traffic congestion detection systems are developed. Developed all system are designed to run as real time application. In this work, ORB (Oriented FAST and Rotated BRIEF) feature extraction method have been used to develop feature tracking based traffic congestion solution. ORB is a rotation invariant, fast and resistant to noise method and contains the power of FAST and BRIEF feature extraction methods. Also, two different approaches, which are standard deviation and weighed average, have been applied to find out the congestion information by using histogram of the image to develop histogram based traffic congestion solution. Both systems have been tested on different weather conditions such as cloudy, sunny and rainy to provide various illumination at both daytime and night. For all developed systems performance results are examined to show the advantages and drawbacks of these systems.
Thesis - Mechanizing optimization of warehouses by implementation of machine ...Shrikant Samarth
Task: As Research Project is part of a postgraduate course it is also required that students employ and
develop their research knowledge and skills in an applied fashion. The Research Project must
involve the identification, generation, or collation of relevant primary or secondary data and the
ability to analyze them in a meaningful and critical manner.
Approach: Data was taken from a working organization to resolve the issue regarding the space optimization of the warehouse which results into losses to the company.
Findings: Ada boosting algorithm works best for identification of the blowout products beforehand which would help warehouse manager to apply strategies on the products which would take time to sell. So that, the losses associated to such products can be avoided.
Tools: Python programming, Excel visualizations, Overleaf latex
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...ijtsrd
"The purpose of this paper is to develop a forecasting model for retailers based on customer segmentation, to improve the performance of inventory. The research makes an attempt to capture the knowledge of segmenting the customers based on various attributes as an input to the demand forecasting in a retail store. The paper suggests a data mining model which has been used for forecasting demand. The proposed model has been applied for forecasting for grocery items in a supermarket. Based on the proposed forecasting model, the inventory performance has been studied by simulation. Hence, the proposed model in the paper results in improved performance of inventory. Retailers can make use of the proposed model for demand forecasting of various items to improve the inventory performance and profitability of operations. With the advent of data mining systems which have given rise to the use of business intelligence in various domains. Kayalvizhi Subramanian | Gunasekar Thangarasu ""Developing a Forecasting Model for Retailers Based on Customer Segmentation using Data Mining Techniques"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | International Conference on Advanced Engineering and Information Technology , November 2018, URL: https://www.ijtsrd.com/papers/ijtsrd19127.pdf
Paper URL: https://www.ijtsrd.com/computer-science/data-miining/19127/developing-a-forecasting-model-for-retailers-based-on-customer-segmentation-using-data-mining-techniques/kayalvizhi-subramanian"
NEW MARKET SEGMENTATION METHODS USING ENHANCED (RFM), CLV, MODIFIED REGRESSIO...ijcsit
A widely used approach for gaining insight into the heterogeneity of consumer’s buying behavior is market segmentation. Conventional market segmentation models often ignore the fact that consumers’ behavior may evolve over time. Therefore retailers consume limited resources attempting to service unprofitable consumers. This study looks into the integration between enhanced Recency, Frequency, Monetary (RFM) scores and Consumer Lifetime Value (CLV) matrix for a medium size retailer in the State of Kuwait. A modified regression algorithm investigates the consumer purchase trend gaining knowledge from a pointof-sales data warehouse. In addition, this study applies enhanced normal distribution formula to remove outliers, followed by soft clustering Fuzzy C-Means and hard clustering Expectation Maximization (EM) algorithms to the analysis of consumer buying behavior. Using cluster quality assessment shows EM algorithm scales much better than Fuzzy C-Means algorithm with its ability to assign good initial points in the smaller dataset.
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...ijaia
This paper uses a case based study – “product sales estimation” on real-time data to help us understand
the applicability of linear and non-linear models in machine learning and data mining. A systematic
approach has been used here to address the given problem statement of sales estimation for a particular set
of products in multiple categories by applying both linear and non-linear machine learning techniques on
a data set of selected features from the original data set. Feature selection is a process that reduces the
dimensionality of the data set by excluding those features which contribute minimal to the prediction of the
dependent variable. The next step in this process is training the model that is done using multiple
techniques from linear & non-linear domains, one of the best ones in their respective areas. Data Remodeling
has then been done to extract new features from the data set by changing the structure of the
dataset & the performance of the models is checked again. Data Remodeling often plays a very crucial and
important role in boosting classifier accuracies by changing the properties of the given dataset. We then try
to explore and analyze the various reasons due to which one model performs better than the other & hence
try and develop an understanding about the applicability of linear & non-linear machine learning models.
The target mentioned above being our primary goal, we also aim to find the classifier with the best possible
accuracy for product sales estimation in the given scenario.
Thesis - Mechanizing optimization of warehouses by implementation of machine ...Shrikant Samarth
Task: As Research Project is part of a postgraduate course it is also required that students employ and
develop their research knowledge and skills in an applied fashion. The Research Project must
involve the identification, generation, or collation of relevant primary or secondary data and the
ability to analyze them in a meaningful and critical manner.
Approach: Data was taken from a working organization to resolve the issue regarding the space optimization of the warehouse which results into losses to the company.
Findings: Ada boosting algorithm works best for identification of the blowout products beforehand which would help warehouse manager to apply strategies on the products which would take time to sell. So that, the losses associated to such products can be avoided.
Tools: Python programming, Excel visualizations, Overleaf latex
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...ijtsrd
"The purpose of this paper is to develop a forecasting model for retailers based on customer segmentation, to improve the performance of inventory. The research makes an attempt to capture the knowledge of segmenting the customers based on various attributes as an input to the demand forecasting in a retail store. The paper suggests a data mining model which has been used for forecasting demand. The proposed model has been applied for forecasting for grocery items in a supermarket. Based on the proposed forecasting model, the inventory performance has been studied by simulation. Hence, the proposed model in the paper results in improved performance of inventory. Retailers can make use of the proposed model for demand forecasting of various items to improve the inventory performance and profitability of operations. With the advent of data mining systems which have given rise to the use of business intelligence in various domains. Kayalvizhi Subramanian | Gunasekar Thangarasu ""Developing a Forecasting Model for Retailers Based on Customer Segmentation using Data Mining Techniques"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | International Conference on Advanced Engineering and Information Technology , November 2018, URL: https://www.ijtsrd.com/papers/ijtsrd19127.pdf
Paper URL: https://www.ijtsrd.com/computer-science/data-miining/19127/developing-a-forecasting-model-for-retailers-based-on-customer-segmentation-using-data-mining-techniques/kayalvizhi-subramanian"
NEW MARKET SEGMENTATION METHODS USING ENHANCED (RFM), CLV, MODIFIED REGRESSIO...ijcsit
A widely used approach for gaining insight into the heterogeneity of consumer’s buying behavior is market segmentation. Conventional market segmentation models often ignore the fact that consumers’ behavior may evolve over time. Therefore retailers consume limited resources attempting to service unprofitable consumers. This study looks into the integration between enhanced Recency, Frequency, Monetary (RFM) scores and Consumer Lifetime Value (CLV) matrix for a medium size retailer in the State of Kuwait. A modified regression algorithm investigates the consumer purchase trend gaining knowledge from a pointof-sales data warehouse. In addition, this study applies enhanced normal distribution formula to remove outliers, followed by soft clustering Fuzzy C-Means and hard clustering Expectation Maximization (EM) algorithms to the analysis of consumer buying behavior. Using cluster quality assessment shows EM algorithm scales much better than Fuzzy C-Means algorithm with its ability to assign good initial points in the smaller dataset.
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...ijaia
This paper uses a case based study – “product sales estimation” on real-time data to help us understand
the applicability of linear and non-linear models in machine learning and data mining. A systematic
approach has been used here to address the given problem statement of sales estimation for a particular set
of products in multiple categories by applying both linear and non-linear machine learning techniques on
a data set of selected features from the original data set. Feature selection is a process that reduces the
dimensionality of the data set by excluding those features which contribute minimal to the prediction of the
dependent variable. The next step in this process is training the model that is done using multiple
techniques from linear & non-linear domains, one of the best ones in their respective areas. Data Remodeling
has then been done to extract new features from the data set by changing the structure of the
dataset & the performance of the models is checked again. Data Remodeling often plays a very crucial and
important role in boosting classifier accuracies by changing the properties of the given dataset. We then try
to explore and analyze the various reasons due to which one model performs better than the other & hence
try and develop an understanding about the applicability of linear & non-linear machine learning models.
The target mentioned above being our primary goal, we also aim to find the classifier with the best possible
accuracy for product sales estimation in the given scenario.
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionPranov Mishra
Problem Statement
One of Unilever’s brands is going through a steep decline in revenues and is requiring major changes in business execution plans. The management is expecting a thorough analysis of historical performances culminating in identification of key factors driving sales.
Data Summary and Product Life Cycle Overview
The data provided constituted more than 30 years of information of sales and related variables.
The training data suggested that the product has gone through a life-cycle of launch, growth and maturity. There were indications of a decline phase in the last few periods of training data.
The test data corroborated the indications as we could notice sharp decline (more than 25%) since 2016.
Key Insights & Driver Analysis
The factors having a significant positive impact on sales volumes were identified to be promotion expenditure, volumes produced or in stock, inflation, rainfall and visibility through social search impressions.
The factors having a significant negative impact on sales volumes were identified to be brand equity, competitor prices, fuel price and digital impressions
Forecasting
Multiple approaches were attempted including ARIMA, Holt Winter’s Double Exponential Smoothing, Bayesian approach(BSTS) and LSTM
The best results were achieved when training data was combined with 2 years of test data to capture the decline phases. MAPE of 25% achieved with Holt Winter followed by ARIMA with a mape of 33%.
For the second problem statement that required training on test data only, best results were achieved through the bsts model followed by LSTM. Mapes of 5% and 13% respectively were achieved.
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
The aim of this project is to help a telecom company with insights on customer behavior that would be useful for retention of customers. The specific goals expected to be achieved are given below
1. Identification of the top variables driving likelihood of churn
2. Build a predictive model to identify customers who have highest probability to terminate services with the company.
3. Build a lift chart for optimization of efforts by targeting most of the potential churns with least contact efforts. Here with 30% of the total customer pool, the model accurately provides 33% of total potential churn candidates.
Models tried to arrive at the best are
1. Simple Models like Logistic Regression & Discriminant Analysis with different thresholds for classification
2. Random Forest after balancing the dataset using Synthetic Minority Oversampling Technique (SMOTE)
3. Ensemble of five individual models and predicting the output by averaging the individual output probabilities
4. Xgboost algorithm
This paper investigates if forecasting models based on Machine Learning (ML) Algorithms are capable to predict intraday prices in the small, frontier stock market of Romania. The results show that this is indeed the case. Moreover, the prediction accuracy of the various models improves as the forecasting horizon increases. Overall, ML forecasting models are superior to the passive buy and hold strategy, as well as to a naïve strategy that always predicts the last known price action will continue. However, we also show that this superior predictive ability cannot be converted into “abnormal”, economically significant profits after considering transaction costs. This implies that intraday stock prices incorporate information within the accepted bounds of weak-form market efficiency, and cannot be “timed” even by sophisticated investors equipped with state of the art ML prediction models.
Customer Clustering Based on Customer Purchasing Sequence DataIJERA Editor
Customer clustering has become a priority for enterprises because of the importance of customer relationship management. Customer clustering can improve understanding of the composition and characteristics of customers, thereby enabling the creation of appropriate marketing strategies for each customer group. Previously, different customer clustering approaches have been proposed according to data type, namely customer profile data, customer value data, customer transaction data, and customer purchasing sequence data. This paper considers the customer clustering problem in the context of customer purchasing sequence data. However, two major aspects distinguish this paper from past research: (1) in our model, a customer sequence contains itemsets, which is a more realistic configuration than previous models, which assume a customer sequence would merely consist of items; and (2) in our model, a customer may belong to multiple clusters or no cluster, whereas in existing models a customer is limited to only one cluster. The second difference implies that each cluster discovered using our model represents a crucial type of customer behavior and that a customer can exhibit several types of behavior simultaneously. Finally, extensive experiments are conducted through a retail data set, and the results show that the clusters obtained by our model can provide more accurate descriptions of customer purchasing behaviors.
The Destruction of Price-RepresentativenessAJHSSR Journal
ABSTRACT : The development of industry 4.0 and e-commerce destroy the traditional mechanism of price
determination, the rigidity of supply in the short run and the idea of price representativeness. Industry 4.0 has
changed the traditional view of price formation. Firms know the individual purchasing history of customers.
Firms can extract the reserve price for each individual due to big data. Price is no more the encounter of supply
and demand, but it is determinated considering the maximum amount that individuals can pay. The combination
of data, dynamic pricing and price discrimination has destroyed one of the pillars of the mainstream economics:
price representativeness. Dynamic pricing is the ability to change prices. Price discrimination is the ability to
apply different prices for different customers for the same product or service.
Building solid marketing strategies in today’s competitive market is impossible without sound market research. The right market information can boost your sales, position your product more effectively, and help you speak more effectively to your audience.
Reinforce and focus your marketing research skills. This highly interactive program, facilitated by an experienced marketing research professional, can provide you with the knowledge and tools you need to develop and manage research projects to meet your specific goals. Furthermore, the workshop debunks the myth that you have to spend a lot of money to gain valuable information for decision making. No prior marketing research experience is required!
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...ijaia
Sales forecasting became crucial for industries in past decades with rapid globalization, widespread adoption of information technology towards e-business, understanding market fluctuations, meeting business plans, and avoiding loss of sales. This research precisely predicts the automotive industry sales using a bag of multiple machine learning and time series algorithms coupled with historical sales and auxiliary features. Three-year historical sales data (from 2017 till 2020) were used for the model building or training, and one-year (2020-2021) predictions were computed for 900 unique SKU's (stock-keeping units). In the present study, the SKU is a combination of sales office, core business field, and material customer group. Various data cleaning and exploratory data analysis algorithms were implemented over raw datasets before use for modeling. Mean absolute percentage error (mape) were estimated for individual predictions from time series and machine learning models. The best model was selected for unique SKU's as per the most negligible mape value.
By applying RapidMiner workflows has been processed a dataset originated from different data files, and containing information about the sales over three years of a large chain of retail stores. Subsequently, has been constructed a Deep Learning model performing a predictive algorithm suitable for sales forecasting. This model is based on artificial neural network –ANN- algorithm able to learn the model starting from sales historical data and by pre-processing the data. The best built model uses a multilayer neural network together with an “optimized operator” able to find automatically the best parameter setting of the implemented algorithm. In order to prove the best performing predictive model, other machine learning algorithms have been tested. The performance comparison has been performed between Support Vector Machine –SVM-, k-Nearest Neighbor k-NN-,Gradient Boosted Trees, Decision Trees, and Deep Learning algorithms. The comparison of the degree of correlation between real and predicted values, the average absolute error and the relative average error proved that ANN exhibited the best performance. The Gradient Boosted Trees approach represents an alternative approach having the second best performance. The case of study has been developed within the framework of an industry project oriented on the integration of high performance data mining models able to predict sales using–ERP- and customer relationship management –CRM- tools.
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionPranov Mishra
Problem Statement
One of Unilever’s brands is going through a steep decline in revenues and is requiring major changes in business execution plans. The management is expecting a thorough analysis of historical performances culminating in identification of key factors driving sales.
Data Summary and Product Life Cycle Overview
The data provided constituted more than 30 years of information of sales and related variables.
The training data suggested that the product has gone through a life-cycle of launch, growth and maturity. There were indications of a decline phase in the last few periods of training data.
The test data corroborated the indications as we could notice sharp decline (more than 25%) since 2016.
Key Insights & Driver Analysis
The factors having a significant positive impact on sales volumes were identified to be promotion expenditure, volumes produced or in stock, inflation, rainfall and visibility through social search impressions.
The factors having a significant negative impact on sales volumes were identified to be brand equity, competitor prices, fuel price and digital impressions
Forecasting
Multiple approaches were attempted including ARIMA, Holt Winter’s Double Exponential Smoothing, Bayesian approach(BSTS) and LSTM
The best results were achieved when training data was combined with 2 years of test data to capture the decline phases. MAPE of 25% achieved with Holt Winter followed by ARIMA with a mape of 33%.
For the second problem statement that required training on test data only, best results were achieved through the bsts model followed by LSTM. Mapes of 5% and 13% respectively were achieved.
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
The aim of this project is to help a telecom company with insights on customer behavior that would be useful for retention of customers. The specific goals expected to be achieved are given below
1. Identification of the top variables driving likelihood of churn
2. Build a predictive model to identify customers who have highest probability to terminate services with the company.
3. Build a lift chart for optimization of efforts by targeting most of the potential churns with least contact efforts. Here with 30% of the total customer pool, the model accurately provides 33% of total potential churn candidates.
Models tried to arrive at the best are
1. Simple Models like Logistic Regression & Discriminant Analysis with different thresholds for classification
2. Random Forest after balancing the dataset using Synthetic Minority Oversampling Technique (SMOTE)
3. Ensemble of five individual models and predicting the output by averaging the individual output probabilities
4. Xgboost algorithm
This paper investigates if forecasting models based on Machine Learning (ML) Algorithms are capable to predict intraday prices in the small, frontier stock market of Romania. The results show that this is indeed the case. Moreover, the prediction accuracy of the various models improves as the forecasting horizon increases. Overall, ML forecasting models are superior to the passive buy and hold strategy, as well as to a naïve strategy that always predicts the last known price action will continue. However, we also show that this superior predictive ability cannot be converted into “abnormal”, economically significant profits after considering transaction costs. This implies that intraday stock prices incorporate information within the accepted bounds of weak-form market efficiency, and cannot be “timed” even by sophisticated investors equipped with state of the art ML prediction models.
Customer Clustering Based on Customer Purchasing Sequence DataIJERA Editor
Customer clustering has become a priority for enterprises because of the importance of customer relationship management. Customer clustering can improve understanding of the composition and characteristics of customers, thereby enabling the creation of appropriate marketing strategies for each customer group. Previously, different customer clustering approaches have been proposed according to data type, namely customer profile data, customer value data, customer transaction data, and customer purchasing sequence data. This paper considers the customer clustering problem in the context of customer purchasing sequence data. However, two major aspects distinguish this paper from past research: (1) in our model, a customer sequence contains itemsets, which is a more realistic configuration than previous models, which assume a customer sequence would merely consist of items; and (2) in our model, a customer may belong to multiple clusters or no cluster, whereas in existing models a customer is limited to only one cluster. The second difference implies that each cluster discovered using our model represents a crucial type of customer behavior and that a customer can exhibit several types of behavior simultaneously. Finally, extensive experiments are conducted through a retail data set, and the results show that the clusters obtained by our model can provide more accurate descriptions of customer purchasing behaviors.
The Destruction of Price-RepresentativenessAJHSSR Journal
ABSTRACT : The development of industry 4.0 and e-commerce destroy the traditional mechanism of price
determination, the rigidity of supply in the short run and the idea of price representativeness. Industry 4.0 has
changed the traditional view of price formation. Firms know the individual purchasing history of customers.
Firms can extract the reserve price for each individual due to big data. Price is no more the encounter of supply
and demand, but it is determinated considering the maximum amount that individuals can pay. The combination
of data, dynamic pricing and price discrimination has destroyed one of the pillars of the mainstream economics:
price representativeness. Dynamic pricing is the ability to change prices. Price discrimination is the ability to
apply different prices for different customers for the same product or service.
Building solid marketing strategies in today’s competitive market is impossible without sound market research. The right market information can boost your sales, position your product more effectively, and help you speak more effectively to your audience.
Reinforce and focus your marketing research skills. This highly interactive program, facilitated by an experienced marketing research professional, can provide you with the knowledge and tools you need to develop and manage research projects to meet your specific goals. Furthermore, the workshop debunks the myth that you have to spend a lot of money to gain valuable information for decision making. No prior marketing research experience is required!
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...ijaia
Sales forecasting became crucial for industries in past decades with rapid globalization, widespread adoption of information technology towards e-business, understanding market fluctuations, meeting business plans, and avoiding loss of sales. This research precisely predicts the automotive industry sales using a bag of multiple machine learning and time series algorithms coupled with historical sales and auxiliary features. Three-year historical sales data (from 2017 till 2020) were used for the model building or training, and one-year (2020-2021) predictions were computed for 900 unique SKU's (stock-keeping units). In the present study, the SKU is a combination of sales office, core business field, and material customer group. Various data cleaning and exploratory data analysis algorithms were implemented over raw datasets before use for modeling. Mean absolute percentage error (mape) were estimated for individual predictions from time series and machine learning models. The best model was selected for unique SKU's as per the most negligible mape value.
By applying RapidMiner workflows has been processed a dataset originated from different data files, and containing information about the sales over three years of a large chain of retail stores. Subsequently, has been constructed a Deep Learning model performing a predictive algorithm suitable for sales forecasting. This model is based on artificial neural network –ANN- algorithm able to learn the model starting from sales historical data and by pre-processing the data. The best built model uses a multilayer neural network together with an “optimized operator” able to find automatically the best parameter setting of the implemented algorithm. In order to prove the best performing predictive model, other machine learning algorithms have been tested. The performance comparison has been performed between Support Vector Machine –SVM-, k-Nearest Neighbor k-NN-,Gradient Boosted Trees, Decision Trees, and Deep Learning algorithms. The comparison of the degree of correlation between real and predicted values, the average absolute error and the relative average error proved that ANN exhibited the best performance. The Gradient Boosted Trees approach represents an alternative approach having the second best performance. The case of study has been developed within the framework of an industry project oriented on the integration of high performance data mining models able to predict sales using–ERP- and customer relationship management –CRM- tools.
DATA MINING MODEL PERFORMANCE OF SALES PREDICTIVE ALGORITHMS BASED ON RAPIDMI...ijcsit
By applying RapidMiner workflows has been processed a dataset originated from different data files, and containing information about the sales over three years of a large chain of retail stores. Subsequently, has been constructed a Deep Learning model performing a predictive algorithm suitable for sales forecasting. This model is based on artificial neural network –ANN- algorithm able to learn the model starting from
sales historical data and by pre-processing the data. The best built model uses a multilayer neural network together with an “optimized operator” able to find automatically the best parameter setting of the implemented algorithm. In order to prove the best performing predictive model, other machine learning algorithms have been tested. The performance comparison has been performed between Support Vector
Machine –SVM-, k-Nearest Neighbor k-NN-,Gradient Boosted Trees, Decision Trees, and Deep Learning algorithms. The comparison of the degree of correlation between real and predicted values, the average
absolute error and the relative average error proved that ANN exhibited the best performance. The Gradient Boosted Trees approach represents an alternative approach having the second best performance. The case of study has been developed within the framework of an industry project oriented on the
integration of high performance data mining models able to predict sales using–ERP- and customer relationship management –CRM- tools.
UNDERSTANDING THE APPLICABILITY OF LINEAR & NON-LINEAR MODELS USING A CASE-BA...ijaia
This paper uses a case based study – “product sales estimation” on real-time data to help us understand
the applicability of linear and non-linear models in machine learning and data mining. A systematic
approach has been used here to address the given problem statement of sales estimation for a particular set
of products in multiple categories by applying both linear and non-linear machine learning techniques on
a data set of selected features from the original data set. Feature selection is a process that reduces the
dimensionality of the data set by excluding those features which contribute minimal to the prediction of the
dependent variable. The next step in this process is training the model that is done using multiple
techniques from linear & non-linear domains, one of the best ones in their respective areas. Data Remodeling
has then been done to extract new features from the data set by changing the structure of the
dataset & the performance of the models is checked again. Data Remodeling often plays a very crucial and
important role in boosting classifier accuracies by changing the properties of the given dataset. We then try
to explore and analyze the various reasons due to which one model performs better than the other & hence
try and develop an understanding about the applicability of linear & non-linear machine learning models.
The target mentioned above being our primary goal, we also aim to find the classifier with the best possible
accuracy for product sales estimation in the given scenario.
Competitive advantage for e-business requires more accurate information and precise decision to aid international companies to analyze and predict sales forecasting trend optimizing potential profits and reduce losses. We propose an improvement model to minimize the forecast error for the time series of daily sales information. We use real international restaurant data as the basis to show the performance of our sales prediction model. Multiple time series are considered in this model to vastly improve the forecasting outcome. Those data series are combined from EPARK Company and open data like weather, into a multi-series data, our forecast model extend the previous predecessor tremendously. Various residue computations during the process are compared and discussed. We applied the model for data from different area to compare the difference respectively. Result shows a proper selection of computation method is more dynamic than a fixed method for shops in different geographic area even within the same company. In addition, analysis shows significant error reduction in forecasting achieved when open data like weather information is included in the regression process. Thus international business can be more agile and flexible for just-in-time stock inventory and better resource allocation strategy.
Over- or underestimating sales is detrimental to marketing and sales efforts as well as
inventories and cash flow management. Thus the purpose of this investigation is to evaluate the forecasting
accuracy of three competing multivariate time-series models that take into account existing
Predicting future sales is intended to control the number of existing stock, so the lack or excess stock can be minimized. When the number of sales can be accurately predicted, then the fulfillment of consumer demand can be prepared in a timely and cooperation with the supplier company can be maintained properly so that the company can avoid losing sales and customers. This study aims to propose a model to predict the sales quantity (multi-products) by adopting the Recency-Frequency-Monetary (RFM) concept and Fuzzy Analytic Hierarchy Process (FAHP) method. The measurement of sales prediction accuracy in this study using a standard measurement of Mean Absolute Percentage Error (MAPE), which is the most important criteria in analyzing the accuracy of the prediction. The results indicate that the average MAPE value of the model was high (3.22%), so this model can be referred to as a sales prediction model.
MODEL OF MULTIPLE ARTIFICIAL NEURAL NETWORKS ORIENTED ON SALES PREDICTION AND...ijscai
In this paper the authors proposed different Multilayer Perceptron Models (MLP) of artificial neural networks (ANN) suitable for visual merchandising in Global Distribution (GDO) applications involving supermarket product facing. The models are related to the prediction of different attributes concerning
mainly shelf product allocation applying times series forecasting approach. The study highlights the range validity of the sales prediction by analysing different products allocated on a testing shelf. The paper shows the correct procedures able to analyse most guaranteed results, by describing how test and train datasets can be processed. The prediction results are useful in order to design monthly a planogram by taking into
account the shelf allocations, the general sales trend, and the promotion activities. The preliminary correlation analysis provided an innovative key reading of the predicted outputs. The testing has been
performed by Weka and RapidMiner tools able to predict by MLP ANN each attribute of the experimental
dataset. Finally it is formulated an innovative hybrid model which combines Weka prediction outputs as
input of the MLP ANN RapidMiner algorithm. This implementation allows to use an artificial testing
dataset useful when experimental datasets are composed by few data, thus accelerating the self-learning
process of the model. The proposed study is developed within a framework of an industry project.
In this paper the authors proposed different Multilayer Perceptron Models (MLP) of artificial neural
networks (ANN) suitable for visual merchandising in Global Distribution (GDO) applications involving
supermarket product facing. The models are related to the prediction of different attributes concerning
mainly shelf product allocation applying times series forecasting approach. The study highlights the range
validity of the sales prediction by analysing different products allocated on a testing shelf. The paper shows
the correct procedures able to analyse most guaranteed results, by describing how test and train datasets
can be processed. The prediction results are useful in order to design monthly a planogram by taking into
account the shelf allocations, the general sales trend, and the promotion activities. The preliminary
correlation analysis provided an innovative key reading of the predicted outputs. The testing has been
performed by Weka and RapidMiner tools able to predict by MLP ANN each attribute of the experimental
dataset. Finally it is formulated an innovative hybrid model which combines Weka prediction outputs as
input of the MLP ANN RapidMiner algorithm. This implementation allows to use an artificial testing
dataset useful when experimental datasets are composed by few data, thus accelerating the self-learning
process of the model. The proposed study is developed within a framework of an industry project.
MODEL OF MULTIPLE ARTIFICIAL NEURAL NETWORKS ORIENTED ON SALES PREDICTION AND...ijscai
In this paper the authors proposed different Multilayer Perceptron Models (MLP) of artificial neural networks (ANN) suitable for visual merchandising in Global Distribution (GDO) applications involving supermarket product facing. The models are related to the prediction of different attributes concerning
mainly shelf product allocation applying times series forecasting approach. The study highlights the range
validity of the sales prediction by analysing different products allocated on a testing shelf. The paper shows the correct procedures able to analyse most guaranteed results, by describing how test and train datasets can be processed. The prediction results are useful in order to design monthly a planogram by taking into account the shelf allocations, the general sales trend, and the promotion activities. The preliminary correlation analysis provided an innovative key reading of the predicted outputs. The testing has been
performed by Weka and RapidMiner tools able to predict by MLP ANN each attribute of the experimental dataset. Finally it is formulated an innovative hybrid model which combines Weka prediction outputs as input of the MLP ANN RapidMiner algorithm. This implementation allows to use an artificial testing dataset useful when experimental datasets are composed by few data, thus accelerating the self-learning
process of the model. The proposed study is developed within a framework of an industry project.
Benchmarking the Turkish apparel retail industry through data envelopment ana...Gurdal Ertek
This paper presents a benchmarking study of the Turkish apparel retailing industry. We have applied the Data Envelopment Analysis (DEA) methodology to determine the efficiencies of the companies in the industry. In the DEA model the number of stores, number of corners, total sales area and number of employees were included as inputs and annual sales revenue was included as the output. The efficiency scores obtained through DEA were visualized for gaining insights about
the industry and revealing guidelines that can aid in strategic decision making.
http://research.sabanciuniv.edu.
ANOMALY DETECTION AND ATTRIBUTION USING AUTO FORECAST AND DIRECTED GRAPHSIJDKP
In the business world, decision makers rely heavily on data to back their decisions. With the quantum of
data increasing rapidly, traditional methods used to generate insights from reports and dashboards will
soon become intractable. This creates a need for efficient systems which can substitute human intelligence
and reduce time latency in decision making. This paper describes an approach to process time series data
with multiple dimensions such as geographies, verticals, products, efficiently, and to detect anomalies in
the data and further, to explain potential reasons for the occurrence of the anomalies. The algorithm
implements auto selection of forecast models to make reliable forecasts and detect such anomalies. Depth
First Search (DFS) is applied to analyse each of these anomalies and find its root causes. The algorithm
filters the redundant causes and reports the insights to the stakeholders. Apart from being a hair-trigger
KPI tracking mechanism, this algorithm can also be customized for problems lke A/B testing, campaign
tracking and product evaluations.
Visual and analytical mining of sales transaction data for production plannin...Gurdal Ertek
Recent developments in information technology paved the way for the collection of large amounts of data pertaining to various aspects of an enterprise. The greatest challenge faced in
processing these massive amounts of raw data gathered turns out to be the effective management of data with the ultimate purpose of deriving necessary and meaningful information
out of it. The following paper presents an attempt to illustrate the combination of visual and analytical data mining techniques for planning of marketing and production activities. The
primary phases of the proposed framework consist of filtering, clustering and comparison steps implemented using interactive pie charts, K-Means algorithm and parallel coordinate plots
respectively. A prototype decision support system is developed and a sample analysis session is conducted to demonstrate the applicability of the framework.
http://research.sabanciuniv.edu.
Similar to Application of Facebook's Prophet Algorithm for Successful Sales Forecasting Based on Real-world Data (20)
In the era of data-driven warfare, the integration of big data and machine learning (ML) techniques has
become paramount for enhancing defence capabilities. This research report delves into the applications of
big data and ML in the defence sector, exploring their potential to revolutionize intelligence gathering,
strategic decision-making, and operational efficiency. By leveraging vast amounts of data and advanced
algorithms, these technologies offer unprecedented opportunities for threat detection, predictive analysis,
and optimized resource allocation. However, their adoption also raises critical concerns regarding data
privacy, ethical implications, and the potential for misuse. This report aims to provide a comprehensive
understanding of the current state of big data and ML in defence, while examining the challenges and
ethical considerations that must be addressed to ensure responsible and effective implementation.
Cloud Computing, being one of the most recent innovative developments of the IT world, has been
instrumental not just to the success of SMEs but, through their productivity and innovative contribution to
the economy, has even made a remarkable contribution to the economic growth of the United States. To
this end, the study focuses on how cloud computing technology has impacted economic growth through
SMEs in the United States. Relevant literature connected to the variables of interest in this study was
reviewed, and secondary data was generated and utilized in the analysis section of this paper. The findings
of this paper revealed that there have been meaningful contributions that the usage of virtualization has
made in the commercial dealings of small firms in the United States, and this has also been reflected in the
economic growth of the country. This paper further revealed that as important as cloud-based software is,
some SMEs are still skeptical about how it can help improve their business and increase their bottom line
and hence have failed to adopt it. Apart from the SMEs, some notable large firms in different industries,
including information and educational services, have adopted cloud computing technology and hence
contributed to the economic growth of the United States. Lastly, findings from our inferential statistics
revealed that no discernible change has occurred in innovation between small and big businesses in the
adoption of cloud computing. Both categories of businesses adopt cloud computing in the same way, and
their contribution to the American economy has no significant difference in the usage of virtualization.
Energy-constrained Wireless Sensor Networks (WSNs) have garnered significant research interest in
recent years. Multiple-Input Multiple-Output (MIMO), or Cooperative MIMO, represents a specialized
application of MIMO technology within WSNs. This approach operates effectively, especially in
challenging and resource-constrained environments. By facilitating collaboration among sensor nodes,
Cooperative MIMO enhances reliability, coverage, and energy efficiency in WSN deployments.
Consequently, MIMO finds application in diverse WSN scenarios, spanning environmental monitoring,
industrial automation, and healthcare applications.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication. IJCSIT publishes original research papers and review papers, as well as auxiliary material such as: research papers, case studies, technical reports etc.
With growing, Car parking increases with the number of car users. With the increased use of smartphones
and their applications, users prefer mobile phone-based solutions. This paper proposes the Smart Parking
Management System (SPMS) that depends on Arduino parts, Android applications, and based on IoT. This
gave the client the ability to check available parking spaces and reserve a parking spot. IR sensors are
utilized to know if a car park space is allowed. Its area data are transmitted using the WI-FI module to the
server and are recovered by the mobile application which offers many options attractively and with no cost
to users and lets the user check reservation details. With IoT technology, the smart parking system can be
connected wirelessly to easily track available locations.
Welcome to AIRCC's International Journal of Computer Science and Information Technology (IJCSIT), your gateway to the latest advancements in the dynamic fields of Computer Science and Information Systems.
Computer-Assisted Language Learning (CALL) are computer-based tutoring systems that deal with
linguistic skills. Adding intelligence in such systems is mainly based on using Natural Language
Processing (NLP) tools to diagnose student errors, especially in language grammar. However, most such
systems do not consider the modeling of student competence in linguistic skills, especially for the Arabic
language. In this paper, we will deal with basic grammar concepts of the Arabic language taught for the
fourth grade of the elementary school in Egypt. This is through Arabic Grammar Trainer (AGTrainer)
which is an Intelligent CALL. The implemented system (AGTrainer) trains the students through different
questions that deal with the different concepts and have different difficulty levels. Constraint-based student
modeling (CBSM) technique is used as a short-term student model. CBSM is used to define in small grain
level the different grammar skills through the defined skill structures. The main contribution of this paper
is the hierarchal representation of the system's basic grammar skills as domain knowledge. That
representation is used as a mechanism for efficiently checking constraints to model the student knowledge
and diagnose the student errors and identify their cause. In addition, satisfying constraints and the number
of trails the student takes for answering each question and fuzzy logic decision system are used to
determine the student learning level for each lesson as a long-term model. The results of the evaluation
showed the system's effectiveness in learning in addition to the satisfaction of students and teachers with its
features and abilities.
In the realm of computer security, the importance of efficient and reliable user authentication methods has
become increasingly critical. This paper examines the potential of mouse movement dynamics as a
consistent metric for continuous authentication. By analysing user mouse movement patterns in two
contrasting gaming scenarios, "Team Fortress" and "Poly Bridge," we investigate the distinctive
behavioral patterns inherent in high-intensity and low-intensity UI interactions. The study extends beyond
conventional methodologies by employing a range of machine learning models. These models are carefully
selected to assess their effectiveness in capturing and interpreting the subtleties of user behavior as
reflected in their mouse movements. This multifaceted approach allows for a more nuanced and
comprehensive understanding of user interaction patterns. Our findings reveal that mouse movement
dynamics can serve as a reliable indicator for continuous user authentication. The diverse machine
learning models employed in this study demonstrate competent performance in user verification, marking
an improvement over previous methods used in this field. This research contributes to the ongoing efforts to
enhance computer security and highlights the potential of leveraging user behavior, specifically mouse
dynamics, in developing robust authentication systems.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
Image segmentation and classification tasks in computer vision have proven to be highly effective using neural networks, specifically Convolutional Neural Networks (CNNs). These tasks have numerous
practical applications, such as in medical imaging, autonomous driving, and surveillance. CNNs are capable
of learning complex features directly from images and achieving outstanding performance across several
datasets. In this work, we have utilized three different datasets to investigate the efficacy of various preprocessing and classification techniques in accurssedately segmenting and classifying different structures
within the MRI and natural images. We have utilized both sample gradient and Canny Edge Detection
methods for pre-processing, and K-means clustering have been applied to segment the images. Image
augmentation improves the size and diversity of datasets for training the models for image classification
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
This research aims to further understanding in the field of continuous authentication using behavioural
biometrics. We are contributing a novel dataset that encompasses the gesture data of 15 users playing
Minecraft with a Samsung Tablet, each for a duration of 15 minutes. Utilizing this dataset, we employed
machine learning (ML) binary classifiers, being Random Forest (RF), K-Nearest Neighbors (KNN), and
Support Vector Classifier (SVC), to determine the authenticity of specific user actions. Our most robust
model was SVC, which achieved an average accuracy of approximately 90%, demonstrating that touch
dynamics can effectively distinguish users. However, further studies are needed to make it viable option
for authentication systems. You can access our dataset at the following
link:https://github.com/AuthenTech2023/authentech-repo
This paper discusses the capabilities and limitations of GPT-3 (0), a state-of-the-art language model, in the
context of text understanding. We begin by describing the architecture and training process of GPT-3, and
provide an overview of its impressive performance across a wide range of natural language processing
tasks, such as language translation, question-answering, and text completion. Throughout this research
project, a summarizing tool was also created to help us retrieve content from any types of document,
specifically IELTS (0) Reading Test data in this project. We also aimed to improve the accuracy of the
summarizing, as well as question-answering capabilities of GPT-3 (0) via long text
In the realm of computer security, the importance of efficient and reliable user authentication methods has
become increasingly critical. This paper examines the potential of mouse movement dynamics as a
consistent metric for continuous authentication. By analysing user mouse movement patterns in two
contrasting gaming scenarios, "Team Fortress" and "Poly Bridge," we investigate the distinctive
behavioral patterns inherent in high-intensity and low-intensity UI interactions. The study extends beyond
conventional methodologies by employing a range of machine learning models. These models are carefully
selected to assess their effectiveness in capturing and interpreting the subtleties of user behavior as
reflected in their mouse movements. This multifaceted approach allows for a more nuanced and
comprehensive understanding of user interaction patterns. Our findings reveal that mouse movement
dynamics can serve as a reliable indicator for continuous user authentication. The diverse machine
learning models employed in this study demonstrate competent performance in user verification, marking
an improvement over previous methods used in this field. This research contributes to the ongoing efforts to
enhance computer security and highlights the potential of leveraging user behavior, specifically mouse
dynamics, in developing robust authentication systems.
Image segmentation and classification tasks in computer vision have proven to be highly effective using neural networks, specifically Convolutional Neural Networks (CNNs). These tasks have numerous
practical applications, such as in medical imaging, autonomous driving, and surveillance. CNNs are capable
of learning complex features directly from images and achieving outstanding performance across several
datasets. In this work, we have utilized three different datasets to investigate the efficacy of various preprocessing and classification techniques in accurssedately segmenting and classifying different structures
within the MRI and natural images. We have utilized both sample gradient and Canny Edge Detection
methods for pre-processing, and K-means clustering have been applied to segment the images. Image
augmentation improves the size and diversity of datasets for training the models for image classification.
This work highlights transfer learning’s effectiveness in image classification using CNNs and VGG 16 that
provides insights into the selection of pre-trained models and hyper parameters for optimal performance.
We have proposed a comprehensive approach for image segmentation and classification, incorporating preprocessing techniques, the K-means algorithm for segmentation, and employing deep learning models such
as CNN and VGG 16 for classification.
The security of Electric Vehicle (EV) charging has gained momentum after the increase in the EV adoption
in the past few years. Mobile applications have been integrated into EV charging systems that mainly use a
cloud-based platform to host their services and data. Like many complex systems, cloud systems are
susceptible to cyberattacks if proper measures are not taken by the organization to secure them. In this
paper, we explore the security of key components in the EV charging infrastructure, including the mobile
application and its cloud service. We conducted an experiment that initiated a Man in the Middle attack
between an EV app and its cloud services. Our results showed that it is possible to launch attacks against
the connected infrastructure by taking advantage of vulnerabilities that may have substantial economic and
operational ramifications on the EV charging ecosystem. We conclude by providing mitigation suggestions
and future research directions.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
The AIRCC's International Journal of Computer Science and Information Technology (IJCSIT) is devoted to fields of Computer Science and Information Systems. The IJCSIT is a open access peer-reviewed scientific journal published in electronic form as well as print form. The mission of this journal is to publish original contributions in its field in order to propagate knowledge amongst its readers and to be a reference publication.
This paper describes the outcome of an attempt to implement the same transitive closure (TC) algorithm
for Apache MapReduce running on different Apache Hadoop distributions. Apache MapReduce is a
software framework used with Apache Hadoop, which has become the de facto standard platform for
processing and storing large amounts of data in a distributed computing environment. The research
presented here focuses on the variations observed among the results of an efficient iterative transitive
closure algorithm when run against different distributed environments. The results from these comparisons
were validated against the benchmark results from OYSTER, an open source Entity Resolution system. The
experiment results highlighted the inconsistencies that can occur when using the same codebase with
different implementations of Map Reduce.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Application of Facebook's Prophet Algorithm for Successful Sales Forecasting Based on Real-world Data
1. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
DOI: 10.5121/ijcsit.2020.12203 23
APPLICATION OF FACEBOOK'S PROPHET
ALGORITHM FOR SUCCESSFUL SALES
FORECASTING BASED ON REAL-WORLD DATA
Emir Žunić1,2
, Kemal Korjenić1
, Kerim Hodžić2,1
and Dženana Đonko2
1
Info Studio d.o.o. Sarajevo, Bosnia and Herzegovina
2
Faculty of Electrical Engineering, University of Sarajevo, Bosnia and Herzegovina
ABSTRACT
This paper presents a framework capable of accurately forecasting future sales in the retail industry and
classifying the product portfolio according to the expected level of forecasting reliability. The proposed
framework, that would be of great use for any company operating in the retail industry, is based on
Facebook's Prophet algorithm and backtesting strategy. Real-world sales forecasting benchmark data
obtained experimentally in a production environment in one of the biggest retail companies in Bosnia and
Herzegovina is used to evaluate the framework and demonstrate its capabilities in a real-world use case
scenario.
KEYWORDS
Sales forecasting, Real-world dataset, Prophet, Backtesting, Classification
1. INTRODUCTION
Generating product-level sales forecasts is a crucial factor in the retail industry since inventory
control and production planning plays an important role in the competitiveness of any company
that provides goods for its customers. While accurate and reliable forecasts can lead to huge
savings and cost reductions by facilitating better production and inventory planning,
competitive pricing and timely promotion planning, poor sales estimations are proven to be
costly in this domain since it is well-known that goods shortages cause lower profits and can
easily lead to customer dissatisfaction. Furthermore, not only the excess inventory may force the
store to sell goods at lower prices, or even worse lead to inventory write-offs, higher than
needed inventory levels also increase warehousing costs.
In the real-world scenario, the business environment in the retail industry is highly dynamic and
often volatile, which is predominantly caused by holiday effects and competitor behaviour. As a
result, contrary to the widely available academic datasets used to demonstrate and benchmark
various time-series forecasting methods, real-world sales data in this domain carry various
challenges, such as highly non-stationary historical data, irregular sales patterns, and highly
intermittent sales data.
A module that would be able to forecast sales with a reasonably high accuracy, augmented by
the module for highly-reliable classification of the product portfolio according to the expected
level of forecastability, would be of great use for any company operating in the retail industry.
To bridge the gap towards the application of time-series forecasting in the real-world scenario in
the retail industry, the focus of this work is set on the development of the module for reliable
2. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
24
classification of the product portfolio according to the expected level of forecasting reliability.
The results are presented on the example of dataset experimentally obtained in a production
environment in one of the biggest retail companies in Bosnia and Herzegovina. Although
generating sales forecasts anywhere between the daily and annual horizon is certainly possible,
a particular focus has been put on monthly and quarterly sales forecasts since it was concluded
from discussions with clients that the aforementioned period is of the greatest interest for
production and inventory planning.
The structure of this paper is as follows: section Literature review offers a general description of
previous studies relating to the use of the different approaches and algorithms in related retail
sales forecasting problems, and methods for solving. Section Methodology gives an overview of
the framework by briefly explaining a structure of input dataset, data filtering and preprocessing
steps, the process of product portfolio selection, the Prophet tool, performance metrics and
forecastability analysis using backtesting experiments, and guidelines for classifying the product
portfolio, whereas section Results shows the capabilities of the proposed forecasting framework
in a real-world use case scenario. The conclusions drawn from the results in terms of the
proposed objective are given in section Conclusions, with a brief description of directions for
future work, further development and application of the proposed framework.
2. LITERATURE REVIEW
To estimate future sales, the process of sales forecasting is used. Those accurate processes help
companies to predict all kinds of performances and to make important business decisions.
Company forecasts can be based on trends in economy, past sales data and comparisons in
industry. Already established companies can easily predict future sales, which are based on past
business data. New companies must create their forecasts on information, not being verified
enough, such as competitive intelligence and market research. Sales forecasting enables
approach into company’s workforce, resources and cash flow. Predictive sales data is crucial for
business in order to get investment capital.
Retail businesses must use their resources in an efficient way and make strategic decisions to
make their revenues increased and stable, especially when conditions are getting more
competitive. There are three main types of retail sales forecasting:
Time-series sales forecasting,
Sales forecasting based on Artificial Neural Networks,
Using complex hybrid methods.
There are many studies in the literature with different simple and complex methods used for
modelling sales data to forecast future sales.
In the paper by Aras et al. [1] the brilliant literature overview and comparative study on retail sales
forecasting between single and combination methods is given. The obtained results in this paper
suggested that the combination methods achieve better results than the individual ones. Also,
the comparison between these methods and company’s current system was done. Several other
interesting facts are mentioned in this paper in the literature review section, as follows.
Sales data from the period of 10 years (from 1979 to 1989) were analysed by Ansuj et al. [2] in
terms of ARIMA (Autoregressive Integrated Moving Averages) model with interventions and
the ANN (Artificial Neural Network) model. Forecasts of the ANN model were more
appropriate than the ARIMA ones. Comparative studies were made of traditional methods and
ARIMA models with ANN models by Alon et al. [3], as well as a multivariate regression for
aggregate retail sales in economic conditions that are stable and winter’s exponential smoothing.
3. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
25
Results showed that the ANN models were the best. Frank et al. [4] created the women’s
apparel sales using the ANN model (which results were the best in terms of R2 evaluation
statistics), Winter’s three parameter model and a single seasonal exponential smoothing. In
sales forecasting, ANN models were the best of all. Aburto and Weber [5] created a
replenishment system for a Chilean supermarket by using a hybrid methodology of two stages,
whose forecasts were better in comparison to ANN and ARIMA models. There were not many
sales failures in hybrid methodology, and there were lower inventory levels.
Au et al. [6] compared the performance of evolutionary neural networks for sales forecasting
with the ARIMA seasonal model and totally connected neural network. Evolutionary neural
networks produced more accurate forecasts. For forecasting retail sales, Pan et al. [7] suggested
a hybrid method. That method integrates a neural network (EMD-NN) with an empirical mode
decomposition to forecast retail sales. The conclusion was that the seasonal ARIMA model and
the classical ANN model were less superior compared to the EMD-NN one. Performance with
hybrid method is better in volatile economic conditions. In comparison made by Dwivedi et al.
[8], the ANFIS (Adaptive Network-based Fuzzy Inference System) method was the most
appropriate of all the other methods being compared, including ANN and linear regression and a
neuro-fuzzy modelling approach. Aye et al. [9] made a performance out of 26 models (ANN,
ARIMA, AFRIMA, etc.) in forecasting South Africa’s aggregate seasonal retail sales. Results
showed that nonlinear ANN model was outperformed by other models also being nonlinear.
In comparison of Ramos et al. [10], the results did not show any difference between the state
space models and ARIMA model with automatic algorithms in forecasting sales of women’s
footwear products. Fabianová et al. [11] made an analysis of refrigerator sales from a retail
store. Results were better in achieving the total revenue by Monte Carlo simulation and using
sensitivity analysis for variables identification. Kolassa [12] observed forecast accuracy
measures not being appropriate for count data. He took into consideration discrete predictive
distributions for forecasting daily sales. Ma et al. [13] presented the results of examination of
the case of Stock Keeping Unit (SKU) level retails store sales by using a four step
methodological framework. This research showed that improvements were achieved by
exploiting the intra red category information, not the inner one. Sales forecasting detailed
review was provided in this way. Jiménez et al. [14] wanted to have forecasts for online sales
being more accurate and also the relevant features of the solid products affecting the sales, so he
proposed a selection methodology of a novel feature.
Retail product sales data are contained of multiple seasonal cycles of different lengths. For
example, beer daily sales data shown in one experiment exhibit both weekly and annual cycles.
Sales are high during the weekends and low during the weekdays, high in summer and low in
winter, and high around Christmas. Some sales data depend on the nature of the business and
business locations. So, models used in forecasting must control multiple seasonal patterns.
Ramos and Fildes [15] use models with additional flexibility but parsimonious complexity to
capture the seasonality of weekly retail data: trigonometric functions prove sufficient.
Papacharalampous and Tyralis [16] consider the performance of random forests and Facebook’s
Prophet in forecasting daily streamflow up to seven days ahead in a river in the US. Both these
forecasting methods use past streamflow observations, while random forests additionally use
past precipitation information. They use a naïve method based on the previous streamflow
observation, as well as a multiple linear regression model utilizing the same information as
random forests. The obtained results suggest that random forests perform better in general
terms, while Prophet outperforms the naïve method for forecast horizons longer than three days.
Based on the knowledge about the sales forecasting, it is worth mentioning that we are currently
working on integrating this framework into our previously developed products for the retail
industry as a big Smart supply chain management (SCM) concept (Zunic et al. [17, 18, 19]).
Also, the proposed forecasting method together with the global positioning system (GPS) data
4. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
26
could be successfully used to determine some parameters and constants of the real-world
vehicle routing problems (VRP), such as unloading time, road and time distances between
customers and so on (Zunic et al. [20, 21, 22]).
3. METHODOLOGY
A module that would be able to forecast sales with a reasonably high accuracy, augmented by
the module for highly-reliable classification of the product portfolio according to the expected
level of forecastability, would be of great use for any company operating in the retail industry.
The proposed model for successful sales forecasting based on real-world data is shown in
Figure 1.
Figure 1. Proposed sales forecasting model.
The upper part (1) in the illustrated model can be represented as the "offline" segment in the
whole approach, for purposes of model accuracy and classification. The second part (2)
represents the sub-process of successful sales forecasting.
3.1. Input Dataset
To develop a framework for sales forecasting, the following columns were assumed to be
available in the real-world input dataset which is structured as a table of records:
1) item_code - unique identifier of the product in a portfolio
2) date - date of transaction
3) quantity - the quantity sold in a given transaction
4) unit_price - the unit price at which the product was sold (optional, not used in
forecasting)
A sample of input dataset is shown in Table 1.
Table 1. A sample of the input dataset
item_code date quantity unit_price
0 501001000001 2010-01-02 399 1.3300
1 501001000001 2010-01-04 812 1.3380
2 501001000001 2010-01-05 516 1.3310
3.2. Data Filtering and Preprocessing
Several steps were taken during the preprocessing phase to transform a table of records into a
convenient form:
5. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
27
filtering by date, in order to remove irrelevant historical data (e.g. it was decided not to
use more than six years of historical data)
conversion of quantities into the same unit (e.g. pieces, packs, bundles, pallets, etc.)
data aggregation in time domain at the product level (i.e. daily sales data were
converted into monthly sales)
This process is illustrated in Figure 2.
Figure 2. The illustration of data filtering and preprocessing step for a single item in
the product portfolio
3.3. Product Portfolio Selection
To simplify the analysis by limiting it to a reasonable number of products from the portfolio, but
at the same time properly deal with a long tail phenomenon present in sales data of any
company in the retail industry, the products are first sorted by their importance.
Based on industry experience and discussion with clients in real field, it was concluded that
several different criteria can be used to sort the product portfolio by relevance:
1) the total profit per product over the last year (if the profit per sale is available, which is
a rarity),
2) the total financial turnover (i.e. net sales) per product over the last year (if the unit price
per sale is available, which is often a case),
3) the total quantity sold per item over the last year (if none of the aforementioned data is
available).
One can conclude that the second criterion is a quite good approximation of the first one since
the percentage profit per product is usually comparable across the product portfolio. On the
other side, the third criterion is a fairly loose approximation since the unit price per item may
vary significantly, but it is the best that can be done if the price data is not available.
Afterwards, to perform forecastability analysis and product portfolio classification, as well as to
present capabilities of sales forecasting framework and initial results to the client, the focus is
6. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
28
set on the Top N products that cover 90% of the total profit/turnover/quantity over the last year.
In a practical application of sales forecasting framework, after obtaining initial results in this
way, one can easily re-run the experiment for the entire product portfolio and deploy the model.
In the next step, products that are not suitable for product portfolio classification framework are
filtered out according to the following requirements:
minimum length of observation horizon: 39 months (at least 24 months of historical
data is required for reliable estimation of trend and/or seasonal effects, additional 12
months of data for repeated backtesting experiments and three months of data to
measure the accuracy of quarterly forecasts),
maximum allowed production/sales downtime: 3 months (the product is considered to
be inactive if zero sales are recorded for more than 3 months of the most recent history).
Since there is usually a non-negliable number of products with a historical data available for
more than 24 months but less than 39 months (i.e. products for which forecasts can be
generated, but backtesting cannot be done as described in the following subsection since there is
no enough historical data available), it is desirable to explain to clients that forecasting is still
possible but reliable estimation of forecasting accuracy cannot be calculated for these products.
In that case, it is suggested to use sales forecasting results in a semi-automated manner, (i.e.
with a human-in-the-loop).
3.4. Prophet - a Tool for Time Series Forecasting at Scale
The basic building block of the proposed framework for sales forecasting and product portfolio
classification is a tool/method for generating high-quality time-series forecasts. Despite the fact
that there are numerous tools/methods that can be applied, it was decided to use Facebook’s
Prophet tool for this research since it is capable of generating forecasts of a reasonable quality at
scale.
Prophet, an open-source software released by Facebook’s Core Data Science team, is a
procedure developed for forecasting time series data based on an additive model where non-
linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best
with time series that have strong seasonal effects and several seasons of historical data. Prophet
is robust to missing data and shifts in the trend, and typically handles outliers well. According to
Taylor and Letham [23] research, Prophet is used in many applications across Facebook for
producing reliable forecasts and performs better than any other approach in the majority of
cases.
In this paper, Facebook’s Prophet tool is used for modelling the dynamics of sales for items in a
product portfolio without using additional regressors, with the aim of generating monthly and
quarterly sales forecasts. It is worth mentioning that an empirical method for tweaking model
parameters is used to incorporate domain knowledge into the proposed framework, but the same
parameters are used for the entire product portfolio to avoid overfitting. It is empirically
concluded that at least 24 months of historical data is required for reliable estimation of trend
and/or seasonal effects. An example of using Facebook’s Prophet tool to forecast the future
sales for the product with sales per month time series is shown in Figure 2 for the next three
months is illustrated in Figure 3.
7. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
29
Figure 3. The illustration of using Prophet tool to forecast sales of the product with sales per month
time series illustrated in Figure 2
3.5. Performance Metrics
Two performance metrics are used for measuring forecasting accuracy, calculating the expected
level of forecasting accuracy and classifying product portfolio accordingly:
the relative or percentage error (PE) for individual monthly/quarterly forecasts, and
the mean absolute percentage error (MAPE) for quantifying the overall accuracy.
The percentage error (PE), which can be calculated as:
%100
true
trueforecast
y
yy
PE (1)
is mainly used to measure the accuracy of individual monthly/quarterly forecasting outputs
generated by the model, while the mean absolute percentage error (MAPE), calculated as:
%100
1
1
n
i true
i
true
i
forecast
i
y
yy
n
MAPE (2)
is used to quantify the overall accuracy of the forecasting framework and calculate the expected
level of reliability useful for classifying the product portfolio.
These metrics are selected for use because of their simplicity, very intuitive interpretation, as
well as the fact they work well if there are no extremes in the data. Based on the interaction with
clients, it is concluded that a clear and intuitive interpretation of forecasting accuracy metrics in
terms of relative error plays a crucial role in the acceptance of the sales forecasting framework
as a decision-making tool in the retail industry.
8. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
30
3.6. Forecastability Analysis using Backtesting Experiments
To calculate a reliable estimation of the expected level of forecasting accuracy in terms of
percentage error, an expanding window backtesting strategy is implemented. Past conditions are
simulated by setting the present moment anywhere in the past, building a model using historical
data to forecast future sales and see how accurately it would have predicted actual data.
One can assume that, with enough repetitions of simulating past conditions, a highly-reliable
estimation of the expected level of forecastability can be calculated without having to wait for a
new event to happen in order to compare it with previously generated forecasts and draw
conclusions. To take into account yearly seasonality effects, it is recommended to carry out at
least 12 repetitions of backtesting experiment with one month step size should be done, which is
the reason why the limit for the minimum length of observation horizon is set to 39 months (i.e.
at least 24 months of historical data is required for reliable estimation of trend and/or seasonal
effects, 12 months for repeated backtesting experiments and three months of data to measure
accuracy of quarterly forecasts).
In this research, the backtesting experiment is repeated 12 times with one-month step size for
items selected from the product portfolio. At each step, a Prophet model is fitted to the data
from the historical data (i.e. observation horizon) and monthly sales forecast for the next three
months (i.e. forecasting horizon) are compared with the observed sales for the same period in
order to calculate percentage error (PE) for monthly and quarterly forecasts. Then, the expected
level of forecastability is calculated as the mean absolute percentage error (MAPE) over
repeated backtesting experiments, which is used to quantify the expected level of forecasting
reliability.
An example of performing forecastability analysis for a single item in the product portfolio is
illustrated in Figures 4 and 5.
Figure 4. Step 1/12 of backtesting experiment illustrated on an item in a product portfolio
9. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
31
Figure 5. Step 2/12 of backtesting experiment illustrated on an item in a product portfolio
For the analysed item (Chamomile Filter Tea, 200g) the monthly mean absolute percentage
error (MAPE) is:
%100
1
1
n
i true
i
true
i
forecast
i
y
yy
n
MAPE = 8.00% (3)
The quarterly forecast MAPE for the same item is:
%100
1
1
n
i true
i
true
i
forecast
i
y
yy
n
MAPE = 6.06% (4)
The obtained results for all the analysed items (real-world usa case scenario) are presented in
the next section.
4. RESULTS
The proposed sales forecasting and product portfolio classification framework is evaluated in a
real-world use case scenario with Real-world sales forecasting benchmark data published by
Žunić [24], which is obtained experimentally in a production environment in one of the biggest
retail companies in Bosnia and Herzegovina. Dataset is placed on 4TU.ResearchData in order to
be available to the rest of the researchers, as a new benchmark data.
In this dataset, a total number of 581 items in product portfolio are observed, while 400 of these
were active (i.e. non-zero sales are observed at least once) over the past year.
According to the guidelines proposed in the previous section, items in a product portfolio are
ordered by the total financial turnover (i.e. net sales) per product over the last year and Top 200
items are selected with the aim of covering approximately 90% of total financial turnover.
10. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
32
4.1. Product Portfolio Classification
The first criterion used to classify the product portfolio is based on the observation horizon
length and the recent sales downtime, so the following categories might be identified:
1) Inactive products: a subset of the product portfolio with items that have recent sales
downtime of three or more months
2) Products with observation horizon shorter than 24 months: a subset of the product
portfolio with items that cannot be forecasted nor classified according to the expected
level of forecastability
3) Products with observation horizon shorter than 39 months: a subset of the product
portfolio with items that can be forecasted but cannot be classified according to the
expected level of forecastability
4) Products with observation horizon at least 39 months long: a subset of the product
portfolio with items that can be both forecasted classified according to the expected
level of forecastability
Classification of Top 200 products in a portfolio by this criterion is illustrated in Figure 6.
Figure 6. Classification of Top 200 products by the criterion based on the observation
horizon length and the recent sales downtime
The second criterion for classifying product portfolio, which can be applied to a subset of
products with observation horizon at least 39 months long, is based on the expected level of
forecasting accuracy calculated as mean absolute percentage error (MAPE) for repeated
backtesting experiments.
For example, binning products into the following class intervals might be interesting:
1) MAPE ≤ 15%
2) 15% < MAPE ≤ 30%
3) 30% < MAPE ≤ 50%
4) MAPE > 50%
11. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
33
Classification of 113/200 products with observation horizon at least 39 months long (from the
list of Top 200 products in a portfolio) by this criterion both for monthly and quarterly forecasts
is illustrated in Figures 7 and 8.
Figure 7. Classification of 113/200 products with observation horizon at least 39 months long:
MAPE of monthly forecasts
Figure 8. Classification of 113/200 products with observation horizon at least 39 months long:
MAPE of quarterly forecasts
5. CONCLUSIONS AND FUTURE WORK
By evaluating its performance in a real-world use case scenario, the proposed framework
demonstrated capabilities of generating reasonably accurate monthly and quarterly sales
forecasts, as well as a great potential for classification of the product portfolio into several
12. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
34
categories according to the expected level of forecasting reliability: approximately 50% of the
product portfolio (with a sufficiently long historical data) can be forecasted with MAPE < 30%
on a monthly basis, while approximately 70% can be forecasted with MAPE < 30% on a
quarterly basis (40% of which with MAPE < 15%).
It is important to mention that these approximately 40% of the product portfolio which can be
forecasted with MAPE < 15% on a quarterly basis are mostly the best selling items of the
aforementioned retail company, with more than 80% of annual share (financial) in the whole
portfolio. Based on those facts, the obtained results are more than satisfactory in real-world
scenario of sales forecasting.
With the aim of expanding a set of products to which the proposed framework can be applied,
future work will also include the development of appropriate sales forecasting method
applicable to products with observation horizon shorted than 24 months. This group of products
represent a non-negligible part of the product portfolio, which is a major limitation of the
proposed framework that will be tacked in the future. Further development of the proposed
framework may include an automated approach for hyper-parameters tuning and optimization,
modelling the impact of price changes, promotional activities and changes in the product
portfolio, integrating multiple forecasting tools besides Prophet (such as X-13ARIMA-SEATS)
and addressing some of its limitations described in this paper.
ACKNOWLEDGEMENTS
The authors want to thank the company “Info Studio d.o.o." from Sarajevo, Bosnia and
Herzegovina, for making this research possible through funding and providing access to
necessary data.
REFERENCES
[1] Aras, S., Deveci Kocakoç, İ. and Polat, C. (2017). Comparative study on retail sales forecasting
between single and combination methods. J. Bus. Econ. Manag.
https://doi.org/10.3846/16111699.2017.1367324
[2] Ansuj, A. P., Camargo, M. E., Radharamanan, R., and Petry, D. G. (1996). Sales forecasting using
time series and neural networks. Computers & Industrial Engineering 31(1): 421–424.
https://doi.org/10.1016/0360-8352(96)00166-0
[3] Alon, I., Qi, M., and Sadowski, R. J. (2001). Forecasting aggregate retail sales: a comparison of
artificial neural networks and traditional methods. Journal of Retailing and Consumer Services
8(3): 147–156. https://doi.org/10.1016/S0969-6989(00)00011-4
[4] Frank, C., Garg, A., Sztandera, L., and Raheja, A. (2003). Forecasting women’s apparel sales
using mathematical modelling. International Journal of Clothing Science and Technology 15(2):
107–125. https://doi.org/10.1108/09556220310470097
[5] Aburto, L., and Weber, R. (2007). Improved supply chain man-agement based on hybrid demand
forecasts. Apl. Soft Computing 7(1): 136–144. https://doi.org/10.1016/j.asoc.2005.06.001
[6] Au, K. F., Choi, T. M., and Yu, Y. (2008). Fashion retail forecasting by evolutionary neural
networks. International Journal of Pro-duction Economics 114(2): 615–630.
https://doi.org/10.1016/j.ijpe.2007.06.013
[7] Pan, Y., Pohlen, T., and Manago, S. (2013). Hybrid neural network model in forecasting aggregate
US retail sales. Advances in Business and Management Forecasting 9: 153–170.
https://doi.org/10.1108/S1477-4070(2013)0000009013
[8] Dwivedi, A., Niranjan, M., and Sahu, K. (2013). A business intelli-gence technique for forecasting
the automobile sales using Adaptive Intelligent Systems (ANFIS and ANN). International Journal
of Computer Applications 74(9): 7–13. https://doi.org/10.5120/12911-9383
[9] Aye, G. C., Balcilar, M., Gupta, R., and Majumdar, A. (2015). Forecasting aggregate retail sales:
the case of South Africa. International Journal of Production Economics 160: 66–79.
https://doi.org/10.1016/j.ijpe.2014.09.033
13. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
35
[10] Ramos, P., Santos, N., and Rebelo, R. (2015). Performance of state space and ARIMA models for
consumer retail sales forecasting. Robotics and Computer-Integrated Manufacturing 34: 151–163.
https://doi.org/10.1016/j.rcim.2014.12.015
[11] Fabianová, J., Kačmáry, P., Molnár, V., and Michalik, P. (2016). Using a software tool in
forecasting: a case study of sales forecasting taking into account data uncertainty. Open
Engineering 6(1): 270–279. https://doi.org/10.1515/eng-2016-0033
[12] Kolassa, S. (2016). Evaluating predictive count data distributions in retail sales forecasting.
International Journal of Forecasting 32(3): 788–803.
https://doi.org/10.1016/j.ijforecast.2015.12.004
[13] Ma, S., Fildes, R., and Huang, T. (2016). Demand forecasting with high dimensional data: the case
of SKU retail sales forecasting with intra-and inter-category promotional information. European
Journal of Operational Research 249(1): 245–257. https://doi.org/10.1016/j.ejor.2015.08.029
[14] Jiménez, F., Sánchez, G., García, J. M., Sciavicco, G., and Miralles, L. (2017). Multi-objective
evolutionary feature selection for online sales forecasting. Neurocomputing 234: 75–92.
https://doi.org/10.1016/j.neucom.2016.12.045
[15] Ramos, P., and Fildes, R. (2018). An evaluation of retail forecasting methods for promotions. In.
Lancaster: Dept. Management Science.
[16] Papacharalampous, G. A., and Tyralis, H. (2018). Evaluation of random forests and Prophet for
daily streamflow forecasting. Advances in Geoscience 45: 201-208. https://doi:10.5194/adgeo-45-
201-2018
[17] Zunic, E., Hodzic, K., Hasic, H., Skrobo, R., Besirevic, A., and Donko, D., (2017). Application of
advanced analysis and predictive algorithm for warehouse picking zone capacity and content
prediction. in ICAT 2017 - 26th International Conference on Information, Communication and
Automation Technologies, Proceedings. 1-6. https://doi.org/10.1109/ICAT.2017.8171629
[18] Zunic, E., Besirevic, A., Delalic, S., Hodzic, K., and Hasic, H. (2018). A generic approach for
order picking optimization process in different warehouse layouts. in 2018 41st International
Convention on Information and Communication Technology, Electronics and Microelectronics,
MIPRO 2018 - Proceedings. 1000-1005. https://doi.org/10.23919/MIPRO.2018.8400183
[19] Zunic, E., Delalic, S., Hodzic, K., Besirevic, A., and Hindija, H. (2018). Smart Warehouse
Management System Concept with Implementation. in 2018 14th Symposium on Neural Networks
and Applications, NEUREL 2018. 1-5. https://doi.org/10.1109/NEUREL.2018.8587004
[20] Zunic, E., Hindija, H., Besirevic, A., Hodzic, K., and Delalic, S. (2018). Improving Performance
of Vehicle Routing Algorithms using GPS Data. in 2018 14th Symposium on Neural Networks
and Applications, NEUREL 2018. https://doi.org/10.1109/NEUREL.2018.8586982
[21] Zunic, E., and Đonko, D. (2019). Parameter setting problem in the case of practical vehicle routing
problems with realistic constraints. in Proceedings of the 2019 Federated Conference on Computer
Science and Information Systems, FedCSIS 2019. https://doi.org/10.15439/2019F194
[22] Žunić, E., Đonko, D., and Buza, E. (2020). An Adaptive Data-Driven Approach to Solve Real-
World Vehicle Routing Problems in Logistics. Complexity. https://doi.org/10.1155/2020/7386701
[23] Taylor, S. J., and Letham, B. (2018). Forecasting at scale. The American Statistician, 72(1), 37-45.
https://doi.org/10.1080/00031305.2017.1380080
[24] Žunić, E. (2019). Real-world sales forecasting benchmark data. 4TU.Centre for Research Data.
https://doi.org/10.4121/uuid:b9f3df9e-08bc-4331-96da-7cabfa8970c0
AUTHORS
Emir Žunić is a PhD candidate in Electrical Engineering with over 10 years of
experience in the fields of Software Engineering, IT, Data Mining, Business Process
Management, Document Management and Optimizations. He currently works as the
Head of AI/ML Department at Info Studio d.o.o. Sarajevo. Also, he is the Co-
Founder and CIO of edu720 d.o.o. Sarajevo. In the Academic Area, he also has
experience in working as a Teaching Assistant/Industry Expert at the Faculty of
Electrical Engineering, University of Sarajevo. In the past, he also worked as an
Industry Expert at the Sarajevo School of Science and Technology. He has currently
published 44 scientific papers at prestigious conferences and journals. He is an Editorial Board Member
on several scientific conferences and journals.
14. International Journal of Computer Science & Information Technology (IJCSIT) Vol 12, No 2, April 2020
36
Kemal Korjenić is a research engineer and data scientist at Info Studio d.o.o. Sarajevo. He gained his
bachelor and master degree at the Faculty of Electrical Engineering, University of Sarajevo.
Kerim Hodžić is a Teaching Assistant at the Faculty of Electrical Engineering, University of Sarajevo
and a research engineer at Info Studio d.o.o. Sarajevo. He gained his bachelor and master degree at the
Faculty of Electrical Engineering, University of Sarajevo. He is a PhD student at the Faculty of Electrical
Engineering, University of Sarajevo.
Dženana Đonko is a Full Professor at Faculty of Electrical Engineering, University of Sarajevo with
enormous experience in the fields of data mining, machine learning and software engineering. She has
published more than 50 scientific papers at prestigious conferences and journals. She is an Editorial
Board Member on several scientific conferences and journals.