SlideShare a Scribd company logo
1 of 27
Mechanizing Optimization of Warehouses by
Implementation of Machine Learning
Methodologies and Pricing Policies
MSc Research Project
Data Analytics
Shrikant Samarth
Student ID: x18129137
School of Computing
National College of Ireland
Supervisor: Paul Liard
National College of Ireland
Project Submission Sheet
School of Computing
Student Name: Shrikant Samarth
Student ID: x18129137
Programme: Data Analytics
Year: 2019-2020
Module: MSc Research Project
Supervisor: Paul Liard
Submission Due Date: 11/08/2019
Project Title: Mechanizing Optimization of Warehouses by Implementation
of Machine Learning Methodologies and Pricing Policies
Word Count: XXX
Page Count: 25
I hereby certify that the information contained in this (my submission) is information
pertaining to research I conducted for this project. All information other than my own
contribution will be fully referenced and listed in the relevant bibliography section at the
rear of the project.
ALL internet material must be referenced in the bibliography section. Students are
required to use the Referencing Standard specified in the report template. To use other
author’s written or electronic work is illegal (plagiarism) and may result in disciplinary
action.
Signature:
Date: 28th January 2020
PLEASE READ THE FOLLOWING INSTRUCTIONS AND CHECKLIST:
Attach a completed copy of this sheet to each project (including multiple copies).
Attach a Moodle submission receipt of the online project submission, to
each project (including multiple copies).
You must ensure that you retain a HARD COPY of the project, both for
your own reference and in case a project is lost or mislaid. It is not sufficient to keep
a copy on computer.
Assignments that are submitted to the Programme Coordinator office must be placed
into the assignment box located outside the office.
Office Use Only
Signature:
Date:
Penalty Applied (if applicable):
Mechanizing Optimization of Warehouses by
Implementation of Machine Learning Methodologies
and Pricing Policies
Shrikant Samarth
x18129137
Abstract
In this fast-evolving world, demand and supply should behave complimentarily
and the warehouse plays a vital role in connecting the two aspects. It is important
to manage warehouses by providing better space optimization solutions to main-
tain the inflow of the markets trending products to the depot. When this factor is
turned a blind eye on, it gives rise to the stagnancy of products in the warehouse
thus incurring losses to the warehouse manager. Heuristic and data analytics have
been a major turn over factors to increase the efficiency of any warehouse storage.
This research paper uncovers the subject of space optimization of warehouse and
has a great potential to revolutionize the way machine learning can be implemented
to increase the warehouses flexibility. By the implementation of linear regression
models and the application of ensemble algorithms, there has been an outstanding
competitive comparison of evaluation metric scores to predict the best performing
algorithm. After conducting the final evaluation with different evaluation metrics
like R2, mean absolute error, etc., gradient boosting emerged as the best performing
algorithm among all other participants implemented in this research work. This can
assist the warehouse manager to predict the sale-rate and calculate the blow-out
period of the electronic products with better precision than heuristic analytics.
Keywords: E-commerce, Warehouse Optimization, Space Optimization, En-
hancement, Machine Learning, Pricing strategy, Blowout products
1 Introduction
The E-commerce industry is the most vital and versatile industry when it comes to
dealing with products online. It has the capability to attract customer attention even
if the market for products is not strong. E-commerce is a shift from a specific way of
thinking which has an equal effect on marketers and customers (Bhat et al.; 2016). It
has a whitewash change over the traditional way of dealing with commerce. The vari-
ous types of e-commerce business relationships are business-to-business (ex: Amazon
business, Alibaba, etc.), business-to-consumer (ex: sellers on Walmart, etc.), consumer-
to-business (ex: Google Adsense, etc.) and consumer-to-consumer (ex: eBay, etc.). The
focus of this paper is a business-to-business area where Klodawski et al. (2017) has rightly
pointed out the issues warehouse manager can face that would eventually result in over-
stocking. Based on factors like the market trend, sale forecast, fast-moving products,
1
etc., the warehouse manager orders the series of products. But not all products get sold
and remaining products becomes stagnant. Klodawski et al.’s (2017) analysis research
is purely heuristic-based which is a grand success against the small scale warehouse in-
dustry. Whereas, the application of this analytics on a greater scale gives rise to revenue
loss to the manager and product-stagnation which is termed as blow-out. Thus, it be-
comes very important to understand the trend of the products to be ordered which takes
into consideration not only the heuristic configurations but also some categorical values
that have an equal and significant outcome. This is easily facilitated with the help of
data analytical implementation in the field of the warehouse industry.
1.1 Motivation
Being a part of the warehouse industry for more than 5 years, it has been understood that
there is a level of complexity to superintend product vacancy in the warehouse. Auto-
mation can be the next possible solution which will reduce manual errors and increase
the efficiency of the warehouse. Automation, even if proven to be more delivery ori-
ented than manual efforts, must be implemented with certain protocols to maximize pro-
ductivity. Baker and Halim (2007) has explained about short-comings for non-productive
automation implementation in warehouse management. Their findings indicate that an
important reason for automation is to adjust growth with effective costing and service im-
provement. There is an evident risk of discrepancy in service level failures, cost-ineffective
and flexibility concerns. This can only be satisfied with the implementation of machine
learning strategies. It is easy to implement and understand the working model of tradi-
tional warehouse management, where the warehouse manager applies a level of previous
experience and orders the number of products that would be recurring in the warehouse
and volume will be used in an optimum way. But there is a great disadvantage in work-
ing with larger warehousing when there is a need to deal with tonnes of a wide range
of products, resulting in mismanagement of entities and monetary losses (Laudon and
Laudon; 2015). To deal with these issues, Reyes et al. (2019) directed the analysts to
face operations like quantity and volume of products, demand uncertainty and rapid cus-
tomer service. Ai-min and Jia (2011) explained how optimization can be achieved with
the assistance of genetic algorithm, replication of optimism with the help of decimal en-
coding and weight addition method to evaluate fitness function and mutation. Artificial
Intelligence can also play a vital role to understand different aspects of warehouse man-
agement. Kartal et al. (2016) has specifically mentioned about the application of machine
learning and artificial intelligence algorithms to study the patterns for stocked products
in a repository. Knoll et al. (2016) studied the inbound logistics operations and imple-
mented machine learning algorithms to predict different tactical strategies which are thus
very difficult to manage through heuristics. Machine learning has the capability to go
beyond the way to understand the sequence and pattern in the data and provide immense
suggestions that are valuable in the scenarios of the live market. By understanding the
benefits of machine learning strategies, this research work pedals in the direction of stock
management for finding the time period patterns of blowout products, that would help
warehouse manager to get an idea of stock clearance before products go out of trend and
such products are no longer a concern of any warehouse.
2
1.2 Research Question
”Can machine learning improve the prediction of blowout time better than legacy heuristic
methods to attain warehouse space optimization?”
1.3 Research Objective
This research paper is a complete answer to the following set objectives for the research
question:
• Objective 1: Data Collection from a working organization.
• Objective 2: Understanding the data through exploratory data analysis.
• Objective 3: Performing pre-processing techniques on the collected data
• Objective 4: Implement linear regression and ensemble models.
• Objective 5: Conduct data evaluation on the algorithms to select the best per-
forming model.
• Objective 6: Calculate the blowout time period using the best performing model.
• Objective 7: Execute dynamic pricing strategies to attain warehouse space op-
timization.
2 Literature Review
In the last two decades, logistic warehouse services have turned out to be significant
player for the various businesses and the supply chain industries. New difficulties arose
due to the advancement of the technology. So, to make a proper decision it is important
to have knowledge of how decision to be made and what are the key parameters affecting
the warehouse performance. To ensure inventory management under space restrictions,
it is important to improve the decision making, storage allocation and scheduling the
warehouse operations. Krauth et al.’s (2005) introduces a framework that clusters a
performance estimation of various streams. Based on their empirical validation, they re-
commend a list of performance indicators that could assist the warehouse management,
reexamining their operations and compelling them to think apart from cost minimiza-
tion. Effective decisions can only be made if the managers have an insight of these key
parameters of warehouse operations. One such key indicator discussed is inventory man-
agement which holds importance to make sure smooth warehouse operations. With the
introduction in the online shopping, ordering has changed coordination of the production
network that lead to a dramatic changes in the warehouse inventory management. Van den
Berg and Zijm’s (1999) explained about the relation between inventory control decisions,
product allocation and assignment problems by comparing various warehouse systems.
Thus, identified with the sophisticated class assignment approach, higher warehouse ser-
vice level and shorter response times could additionally help in saving more funds and
warehouse space. Kar´asek (2013) further examined the issues regarding the warehouse
layout which depends on the effective use of space. The real-world challenges such as
collision of vehicles and shared employees were studied and a technique was introduced
3
for achieving the optimization by utilizing the shop scheduling techniques combined with
Vehicle Routing Problem solving techniques. In the competitive e-commerce sector where
sellers assure in-time deliveries, there is a constant pressure of improving response time
and maintaining trending products in the warehouse. As a result, an enthusiasm for new,
complex arranging strategies is taking roots in this sector.
The ideal warehouse operation is achieved when customers get their order in due time
and when all the warehouse and logistic processes finish in the shortest possible time
with minimum utilization of cost and resources under dynamically changing conditions.
Kim (2018) studies different priority rules to improvise the delivery process based on the
order time. The decision which priority rule involves a trade-off amongst a numerous
performance attributes of the outcomes, for instance example, handling order volume,
service level and operational cost (Chen et al.; 2010). A popular technique to help
this type of decision is data envelopment analysis (Hackman et al.; 2001; De Koster
et al.; 2012). But in 2014, Mangano and De Marco’s (2014) uncovers that there is a
limited research done in the area of maintenance of logistic and operational warehouse
performance, hence there is a need for further optimization in this area. To deal with this
optimization, data scientific solutions prove to be a logical pick and have been illustrated
in the further sections.
2.1 Efficiency Improvement for Warehouse Optimization
In general, warehouse today is getting more complex as there is a constant pressure on
the warehouse manager to deliver the orders in time with minimum cost. Therefore,
variety of tools and performance of warehouse evaluation has also increased. The matrix
that are used for evaluating the performance for different scenarios is not clear (Staudt
et al.; 2015), for which warehouse manager have to perform regular analysis. Warehouses
generally aims towards reducing cost optimizing warehouse performance and customer
responsiveness. Estimating warehouse performance gives input about how the warehouse
performs compared with the requirements and with peers. Johnson and McGinnis (2010)
discusses that while evaluating warehouse performance technical criteria (that generates
output after utilizing resource) gives more clear picture than the financial criteria, as
warehouse most of the times does not generates revenues as there are many loses involves
regarding the stagnant products (blowout products) which have the slow movement than
predicted. On the other hand, Staudt et al.’s (2015) found that there are very few re-
search work done in cost related performance indicator than other operational perform-
ance indicators (i.e time, quality and productivity). Usually warehouses are integrated
part of large supply chains, traditional performance evaluation such as productivity, qual-
ity delivery are more applicable (Schmenner and Swink; 1998; Boyer and Lewis; 2002).
The researchers mainly focus on technical performance evaluation measurement, such as
order line pickup per hour/person, related products, shipment errors and special request
orders (Van Goor et al.; 2019). The issue with these indicators is that they are not inde-
pendent indicators and that each of them relies upon various input indicators (De Koster
and Balk; 2008).
These literature studies mostly focuses on technical aspects of the warehouse but
very few research work present on the warehouse Financial aspects which depends on the
warehouse product management. The above studies helps us to understand the direction
of research work for further improvisation in blowout product identification that could
help in maintaining financial goals of warehouse. The next section describes the various
4
automation techniques utilize for optimizing warehouse performance in different areas of
warehouse management.
2.2 Automation Today in Warehouse Optimization
There are many research that work on different strategies to automate the various areas
of the warehouse for reducing the time and to organize the warehouse structure. Zunic
et al. (2018) worked on optimal product placement for an operational warehouse where
no optimization mechanism was implemented before. They developed algorithm that cal-
culates the grade of an item by considering current product placement and frequency of a
product. As a result, there was a 17.37% reduction in the average picking length for more
than 40 orders and the picking process consumes 50% of the overall total cost. Whereas,
in his earlier work, Zunic et al. (2017) uses a statistic based fitting algorithm for optimal
strategic method and quantitative product placement in the warehouse picking zone with
the help of a real-world case study. This study addresses the issue that if the product
is not present in the picking zone, the workers get dependent on the forklift to take
down the total palettes from the top racks which ultimately decreases the warehouse effi-
ciency. The result shows that items to be moved to the picking zone with average 91.73%
shifted median accuracy which indicates any improvement in this area substantially im-
proves the efficiency of the warehouse. Both papers worked on the effective item picking
strategy based on the distance, product placement, and its product’s order frequency,
but have not considered auto-replenishment products placement and storage assignment
problems.Chan and Chan’s (2011) research paper, worked on a real-world case study
and addresses this issue by implementing class-based storage on manual order picking for
multilevel racks distribution for a real-world case study. Based on travel distance and
order retrieval, time performance is measured. Results obtained in this paper show that
a combination of different factors has different performance indicators. As a future work
Chan and Chan (2011) suggests to work on products congestion problem; this congestion
can be avoided if products are identify that take long retrieval time from the warehouse
than expected. The stagnancy issue faced by Chan and Chan (2011) has been resolved
in this research paper by identifying and informing the warehouse manager about such
products beforehand so that necessary actions could be taken in the first place.
With the emergence of the e-commerce sector and the worldwide order flooded the
distribution center, made the order fulfillment job more complex. Thus, there was a
need for automation which was solved by utilizing the Kiva System for order fulfillment.
Amazon back in 2012, introduced a Kiva system in the e-commerce warehouses (Bogue;
2016). This robotic system utilizes hundreds of robots that move around the warehouse
for picking the product and make it ready for delivery. Now in recent years, for the
process of automation Tejesh and Neeraja’s (2018) paper addressed the issue of finding the
product in the warehouse as it required manual searching, thus in such cases warehouse
inventory management system which utilizes RFID comes to rescue. Tejesh and Neeraja
(2018) developed a warehouse inventory management system based on the Internet of
Things (i.e IOT) for tracking products with its timestamp for verifying products. All
the information was made available on a web page interface for users which can be used
dynamically for getting remote information.
All the mentioned research papers work towards optimization of the warehouse using
various automation techniques, but these papers lack the detection and maintenance
of effective warehouse inventory. These limitations and the benefits of machine learning
5
implementation are identified. In the next section, some of the research works are drafted
that have adopted machine learning techniques, and work in different areas to improve
warehouse performance.
2.3 Warehouse Management using Machine Learning
This section is about the overview and critical analysis of major research work that
uses supervised and unsupervised machine learning techniques in the different fields of a
warehouse.
2.3.1 Using Artificial Neural Network (ANN)
For any optimal warehouse, the priority for operations is to be updated and reorder
its warehouse inventory. There are various researches done for calculating reordering
points using heuristic methods. But calculating these reordering points of all products
using mathematical function is time consuming (Inprasit and Tanachutiwat; 2018); Thus,
Inprasit and Tanachutiwat’s (2018) implemented artificial neural network (ANN) for re-
ordering point determination and safety stock management. Various algorithms were used
for training, testing, building and for comparison of all products based on factors such as
lead time, demand, standard deviation (SD) of demand, SD lead time and service. As a
result, ANN gave the most accurate results with MSE 0.03 × [10−4
] and 0.999 adjusted
R2
value for overall data. Rezaei (2012) uses ANN along with clustering techniques for
prediction of safety stock inventory. In this paper, prediction model was developed with
the combination of clustering K-mean and multi-layer perception methods; moreover for
data reduction in input vector, sensitivity analysis was applied that improves the accur-
acy of prediction by ANN in identifying the safety stock. The above two papers worked
on the reordering point determination and for the prediction of safety stock, but even
after stock prediction, determination of product stagnancy can improve the gap after the
stock prediction and storage of new stock.
2.3.2 Using Genetic Algorithm
A unique approach was used by Nastasi et al. (2018) on an existing steel-making ware-
house for multi-objective optimization of storage strategies. For this purpose, three ge-
netic algorithms i.e Niched Pareto Genetic Algorithm, the Non-dominated Sorting Ge-
netic Algorithm II and the Strength Pareto Genetic Algorithm II were implemented;
along with these algorithms, traditional reordering and product allocation procedures
have also been implemented in this research work. The issue of warehouse optimization
was tested on every algorithm by exploring the simulation system1
which was presented
in the paper. Ai-min and Jia (2011) worked for pharmaceutical logistics center, mentions
the issue with such medical storage centres need special attention in slotting optimiz-
ation, as requirement of medicines change according to the season or the influence of
flue. Hence to deal with such issues Ai-min and Jia (2011) utilize MATLAB and genetic
algorithm to resolve multi objective optimization issues of pharmaceutical logistic center.
The overall results shows that the approach is effective but needs further improvement
for real world scenarios as this paper uses very few goods in this approach.
1
The approach to understand the simulation process is provided under the section Numerical Approach
in (Nastasi et al.; 2018)
6
2.3.3 Using Regression Models
Faber et al. (2018) uncovers the fit between warehouse management structure and per-
formance. A hypothesis was developed and tested among 111 storage warehouses of
Belgium and Netherlands using linear regression model. To ensure the robustness of re-
gression results, bootstrapping method was employed which hardly changes the p-value
except for food products dummy variable section. The ordinary least squares (OLS) re-
gression model performs well and statistically valid for drawing conclusion. Faber et al.
(2018) mentions the results obtained from this model could be further used by warehouse
manager for selecting appropriate planning system. To check the performance of linear
regression models, Vastrad et al. (2013) applies methods lasso, ridge, elastic net which
uses penalty-based shrinkage to handle data set with more predictors than observations.
It has been found that, elastic-net and least angle regression (LARS) gives similar results
but lasso outperformed ridge regression in terms of R2
and RMSE.
Whereas, de Santis et al. (2017) focuses on identifying the material back order is-
sue prior to its occurrence, provides the business sufficient time to take actions that
could increase its overall performance. This study addresses, a predictive model for
class imbalance issue, where frequency of products which gets backorder is low compare
to products that do not. Initially, logistic regression (LOGIST) algorithm was used to
create a baseline score for this issue followed by classification tree (CART) for outlier ro-
bustness and feature selection. Machine learning techniques like Random forest, Gradient
Boost and Blagging (under bagging) were then applied. Result gives RF = .9441 (area
under curve) score by utilizing bagging ensemble, Gradient Boosting = 0.9481 (area under
curve) score. Practically, BLAG identifies 60% of positive class items and 20% products
that could become backorder. In the same year, Larco et al. (2017) uses linear regression
for managing the workers discomfort through warehouse storage assignment decisions.
The results suggests the 21% in terms of cycle time improvement in picking zone for the
process of warehouse optimization.These research works rightly point towards adapting
linear regression and ensemble techniques for this experiment.
From all above studies, it has been understood that there are very few research works
that implement warehouse space optimization by identifying products which could pos-
sible go out of trend. Moreover, the use of electronic warehouse data-set is very unique
and has not been implemented by any other project. The next section gives the idea of
effective dynamic pricing, which is important for this project once the blow out products
are identified.
2.4 Dynamic Pricing for Warehouse Maintenance
In any supply chain industry, effective pricing strategy is an important aspect to provide
products a good momentum in the market depending upon market scenarios especially for
the B2C electronics industry; where fast changing trends and competitive market affects
the sale of products. Minga et al. (2003) proposes a price setting algorithm which is an
application of dynamic price setting strategy for stock optimization of warehouse. The
proposed algorithm for demand sensitive model helps warehouse manager to maximize
the profit while decreasing the marginal cost with increase in quantity ordered. It further
mentions some websites which allows buyers to register and let seller to set a minimum
threshold. When enough buyers register on the website, the price drops a little above the
threshold which helps seller to make more money. The proposed strategy can be used in
this research work after finding the blow out products to sell those products effectively.
7
Whereas, Ancarani and Shankar (2004) proposes a hypothesis on how costs and price
dispersion compare among the traditional retailers and the multi-channel retailers, and
test through a statistical analysis. Price sensitive multidimensional recommended system
(PSRS) and collaborative filtering was used. This research shows how price setting affects
the business performance. This research can be used to sell products in bulk once blowout
products are identified. After thorough analysis of literature work done in various sector,
the detailed explanation of adopted methodology is drafted in the next (section 3) to
answer the research question mentioned in (section 1.2).
3 Research Methodology and Specification
After the literature is understood, it is important to chalk out the methodology of the
research. For this experiment, it is very important to understand the business perspect-
ive of data. This data is further modelled with different regression strategies and then
evaluated to achieve the ultimate result of finding the blow-out time. It turns out that
CRISP-DM is the best ideology to work with this process as all the process steps have
descriptions of typical phases explained in CRISP-DM guide. It also has an advantage-
ous capability to make large data mining projects cheap at cost, manageable and reliable
(Wirth and Hipp; 2000). This process is discussed in Figure 1 below,
Figure 1: Modified CRISP-DM Process Model, Data Source: Wirth and Hipp (2000)
8
3.1 Business Understanding
The most important thing is to understand the business for whom this product is targeted.
Due to the increase in customer demand, the business has to be more flexible and provide
a user-friendly solution as well as business-friendly. From the literature review, it has
been understood that there is an issue with the warehouse limited space and the stagnant
products make it worse as they take more time to sell than usual that which blocks revenue
and space resulted in losses to the warehouse management. This research work mainly
focuses on space optimization solution for warehouse manager, so that the manager does
not have to bear losses on the products that have worn out. Thus, to produce the blow
out time of the electronic products becomes the business persona and understanding of
this research work.
3.2 Data Understanding
Data collection is an important step as it will initiate the process of execution of the
planning. The data for the research should not be anything other than live data; so that,
it is easy to understand what issues the warehouse controllers are facing on a regular
basis. This data understanding section should be divided into the following 3 sections to
comprehend the data.
3.2.1 Data Collection
The data is collected from a live working organization to sustain the objective of assisting
the organizations where heuristics are applied to understand the blow-out duration of
products. The data has been arranged from a startup e-commerce organization Price
Save that mainly deals with electronic products. This data has mainly 4200 unique
products data across 3 months. This data has many parameters like cost price, selling
price, profit and an additional feature called Sale Rate which is mainly explained in the
data preparation phase.
3.2.2 Data Ethics
To maintain the confidentiality of the data, a prior consent letter was taken to use the
data for educational and research purposes. The collected data is only gathered keeping
in mind the business interest. The letter of consent is attached in the configuration
manual.
3.2.3 Exploratory Data Analysis
The collected data is summarized considering the values that have obtained in the form
of different entities like mean, standard deviation, etc.. The summary of some columns
is shown in the Figure 2 below,
9
Figure 2: Column Summary
The only purpose of this is to identify any anomalies in the data than can be stroked
from the data at an early stage.
1) Validation of Data Linearity: Scatter plot shown in the Figure 3, shows the
data distribution of the columns. That shows the column data is not linear.
Figure 3: Scatter-plot
2) Multicollinearity: Multi-collinearity comes into existence if we have features
that are strongly dependent on each other. In the dataset under consideration, the cost
price, profit and price are the strongly dependent features. To understand their inter-
dependence, the correlation matrix is been plotted for all these features,
Figure 4: Correlation Matrix
10
From the above correlation matrix, it has been found that the cost price, profit and
price columns are correlated. The profit column was calculated using the price (i.e selling
price) and the cost price. Thus, to increase the efficiency of model training, the feature
profit has been dropped and fitting of the model is performed by keeping cost price and
price features; as these features are very an important parameters for any product and
could be helpful in training the model.
3.3 Design Specification
One of the key aspect of any project is to understand from the logical and business point
of view. Thus, this section gives an insightful overview of the project and and the steps
involved in the project is shown in the Figure 5 using two tier architecture.
Figure 5: Project Flow
3.4 Data Preparation and Transformation
This phase has mainly dealt with the pre-processing of the data that is collected. As
it was earlier declared that Sale Rate has been additionally used as a new parameter,
its only use in the data is to calculate the blow out time for the products in heuristic
analytics. To create the additional column a master file was created assimilating per day
data of 90 days. Using the inventory and the number of the day when the stock gets
sold out is recorded, if there is no sale for 90 days of certain products; such products get
the priority for clearance and are given named as infinite blowout to distinguish from the
other products. The formula for calculating the Sale Rate is given below,
SaleRate =
Initial Inventory − Remaining Inventory
N
(1)
11
where, ’N’ is the number of the day when the stock gets sold out.
It is very important to pre-process the data before we use it fit the model as nature of
information and the data quality have direct influence over the capacity of model to be
adapted Nali´c and ˇSvraka (2018). Following factors have been taken into consideration
before the data modelling.
3.4.1 Managing NULL values
The data was validated if null values are presented in it. If those are present, it was
made sure that those values are replaced with ’default’ value that does not make any
significant change in the database but also does not misguide the models that are applied
for irrelevant outputs.
3.4.2 Handling missing values
Missing values cause major issues while applying machine learning algorithms to the
data, as it becomes prone to outliers which returns inaccurate results2
. This is handled
in the dataset by replacing the missing values with default values that do not cause much
variance to the dataset.
3.4.3 Dealing with categorical values
Categorical values are very discrete and thus not continuous. It is very important for
an algorithm to receive continuous values as it is not favourable to judge the outcome
through categorical values. There are two types of categorical values namely ordinal and
nominal. As the dataset is free of ordinal variables, only nominal variables have been
taken into consideration.
• One-Hot Encoding:
The dataset mainly has many categorical values which were required to improve the
accuracy of machine learning models. The data was converted into the ordinal type
and nominal type while feature engineering; whereas, categorical data was taken into
nominal and numerical data into ordinal values. One-hot encoding was performed on
these nominal data, here the dataset is divided into n unique columns where m is the
number of unique nominal columns. As there are 458 different categories combination
in the nominal dataset, additional 458 columns have been added in it. The data after
encoding is shown in the Figure 6,
Figure 6: One-hot encoding
2
understood how to manage the missing data -
https://towardsdatascience.com/introduction-to-data-preprocessing-in-machine-learning-a9fa83a5dc9d
12
3.5 Modelling
After the vast literature survey reviewed in section 2, it was seen that the identification of
blowout products can be done using the regression and ensemble methods. The objective
outlined is to predict continuous dependent variable Sale Rate to figure out blow-out
duration using the best performing algorithm from independent variables like category
and price etc. As the dependent variable is continuous in nature, which is the only reason
to use linear regression analysis over logistic regression. Fulfilling the same purpose, al-
gorithms such as LASSO, Ridge, Elastic Net, Gradient Boosting, Ada Boosting, Random
Forest Regressor have been modeled.
3.5.1 Linear Regression
1) LASSO (Least Absolute Shrinkage and Selection Operator):
LASSO is a linear regression analysis method that displays variable selection to increase
the accuracy of the prediction of statistics model. It uses the shrinkage method that
shrinks data values towards the center. LASSO is penalized for the sum of absolute
values of weights. The objective of the lasso is to acquire the subset while minimizing
the predictor error for a quantitative variable. It reduces model complexity and avoids
overfitting.
2) Ridge Regression:
Ridge regression estimates are usually little affected by small changes in the data on
which the fit regression is based. It goes one step ahead by penalizing the sum of squared
values of the weights. Thus the weights are more evenly distributed and have values
closer to zero. Ridge is a regression technique that is generally used to deal with multi-
collinearity in the dataset. Ridge regression model dismantles the multi-collinearity by
reducing the immensity of correlations in the data. Thus, it became part of this research
work.
3) Elastic Net:
One another type of linear regression model is an elastic net that can deal with the
penalties present in both Lasso and Ridge. It is a hybrid technique of Lasso and Ridge
where absolute penalty and square penalty both are inclusive and regulated with another
coefficient ration. It best works with scaled data i-e a data which is more than 100K
rows, thus it would be a good choice to test with it as it is an improvement to Lasso and
Ridge algorithms.
3.6 Ensemble Models
1) Gradient Boosting Regressor:
Boosting is generally a procedure to convert weak learners into strong learners. Gradi-
ent Boost trains model in an additive and sequential manner. It identifies the limitation
of weak learners by using gradients in the loss function, where loss function indicates
how coefficients of model fit the underlying data. Ke et al. (2017) popularized the use
of highly efficient Gradient Boosting decision tree by proposing two novel techniques
13
namely Gradient based One Sided Sampling (GOSS) and Exclusive Feature Bundling
(EFB). After viewing the astonishing performance of in training time cost of 0.28 secs
for one training iteration, this algorithm should best suit to the warehouse data.
2) Ada Boosting Regressor:
The AdaBoost algorithm AdaBoost regressor is a meta-estimator that starts by fitting
a regressor on the dataset and fits extra duplicates of the regressor on the same dataset,
the weight of those observations are balanced by the error of the present expectation. It
is implemented with equal weight given to each observation. Solomatine and Shrestha
(2004) compared the working of M5 model trees with AdaBoost.RT and found that it
performed better in most of the considered datasets. Researchers proved practically that
the AdaBoost algorithm gives better performance even in the presence of noise in the data
and is better than the individual machine with a confidence level higher than 99%. It has
a high capability to perform against the warehouse data considered all these presented
factors.
3) Random Forest Regressor:
Random forest is the bagging ensemble that operates by constructing a series of de-
cision trees during training and aime to reduce the variance by randomly selecting (which
de-correlates) trees from the dataset. They are an ensemble of different regression trees
and are utilized for nonlinear multiple regression.
All the above algorithms have been selected based on their performance in their re-
spective fields implemented which has been exceptional. The main motive of selecting
Linear regression and ensemble technique is because the data is very continuous and not
discrete.
3.6.1 Test Design
Before building the model, it is very important to understand what type of splitting
technique is best suited for the dataset. There are two main types of data splitting
namely the train-test split and train-validation-test split. There are various advantages
proven about the train-validation-test split over the train-test split. Guyon (1997) has
provided splitting measure by giving a relation of ratio of validation set size over training
set size with the minimizing validation error and error rate in training set. The data
split used in the warehouse dataset is 80% train, 10% validation and 10% test. After
this splitting is performed, the model is built, and evaluation is conducted on it. The
three best performing algorithms are subjected to tuning with the help of grid search
parameter
3.7 Evaluation
While preparing a model, it is important to consider how model generalizes an unseen
data while performing machine learning techniques. In this research, hold-out strategy is
used which is a simplest kind of cross validation and requires less computational power
than the k-fold cross validation. This method provides an unbiased prediction of learning
performance. In this method, the dataset has been divided into three subsets i.e Train,
test and validation. The model is trained on the validation subset and for selecting a best
14
performing model, it provides a test platform for fine tuning the model’s parameter. For
this research we have divided data into train 80%, test 10% and validation 10%. All the
models are first trained on validation and then with the best performing model predictions
are calculated on test set. While training the model in all evaluation techniques, the
performance of the models have been observed by iterating the data in batches and
models have been compared with these evaluations in testing. Moreover, in order to
further validate, Modelling has been done separately on products which are sold and
evaluation are checked, whereas for not sold products, best algorithm was tested to check
the R2 score. All the graphs are shown in below subsection.
3.7.1 R-Square Criterion
The coefficient of determination (i.e R2
) is a statistical technique of regression that de-
termines proportion of variance in dependent variable experienced by predictor variables.
R2
can vary between 0 and 1; where 1 means the model correctly fits the data. Formula
for R-square is given below:
(R2
) =
V ariance explained by the model
Total V ariance
(2)
Graphs obtained after modelling are shown below:
Figure 7: R2 score - full dataset
Figure 8: R2 score - soldout dataset
From the above Figure 7 and Figure 8, it is clear that GBM outperforms all other al-
gorithms for both modelling performed on full dataset as well as on soldout dataset.
3.7.2 Mean absolute error
As the data under use is continuous, measure of differences between the continuous vari-
able can be derived using mean absolute error. MAE shows how big an error we can
anticipate from the forecast on average. It is also an average distance each point and the
line of equality. The formula for MAE is described below,
15
Where, n= number of error and |xi − x| is absolute error
Figure 9: MAE - Full dataset
Figure 10: MAE - soldout dataset
All six models was evaluated on both datasets using mean absolute error and it has
been noticed that gradient boosting has lowest MAE for soldout dataset whereas, Ran-
domforest regressor gives the lowest MAE when evaluated on full dataset.
3.7.3 Mean squared error
Mean square error informs how close the data points to the regression line. It calculates
using the square of the distance from the regression line. The squaring removes all the
negative data points. MSE also gives importance to large differences. Smaller the MSE
closer to the line of best fit. The comparison of MSE for all algorithm is shown in the
below graph,
Figure 11: MSE - full dataset
16
Figure 12: MSE - soldout dataset
From the above Figure 11 and Figure 12, it can be seen that the RFR gives the lowest
error on full dataset, but GBM shows the least amount of error for the soldout dataset.
3.7.4 Median absolute error
The median absolute error is very robust to outliers. The loss function is calculated by
simply considering all the median of absolute differences between prediction and target.
The lowest difference shows more accuracy. The built model was evaluated using median
absolute error and algorithm comparison has been graphically plotted below,
Figure 13: Median absolute error - full dataset
Figure 14: Median absolute error - soldout dataset
From the above Figure 13 and Figure 14, RFR shows the lowest median absolute error
for full dataset while GBM has lowest error in soldout dataset.
3.7.5 Variance score
Variance score or explained variance is the measure of calculating how important a math-
ematical model is, in terms of variation in the given data set. It is a measure of discrep-
ancy in predicted to the actual model. Higher rates of explained variance shows stronger
association strength. The built model was evaluated and compared using variance score
and is shown below,
17
Figure 15: Variance score - full dataset
Figure 16: Variance score - soldout dataset
From the above comparison, RFR shows the strong association than rest of the al-
gorithms for full dataset but GBM shows betters result for soldout one.
After looking at all the evalutions on both the datasets, RFR shows lower error rate
than all other algorithms but the difference between train and test scores are wide for
most of the evaluations. If this condition is checked for remaining algorithms, GBM has
low error rate and lesser score difference which makes it best suited for application.
This GBM algorithm is further tested against unsold subset, to compare the perform-
ance and there is a reduction in test R2
score as compared to the full dataset, whereas
the mean squared error increased; the reason which can be anticipated here is the subset
contains noise i.e outliers of the the full dataset. The graphs for R2
and MAE is plotted
in the below Figure 17,
Figure 17: Full data vs unsold subset
3.8 Deployment
This section gives an overall idea about the implementation of the project as well as the
how the prediction of blowout time period and application of pricing strategies implemen-
ted for the objective of achieving the automation of the warehouse for space optimization.
18
3.8.1 Implementation
For the objective of identifying the accurate blowout time period using different machine
learning algorithms, first, the data has been collected from an organization named ’Price
Save’ which is a startup company, following proper ethics. The consent letter has been
provided in the configuration manual of this project. The dataset contains 4200 unique
products per day for a period of 3 months. Initially, exploratory data analysis was
performed, which is explained in section in section 3.2. After understanding the data, pre-
processing techniques (explained in section 3.4) such as scaling one-hot encoding, feature
extraction, etc. were committed. The data was split into a train, test and validation
set (in 80%, 10%, 10% respectively) to maintain the integrity of the dataset and trained
with all six models. algorithm. The model was first trained on validation by iterating the
dataset in batches to check the performance of all models. Based on the R2
score best
model was further tuned for improving the scores. But, unfortunately, the model did
not find the best fine-tuning parameters and the R2
scores were inaccurate as compared
to the training validation ones. Hence, it has been decided to use models trained on
validation for further predicting the scores on the test data set. Now the dataset was
added with a new column ”Sold Status” which gives logical value on resultant product
exhaustion. This sold subset was trained against all the models performing all the similar
tasks as of full dataset and the results from both experiments were noted to check the
best performing algorithm. After the comparative evaluation, it has been identified that
the gradient boosting regressor gives the best performance in terms of R2
and using the
other metrics as explained in section 3.7. The already trained GBM model is subjected to
testing against unsold dataset to calculate the comparative error rate. The best algorithm
was further used to predict the blow out time-period for all products and how to apply
effective pricing strategies is explained in the following section 3.8.2 and section 3.8.3.
3.8.2 Prediction of Blowout Time-Period
This is the application of the machine learning strategy for the benefit of warehouse
industry. Identifying the blowout time period will not only saves the time of a warehouse
manager but also prevent warehouse from overstocking which ultimately helps in space
optimization of the warehouse. The best performing algorithm identified from the above
data analysis, the algorithm was further used to predict the sale-rate which is calculated
using the sales obtained per day. Based on the accuracy obtained from the testing the
sale rate, the accurate sale rate is obtained. The blowout day can be calculated using
formula,
Blowout Day n =
Initial Inventory
Predicted SaleRate
(3)
where, predicted sale rate - which is obtained using the best performing algorithm.
The results are shown in the output below after calculating blowout using best performing
algorithm.
y pred - is the predicted blowout using Gradient boosting algorithm and blowout test is
calculated using heuristic method.
It can been seen from the above blowout prediction output that the actual blowout
calculated through application of machine learning algorithm is far more accurate than the
heuristic ones. Thus after identifying the time required for every product get exhaust, a
19
Figure 18: Blowout prediction
warehouse manager can apply pricing strategies release these products from the warehouse
effectively. The pricing strategy is explained in the next section.
3.8.3 Application of Efficient Pricing Strategies for Warehouse Optimization
Application of pricing strategies is an additional layer of the project. After the predic-
tion of blow-out time period for each product using the best performing algorithm as
explained in the section 3.8.2. This part is important in terms of achieving the object-
ive of automating warehouse products by clearing-out with optimal pricing strategy. As
discussed in the section 2.4 of literature survey, a dynamic price setting strategy can
be used using the price setting algorithm as mentioned in Minga et al. (2003). If the
predicted blowout product have more quantity to be sold and have more time period as
predicted by the algorithm, then in such scenario profit could be further dropped up-to
a threshold limit until the product gets its momentum. When dealing with a wide range
of products, it is not possible to do pricing on every products and as such products are
prone to overstocking. Hence a method which is mentioned below can be utilized to do
pricing on all these products with ease and can focus on other warehouse products for
generating profit. The formula mentioned below, uses a strategy of dropping the price by
taking the percentage above cost price. For this method a minimum profit is defined first
and then using the trial and testing method, certain amount of profit percent dropped
every time until the product gets the sale momentum. This is a self invented and tested
method while working in the previous organization. The cost price consist of various
parameters i.e Cost of the product, shipping, warehouse fees (cost of warehouse), website
commission (ex. third party e-commerce website). The idea is explained below using the
example formula,
Example Formula:
Selling price = 1 +
15 ∗ (product cost + shipping + commission + warehouse fees)
100
(4)
This means, selling price = 15% above the cost price. i.e the selling price will give the
value to be set in order to take 15% above the cost price. Here just by changing the
percentage, a selling price can be obtained for each product. Hence the formula,
Sellingprice = 1 +
n ∗ (total cost price involved)
100
(5)
Where, ’n’ symbolize percent to be set. This way mass pricing can be done using the
master file for products which are slow moving (blowout products).
20
4 Discussion
In this research work of application of machine learning strategies for optimization of
the warehouse space, it has been found that the machine learning methods perform
better in predicting the sale rate which helps this experiment in identifying the blow out
period. Machine learning algorithms such as elastic net, lasso, ridge, gradient boosting,
ada boosting, and random forest regressor were applied on ful dataset and evaluated on
different metrics. It was found that amongst all algorithms, gradient boosting algorithm
outperforms with R2
= 0.9197 and 0.177 mean squared error which is better than all other
algorithms, followed by ada boosting regressor with an R2
of 0.9064 and MAE = 0.2120.
This trained GBM model is tested against unsold subset to validate its performance
against the products that are unsold. The results show a reduction in the R2
as compared
to the R-squared obtained from full dataset. Moreover, for sold subset data all the
algorithms were again trained and tested. The results found to be improved as the noise
i.e unsold data which contains outliers was removed from the dataset. This study will
help the warehouse manager understand the pattern of his unsold products while re-
ordering them for the next iteration and apply dynamic pricing strategies to effectively
optimize the warehouse space. Blow-out time calculated with actual and that with the
implementation of GBM have unique values, some of the examples are plotted in the
Figure 19 below. At this moment, there won’t be huge differences noted in the blow out
durations, but in long run, the heuristics (i.e decisions based on experience/knowledge)
would not prove effective but GBM will always improve with an increased amount of
pattern data and products. Thus, the automation can be marked fruitful and executable
in case of warehouse space optimization.
Figure 19: Blowout ML vs actual blowout
5 Conclusion and Future work
A competitive comparison between heuristic analytics is conducted against different ma-
chine learning strategies and it can be concluded that that heuristics can proved very
effective for small scale organizations but to handle the amount of data generated by
huge organizations, only the machine learning strategies can set a stepping stone and
provide better results. From the experiment, it has been proved that the warehouse
21
space optimization can very well be achieved with the implementation of machine learn-
ing strategies with good amount of accuracy.
There is always a scope of improvement in the research work. The most considerate
one would be provision of factors like varied seasonal data, increased number of products
and additional domains for sale. These factors bring about increased quality pf data
which in turn would train the algorithms and bring about more accurate predictions of
products’ blow-out time. While conducting research, there are various studies done with
the help of applied deep learning viz. ANN, that provides higher accuracy. But regarding
this research, as there is inadequate amount of data, it is not exposed to seasonal data to
be able to successfully build the model and predict the accurate patterns. Nevertheless,
space optimization is achievable and this research paper adequately answers the research
object with great success.
References
Ai-min, D. and Jia, C. (2011). Research on slotting optimization in automated warehouse
of pharmaceutical logistics center, 2011 International Conference on Management Sci-
ence & Engineering 18th Annual Conference Proceedings, IEEE, pp. 135–139.
Ancarani, F. and Shankar, V. (2004). Price levels and price dispersion within and across
multiple retailer types: Further evidence and extension, Journal of the academy of
marketing Science 32(2): 176.
Baker, P. and Halim, Z. (2007). An exploration of warehouse automation implementa-
tions: cost, service and flexibility issues, Supply Chain Management: An International
Journal 12(2): 129–138.
Bhat, S., Kansana, K. and Khan, J. (2016). A review paper on e-commerce, Asian
Journal of Technology & Management Research [ISSN: 2249–0892] 6(1).
Bogue, R. (2016). Growth in e-commerce boosts innovation in the warehouse robot
market, Industrial Robot: An International Journal 43(6): 583–587.
Boyer, K. K. and Lewis, M. W. (2002). Competitive priorities: investigating the need for
trade-offs in operations strategy, Production and operations management 11(1): 9–20.
Chan, F. T. and Chan, H. K. (2011). Improving the productivity of order picking of a
manual-pick and multi-level rack distribution warehouse through the implementation
of class-based storage, Expert systems with applicatiostockns 38(3): 2686–2700.
Chen, C.-M., Gong, Y., De Koster, R. B. and Van Nunen, J. A. (2010). A flexible eval-
uative framework for order picking systems, Production and Operations Management
19(1): 70–82.
De Koster, M. and Balk, B. M. (2008). Benchmarking and monitoring international
warehouse operations in europe, Production and Operations Management 17(2): 175–
183.
De Koster, R. B., Le-Duc, T. and Zaerpour, N. (2012). Determining the number of zones
in a pick-and-sort order picking system, International Journal of Production Research
50(3): 757–771.
22
de Santis, R. B., de Aguiar, E. P. and Goliatt, L. (2017). Predicting material backorders in
inventory management using machine learning, 2017 IEEE Latin American Conference
on Computational Intelligence (LA-CCI), IEEE, pp. 1–6.
Faber, N., De Koster, R. B. and Smidts, A. (2018). Survival of the fittest: the impact
of fit between warehouse management structure and warehouse context on warehouse
performance, International Journal of Production Research 56(1-2): 120–139.
Guyon, I. (1997). A scaling law for the validation-set training-set size ratio, AT&T Bell
Laboratories pp. 1–11.
Hackman, S. T., Frazelle, E. H., Griffin, P. M., Griffin, S. O. and Vlasta, D. A. (2001).
Benchmarking warehousing and distribution operations: an input-output approach,
Journal of Productivity Analysis 16(1): 79–100.
Inprasit, T. and Tanachutiwat, S. (2018). Reordering point determination using ma-
chine learning technique for inventory management, 2018 International Conference on
Engineering, Applied Sciences, and Technology (ICEAST), IEEE, pp. 1–4.
Johnson, A. and McGinnis, L. (2010). Performance measurement in the warehousing
industry, IIE Transactions 43(3): 220–230.
Kar´asek, J. (2013). An overview of warehouse optimization, International journal of
advances in telecommunications, electrotechnics, signals and systems 2(3): 111–117.
Kartal, H., Oztekin, A., Gunasekaran, A. and Cebi, F. (2016). An integrated decision
analytic framework of machine learning with multi-criteria decision making for multi-
attribute inventory classification, Computers & Industrial Engineering 101: 599–613.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q. and Liu, T.-Y.
(2017). Lightgbm: A highly efficient gradient boosting decision tree, Advances in
Neural Information Processing Systems, pp. 3146–3154.
Kim, T. Y. (2018). Improving warehouse responsiveness by job priority management: A
european distribution centre field study, Computers & Industrial Engineering .
Klodawski, M., Jacyna, M., Lewczuk, K. and Wasiak, M. (2017). The issues of selection
warehouse process strategies, Procedia Engineering 187: 451–457.
Knoll, D., Pr¨uglmeier, M. and Reinhart, G. (2016). Predicting future inbound logistics
processes using machine learning, Procedia CIRP 52: 145–150.
Krauth, E., Moonen, H., Popova, V. and Schut, M. (2005). Performance indicators in
logistics service provision and warehouse management–a literature review and frame-
work, Euroma international conference, pp. 19–22.
Larco, J. A., De Koster, R., Roodbergen, K. J. and Dul, J. (2017). Managing warehouse
efficiency and worker discomfort through enhanced storage assignment decisions, In-
ternational Journal of Production Research 55(21): 6407–6422.
Laudon, K. C. and Laudon, J. P. (2015). E-commerce: Digital markets, digital goods, in
K. C. Laudon and J. P. Laudon (eds), Management Information Systems: Managing
the Digital Firm Plus MyMISLab with Pearson eText–Access Card Package, Prentice
Hall Press, chapter 10, pp. 415–425.
23
Mangano, G. and De Marco, A. (2014). The role of maintenance and facility management
in logistics: a literature review, Facilities 32(5/6): 241–255.
Minga, L. M., Feng, Y.-Q. and Li, Y.-J. (2003). Dynamic pricing: ecommerce-oriented
price setting algorithm, Proceedings of the 2003 International Conference on Machine
Learning and Cybernetics (IEEE Cat. No. 03EX693), Vol. 2, IEEE, pp. 893–898.
Nali´c, J. and ˇSvraka, A. (2018). Importance of data pre-processing in credit scoring
models based on data mining approaches, 2018 41st International Convention on In-
formation and Communication Technology, Electronics and Microelectronics (MIPRO),
IEEE, pp. 1046–1051.
Nastasi, G., Colla, V., Cateni, S. and Campigli, S. (2018). Implementation and compar-
ison of algorithms for multi-objective optimization based on genetic algorithms applied
to the management of an automated warehouse, Journal of Intelligent Manufacturing
29(7): 1545–1557.
Reyes, J., Solano-Charris, E. and Montoya-Torres, J. (2019). The storage location as-
signment problem: A literature review, International Journal of Industrial Engineering
Computations 10(2): 199–224.
Rezaei, H. R. (2012). A novel approach for safety stock prediction based on clustering
artificial neural network, Paul Bharath Bhushan Petlu Chaylasy Gnophanxay p. 107.
Schmenner, R. W. and Swink, M. L. (1998). On theory in operations management,
Journal of operations management 17(1): 97–113.
Solomatine, D. P. and Shrestha, D. L. (2004). Adaboost. rt: a boosting algorithm for
regression problems, 2004 IEEE International Joint Conference on Neural Networks
(IEEE Cat. No. 04CH37541), Vol. 2, IEEE, pp. 1163–1168.
Staudt, F. H., Alpan, G., Di Mascolo, M. and Rodriguez, C. M. T. (2015). Warehouse
performance measurement: a literature review, International Journal of Production
Research 53(18): 5524–5544.
Tejesh, B. S. S. and Neeraja, S. (2018). Warehouse inventory management system using
iot and open source framework, Alexandria engineering journal 57(4): 3817–3823.
Van den Berg, J. P. and Zijm, W. H. (1999). Models for warehouse management: Classi-
fication and examples, International journal of production economics 59(1-3): 519–528.
Van Goor, A. R., van Amstel, W. P. and van Amstel, M. P. (2019). European distribution
and supply chain logistics, Routledge.
Vastrad, C. et al. (2013). Performance analysis of regularized linear regression models for
oxazolines and oxazoles derivitive descriptor dataset, arXiv preprint arXiv:1312.2789 .
Wirth, R. and Hipp, J. (2000). Crisp-dm: Towards a standard process model for data
mining, Proceedings of the 4th international conference on the practical applications of
knowledge discovery and data mining, Citeseer, pp. 29–39.
24
Zunic, E., Hasic, H., Hodzic, K., Delalic, S. and Besirevic, A. (2018). Predictive analysis
based approach for optimal warehouse product positioning, 2018 41st International
Convention on Information and Communication Technology, Electronics and Micro-
electronics (MIPRO), IEEE, pp. 0950–0954.
Zunic, E., Hodzic, K., Hasic, H., Skrobo, R., Besirevic, A. and Donko, D. (2017). Ap-
plication of advanced analysis and predictive algorithm for warehouse picking zone
capacity and content prediction, 2017 XXVI International Conference on Information,
Communication and Automation Technologies (ICAT), IEEE, pp. 1–6.
25

More Related Content

What's hot

Unit 6 Software Configuration Management
Unit 6 Software Configuration ManagementUnit 6 Software Configuration Management
Unit 6 Software Configuration ManagementKanchanPatil34
 
The role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha KrishnanThe role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha Krishnansunanthakrishnan
 
Software Configuration Management (SCM)
Software Configuration Management (SCM)Software Configuration Management (SCM)
Software Configuration Management (SCM)Er. Shiva K. Shrestha
 
3. planning in situational calculas
3. planning in situational calculas3. planning in situational calculas
3. planning in situational calculasAnkush Kumar
 
Lect4 software economics
Lect4 software economicsLect4 software economics
Lect4 software economicsmeena466141
 
Retrofitting Buildings to Achieve Energy Efficiency
Retrofitting Buildings to Achieve Energy EfficiencyRetrofitting Buildings to Achieve Energy Efficiency
Retrofitting Buildings to Achieve Energy EfficiencyDivya Kothari
 
What does BIM mean for Civil Engineers?
What does BIM mean for Civil Engineers?What does BIM mean for Civil Engineers?
What does BIM mean for Civil Engineers?Chun Keung Ng
 
Lecture 06 production system
Lecture 06 production systemLecture 06 production system
Lecture 06 production systemHema Kashyap
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and RegressionMegha Sharma
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated AnnealingJason Larsen
 
Empirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an OverviewEmpirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an Overviewalessio_ferrari
 
BIM at Stanford - Building Success
BIM at Stanford - Building SuccessBIM at Stanford - Building Success
BIM at Stanford - Building SuccessJason Holbrook, PMP
 
Machine learning Lecture 3
Machine learning Lecture 3Machine learning Lecture 3
Machine learning Lecture 3Srinivasan R
 
Pre versus post-occupancy evaluation of daylight quality in hospitals
Pre versus post-occupancy evaluation of daylight quality in hospitalsPre versus post-occupancy evaluation of daylight quality in hospitals
Pre versus post-occupancy evaluation of daylight quality in hospitalsDania Abdel-aziz
 

What's hot (20)

Cyclomatic complexity
Cyclomatic complexityCyclomatic complexity
Cyclomatic complexity
 
Unit 6 Software Configuration Management
Unit 6 Software Configuration ManagementUnit 6 Software Configuration Management
Unit 6 Software Configuration Management
 
Why Wood Plastic Composite (WPC)?
Why Wood Plastic Composite (WPC)?Why Wood Plastic Composite (WPC)?
Why Wood Plastic Composite (WPC)?
 
Smart High Performanc Facades
Smart High Performanc FacadesSmart High Performanc Facades
Smart High Performanc Facades
 
The role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha KrishnanThe role of NLP & ML in Cognitive System by Sunantha Krishnan
The role of NLP & ML in Cognitive System by Sunantha Krishnan
 
Software Configuration Management (SCM)
Software Configuration Management (SCM)Software Configuration Management (SCM)
Software Configuration Management (SCM)
 
3. planning in situational calculas
3. planning in situational calculas3. planning in situational calculas
3. planning in situational calculas
 
Lect4 software economics
Lect4 software economicsLect4 software economics
Lect4 software economics
 
Retrofitting Buildings to Achieve Energy Efficiency
Retrofitting Buildings to Achieve Energy EfficiencyRetrofitting Buildings to Achieve Energy Efficiency
Retrofitting Buildings to Achieve Energy Efficiency
 
What does BIM mean for Civil Engineers?
What does BIM mean for Civil Engineers?What does BIM mean for Civil Engineers?
What does BIM mean for Civil Engineers?
 
Lecture 06 production system
Lecture 06 production systemLecture 06 production system
Lecture 06 production system
 
Lecture 3 general problem solver
Lecture 3 general problem solverLecture 3 general problem solver
Lecture 3 general problem solver
 
BIM awareness
BIM awarenessBIM awareness
BIM awareness
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated Annealing
 
Empirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an OverviewEmpirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an Overview
 
BIM at Stanford - Building Success
BIM at Stanford - Building SuccessBIM at Stanford - Building Success
BIM at Stanford - Building Success
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Machine learning Lecture 3
Machine learning Lecture 3Machine learning Lecture 3
Machine learning Lecture 3
 
Pre versus post-occupancy evaluation of daylight quality in hospitals
Pre versus post-occupancy evaluation of daylight quality in hospitalsPre versus post-occupancy evaluation of daylight quality in hospitals
Pre versus post-occupancy evaluation of daylight quality in hospitals
 

Similar to Thesis - Mechanizing optimization of warehouses by implementation of machine learning methodologies and pricing policies

A Proposed Fuzzy Inventory Management Policy
A Proposed Fuzzy Inventory Management PolicyA Proposed Fuzzy Inventory Management Policy
A Proposed Fuzzy Inventory Management PolicyYogeshIJTSRD
 
A multi stage supply chain network optimization using genetic algorithms
A multi stage supply chain network optimization using genetic algorithmsA multi stage supply chain network optimization using genetic algorithms
A multi stage supply chain network optimization using genetic algorithmsAlexander Decker
 
Data Mining Problems in Retail
Data Mining Problems in RetailData Mining Problems in Retail
Data Mining Problems in RetailIlya Katsov
 
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET Journal
 
Introduction to Vendor Management Inventory
Introduction to Vendor Management Inventory Introduction to Vendor Management Inventory
Introduction to Vendor Management Inventory Abu Talha
 
IMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRY
IMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRYIMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRY
IMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRYIAEME Publication
 
The Simulation of the Supply Chain in an Investment Company by Genetic Algori...
The Simulation of the Supply Chain in an Investment Company by Genetic Algori...The Simulation of the Supply Chain in an Investment Company by Genetic Algori...
The Simulation of the Supply Chain in an Investment Company by Genetic Algori...Mohammad Ali Arjamfekr
 
An integrated inventory optimisation model for facility location allocation p...
An integrated inventory optimisation model for facility location allocation p...An integrated inventory optimisation model for facility location allocation p...
An integrated inventory optimisation model for facility location allocation p...Ramkrishna Manatkar
 
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...IJITCA Journal
 
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...IJITCA Journal
 
Presentation (7) (1).pptx
Presentation (7) (1).pptxPresentation (7) (1).pptx
Presentation (7) (1).pptxAyushDoshi9
 
Operations Research - An Analytic Tool for a Researcher.ppt
Operations Research - An Analytic Tool for a Researcher.pptOperations Research - An Analytic Tool for a Researcher.ppt
Operations Research - An Analytic Tool for a Researcher.pptLadallaRajKumar
 
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...Developing a Forecasting Model for Retailers Based on Customer Segmentation u...
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...ijtsrd
 
Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...
Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...
Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...IJAEMSJORNAL
 
Cost Implication of Inventory Management in Organised Systems
Cost Implication of Inventory Management in Organised SystemsCost Implication of Inventory Management in Organised Systems
Cost Implication of Inventory Management in Organised SystemsDr. Amarjeet Singh
 
Big Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourBig Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourIRJET Journal
 
Simulation in the supply chain context a survey Sergio Terzia,.docx
Simulation in the supply chain context a survey Sergio Terzia,.docxSimulation in the supply chain context a survey Sergio Terzia,.docx
Simulation in the supply chain context a survey Sergio Terzia,.docxbudabrooks46239
 
A study to evaluate the impact of mobile technology on supply chain ecosystem...
A study to evaluate the impact of mobile technology on supply chain ecosystem...A study to evaluate the impact of mobile technology on supply chain ecosystem...
A study to evaluate the impact of mobile technology on supply chain ecosystem...आशीष सिहं
 
Benchmarking Logistics Performance
Benchmarking Logistics PerformanceBenchmarking Logistics Performance
Benchmarking Logistics PerformanceARC Advisory Group
 

Similar to Thesis - Mechanizing optimization of warehouses by implementation of machine learning methodologies and pricing policies (20)

A Proposed Fuzzy Inventory Management Policy
A Proposed Fuzzy Inventory Management PolicyA Proposed Fuzzy Inventory Management Policy
A Proposed Fuzzy Inventory Management Policy
 
A multi stage supply chain network optimization using genetic algorithms
A multi stage supply chain network optimization using genetic algorithmsA multi stage supply chain network optimization using genetic algorithms
A multi stage supply chain network optimization using genetic algorithms
 
Data Mining Problems in Retail
Data Mining Problems in RetailData Mining Problems in Retail
Data Mining Problems in Retail
 
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
 
Introduction to Vendor Management Inventory
Introduction to Vendor Management Inventory Introduction to Vendor Management Inventory
Introduction to Vendor Management Inventory
 
IDENTIFICATION OF SUPPLY CHAIN MANAGEMENT PROBLEMS: A REVIEW
IDENTIFICATION OF SUPPLY CHAIN MANAGEMENT PROBLEMS: A REVIEWIDENTIFICATION OF SUPPLY CHAIN MANAGEMENT PROBLEMS: A REVIEW
IDENTIFICATION OF SUPPLY CHAIN MANAGEMENT PROBLEMS: A REVIEW
 
IMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRY
IMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRYIMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRY
IMPORTANCE OF SUPPLY CHAIN INTEGRATION IN AUTO INDUSTRY
 
The Simulation of the Supply Chain in an Investment Company by Genetic Algori...
The Simulation of the Supply Chain in an Investment Company by Genetic Algori...The Simulation of the Supply Chain in an Investment Company by Genetic Algori...
The Simulation of the Supply Chain in an Investment Company by Genetic Algori...
 
An integrated inventory optimisation model for facility location allocation p...
An integrated inventory optimisation model for facility location allocation p...An integrated inventory optimisation model for facility location allocation p...
An integrated inventory optimisation model for facility location allocation p...
 
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
 
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
AN OPTIMIZING INTEGRATED INVENTORY MODEL WITH INVESTMENT FOR QUALITY IMPROVEM...
 
Presentation (7) (1).pptx
Presentation (7) (1).pptxPresentation (7) (1).pptx
Presentation (7) (1).pptx
 
Operations Research - An Analytic Tool for a Researcher.ppt
Operations Research - An Analytic Tool for a Researcher.pptOperations Research - An Analytic Tool for a Researcher.ppt
Operations Research - An Analytic Tool for a Researcher.ppt
 
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...Developing a Forecasting Model for Retailers Based on Customer Segmentation u...
Developing a Forecasting Model for Retailers Based on Customer Segmentation u...
 
Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...
Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...
Instantaneous Deteriorated Economic Order Quantity (EOQ) Model with Promotion...
 
Cost Implication of Inventory Management in Organised Systems
Cost Implication of Inventory Management in Organised SystemsCost Implication of Inventory Management in Organised Systems
Cost Implication of Inventory Management in Organised Systems
 
Big Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer BehaviourBig Data Analytics for Predicting Consumer Behaviour
Big Data Analytics for Predicting Consumer Behaviour
 
Simulation in the supply chain context a survey Sergio Terzia,.docx
Simulation in the supply chain context a survey Sergio Terzia,.docxSimulation in the supply chain context a survey Sergio Terzia,.docx
Simulation in the supply chain context a survey Sergio Terzia,.docx
 
A study to evaluate the impact of mobile technology on supply chain ecosystem...
A study to evaluate the impact of mobile technology on supply chain ecosystem...A study to evaluate the impact of mobile technology on supply chain ecosystem...
A study to evaluate the impact of mobile technology on supply chain ecosystem...
 
Benchmarking Logistics Performance
Benchmarking Logistics PerformanceBenchmarking Logistics Performance
Benchmarking Logistics Performance
 

More from Shrikant Samarth

Infographic - Ireland: "A Beneficiary of Brexit"
Infographic - Ireland: "A Beneficiary of Brexit"Infographic - Ireland: "A Beneficiary of Brexit"
Infographic - Ireland: "A Beneficiary of Brexit"Shrikant Samarth
 
Data Visualization - A reality check Prisons in India
Data Visualization - A reality check Prisons in IndiaData Visualization - A reality check Prisons in India
Data Visualization - A reality check Prisons in IndiaShrikant Samarth
 
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales Shrikant Samarth
 
Advance Data Mining - Analysis and forecasting of power factor for optimum el...
Advance Data Mining - Analysis and forecasting of power factor for optimum el...Advance Data Mining - Analysis and forecasting of power factor for optimum el...
Advance Data Mining - Analysis and forecasting of power factor for optimum el...Shrikant Samarth
 
Statistics For Data Analytics - Multiple & logistic regression
Statistics For Data Analytics - Multiple & logistic regression Statistics For Data Analytics - Multiple & logistic regression
Statistics For Data Analytics - Multiple & logistic regression Shrikant Samarth
 
Sales force- Housing society management system | Strategic ICT and eBusiness ...
Sales force- Housing society management system | Strategic ICT and eBusiness ...Sales force- Housing society management system | Strategic ICT and eBusiness ...
Sales force- Housing society management system | Strategic ICT and eBusiness ...Shrikant Samarth
 
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of IrelandDWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of IrelandShrikant Samarth
 
DSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraDSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraShrikant Samarth
 

More from Shrikant Samarth (8)

Infographic - Ireland: "A Beneficiary of Brexit"
Infographic - Ireland: "A Beneficiary of Brexit"Infographic - Ireland: "A Beneficiary of Brexit"
Infographic - Ireland: "A Beneficiary of Brexit"
 
Data Visualization - A reality check Prisons in India
Data Visualization - A reality check Prisons in IndiaData Visualization - A reality check Prisons in India
Data Visualization - A reality check Prisons in India
 
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
Analytical CRM - Ecommerce analysis of customer behavior to enhance sales
 
Advance Data Mining - Analysis and forecasting of power factor for optimum el...
Advance Data Mining - Analysis and forecasting of power factor for optimum el...Advance Data Mining - Analysis and forecasting of power factor for optimum el...
Advance Data Mining - Analysis and forecasting of power factor for optimum el...
 
Statistics For Data Analytics - Multiple & logistic regression
Statistics For Data Analytics - Multiple & logistic regression Statistics For Data Analytics - Multiple & logistic regression
Statistics For Data Analytics - Multiple & logistic regression
 
Sales force- Housing society management system | Strategic ICT and eBusiness ...
Sales force- Housing society management system | Strategic ICT and eBusiness ...Sales force- Housing society management system | Strategic ICT and eBusiness ...
Sales force- Housing society management system | Strategic ICT and eBusiness ...
 
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of IrelandDWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
 
DSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraDSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and Cassandra
 

Recently uploaded

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 

Recently uploaded (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 

Thesis - Mechanizing optimization of warehouses by implementation of machine learning methodologies and pricing policies

  • 1. Mechanizing Optimization of Warehouses by Implementation of Machine Learning Methodologies and Pricing Policies MSc Research Project Data Analytics Shrikant Samarth Student ID: x18129137 School of Computing National College of Ireland Supervisor: Paul Liard
  • 2. National College of Ireland Project Submission Sheet School of Computing Student Name: Shrikant Samarth Student ID: x18129137 Programme: Data Analytics Year: 2019-2020 Module: MSc Research Project Supervisor: Paul Liard Submission Due Date: 11/08/2019 Project Title: Mechanizing Optimization of Warehouses by Implementation of Machine Learning Methodologies and Pricing Policies Word Count: XXX Page Count: 25 I hereby certify that the information contained in this (my submission) is information pertaining to research I conducted for this project. All information other than my own contribution will be fully referenced and listed in the relevant bibliography section at the rear of the project. ALL internet material must be referenced in the bibliography section. Students are required to use the Referencing Standard specified in the report template. To use other author’s written or electronic work is illegal (plagiarism) and may result in disciplinary action. Signature: Date: 28th January 2020 PLEASE READ THE FOLLOWING INSTRUCTIONS AND CHECKLIST: Attach a completed copy of this sheet to each project (including multiple copies). Attach a Moodle submission receipt of the online project submission, to each project (including multiple copies). You must ensure that you retain a HARD COPY of the project, both for your own reference and in case a project is lost or mislaid. It is not sufficient to keep a copy on computer. Assignments that are submitted to the Programme Coordinator office must be placed into the assignment box located outside the office. Office Use Only Signature: Date: Penalty Applied (if applicable):
  • 3. Mechanizing Optimization of Warehouses by Implementation of Machine Learning Methodologies and Pricing Policies Shrikant Samarth x18129137 Abstract In this fast-evolving world, demand and supply should behave complimentarily and the warehouse plays a vital role in connecting the two aspects. It is important to manage warehouses by providing better space optimization solutions to main- tain the inflow of the markets trending products to the depot. When this factor is turned a blind eye on, it gives rise to the stagnancy of products in the warehouse thus incurring losses to the warehouse manager. Heuristic and data analytics have been a major turn over factors to increase the efficiency of any warehouse storage. This research paper uncovers the subject of space optimization of warehouse and has a great potential to revolutionize the way machine learning can be implemented to increase the warehouses flexibility. By the implementation of linear regression models and the application of ensemble algorithms, there has been an outstanding competitive comparison of evaluation metric scores to predict the best performing algorithm. After conducting the final evaluation with different evaluation metrics like R2, mean absolute error, etc., gradient boosting emerged as the best performing algorithm among all other participants implemented in this research work. This can assist the warehouse manager to predict the sale-rate and calculate the blow-out period of the electronic products with better precision than heuristic analytics. Keywords: E-commerce, Warehouse Optimization, Space Optimization, En- hancement, Machine Learning, Pricing strategy, Blowout products 1 Introduction The E-commerce industry is the most vital and versatile industry when it comes to dealing with products online. It has the capability to attract customer attention even if the market for products is not strong. E-commerce is a shift from a specific way of thinking which has an equal effect on marketers and customers (Bhat et al.; 2016). It has a whitewash change over the traditional way of dealing with commerce. The vari- ous types of e-commerce business relationships are business-to-business (ex: Amazon business, Alibaba, etc.), business-to-consumer (ex: sellers on Walmart, etc.), consumer- to-business (ex: Google Adsense, etc.) and consumer-to-consumer (ex: eBay, etc.). The focus of this paper is a business-to-business area where Klodawski et al. (2017) has rightly pointed out the issues warehouse manager can face that would eventually result in over- stocking. Based on factors like the market trend, sale forecast, fast-moving products, 1
  • 4. etc., the warehouse manager orders the series of products. But not all products get sold and remaining products becomes stagnant. Klodawski et al.’s (2017) analysis research is purely heuristic-based which is a grand success against the small scale warehouse in- dustry. Whereas, the application of this analytics on a greater scale gives rise to revenue loss to the manager and product-stagnation which is termed as blow-out. Thus, it be- comes very important to understand the trend of the products to be ordered which takes into consideration not only the heuristic configurations but also some categorical values that have an equal and significant outcome. This is easily facilitated with the help of data analytical implementation in the field of the warehouse industry. 1.1 Motivation Being a part of the warehouse industry for more than 5 years, it has been understood that there is a level of complexity to superintend product vacancy in the warehouse. Auto- mation can be the next possible solution which will reduce manual errors and increase the efficiency of the warehouse. Automation, even if proven to be more delivery ori- ented than manual efforts, must be implemented with certain protocols to maximize pro- ductivity. Baker and Halim (2007) has explained about short-comings for non-productive automation implementation in warehouse management. Their findings indicate that an important reason for automation is to adjust growth with effective costing and service im- provement. There is an evident risk of discrepancy in service level failures, cost-ineffective and flexibility concerns. This can only be satisfied with the implementation of machine learning strategies. It is easy to implement and understand the working model of tradi- tional warehouse management, where the warehouse manager applies a level of previous experience and orders the number of products that would be recurring in the warehouse and volume will be used in an optimum way. But there is a great disadvantage in work- ing with larger warehousing when there is a need to deal with tonnes of a wide range of products, resulting in mismanagement of entities and monetary losses (Laudon and Laudon; 2015). To deal with these issues, Reyes et al. (2019) directed the analysts to face operations like quantity and volume of products, demand uncertainty and rapid cus- tomer service. Ai-min and Jia (2011) explained how optimization can be achieved with the assistance of genetic algorithm, replication of optimism with the help of decimal en- coding and weight addition method to evaluate fitness function and mutation. Artificial Intelligence can also play a vital role to understand different aspects of warehouse man- agement. Kartal et al. (2016) has specifically mentioned about the application of machine learning and artificial intelligence algorithms to study the patterns for stocked products in a repository. Knoll et al. (2016) studied the inbound logistics operations and imple- mented machine learning algorithms to predict different tactical strategies which are thus very difficult to manage through heuristics. Machine learning has the capability to go beyond the way to understand the sequence and pattern in the data and provide immense suggestions that are valuable in the scenarios of the live market. By understanding the benefits of machine learning strategies, this research work pedals in the direction of stock management for finding the time period patterns of blowout products, that would help warehouse manager to get an idea of stock clearance before products go out of trend and such products are no longer a concern of any warehouse. 2
  • 5. 1.2 Research Question ”Can machine learning improve the prediction of blowout time better than legacy heuristic methods to attain warehouse space optimization?” 1.3 Research Objective This research paper is a complete answer to the following set objectives for the research question: • Objective 1: Data Collection from a working organization. • Objective 2: Understanding the data through exploratory data analysis. • Objective 3: Performing pre-processing techniques on the collected data • Objective 4: Implement linear regression and ensemble models. • Objective 5: Conduct data evaluation on the algorithms to select the best per- forming model. • Objective 6: Calculate the blowout time period using the best performing model. • Objective 7: Execute dynamic pricing strategies to attain warehouse space op- timization. 2 Literature Review In the last two decades, logistic warehouse services have turned out to be significant player for the various businesses and the supply chain industries. New difficulties arose due to the advancement of the technology. So, to make a proper decision it is important to have knowledge of how decision to be made and what are the key parameters affecting the warehouse performance. To ensure inventory management under space restrictions, it is important to improve the decision making, storage allocation and scheduling the warehouse operations. Krauth et al.’s (2005) introduces a framework that clusters a performance estimation of various streams. Based on their empirical validation, they re- commend a list of performance indicators that could assist the warehouse management, reexamining their operations and compelling them to think apart from cost minimiza- tion. Effective decisions can only be made if the managers have an insight of these key parameters of warehouse operations. One such key indicator discussed is inventory man- agement which holds importance to make sure smooth warehouse operations. With the introduction in the online shopping, ordering has changed coordination of the production network that lead to a dramatic changes in the warehouse inventory management. Van den Berg and Zijm’s (1999) explained about the relation between inventory control decisions, product allocation and assignment problems by comparing various warehouse systems. Thus, identified with the sophisticated class assignment approach, higher warehouse ser- vice level and shorter response times could additionally help in saving more funds and warehouse space. Kar´asek (2013) further examined the issues regarding the warehouse layout which depends on the effective use of space. The real-world challenges such as collision of vehicles and shared employees were studied and a technique was introduced 3
  • 6. for achieving the optimization by utilizing the shop scheduling techniques combined with Vehicle Routing Problem solving techniques. In the competitive e-commerce sector where sellers assure in-time deliveries, there is a constant pressure of improving response time and maintaining trending products in the warehouse. As a result, an enthusiasm for new, complex arranging strategies is taking roots in this sector. The ideal warehouse operation is achieved when customers get their order in due time and when all the warehouse and logistic processes finish in the shortest possible time with minimum utilization of cost and resources under dynamically changing conditions. Kim (2018) studies different priority rules to improvise the delivery process based on the order time. The decision which priority rule involves a trade-off amongst a numerous performance attributes of the outcomes, for instance example, handling order volume, service level and operational cost (Chen et al.; 2010). A popular technique to help this type of decision is data envelopment analysis (Hackman et al.; 2001; De Koster et al.; 2012). But in 2014, Mangano and De Marco’s (2014) uncovers that there is a limited research done in the area of maintenance of logistic and operational warehouse performance, hence there is a need for further optimization in this area. To deal with this optimization, data scientific solutions prove to be a logical pick and have been illustrated in the further sections. 2.1 Efficiency Improvement for Warehouse Optimization In general, warehouse today is getting more complex as there is a constant pressure on the warehouse manager to deliver the orders in time with minimum cost. Therefore, variety of tools and performance of warehouse evaluation has also increased. The matrix that are used for evaluating the performance for different scenarios is not clear (Staudt et al.; 2015), for which warehouse manager have to perform regular analysis. Warehouses generally aims towards reducing cost optimizing warehouse performance and customer responsiveness. Estimating warehouse performance gives input about how the warehouse performs compared with the requirements and with peers. Johnson and McGinnis (2010) discusses that while evaluating warehouse performance technical criteria (that generates output after utilizing resource) gives more clear picture than the financial criteria, as warehouse most of the times does not generates revenues as there are many loses involves regarding the stagnant products (blowout products) which have the slow movement than predicted. On the other hand, Staudt et al.’s (2015) found that there are very few re- search work done in cost related performance indicator than other operational perform- ance indicators (i.e time, quality and productivity). Usually warehouses are integrated part of large supply chains, traditional performance evaluation such as productivity, qual- ity delivery are more applicable (Schmenner and Swink; 1998; Boyer and Lewis; 2002). The researchers mainly focus on technical performance evaluation measurement, such as order line pickup per hour/person, related products, shipment errors and special request orders (Van Goor et al.; 2019). The issue with these indicators is that they are not inde- pendent indicators and that each of them relies upon various input indicators (De Koster and Balk; 2008). These literature studies mostly focuses on technical aspects of the warehouse but very few research work present on the warehouse Financial aspects which depends on the warehouse product management. The above studies helps us to understand the direction of research work for further improvisation in blowout product identification that could help in maintaining financial goals of warehouse. The next section describes the various 4
  • 7. automation techniques utilize for optimizing warehouse performance in different areas of warehouse management. 2.2 Automation Today in Warehouse Optimization There are many research that work on different strategies to automate the various areas of the warehouse for reducing the time and to organize the warehouse structure. Zunic et al. (2018) worked on optimal product placement for an operational warehouse where no optimization mechanism was implemented before. They developed algorithm that cal- culates the grade of an item by considering current product placement and frequency of a product. As a result, there was a 17.37% reduction in the average picking length for more than 40 orders and the picking process consumes 50% of the overall total cost. Whereas, in his earlier work, Zunic et al. (2017) uses a statistic based fitting algorithm for optimal strategic method and quantitative product placement in the warehouse picking zone with the help of a real-world case study. This study addresses the issue that if the product is not present in the picking zone, the workers get dependent on the forklift to take down the total palettes from the top racks which ultimately decreases the warehouse effi- ciency. The result shows that items to be moved to the picking zone with average 91.73% shifted median accuracy which indicates any improvement in this area substantially im- proves the efficiency of the warehouse. Both papers worked on the effective item picking strategy based on the distance, product placement, and its product’s order frequency, but have not considered auto-replenishment products placement and storage assignment problems.Chan and Chan’s (2011) research paper, worked on a real-world case study and addresses this issue by implementing class-based storage on manual order picking for multilevel racks distribution for a real-world case study. Based on travel distance and order retrieval, time performance is measured. Results obtained in this paper show that a combination of different factors has different performance indicators. As a future work Chan and Chan (2011) suggests to work on products congestion problem; this congestion can be avoided if products are identify that take long retrieval time from the warehouse than expected. The stagnancy issue faced by Chan and Chan (2011) has been resolved in this research paper by identifying and informing the warehouse manager about such products beforehand so that necessary actions could be taken in the first place. With the emergence of the e-commerce sector and the worldwide order flooded the distribution center, made the order fulfillment job more complex. Thus, there was a need for automation which was solved by utilizing the Kiva System for order fulfillment. Amazon back in 2012, introduced a Kiva system in the e-commerce warehouses (Bogue; 2016). This robotic system utilizes hundreds of robots that move around the warehouse for picking the product and make it ready for delivery. Now in recent years, for the process of automation Tejesh and Neeraja’s (2018) paper addressed the issue of finding the product in the warehouse as it required manual searching, thus in such cases warehouse inventory management system which utilizes RFID comes to rescue. Tejesh and Neeraja (2018) developed a warehouse inventory management system based on the Internet of Things (i.e IOT) for tracking products with its timestamp for verifying products. All the information was made available on a web page interface for users which can be used dynamically for getting remote information. All the mentioned research papers work towards optimization of the warehouse using various automation techniques, but these papers lack the detection and maintenance of effective warehouse inventory. These limitations and the benefits of machine learning 5
  • 8. implementation are identified. In the next section, some of the research works are drafted that have adopted machine learning techniques, and work in different areas to improve warehouse performance. 2.3 Warehouse Management using Machine Learning This section is about the overview and critical analysis of major research work that uses supervised and unsupervised machine learning techniques in the different fields of a warehouse. 2.3.1 Using Artificial Neural Network (ANN) For any optimal warehouse, the priority for operations is to be updated and reorder its warehouse inventory. There are various researches done for calculating reordering points using heuristic methods. But calculating these reordering points of all products using mathematical function is time consuming (Inprasit and Tanachutiwat; 2018); Thus, Inprasit and Tanachutiwat’s (2018) implemented artificial neural network (ANN) for re- ordering point determination and safety stock management. Various algorithms were used for training, testing, building and for comparison of all products based on factors such as lead time, demand, standard deviation (SD) of demand, SD lead time and service. As a result, ANN gave the most accurate results with MSE 0.03 × [10−4 ] and 0.999 adjusted R2 value for overall data. Rezaei (2012) uses ANN along with clustering techniques for prediction of safety stock inventory. In this paper, prediction model was developed with the combination of clustering K-mean and multi-layer perception methods; moreover for data reduction in input vector, sensitivity analysis was applied that improves the accur- acy of prediction by ANN in identifying the safety stock. The above two papers worked on the reordering point determination and for the prediction of safety stock, but even after stock prediction, determination of product stagnancy can improve the gap after the stock prediction and storage of new stock. 2.3.2 Using Genetic Algorithm A unique approach was used by Nastasi et al. (2018) on an existing steel-making ware- house for multi-objective optimization of storage strategies. For this purpose, three ge- netic algorithms i.e Niched Pareto Genetic Algorithm, the Non-dominated Sorting Ge- netic Algorithm II and the Strength Pareto Genetic Algorithm II were implemented; along with these algorithms, traditional reordering and product allocation procedures have also been implemented in this research work. The issue of warehouse optimization was tested on every algorithm by exploring the simulation system1 which was presented in the paper. Ai-min and Jia (2011) worked for pharmaceutical logistics center, mentions the issue with such medical storage centres need special attention in slotting optimiz- ation, as requirement of medicines change according to the season or the influence of flue. Hence to deal with such issues Ai-min and Jia (2011) utilize MATLAB and genetic algorithm to resolve multi objective optimization issues of pharmaceutical logistic center. The overall results shows that the approach is effective but needs further improvement for real world scenarios as this paper uses very few goods in this approach. 1 The approach to understand the simulation process is provided under the section Numerical Approach in (Nastasi et al.; 2018) 6
  • 9. 2.3.3 Using Regression Models Faber et al. (2018) uncovers the fit between warehouse management structure and per- formance. A hypothesis was developed and tested among 111 storage warehouses of Belgium and Netherlands using linear regression model. To ensure the robustness of re- gression results, bootstrapping method was employed which hardly changes the p-value except for food products dummy variable section. The ordinary least squares (OLS) re- gression model performs well and statistically valid for drawing conclusion. Faber et al. (2018) mentions the results obtained from this model could be further used by warehouse manager for selecting appropriate planning system. To check the performance of linear regression models, Vastrad et al. (2013) applies methods lasso, ridge, elastic net which uses penalty-based shrinkage to handle data set with more predictors than observations. It has been found that, elastic-net and least angle regression (LARS) gives similar results but lasso outperformed ridge regression in terms of R2 and RMSE. Whereas, de Santis et al. (2017) focuses on identifying the material back order is- sue prior to its occurrence, provides the business sufficient time to take actions that could increase its overall performance. This study addresses, a predictive model for class imbalance issue, where frequency of products which gets backorder is low compare to products that do not. Initially, logistic regression (LOGIST) algorithm was used to create a baseline score for this issue followed by classification tree (CART) for outlier ro- bustness and feature selection. Machine learning techniques like Random forest, Gradient Boost and Blagging (under bagging) were then applied. Result gives RF = .9441 (area under curve) score by utilizing bagging ensemble, Gradient Boosting = 0.9481 (area under curve) score. Practically, BLAG identifies 60% of positive class items and 20% products that could become backorder. In the same year, Larco et al. (2017) uses linear regression for managing the workers discomfort through warehouse storage assignment decisions. The results suggests the 21% in terms of cycle time improvement in picking zone for the process of warehouse optimization.These research works rightly point towards adapting linear regression and ensemble techniques for this experiment. From all above studies, it has been understood that there are very few research works that implement warehouse space optimization by identifying products which could pos- sible go out of trend. Moreover, the use of electronic warehouse data-set is very unique and has not been implemented by any other project. The next section gives the idea of effective dynamic pricing, which is important for this project once the blow out products are identified. 2.4 Dynamic Pricing for Warehouse Maintenance In any supply chain industry, effective pricing strategy is an important aspect to provide products a good momentum in the market depending upon market scenarios especially for the B2C electronics industry; where fast changing trends and competitive market affects the sale of products. Minga et al. (2003) proposes a price setting algorithm which is an application of dynamic price setting strategy for stock optimization of warehouse. The proposed algorithm for demand sensitive model helps warehouse manager to maximize the profit while decreasing the marginal cost with increase in quantity ordered. It further mentions some websites which allows buyers to register and let seller to set a minimum threshold. When enough buyers register on the website, the price drops a little above the threshold which helps seller to make more money. The proposed strategy can be used in this research work after finding the blow out products to sell those products effectively. 7
  • 10. Whereas, Ancarani and Shankar (2004) proposes a hypothesis on how costs and price dispersion compare among the traditional retailers and the multi-channel retailers, and test through a statistical analysis. Price sensitive multidimensional recommended system (PSRS) and collaborative filtering was used. This research shows how price setting affects the business performance. This research can be used to sell products in bulk once blowout products are identified. After thorough analysis of literature work done in various sector, the detailed explanation of adopted methodology is drafted in the next (section 3) to answer the research question mentioned in (section 1.2). 3 Research Methodology and Specification After the literature is understood, it is important to chalk out the methodology of the research. For this experiment, it is very important to understand the business perspect- ive of data. This data is further modelled with different regression strategies and then evaluated to achieve the ultimate result of finding the blow-out time. It turns out that CRISP-DM is the best ideology to work with this process as all the process steps have descriptions of typical phases explained in CRISP-DM guide. It also has an advantage- ous capability to make large data mining projects cheap at cost, manageable and reliable (Wirth and Hipp; 2000). This process is discussed in Figure 1 below, Figure 1: Modified CRISP-DM Process Model, Data Source: Wirth and Hipp (2000) 8
  • 11. 3.1 Business Understanding The most important thing is to understand the business for whom this product is targeted. Due to the increase in customer demand, the business has to be more flexible and provide a user-friendly solution as well as business-friendly. From the literature review, it has been understood that there is an issue with the warehouse limited space and the stagnant products make it worse as they take more time to sell than usual that which blocks revenue and space resulted in losses to the warehouse management. This research work mainly focuses on space optimization solution for warehouse manager, so that the manager does not have to bear losses on the products that have worn out. Thus, to produce the blow out time of the electronic products becomes the business persona and understanding of this research work. 3.2 Data Understanding Data collection is an important step as it will initiate the process of execution of the planning. The data for the research should not be anything other than live data; so that, it is easy to understand what issues the warehouse controllers are facing on a regular basis. This data understanding section should be divided into the following 3 sections to comprehend the data. 3.2.1 Data Collection The data is collected from a live working organization to sustain the objective of assisting the organizations where heuristics are applied to understand the blow-out duration of products. The data has been arranged from a startup e-commerce organization Price Save that mainly deals with electronic products. This data has mainly 4200 unique products data across 3 months. This data has many parameters like cost price, selling price, profit and an additional feature called Sale Rate which is mainly explained in the data preparation phase. 3.2.2 Data Ethics To maintain the confidentiality of the data, a prior consent letter was taken to use the data for educational and research purposes. The collected data is only gathered keeping in mind the business interest. The letter of consent is attached in the configuration manual. 3.2.3 Exploratory Data Analysis The collected data is summarized considering the values that have obtained in the form of different entities like mean, standard deviation, etc.. The summary of some columns is shown in the Figure 2 below, 9
  • 12. Figure 2: Column Summary The only purpose of this is to identify any anomalies in the data than can be stroked from the data at an early stage. 1) Validation of Data Linearity: Scatter plot shown in the Figure 3, shows the data distribution of the columns. That shows the column data is not linear. Figure 3: Scatter-plot 2) Multicollinearity: Multi-collinearity comes into existence if we have features that are strongly dependent on each other. In the dataset under consideration, the cost price, profit and price are the strongly dependent features. To understand their inter- dependence, the correlation matrix is been plotted for all these features, Figure 4: Correlation Matrix 10
  • 13. From the above correlation matrix, it has been found that the cost price, profit and price columns are correlated. The profit column was calculated using the price (i.e selling price) and the cost price. Thus, to increase the efficiency of model training, the feature profit has been dropped and fitting of the model is performed by keeping cost price and price features; as these features are very an important parameters for any product and could be helpful in training the model. 3.3 Design Specification One of the key aspect of any project is to understand from the logical and business point of view. Thus, this section gives an insightful overview of the project and and the steps involved in the project is shown in the Figure 5 using two tier architecture. Figure 5: Project Flow 3.4 Data Preparation and Transformation This phase has mainly dealt with the pre-processing of the data that is collected. As it was earlier declared that Sale Rate has been additionally used as a new parameter, its only use in the data is to calculate the blow out time for the products in heuristic analytics. To create the additional column a master file was created assimilating per day data of 90 days. Using the inventory and the number of the day when the stock gets sold out is recorded, if there is no sale for 90 days of certain products; such products get the priority for clearance and are given named as infinite blowout to distinguish from the other products. The formula for calculating the Sale Rate is given below, SaleRate = Initial Inventory − Remaining Inventory N (1) 11
  • 14. where, ’N’ is the number of the day when the stock gets sold out. It is very important to pre-process the data before we use it fit the model as nature of information and the data quality have direct influence over the capacity of model to be adapted Nali´c and ˇSvraka (2018). Following factors have been taken into consideration before the data modelling. 3.4.1 Managing NULL values The data was validated if null values are presented in it. If those are present, it was made sure that those values are replaced with ’default’ value that does not make any significant change in the database but also does not misguide the models that are applied for irrelevant outputs. 3.4.2 Handling missing values Missing values cause major issues while applying machine learning algorithms to the data, as it becomes prone to outliers which returns inaccurate results2 . This is handled in the dataset by replacing the missing values with default values that do not cause much variance to the dataset. 3.4.3 Dealing with categorical values Categorical values are very discrete and thus not continuous. It is very important for an algorithm to receive continuous values as it is not favourable to judge the outcome through categorical values. There are two types of categorical values namely ordinal and nominal. As the dataset is free of ordinal variables, only nominal variables have been taken into consideration. • One-Hot Encoding: The dataset mainly has many categorical values which were required to improve the accuracy of machine learning models. The data was converted into the ordinal type and nominal type while feature engineering; whereas, categorical data was taken into nominal and numerical data into ordinal values. One-hot encoding was performed on these nominal data, here the dataset is divided into n unique columns where m is the number of unique nominal columns. As there are 458 different categories combination in the nominal dataset, additional 458 columns have been added in it. The data after encoding is shown in the Figure 6, Figure 6: One-hot encoding 2 understood how to manage the missing data - https://towardsdatascience.com/introduction-to-data-preprocessing-in-machine-learning-a9fa83a5dc9d 12
  • 15. 3.5 Modelling After the vast literature survey reviewed in section 2, it was seen that the identification of blowout products can be done using the regression and ensemble methods. The objective outlined is to predict continuous dependent variable Sale Rate to figure out blow-out duration using the best performing algorithm from independent variables like category and price etc. As the dependent variable is continuous in nature, which is the only reason to use linear regression analysis over logistic regression. Fulfilling the same purpose, al- gorithms such as LASSO, Ridge, Elastic Net, Gradient Boosting, Ada Boosting, Random Forest Regressor have been modeled. 3.5.1 Linear Regression 1) LASSO (Least Absolute Shrinkage and Selection Operator): LASSO is a linear regression analysis method that displays variable selection to increase the accuracy of the prediction of statistics model. It uses the shrinkage method that shrinks data values towards the center. LASSO is penalized for the sum of absolute values of weights. The objective of the lasso is to acquire the subset while minimizing the predictor error for a quantitative variable. It reduces model complexity and avoids overfitting. 2) Ridge Regression: Ridge regression estimates are usually little affected by small changes in the data on which the fit regression is based. It goes one step ahead by penalizing the sum of squared values of the weights. Thus the weights are more evenly distributed and have values closer to zero. Ridge is a regression technique that is generally used to deal with multi- collinearity in the dataset. Ridge regression model dismantles the multi-collinearity by reducing the immensity of correlations in the data. Thus, it became part of this research work. 3) Elastic Net: One another type of linear regression model is an elastic net that can deal with the penalties present in both Lasso and Ridge. It is a hybrid technique of Lasso and Ridge where absolute penalty and square penalty both are inclusive and regulated with another coefficient ration. It best works with scaled data i-e a data which is more than 100K rows, thus it would be a good choice to test with it as it is an improvement to Lasso and Ridge algorithms. 3.6 Ensemble Models 1) Gradient Boosting Regressor: Boosting is generally a procedure to convert weak learners into strong learners. Gradi- ent Boost trains model in an additive and sequential manner. It identifies the limitation of weak learners by using gradients in the loss function, where loss function indicates how coefficients of model fit the underlying data. Ke et al. (2017) popularized the use of highly efficient Gradient Boosting decision tree by proposing two novel techniques 13
  • 16. namely Gradient based One Sided Sampling (GOSS) and Exclusive Feature Bundling (EFB). After viewing the astonishing performance of in training time cost of 0.28 secs for one training iteration, this algorithm should best suit to the warehouse data. 2) Ada Boosting Regressor: The AdaBoost algorithm AdaBoost regressor is a meta-estimator that starts by fitting a regressor on the dataset and fits extra duplicates of the regressor on the same dataset, the weight of those observations are balanced by the error of the present expectation. It is implemented with equal weight given to each observation. Solomatine and Shrestha (2004) compared the working of M5 model trees with AdaBoost.RT and found that it performed better in most of the considered datasets. Researchers proved practically that the AdaBoost algorithm gives better performance even in the presence of noise in the data and is better than the individual machine with a confidence level higher than 99%. It has a high capability to perform against the warehouse data considered all these presented factors. 3) Random Forest Regressor: Random forest is the bagging ensemble that operates by constructing a series of de- cision trees during training and aime to reduce the variance by randomly selecting (which de-correlates) trees from the dataset. They are an ensemble of different regression trees and are utilized for nonlinear multiple regression. All the above algorithms have been selected based on their performance in their re- spective fields implemented which has been exceptional. The main motive of selecting Linear regression and ensemble technique is because the data is very continuous and not discrete. 3.6.1 Test Design Before building the model, it is very important to understand what type of splitting technique is best suited for the dataset. There are two main types of data splitting namely the train-test split and train-validation-test split. There are various advantages proven about the train-validation-test split over the train-test split. Guyon (1997) has provided splitting measure by giving a relation of ratio of validation set size over training set size with the minimizing validation error and error rate in training set. The data split used in the warehouse dataset is 80% train, 10% validation and 10% test. After this splitting is performed, the model is built, and evaluation is conducted on it. The three best performing algorithms are subjected to tuning with the help of grid search parameter 3.7 Evaluation While preparing a model, it is important to consider how model generalizes an unseen data while performing machine learning techniques. In this research, hold-out strategy is used which is a simplest kind of cross validation and requires less computational power than the k-fold cross validation. This method provides an unbiased prediction of learning performance. In this method, the dataset has been divided into three subsets i.e Train, test and validation. The model is trained on the validation subset and for selecting a best 14
  • 17. performing model, it provides a test platform for fine tuning the model’s parameter. For this research we have divided data into train 80%, test 10% and validation 10%. All the models are first trained on validation and then with the best performing model predictions are calculated on test set. While training the model in all evaluation techniques, the performance of the models have been observed by iterating the data in batches and models have been compared with these evaluations in testing. Moreover, in order to further validate, Modelling has been done separately on products which are sold and evaluation are checked, whereas for not sold products, best algorithm was tested to check the R2 score. All the graphs are shown in below subsection. 3.7.1 R-Square Criterion The coefficient of determination (i.e R2 ) is a statistical technique of regression that de- termines proportion of variance in dependent variable experienced by predictor variables. R2 can vary between 0 and 1; where 1 means the model correctly fits the data. Formula for R-square is given below: (R2 ) = V ariance explained by the model Total V ariance (2) Graphs obtained after modelling are shown below: Figure 7: R2 score - full dataset Figure 8: R2 score - soldout dataset From the above Figure 7 and Figure 8, it is clear that GBM outperforms all other al- gorithms for both modelling performed on full dataset as well as on soldout dataset. 3.7.2 Mean absolute error As the data under use is continuous, measure of differences between the continuous vari- able can be derived using mean absolute error. MAE shows how big an error we can anticipate from the forecast on average. It is also an average distance each point and the line of equality. The formula for MAE is described below, 15
  • 18. Where, n= number of error and |xi − x| is absolute error Figure 9: MAE - Full dataset Figure 10: MAE - soldout dataset All six models was evaluated on both datasets using mean absolute error and it has been noticed that gradient boosting has lowest MAE for soldout dataset whereas, Ran- domforest regressor gives the lowest MAE when evaluated on full dataset. 3.7.3 Mean squared error Mean square error informs how close the data points to the regression line. It calculates using the square of the distance from the regression line. The squaring removes all the negative data points. MSE also gives importance to large differences. Smaller the MSE closer to the line of best fit. The comparison of MSE for all algorithm is shown in the below graph, Figure 11: MSE - full dataset 16
  • 19. Figure 12: MSE - soldout dataset From the above Figure 11 and Figure 12, it can be seen that the RFR gives the lowest error on full dataset, but GBM shows the least amount of error for the soldout dataset. 3.7.4 Median absolute error The median absolute error is very robust to outliers. The loss function is calculated by simply considering all the median of absolute differences between prediction and target. The lowest difference shows more accuracy. The built model was evaluated using median absolute error and algorithm comparison has been graphically plotted below, Figure 13: Median absolute error - full dataset Figure 14: Median absolute error - soldout dataset From the above Figure 13 and Figure 14, RFR shows the lowest median absolute error for full dataset while GBM has lowest error in soldout dataset. 3.7.5 Variance score Variance score or explained variance is the measure of calculating how important a math- ematical model is, in terms of variation in the given data set. It is a measure of discrep- ancy in predicted to the actual model. Higher rates of explained variance shows stronger association strength. The built model was evaluated and compared using variance score and is shown below, 17
  • 20. Figure 15: Variance score - full dataset Figure 16: Variance score - soldout dataset From the above comparison, RFR shows the strong association than rest of the al- gorithms for full dataset but GBM shows betters result for soldout one. After looking at all the evalutions on both the datasets, RFR shows lower error rate than all other algorithms but the difference between train and test scores are wide for most of the evaluations. If this condition is checked for remaining algorithms, GBM has low error rate and lesser score difference which makes it best suited for application. This GBM algorithm is further tested against unsold subset, to compare the perform- ance and there is a reduction in test R2 score as compared to the full dataset, whereas the mean squared error increased; the reason which can be anticipated here is the subset contains noise i.e outliers of the the full dataset. The graphs for R2 and MAE is plotted in the below Figure 17, Figure 17: Full data vs unsold subset 3.8 Deployment This section gives an overall idea about the implementation of the project as well as the how the prediction of blowout time period and application of pricing strategies implemen- ted for the objective of achieving the automation of the warehouse for space optimization. 18
  • 21. 3.8.1 Implementation For the objective of identifying the accurate blowout time period using different machine learning algorithms, first, the data has been collected from an organization named ’Price Save’ which is a startup company, following proper ethics. The consent letter has been provided in the configuration manual of this project. The dataset contains 4200 unique products per day for a period of 3 months. Initially, exploratory data analysis was performed, which is explained in section in section 3.2. After understanding the data, pre- processing techniques (explained in section 3.4) such as scaling one-hot encoding, feature extraction, etc. were committed. The data was split into a train, test and validation set (in 80%, 10%, 10% respectively) to maintain the integrity of the dataset and trained with all six models. algorithm. The model was first trained on validation by iterating the dataset in batches to check the performance of all models. Based on the R2 score best model was further tuned for improving the scores. But, unfortunately, the model did not find the best fine-tuning parameters and the R2 scores were inaccurate as compared to the training validation ones. Hence, it has been decided to use models trained on validation for further predicting the scores on the test data set. Now the dataset was added with a new column ”Sold Status” which gives logical value on resultant product exhaustion. This sold subset was trained against all the models performing all the similar tasks as of full dataset and the results from both experiments were noted to check the best performing algorithm. After the comparative evaluation, it has been identified that the gradient boosting regressor gives the best performance in terms of R2 and using the other metrics as explained in section 3.7. The already trained GBM model is subjected to testing against unsold dataset to calculate the comparative error rate. The best algorithm was further used to predict the blow out time-period for all products and how to apply effective pricing strategies is explained in the following section 3.8.2 and section 3.8.3. 3.8.2 Prediction of Blowout Time-Period This is the application of the machine learning strategy for the benefit of warehouse industry. Identifying the blowout time period will not only saves the time of a warehouse manager but also prevent warehouse from overstocking which ultimately helps in space optimization of the warehouse. The best performing algorithm identified from the above data analysis, the algorithm was further used to predict the sale-rate which is calculated using the sales obtained per day. Based on the accuracy obtained from the testing the sale rate, the accurate sale rate is obtained. The blowout day can be calculated using formula, Blowout Day n = Initial Inventory Predicted SaleRate (3) where, predicted sale rate - which is obtained using the best performing algorithm. The results are shown in the output below after calculating blowout using best performing algorithm. y pred - is the predicted blowout using Gradient boosting algorithm and blowout test is calculated using heuristic method. It can been seen from the above blowout prediction output that the actual blowout calculated through application of machine learning algorithm is far more accurate than the heuristic ones. Thus after identifying the time required for every product get exhaust, a 19
  • 22. Figure 18: Blowout prediction warehouse manager can apply pricing strategies release these products from the warehouse effectively. The pricing strategy is explained in the next section. 3.8.3 Application of Efficient Pricing Strategies for Warehouse Optimization Application of pricing strategies is an additional layer of the project. After the predic- tion of blow-out time period for each product using the best performing algorithm as explained in the section 3.8.2. This part is important in terms of achieving the object- ive of automating warehouse products by clearing-out with optimal pricing strategy. As discussed in the section 2.4 of literature survey, a dynamic price setting strategy can be used using the price setting algorithm as mentioned in Minga et al. (2003). If the predicted blowout product have more quantity to be sold and have more time period as predicted by the algorithm, then in such scenario profit could be further dropped up-to a threshold limit until the product gets its momentum. When dealing with a wide range of products, it is not possible to do pricing on every products and as such products are prone to overstocking. Hence a method which is mentioned below can be utilized to do pricing on all these products with ease and can focus on other warehouse products for generating profit. The formula mentioned below, uses a strategy of dropping the price by taking the percentage above cost price. For this method a minimum profit is defined first and then using the trial and testing method, certain amount of profit percent dropped every time until the product gets the sale momentum. This is a self invented and tested method while working in the previous organization. The cost price consist of various parameters i.e Cost of the product, shipping, warehouse fees (cost of warehouse), website commission (ex. third party e-commerce website). The idea is explained below using the example formula, Example Formula: Selling price = 1 + 15 ∗ (product cost + shipping + commission + warehouse fees) 100 (4) This means, selling price = 15% above the cost price. i.e the selling price will give the value to be set in order to take 15% above the cost price. Here just by changing the percentage, a selling price can be obtained for each product. Hence the formula, Sellingprice = 1 + n ∗ (total cost price involved) 100 (5) Where, ’n’ symbolize percent to be set. This way mass pricing can be done using the master file for products which are slow moving (blowout products). 20
  • 23. 4 Discussion In this research work of application of machine learning strategies for optimization of the warehouse space, it has been found that the machine learning methods perform better in predicting the sale rate which helps this experiment in identifying the blow out period. Machine learning algorithms such as elastic net, lasso, ridge, gradient boosting, ada boosting, and random forest regressor were applied on ful dataset and evaluated on different metrics. It was found that amongst all algorithms, gradient boosting algorithm outperforms with R2 = 0.9197 and 0.177 mean squared error which is better than all other algorithms, followed by ada boosting regressor with an R2 of 0.9064 and MAE = 0.2120. This trained GBM model is tested against unsold subset to validate its performance against the products that are unsold. The results show a reduction in the R2 as compared to the R-squared obtained from full dataset. Moreover, for sold subset data all the algorithms were again trained and tested. The results found to be improved as the noise i.e unsold data which contains outliers was removed from the dataset. This study will help the warehouse manager understand the pattern of his unsold products while re- ordering them for the next iteration and apply dynamic pricing strategies to effectively optimize the warehouse space. Blow-out time calculated with actual and that with the implementation of GBM have unique values, some of the examples are plotted in the Figure 19 below. At this moment, there won’t be huge differences noted in the blow out durations, but in long run, the heuristics (i.e decisions based on experience/knowledge) would not prove effective but GBM will always improve with an increased amount of pattern data and products. Thus, the automation can be marked fruitful and executable in case of warehouse space optimization. Figure 19: Blowout ML vs actual blowout 5 Conclusion and Future work A competitive comparison between heuristic analytics is conducted against different ma- chine learning strategies and it can be concluded that that heuristics can proved very effective for small scale organizations but to handle the amount of data generated by huge organizations, only the machine learning strategies can set a stepping stone and provide better results. From the experiment, it has been proved that the warehouse 21
  • 24. space optimization can very well be achieved with the implementation of machine learn- ing strategies with good amount of accuracy. There is always a scope of improvement in the research work. The most considerate one would be provision of factors like varied seasonal data, increased number of products and additional domains for sale. These factors bring about increased quality pf data which in turn would train the algorithms and bring about more accurate predictions of products’ blow-out time. While conducting research, there are various studies done with the help of applied deep learning viz. ANN, that provides higher accuracy. But regarding this research, as there is inadequate amount of data, it is not exposed to seasonal data to be able to successfully build the model and predict the accurate patterns. Nevertheless, space optimization is achievable and this research paper adequately answers the research object with great success. References Ai-min, D. and Jia, C. (2011). Research on slotting optimization in automated warehouse of pharmaceutical logistics center, 2011 International Conference on Management Sci- ence & Engineering 18th Annual Conference Proceedings, IEEE, pp. 135–139. Ancarani, F. and Shankar, V. (2004). Price levels and price dispersion within and across multiple retailer types: Further evidence and extension, Journal of the academy of marketing Science 32(2): 176. Baker, P. and Halim, Z. (2007). An exploration of warehouse automation implementa- tions: cost, service and flexibility issues, Supply Chain Management: An International Journal 12(2): 129–138. Bhat, S., Kansana, K. and Khan, J. (2016). A review paper on e-commerce, Asian Journal of Technology & Management Research [ISSN: 2249–0892] 6(1). Bogue, R. (2016). Growth in e-commerce boosts innovation in the warehouse robot market, Industrial Robot: An International Journal 43(6): 583–587. Boyer, K. K. and Lewis, M. W. (2002). Competitive priorities: investigating the need for trade-offs in operations strategy, Production and operations management 11(1): 9–20. Chan, F. T. and Chan, H. K. (2011). Improving the productivity of order picking of a manual-pick and multi-level rack distribution warehouse through the implementation of class-based storage, Expert systems with applicatiostockns 38(3): 2686–2700. Chen, C.-M., Gong, Y., De Koster, R. B. and Van Nunen, J. A. (2010). A flexible eval- uative framework for order picking systems, Production and Operations Management 19(1): 70–82. De Koster, M. and Balk, B. M. (2008). Benchmarking and monitoring international warehouse operations in europe, Production and Operations Management 17(2): 175– 183. De Koster, R. B., Le-Duc, T. and Zaerpour, N. (2012). Determining the number of zones in a pick-and-sort order picking system, International Journal of Production Research 50(3): 757–771. 22
  • 25. de Santis, R. B., de Aguiar, E. P. and Goliatt, L. (2017). Predicting material backorders in inventory management using machine learning, 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI), IEEE, pp. 1–6. Faber, N., De Koster, R. B. and Smidts, A. (2018). Survival of the fittest: the impact of fit between warehouse management structure and warehouse context on warehouse performance, International Journal of Production Research 56(1-2): 120–139. Guyon, I. (1997). A scaling law for the validation-set training-set size ratio, AT&T Bell Laboratories pp. 1–11. Hackman, S. T., Frazelle, E. H., Griffin, P. M., Griffin, S. O. and Vlasta, D. A. (2001). Benchmarking warehousing and distribution operations: an input-output approach, Journal of Productivity Analysis 16(1): 79–100. Inprasit, T. and Tanachutiwat, S. (2018). Reordering point determination using ma- chine learning technique for inventory management, 2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST), IEEE, pp. 1–4. Johnson, A. and McGinnis, L. (2010). Performance measurement in the warehousing industry, IIE Transactions 43(3): 220–230. Kar´asek, J. (2013). An overview of warehouse optimization, International journal of advances in telecommunications, electrotechnics, signals and systems 2(3): 111–117. Kartal, H., Oztekin, A., Gunasekaran, A. and Cebi, F. (2016). An integrated decision analytic framework of machine learning with multi-criteria decision making for multi- attribute inventory classification, Computers & Industrial Engineering 101: 599–613. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q. and Liu, T.-Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree, Advances in Neural Information Processing Systems, pp. 3146–3154. Kim, T. Y. (2018). Improving warehouse responsiveness by job priority management: A european distribution centre field study, Computers & Industrial Engineering . Klodawski, M., Jacyna, M., Lewczuk, K. and Wasiak, M. (2017). The issues of selection warehouse process strategies, Procedia Engineering 187: 451–457. Knoll, D., Pr¨uglmeier, M. and Reinhart, G. (2016). Predicting future inbound logistics processes using machine learning, Procedia CIRP 52: 145–150. Krauth, E., Moonen, H., Popova, V. and Schut, M. (2005). Performance indicators in logistics service provision and warehouse management–a literature review and frame- work, Euroma international conference, pp. 19–22. Larco, J. A., De Koster, R., Roodbergen, K. J. and Dul, J. (2017). Managing warehouse efficiency and worker discomfort through enhanced storage assignment decisions, In- ternational Journal of Production Research 55(21): 6407–6422. Laudon, K. C. and Laudon, J. P. (2015). E-commerce: Digital markets, digital goods, in K. C. Laudon and J. P. Laudon (eds), Management Information Systems: Managing the Digital Firm Plus MyMISLab with Pearson eText–Access Card Package, Prentice Hall Press, chapter 10, pp. 415–425. 23
  • 26. Mangano, G. and De Marco, A. (2014). The role of maintenance and facility management in logistics: a literature review, Facilities 32(5/6): 241–255. Minga, L. M., Feng, Y.-Q. and Li, Y.-J. (2003). Dynamic pricing: ecommerce-oriented price setting algorithm, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No. 03EX693), Vol. 2, IEEE, pp. 893–898. Nali´c, J. and ˇSvraka, A. (2018). Importance of data pre-processing in credit scoring models based on data mining approaches, 2018 41st International Convention on In- formation and Communication Technology, Electronics and Microelectronics (MIPRO), IEEE, pp. 1046–1051. Nastasi, G., Colla, V., Cateni, S. and Campigli, S. (2018). Implementation and compar- ison of algorithms for multi-objective optimization based on genetic algorithms applied to the management of an automated warehouse, Journal of Intelligent Manufacturing 29(7): 1545–1557. Reyes, J., Solano-Charris, E. and Montoya-Torres, J. (2019). The storage location as- signment problem: A literature review, International Journal of Industrial Engineering Computations 10(2): 199–224. Rezaei, H. R. (2012). A novel approach for safety stock prediction based on clustering artificial neural network, Paul Bharath Bhushan Petlu Chaylasy Gnophanxay p. 107. Schmenner, R. W. and Swink, M. L. (1998). On theory in operations management, Journal of operations management 17(1): 97–113. Solomatine, D. P. and Shrestha, D. L. (2004). Adaboost. rt: a boosting algorithm for regression problems, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), Vol. 2, IEEE, pp. 1163–1168. Staudt, F. H., Alpan, G., Di Mascolo, M. and Rodriguez, C. M. T. (2015). Warehouse performance measurement: a literature review, International Journal of Production Research 53(18): 5524–5544. Tejesh, B. S. S. and Neeraja, S. (2018). Warehouse inventory management system using iot and open source framework, Alexandria engineering journal 57(4): 3817–3823. Van den Berg, J. P. and Zijm, W. H. (1999). Models for warehouse management: Classi- fication and examples, International journal of production economics 59(1-3): 519–528. Van Goor, A. R., van Amstel, W. P. and van Amstel, M. P. (2019). European distribution and supply chain logistics, Routledge. Vastrad, C. et al. (2013). Performance analysis of regularized linear regression models for oxazolines and oxazoles derivitive descriptor dataset, arXiv preprint arXiv:1312.2789 . Wirth, R. and Hipp, J. (2000). Crisp-dm: Towards a standard process model for data mining, Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining, Citeseer, pp. 29–39. 24
  • 27. Zunic, E., Hasic, H., Hodzic, K., Delalic, S. and Besirevic, A. (2018). Predictive analysis based approach for optimal warehouse product positioning, 2018 41st International Convention on Information and Communication Technology, Electronics and Micro- electronics (MIPRO), IEEE, pp. 0950–0954. Zunic, E., Hodzic, K., Hasic, H., Skrobo, R., Besirevic, A. and Donko, D. (2017). Ap- plication of advanced analysis and predictive algorithm for warehouse picking zone capacity and content prediction, 2017 XXVI International Conference on Information, Communication and Automation Technologies (ICAT), IEEE, pp. 1–6. 25