Although stochastic gradient descent (SGD) resolves some issues of gradient descent (GD), as a slow and costly process, it results in fluctuations around the local minima. Mini-batch gradient descent was suggested to get the middle point reducing fluctuations observed in SGD and cost of GD. Benefit of mini-batch gradient descent is usually reported in deep learning (DL) setting as DL models usually are built training on large datasets.