Mining Loyalty Card Data for Increased Competitiveness: Case of a leading Retail Store of Kolkata, India
1. Mining Loyalty Card Data for Increased Competitiveness:
Case of a leading Retail Store of Kolkata, India
Present affiliation of Authors
Dr. Atish Chattopadhyay
Professor of Marketing, SPJIMR, India
atishc@spjimr.org
And
Dr. Kalyan Sengupta
Professor of IT and Systems, IISW&BM, Kolkata, India
kalyansen2002@yahoo.co.uk
(Paper Presented at the Conference on Global Competitiveness at IIM-Kozhikode, 25-26
March, 2006)
1
2. 1. Introduction and Background
During the past decade, loyalty programs have been intensively experimented throughout
the globe mostly to create a new generation of CRM tactics (Brown, 2000; Kalokota and
Robinson, 1999; Field, 1997). It was evident from ample experiences including Japanese
relating, US airlines and hotels, French banks, UK groceries and so forth. In India it was
observed that Shoppers’ Stop, a leading retail chain, managed to achieve 60 percent of its
sales from repeat customers (as against the Indian average of 30 percent) by virtue of its
highly pushed loyalty programs.
However, a group of researchers (Uncles et. al, 2003; Miranda et. al, 2004; Stauss et. al,
2005) observed from empirical researches that loyalty in repeat purchase markets is
resulted from passive acceptance of brands rather than from positive efforts to improve
customer attitudes. A recent study (C. Noordhoff et. al, 2004) expressed the fate of
loyalty programs in the long run. Store customers of Netherlands and Singapore were
compared in terms of behavioral and attitudinal loyalty with respect to loyalty cards. It
was concluded from the study that efficacy of store loyalty programs appeared to
diminish with an increasing number of alternative card programs in the market. It also
diminished with the habituation of customer with these cards. While the sustainability of
loyalty scheme is in question, the marketers need to be clear about relative importance of
data collection and rewarding loyal customers for achieving sustainable loyalty (Lisa O’
Malley, 1998).
Understanding of appropriate factors which could build a cordon around the customers is
extremely essential. Organizational and regular feedback from the marketplace may
extract customers’ latent needs in some ongoing manner. A well designed loyalty
scheme could be considered as a useful instrument for continuous tracking of customers,
which may enable a successful CRM and hence a sustainable loyalty improvement
system. The present study will address these issues in the Indian context with respect to a
leading retail chain in India.
2
3. 2. Methodology
Computerized billing data for a well known life-style retail chain was gathered from two
of its retail points in the city of Kolkata, India. Transactions of only loyalty card holders
were collected for a period from 1st August 2004 to 28th February 2005, constituting
334093 bills information lines for different items purchased. 12990 customers with
loyalty cards took part in the purchase process. The transaction dataset was merged with
the customer profile dataset for generating knowledge on purchase behaviors by way of
classifications and associations. The entire dataset was cleaned to avoid null data and
outliers. Preprocessing like data generalization, aggregation and relevancy analyses were
performed on the dataset and finally, a relevant as well as compact dataset was extracted.
From the above dataset, attempts were made to rightly estimate the measures of purchase
values on individual customers by using various demographic factors indicated in the
loyalty card profile of the customers and also the behavioral patterns like frequency,
recency, etc.
The two shops of the chain located at two different locations of the city were analyzed
separately in order to compare the buying patterns in the two different locations.
Step wise regression models were adopted to investigate relationships of value purchased
with the given input variables as discussed earlier. It was also interesting to apply an
artificial neural network model for the same purpose. A multilayer perception model was
chosen using eight different input variables against a single output variable – value of
purchase of a customer. In a further analysis, based on CART classification model, it
was attempted to investigate the behavior of purchase by extracting a rule set from the
data as prepared and pre-processed for our models.
3. Results and Discussions
3
4. A k-means cluster analysis of the entire dataset (including both the shops) revealed five
reasonable clusters of customers in the system. Cluster centers and cluster size of each
group revealed a sharp peak of the customer pyramid (table 1). The top 2 levels of the
pyramids constituted only 2.4 percent of the total customers, who spent heavily during
the seven months under study. The mean values of these two clusters were Rs.79360 and
Rs.36780 respectively against the average of only Rs.6500 for the whole of the
customers. It was found that only top 2.4 percent of the total customers contributed
nearly 15 percent of the total revenue and also top 11.5 percent of the customers
contributed to 41 percent of the revenue.
Table 1: Customer Value Pyramid
Cluster Percentage of Total Customers Average Value Purchase (Rs.)
1 0.22 79360
2 2.15 36780
3 9.16 18765
4 27.24 8770
5 61.23 2400
Source: Billing data
It was thus important for the organization to identify typical characteristics of high value
customers so that proper CRM could be implemented in the most effective way. In order
to identify such behaviors three different approaches were adopted. Regression model,
CART decision tree model and Neural Network model were performed on the data set for
classification and prediction purposes.
A step wise regression model confirmed (R Square value of 0.52) the fact that amount of
purchase by a customer was strongly and positively related to frequency of visit to the
shop and discount-amount the customer enjoyed from the shop. However, recency of a
customer was also positively related with low intensity. The standardize Beta-
coefficients were 0.528, 0.350 and 0.035 respectively. It was also interesting to note that
the dummy variables gender (female = 1) and type (non-bengali = 1) had negative low
impacts on the customer revenue. The final regression model had high F-value of 2799
4
5. indicating significance level of 000. The model did not however, include other
demographic variables like age, marital status etc.
Table 2: Stepwise Regression – Both Shops
Coefficients a
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 1097.858 77.199 14.221 .000
Frequency 1626.298 16.902 .649 96.219 .000
2 (Constant) 836.337 70.442 11.873 .000
Frequency 1276.345 16.826 .510 75.857 .000
bargain 1.753 .034 .345 51.324 .000
3 (Constant) 1076.600 77.169 13.951 .000
Frequency 1291.442 16.908 .516 76.382 .000
bargain 1.745 .034 .343 51.195 .000
Gender -750.892 99.560 -.047 -7.542 .000
4 (Constant) 614.180 117.528 5.226 .000
Frequency 1326.465 18.178 .530 72.973 .000
bargain 1.773 .034 .349 51.445 .000
Gender -722.547 99.606 -.045 -7.254 .000
Recency 4.502 .864 .036 5.213 .000
5 (Constant) 763.921 130.619 5.848 .000
Frequency 1322.731 18.229 .528 72.562 .000
bargain 1.777 .034 .350 51.524 .000
Gender -722.889 99.583 -.045 -7.259 .000
Recency 4.321 .866 .035 4.989 .000
Type -252.771 96.306 -.016 -2.625 .009
a. Dependent Variable: BILL_VALUE_sum_sum
In order to compare characteristics of the two different shops – one located at Camac
Street and the other at Gariahat, it was observed that the demographic pattern of
customers were more or less the same in these two shops, excepting Gariahat shop which
had a high Bengali patronage whereas the Camac Street shop which had less than half as
Bengali population (table 3).
5
6. Table 3: Important Demographic profiles of two shops
a. Gariahat
Marital Status Frequency Percent
Unmarried 644 34
Married 1268 66
Total 1912 100
Gender Frequency Percent
Male 1155 60
Female 752 40
Total 1907 100
Type Frequency Percent
Bengali 1426 75
Non Bengali 481 25
Total 1907 100
b. Camac Street
Marital Status Frequency Percent
Unmarried 3626 33
Married 7411 67
Total 11037 100
Gender Frequency Percent
Male 6879 62
Female 4142 38
Total 11021 100
Type Frequency Percent
Bengali 4924 45
Non Bengali 6096 55
Total 11020 100
6
7. The shopping behavior of customers of two different outlets of the chain was some what
different because of location factor. Two sets of regression models were performed for
the two outlets. It was found that only three variables (frequency, bargain and recency)
could explain amount of purchase in case of Gariahat shop whereas two extra variables
(gender and type) were necessary to predict the purchase value of customers for the
Camac Street shops. The R-square value was 0.522 for the first shop and was 0.524 for
the second. The coefficients of the independent variables and their significance are
presented in the table 4 for both the models.
Table 4: Regression outputs of Outlets
Coefficientsa,b
Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 898.196 162.659 5.522 .000
Frequency 1479.870 38.259 .663 38.680 .000
2 (Constant) 718.403 151.115 4.754 .000
Frequency 1244.466 37.877 .558 32.855 .000
bargain 1.767 .100 .300 17.693 .000
3 (Constant) 46.099 256.662 .180 .857
Frequency 1302.100 41.770 .584 31.173 .000
bargain 1.824 .101 .310 18.029 .000
Recency 6.148 1.900 .060 3.236 .001
a. Dependent Variable: BILL_VALUE_sum_sum
b. SHOP_CODE = Gariahat
A further analysis on the two outlets showed that the top cluster of the customer group of
Camac Street had average purchase value of Rs.75760 during the period under study,
while it was Rs.44380 at the Gariahat outlet. It is interesting to note that for both the
shops, frequency of visits and the amount of bargain earned were the two most important
factors for total amount of purchase.
7
8. Coefficients a,b
U nstan dardize d Standardized
C oefficients C oefficients
M del
o B Std. Error Beta t Sig.
1 (Con stant) 1153.63 7 85.959 13.421 .000
F uen
req cy 1644.66 0 18.602 .648 88.414 .000
2 (Con stant) 863.243 78.436 11.006 .000
F uen
req cy 1281.52 8 18.573 .505 68.999 .000
barga in 1.745 .037 .347 47.433 .000
3 (Con stant) 1119.01 5 85.901 13.027 .000
F uen
req cy 1297.37 8 18.659 .511 69.531 .000
barga in 1.737 .037 .346 47.319 .000
G der
en -803.974 111.388 -.048 -7.218 .000
4 (Con stant) 683.233 129.567 5.273 .000
F uen
req cy 1329.87 3 19.998 .524 66.500 .000
barga in 1.762 .037 .351 47.497 .000
G der
en -776.810 111.454 -.047 -6.970 .000
Recency 4.300 .958 .034 4.490 .000
5 (Con stant) 881.868 146.781 6.008 .000
F uen
req cy 1324.89 5 20.066 .522 66.026 .000
barga in 1.766 .037 .351 47.586 .000
G der
en -774.438 111.419 -.047 -6.951 .000
Recency 4.089 .960 .032 4.259 .000
T e
yp -311.117 108.154 -.019 -2.877 .004
a. Dependen Variab BILL_
t le: VALUE_sum sum
_
b. SH P_CO E = C ac Stree
O D am t
Artificial Neural Network (ANN) Model
A Neural Network based model was tried on the data with bill amount as output variable
and eight input variables namely frequency, gender code, recency, shop code, type
(Bengali or non-bengali), age, bargain, marital state. It was found that the model code
estimate with 96 percent accuracy, using 1:3 neurons Hidden Layers. A further analysis
of the model reveals that mean error of estimate was Rs.27 (figure 1 and 2).
8
9. Figure 1: Neural Modeling using Clementine 9.0
The estimated relative importance of the input variables is varied in nature, where bargain
being the most important factor, followed by frequency and recency. The least important
factors were shop code, type of customer and marital status of customers.
It is interesting to note that in both the models (regression and ANN) both bargain and
frequency were important parameters while frequency was more important as judged by
regression model unlike ANN model. Recency was important to both the models.
Marital status was however considered to be not important in both the models.
9
10. Figure 2: Results of Neural Modeling using Clementine 9.0
4. Managerial Implications
Analyses of billing database and customer profile no doubt reveals precious knowledge
on customer purchase behavior which may be suitably used to formulate realistic
marketing programs to improve revenue and market share. In this particular situation we
experience that more frequently the people visit the shop, more is the revenue. Also
10
11. more discount or bargain the customers are offered more is the revenue. So these two
factors are critically important for the chain to increase its sales. Marketing investments
should align with such findings and discoveries.
Recency though has a low but positive impact on the amount of sales, it is extremely
useful to maintain a low average recency for the customers and management actions may
be devised to target, follow-up and encourage those customers whose recency values are
above the expected threshold.
Loyalty cards generate a large amount of valuable customer data which enable to track
and monitor customers in the most effective way to enhance sustainability. One may
conclude that loyalty programs thus become means for earning valuable customer
information to shape up appropriate market mix at a humble cost of reward points.
11
12. References:
Brown, S.A. (2000): Customer Relationship Management, John Wiley & Sons, Toronto.
Dasgupta S (2005): Who’s Afraid of Wal-Mart?, Business Standard (India), Dec 4,2005
Field, C. (1997): Data goes to Market, Computer Weekly, Jan 16, 1997, pp.44-5
Kalokota, R. and Robinson, M. (1999): “e-Business”, Addison-Wesley, Reading, MA.
Miranda M.J.; Konya, L. and Havrila, I. (2004): “Shoppers’ satisfaction levels are not the
only key to store loyalty”, Marketing intelligence and Planning”, Vol.23, No.2,
pp.220-232
Noordhoff, C.; Pauwels, P. and Schroder, O.G. (2004): “The effect of customer card
programs – A comparative study in Singapore and The Netherlands”, International
Journal of Service Industry Management, Vol.15, No.4, pp.351-364
Stauss, B.; Schmidt, M. and Schoeler, A. (2005): “Customer frustration in loyalty
programs”, International Journal of Service Industry Management, Vol.16, No.3,
pp.229-252
Uncles, M. D.; Grahame, R. D. and Kathy, H. (2003): “Customer loyalty and customer
loyalty programs”, Journal of Consumer Marketing, Vol.20, No.4, pp.294-316
Malley, L.O’ (1998): “Can loyalty schemes really build loyalty?”, Marketing Intelligence
and Planning, Vol.16, No.1, pp.47-55
12