SlideShare a Scribd company logo
1 of 100
Download to read offline
1
Data Analytics in Irish Retail: Is Data Analytics Used in its Full
Potential?
Paul Flood
Institute of Technology Carlow
MSc in Information Technology Management 2016
2
Data Analytics in Irish Retail: Is Data Analytics Used in its Full
Potential?
Paul Flood
Submitted in partial fulfilment of requirements for the
MSc in Information Technology Management 2016 Institute of
Technology Carlow
3
LIFELONG LEARNING CENTRE
Work submitted for assessment which does not
include this declaration will not be assessed.
DECLARATION
*I declare that all material in this submission e.g.
thesis/essay/project/assignment is entirely my/our own work except where
duly acknowledged.
*I have cited the sources of all quotations, paraphrases, summaries of
information, tables, diagrams or other material; including software and other
electronic media in which intellectual property rights may reside.
*I have provided a complete bibliography of all works and sources used in the
preparation of this submission.
*I understand that failure to comply with the Institute’s regulations governing
plagiarism constitutes a serious offence.
Student Name: (Printed)
____________________________________________
Student Number(s): ____________________________________________
Programme Title & Yr.: ____________________________________________
Module: ____________________________________________
Signature(s): ____________________________________________
Date: ____________________________________________
----------------------------------------------------------------------------------------------
--------- Please note:
a) Individual declaration is required by each student for joint projects.
b) Where projects are submitted electronically, students are required to type their
name under signature.
c) The Institute regulations on plagiarism are set out in Section 10 of Examination
and Assessment Regulations published each year in the Student Handbook.
4
Acknowledgements
I would first like to thank everybody who helped me throughout the college
year. I would like to thank Martin my advisor who guided and advised me.
The feedback was brilliant and I learnt so much on how to design a
dissertation. Thank you for bringing my dissertation from being too broad to
actually putting my questions on paper and going from there.
The college in Carlow especially the lecturers and lifelong learning who made
the year that bit easier. It was strange returning to the college after 20 years.
To all the retailers who completed my survey and to David for help getting me
the names and email addresses of the people used. You know a lot of people.
The proof readers, Edel and John, my eyes for finding errors. To all my class
for making me feel welcome especially John, Niall and Ross.
To all my family in Tipperary and my in-laws in Offaly.
And finally to my wife Niamh, my rock this year. You convinced me to do the
course and helped me through the hard times, the weeks away from home and
the drives from Carlow to Balbriggan at night. I could not do it without you.
5
Contents
Acknowledgements .....................................................................................................4
Glossary .......................................................................................................................7
Abstract........................................................................................................................7
1.0 Introduction.....................................................................................................9
1.1 Research Study and Aims ..................................................................................11
1.2 Thesis Structure..................................................................................................12
2.0 Literature Review ...............................................................................................12
2.1 Introduction.........................................................................................................12
2.1 What size shops use Data Analytics? ................................................................14
2.2 Shops using Data analytics to its fullest Potential............................................16
2.3 Cost for retailers not using or not using big data.............................................19
2.4 Data ......................................................................................................................20
2.4.1 Information Hierarchy................................................................................20
2.4.2 Algorithms....................................................................................................21
2.5 Models..................................................................................................................21
2.5.1 K - Nearest Neighbour Model.....................................................................21
2.5.2 C4.5................................................................................................................22
2.6 Data Security Risks.............................................................................................23
2.7 Methods of Collection.........................................................................................24
2.7.1 Loyalty Cards...............................................................................................24
2.7.2 Security .........................................................................................................25
2.7.3 Dublin Airport..............................................................................................25
2.7.4 On-Line Recommendations.........................................................................26
2.7.5 Amazons Similarity Algorithm...................................................................26
2.7.6 In-store Tracking .........................................................................................27
3.0 Methodology........................................................................................................29
3.1 Research...............................................................................................................30
3.1.1 Research Philosophy....................................................................................30
3.1.2 Analysing Research Data ............................................................................31
3.1.3 Qualitative versus Quantitative..................................................................31
3.2 Research Objectives............................................................................................33
3.3 Qualitative Methodology....................................................................................33
3.4 Sampling Techniques..........................................................................................34
6
3.4.1 Qualitative Sampling ...................................................................................36
3.4.2 Snowball Sampling ......................................................................................36
3.4.3 Convenience Sampling.................................................................................37
3.4.4. Judgemental Sampling ...............................................................................38
3.4.5 Quota Sampling............................................................................................38
3.5. Semi Structured .................................................................................................39
3.5.1 Unstructured ................................................................................................40
3.5.2. Structured....................................................................................................41
3.6 Quantitative Methodology..................................................................................42
3.6.1 Independent Variable..................................................................................42
3.6.2 Dependent Variable .....................................................................................43
3.6.3 Moderating Variable ...................................................................................43
3.7 Mixed Methodology ............................................................................................43
3.8 Ethics....................................................................................................................45
3.8.1 Data Collection.............................................................................................46
3.8.2 Restrictions...................................................................................................47
3.8.3 Value .............................................................................................................47
4.0 Results..................................................................................................................48
4.1 Survey Results.....................................................................................................49
4.2 Themes That Emerged .......................................................................................59
4.3 Results Conclusion..............................................................................................60
5.0 Discussion ............................................................................................................61
6.0 Conclusion ...........................................................................................................82
6.1 Summary..............................................................................................................82
6.2 Future Directions................................................................................................83
6.3 Practical significance ..........................................................................................83
6.4 Reflection.............................................................................................................84
7.0 Bibliography........................................................................................................86
7.0.1 Email Invitation for Survey ........................................................................97
7.0.2 Survey ...........................................................................................................98
7
Glossary
BI - Business Intelligence
KDD - Knowledge Discovery and
Data Mining
SME - Small to Medium
Enterprises
CeADAR - Centre for Applied Data
Analytics Research
MGI - McKinsey Global Institute
TSSG - Telecommunications
Software & Systems Group
ECR - Efficient Consumer
Response
ICHEC - Telecommunications
Software & Systems Group
SKU - Stock Keeping Unit
CRM - Customer Relationship
Management
DPA - Data Protection Act
ODPC - Office of the Data Protection
Commissioner
POS – Point of Sale
Abstract
The main goal of this research paper was to look at retail in Ireland and the use
of data analytics. Data analytics comes in many ways and forms and the
objective was to look at and describe what size retailers use the different tools
available to them.
Data analysis has a pivotal role to play in how retailers in Ireland will look at
customer information going forward. Even now the smallest stores are looking
at their customer’s data via social media or instore tracking. Stores do not have
to be bricks and mortar anymore and now, online shops are growing at a rapid
pace.
There was three question asked in this research paper:
1. What size shops are using data analytics?
2. Is it cost based not to use data analytics?
3. Are Irish retailers using data analytics to its full potential?
8
To get a final answer to the research, a literature review was conducted
looking at data models, methods of collecting customer data, security. This
data will be collected to make a decision on the three questions.
A questionnaire is created by using SurveyMonkey, to answer the three
questions stated in the methodology section. To answer these questions nine
questions were created and sent to twenty six retailers. The answers to these
questions were put into SPSS, a statistical software that can perform complex
data analysis.
Once the data is analysed the results will be compared to information in the
literature review. Themes emerging from the research will be looked at and
put to the three methodology questions. The discussion will then find common
results in both methods where then the research questions will be answered.
Finally a conclusion will be written up where the results will be summarised
and future directions will be spoken about. Practical significance and finally a
reflective journal on the research paper.
9
1.0 Introduction
Big data or data analytics is a concept where organisations use a huge amount
of data. Rouse defined big data as the
process of examining large data sets containing a variety of data types -- i.e., big
data -- to uncover hidden patterns, unknown correlations, market trends,
customer preferences and other useful business information.
(Rouse, 2014)
Big data can be
characterised by the
3 V’s, data variety,
and data volume
and data velocity, as
seen in figure 1.
The amount of data
is now so huge it is
talked about in
petabytes and
exabytes.
Relationship
databases are not
used for analysis of
big data as it is too costly and time consuming. Instead, new methods of
storing and analysing data have developed that rely less on data schema and
data quality and more on raw data gathered in a data lake or storage
repository. Machine learning and artificial intelligence (AI) programs use
difficult algorithms to look for repeatable patterns.
Companies are using platform tools such as Hadoop, Tableau and Oracle.
In recent years’ retailers have been tracking customers purchases with loyalty
cards to create a profile on that customer. Everyday millions of transactions go
Figure 1 Rouse, M.,
Source:http://searchcloudcomputing.techtarget.com/definition/big-data-
Big-Data
Figure 2 Markey Leaders Source: Hopkins, R.Figure 3 Rouse, M.,
Source:http://searchcloudcomputing.techtarget.com/definition/big-data-
Big-Data
10
through Irish stores whether instore or on line. This data collected is known as
big data.
A major challenge for retailers is to understand its customers’ needs and wants
along with increasing the companies own sales. Retailers today have a wide
variety of tools available to them to predict the next growing trend. According
to (McHugh, R., 2015) 33% of Irish businesses use Big Data in their strategic
decision making process, 81% of businesses use data at the centre of their
decision making but only 31% of Irish businesses have restructured their
operation to show this.
Some of the
Business
Intelligence
(BI) tools, as
seen in Figure
2, that are
available to
retailers for
analysing the
data are
Tableau, IBM,
SAS, SAP and
many more. A breakdown of the
leaders, challengers, visionaries and niche players are listed above in Figure 2.
Big data falls under the umbrella of BI. These tools can be used to answer
Customer Relationship Management (CRM) questions such as (Doran, 2007)
1. Who are the most valuable and least valuable customers?
2. What aspects effect a sale?
3. How profitable are promotional offers?
4. What are the differences between outlets profits in various
geographical locations?
Figure 4 Markey Leaders Source: Hopkins, R.
Figure 5 Markey Leaders Source: Hopkins, R.
11
Big data covers so much in nearly every aspect of retail that it is very
important for retailers to keep up with this ever changing technology. Most
people have seen Big Data in use every day through use of loyalty cards.
Today 87% of Irish people are signed up to a loyalty card scheme according to
(Armstrong 2016).
1.1 Research Study and Aims
Current studies show that data mining is becoming bigger in Ireland from the
data science side but does that reflect retailers use of customers data? This
research paper aims to study Big Data and how it is used in retail in Ireland
today. Do smaller retailers use big data or is it just the larger retailers? Is cost
one of the main issues holding Irish retailers back or are their other barriers
such as lack of knowledge or issues with data base? Can Irish retailers do
more to use big data too its full potential?
There are three questions asked in the research proposal.
1. What size shops are using data analytics?
2. Is it cost based not to use data analytics?
3. Are Irish retailers using data analytics to its full potential?
These questions will be put to Irish retailers of different sizes to get a broad
sense of the use of data analytics in Ireland today and the barriers standing in
the retailer’s way. Are Irish retailers being left behind other countries
retailers?
12
1.2 Thesis Structure
The literature review will cover past research and discussions on big data in
retail in Ireland. It will talk about algorithms and models. Current security risk
will be talked about and ways of tracking customer data. The methodology
section will talk about the method used and how it was used, the methods not
used and reason for not using them. The research philosophy and how the data
was analysed. In the results section, the findings of the survey will be analysed
and the results will be displayed. This will include graphs and a brief summary
of these results. The discussion then will relate the findings of the survey back
to the literature review where common results will be discussed and argued.
Finally, the conclusion where future direction and recommendations take place
for data analytics in retail in Ireland and answering the three questions set out
in the research. At the back of this paper will be Appendices – bibliography,
pictures, interview questions, sample survey, consent letters, tables and
graphs.
2.0 Literature Review
2.1 Introduction
A huge amount of data both structured and unstructured is being processed every day
in Irish retail though, Point of Sale (POS), loyalty cards, social media and online
sales. Today’s retail environment is tough for retailers as consumer’s face choices
from numerous channels and customers demand a personalised shopping experience.
13
Business have always used data to create business value. For organisations to
make better, fact based decisions, new tools and platforms have been created
to complement this demand for data knowledge.
In 1995 the very first International Conference on Knowledge Discovery and
Data Mining (KDD) took place.
Data mining is the process of exploration and analysis, by automatic or semi-
automatic means, of large quantities of data in order to discover meaningful
patterns and rules.
(Berry and Linoff, 1997)
In Figure 3. KDD
– Knowledge of
Discovering
Databases
Why do we need
KDD?
Questions have been asked, is data mining important? (Gartner, 2016) said
that data mining is the process of discovering meaningful correlations, patterns
and trends by sifting through large amounts of data stored in repositories. Data
mining employs pattern recognition technologies, as well as statistical and
mathematical techniques.
Data Mining is a growing industry in Ireland and the (IPP Ireland, 2016) Irish
Government Action Plan for jobs 2013 has identified “Big Data” as an area
where Ireland would have an unequivocal advantage over other countries.
With Ireland having a high ICT skill level and research capabilities could reap
the meaningful benefits of job growth from global organisations in the “Big
Data” sector.
Ireland is committed to the funding of research that facilitates Big Data. Some
of these companies are Centre for Applied Data Analytics Research
Figure 3 KDD Source: Zaiane, O.,
https://webdocs.cs.ualberta.ca/~zaiane/courses/cmput690/notes/Chapter1
/
Figure 3 KDD Source: Zaiane, O.,
https://webdocs.cs.ualberta.ca/~zaiane/courses/cmput690/notes/Chapter1
/
14
(CeADAR), Telecommunications Software & Systems Group (TSSG), The
National Centre for Data Analytics (The INSIGHT Centre) and The Irish
Centre for High-End Computing (ICHEC).
Are Irish retail stores that use data mining making more profit than retail
stores that do not use data mining? Stores are now monitoring customer data
for:
o Predicting customer trends
o Tracking customer loyalty
o Up-selling strategies
o Tracking and targeting profitable items
Now more than ever is data mining important for retail stores small and large
to try and gain a competitive advantage over their rivals.
2.1 What size shops use Data Analytics?
Stores of many sizes use data analytics with each store obtaining the
customers information in different ways. Data analytics varies from taking a
customer’s details on a POS to Amazon’s similarity algorithm. It is not only
large multinational retailers that use data analytics to their benefit, smaller
stores too can use data analytics to their advantage.
(Hawkins 2012) wrote that the largest retailers, who can afford to have the
most sophisticated data collection software are becoming the leading shopping
destinations while smaller retailers who in the past were competitors with the
larger companies are falling behind. It is a war to keep customers and gain
new customers from competitors, by using customer data collected by
software. (Hawkins, 2012) is worried that not all retailers will convert to using
customer data, the large retailers will leave small, independent retailers
behind. (McKenna 2015) stated that 24% of UK retailers are using data
effectively enough to enhance sales.
15
(Schaeffer 2016) disagreed stating that retail pecking order is less determined
by the size of an organisations IT budget but more by the retailer's inclination
towards innovation and agility. The retail business is rapidly changing and
smaller shops are showing more agility than larger retailers. (Davis 2015)
claimed that 95% of Small to Medium Enterprise’s (SME) have a defined big
data agenda.
(Simon 2013) wrote that in a recent study of 541 small to medium UK
companies, none were thinking of taking advantage of big data. This puts
these organisations at a serious disadvantage against competitors that use data
mining to predict market trends and customer behaviour.
Even large retailers can make mistakes when it comes to exploiting customer
data. Tesco have been at the forefront of technology since 1995 when they
introduced the loyalty card. By doing this Tesco changed the landscape of
retailing. All supermarkets wanted to be like Tesco. Even in America retailers
like Walmart took on the Tesco model of customer analytics.
But as (Schrage 2014) stated, for all Tesco’s customer data, analytics,
segmentation, customisation and promotion their dominance was waning as
disillusioned shoppers left to shop in discounted German stores such as Lidl
and Aldi. These stores were simple with no gimmicks such as club cards,
which customers started to believe benefits the retailer more than the
customer.
If Tesco’s dramatic decline is a warning that powerful, data rich loyalty
programs retailers cannot fight off smaller companies with lower prices and
comparable shopping experience. Another reason is that Tesco now lack the
innovation and vision that got them to the top. None of this is new, a large
company falling from grace but is this a company that could not get through
tough times or in an age of big data, predictive analysis and customer
knowledge is this technology as powerful as we are lead to believe.
16
Not all organisations use big data. About 61% of senior executives now use
big data to help make decisions wrote (Conley 2016). This leaves a large
figure of 39% that do not use data analysis. Many see this as a massive
disadvantage. The reason for the large amount of organisations not using big
data could be down to old habits dying hard. Some senior executives are just
afraid of changing with the times and will not embrace change. Some
organisations do not know the benefits that come with changing to data
analytics.
2.2 Shops using Data analytics to its fullest Potential
According to research carried out by McKinsey Global Institute (MGI) and
McKinsey Business Technology Office big data will become key in
productivity, growth and competition. Big data will affect every sector due to
the rise in popularity of social media, the internet of things and multimedia for
the foreseeable future.
(MGI 2011) carried out a study in Europe and the United States of 5 domains,
healthcare in the US, the public sector in Europe, retail in the US, and
manufacturing and personal-location data globally. This study showed s figure
if big data was used to its fullest in each sector the increase in monetary form
would be substantial. If a retailer used big data to its full potential, it can
increase its profit margin by more than 60%. Another example in the same
report European government administration would saving €100 billion in
operating efficiency improvements.
With MGI saying that profit margins can be increased by more than 60%, a
(Forrester Consulting 2014) report claimed that 68% of retail CIO’s said they
collect data, but agreed they were not maximising its full potential. The report
also conveyed, only 47% of the survey participants have invested in cross
channel analytics to effectively engage customers while only 25% of retailers
plan to invest in big data in the future.
(Bishop 2013) stated that retailers need to move beyond the term ‘Big Data’,
or its full potential will not be fulfilled. Part of the reason is the lack of
meaning for the name ‘Big Data’, but the name is not the only think holding
back its development.
17
People who comprehend big data strive to convey its value to retail leaders.
(Bishop 2013) wrote that besides communication there is also 3 other
significant factors stopping big data reaching its full potential,
1. An absence of the ‘Big Picture’ is holding retailers back.
o For supply, the initial work on Efficient Consumer Response
(ECR) describes a continuous stream of customer demand
information to the producer and a continuous stream of product
to the customer to meet that request.
o For demand, if practiced correctly, one to one marketing can
increase the value of retailer’s customer base.
2. Inclination not to share data freely
o Fear is one of the factors for the lack of end to end data sharing.
Retailers fear that their company will become irrelevant and
fall behind.
3. Acceptance of new performance metrics such as Just-in-Time.
o Developing the metrics such as just in time will not be an issue,
the problem will be new metrics is integrating them and get
other companies accepting new metrics.
It is only now that retailers are using data analytics techniques to catch up
with their online competition. Physical shops know what customers visit
buy what they buy, online retailers know, what section is viewed first,
what items are viewed but not purchased and how long that item has been
viewed for. This information allows online retailers to use big data to a
better potential than physical shops.
(Lacy 2013) in an interview with Marc Andreesen wrote that
Retail guys are going to go out of business and ecommerce will become
the place everyone buys. You are not going to have a choice.
(Andreesen 2013)
While 95% of all retail in United States are physical stores, they are
growing at one quarter the rate of online stores.
18
(Baldwin 2014) wrote only 23% of UK retailers feel they can instantly make
sense of the data made accessible to them to make the correct business
decision while 50% of retailers believe their existing BI tools fall short of their
needs, with only 16% of retailer confident that the data analytic tools used
give the organisational visibility they require.
(Warner 2014) explained, data sharing is important between retailer and
supplier. It can cut costs and make savings.
o Data Sharing
- Retailers and suppliers should share and analyse general
set of data.
- By measuring and reporting the data both parties will
know the same level of information.
(Warner 2014) as claimed they are two main types of data that should be
shared.
o Sell-thru data
- This is data that a retailer uses to make decisions on
what to order from a supplier. These include;
- Scanned sales or Register Sales
- Quantity stock on order
- Quantity stock on hand
- Orders categorised by SKU
- Orders categorised by store
Giving suppliers access to this information benefits both supplier and retailer.
This also gives less chance of a store being over or under stocked.
o Product categories
- This lets suppliers and retailers focus on groups of
products instead of individual products
- By measuring the categories and sharing the results,
they both can satisfy a customer’s needs.
19
- They can find slow selling and fast selling categories.
2.3 Cost for retailers not using or not using big data
With data analytics so engrained in gaining customer information, can retailers
really afford not to move to using big data? (Liozu 2014) explained that with
online, mobile, social media and in-store, retailers are struggling to keep up
with customers ever changing needs.
(Liozu 2014) also suggested that retailers should invest in price intelligence
and predictive analytics side of data analytics as it helps them figure out what
customers want to purchase and when and help decide the optimal price range
of that customer. (Gartner 2013) forecasted by 2014 the big data market will
grow by 9% ($80 billion) per year and 50% of the growth will be in predictive
analysis.
Should retailers purchase data analytics at any cost? From what Gartner is
saying, big data will be hugely significant for retailers going forward.
(Bantleman 2012) wrote about the cost of big data, saying a petabyte Hadoop
cluster will need 125 to 250 nodes which cost $1 million. The cost to support
Hadoop will cost $1 million. Using data analytics is expensive, but there are
more inexpensive alternatives for smaller retailers. (Amazon) explained
Amazon released Amazon Redshift, a fast inexpensive option, petabyte scale
data warehouse that makes it easy to analyse organisation’s data using your
existing BI tools. It can cost from as little as 25c per hour and move up to
petabytes for $1,000 per terabyte per year, less than the traditional data
analytic tools. (Jain 2014) disagreed with (Bantleman 2012) and explained
why retailers do not need to spend big on data analytic tools. A simple trend
focused on one or two key variables, employed the right way will start smaller
retailer on right track. Online customers and customers on social media can be
easily and inexpensively tracked through ‘liked’ clicks on social media,
purchasing history browsing behaviour.
The Data Protection act of 1988 and 2003 was put in place to protect
customer’s information collected by organisations. This act asks regulatory
requirements when it comes to handling private data. (AL Goodbody) wrote about
some of the key issues that need to be considered,
20
1. Who controls the data
2. Appropriate security measures
3. Consent of the data subjects
4. De-identification of data
2.4 Data
2.4.1 Information Hierarchy
The hierarchy referred to variously as the ‘Knowledge Hierarchy’, the
‘Information Hierarchy’ and the ‘Knowledge Pyramid’ is one of the
fundamental, widely recognized and ‘taken-for-granted’ models in the
information and knowledge literatures. It is often quoted, or used implicitly,
in definitions of data, information knowledge in the information, information
systems and knowledge management literatures
(Rowley, J., 2007)
(Russell Ackoff), a systems theorist and professor of organizational change,
organised the human mind into 5 categories as seen in Figure 4 (Bellinger et al
2004);
o Data – Symbols.
o Information – Processed data that return answers to “who”, “what”,
“where” and “when” questions.
o Intelligence - Application of data and information; answers "how"
questions
o Knowledge - Appreciation of "why"
o Wisdom – Evaluated understanding
Figure 4 Information Hierarchy, Flood, P 2016
Figure 4 Information Hierarchy, Flood, P 2016
21
The first four categories deal with the past, with only the final category,
wisdom, dealing with the future.
IBM claimed that the amount of unstructured and multi-structured data within
an organisation is at 80% (Savvas, A., 2011). With more and more people now
purchasing on-line, retailers need to be able to extract relevant information to
stay competitive. By better predicting what customers want by their habits,
preferences, and expectations for a better shopping experience?
2.4.2 Algorithms
Microsoft 2016 wrote that data mining algorithms are a set of heuristics and
calculations that creates a data mining model from data. The algorithm
analysis the data present and searches for specific types of patterns or trends,
creating a model. The results of the analysis are then used by the algorithm to
characterise the optimal parameters for creating the mining model. The
parameters are then implemented across the entire data set to separate
actionable patterns and detailed statistics.
2.5 Models
2.5.1 K - Nearest Neighbour Model
The K – Nearest Neighbour Model (KNN) is a simple, versatile model used in
(Thirumuruganathan, S., 2014) said that KNN is a non-parametric lazy
learning algorithm. This means the KNN does not make assumptions on the
underlying data distribution.
1. Nearest neighbour outcome is a plus
2. Nearest neighbour outcome is an
unknown
4. Nearest neighbour outcome is a minus
Figure 5 K-Nearest Neighbour Model
22
To explain how figure 5 demonstrates K-nearest neighbour analysis you need
to classify a new object among a number of known examples. By looking at
the figure 5, you want to know whether the query point (the orange dot) can be
classified as a plus or minus sign.
The outcome of KNN based on 1 nearest neighbour. It is clear from figure 3
that the result will be a minus since the nearest point is a minus sign. If the
nearest neighbour is increased to 2 then KNN will not be able to classify the
outcome with the second closed query point a plus as both answers have the
same outcome. If you increase the nearest neighbour to 4, this will identify the
nearest neighbour region, indicated above with a circle around it. Since there
are 3 plus and 1 minus, the outcome of the query point will be a plus.
2.5.2 C4.5
The C4.5 model creates a classifier in the appearance of a decision tree. For
this to happen the C4.5 is given a set of data representing items that are
already classified.
Young Middle Aged Senior
A decision tree is similar to a flow chart. Each tree structure has a root node
which is Age in figure 6, branches which is shown by age, middle age and
senior. Leaf node is represented by Student? And Credit_Rating?
Age
Age
Student?
Student?
Yes
Yes
Credit_Rating?
Credit_Rating?
No
Fig
ure
6
C4.
5
De
cisi
on
Tre
e
So
urc
e:
P
Yes
Yes
No
No
Yes
Yes
Figure 6 C4.5
23
Using C4.5 decision tree does have benefits,
o Easy to understand small decision trees
o Does not involve domain knowledge
o Works well with redundant attributes
o Using classification steps of a decision tree is straight forward and
rapid
Disadvantages of using C4.5 are
o Irrelevant attributes may affect the creation of a decision tree
o Slight changes in the data can create very different looking decision
trees
o Too many classes can cause errors
o Sub trees can be replicated many times
o Inadequate for forecasting the value of a continuous class attribute
2.6 Data Security Risks
With the increase use of business intelligence comes the bigger concern of
security for the company’s data. (McHugh, R., 2015) wrote that 44% of Irish
companies are worried that big data will increase security risks. Companies
have been working on computer security for the past 30 years. In that time a
huge number of successful advances such as public key encryption, multi-
level security and cryptographic protocols but unfortunalty attackers also
advance at faster rates. With more and more customer’s information being
saved by retailers, the customer needs to feel secure in knowing that their data
is safely stored.
24
2.7 Methods of Collection
2.7.1 Loyalty Cards
Superquinn introduced loyalty cards to Irish retail in 1993 with their
SuperClub loyalty card scheme. Loyalty cards did not grow in popularity with
other retailers until Tesco introduced their loyalty card scheme in 1997. Most
large retailers that have Business Intelligence (BI) also have a loyalty program
for its customers such as vouchers, coupon and discounts off items. This is a
small incentive for the information that customers are giving the retailers.
(Sharpe, 97) asked, do loyalty programs increase loyalty? Retailers use BI to
keep existing customers by tracking what they purchase and aiming discount
vouchers at their purchases. This helps the retailers keep track of the
customers and increase the loyalty of their brand as seen in figure 7. Retailers
achieve this through deep purchase history analysis and customer attribution
(Sciencesoft, 2014)
In a study by North Western University one in five people studied used social
media as a form of communication for the most frequent retailers. Among that
20%, Facebook was the most dominant form for discounts and brands news
(Beck et al). Retailers are trying to find new and unique ways of attracting
new customers and keep existing customers.
Figure 7 Analysing Customer Data :Source Slideshare.net
In the UK £1 in £7 is spent on a Tesco product or services (Smith, Woods,
2008). This commanding position has been supported by Tesco’s popular club
card which is in 1 out of every 2 households in the UK (Davey, 2009).
(Armstrong, 2016) wrote that 9 out of 10 people (87%) in Ireland are signed
up to loyalty schemes and the average person is participating in up to 4
programs as seen in figure 8.
25
Figure 8 Loyalty Cards Ireland Source: http://irishtechnews.net/ITN3/new-shopping-loyalty-app-is-
launched-for-the-irish-market/
2.7.2 Security
Most shoppers are unaware the loyalty cards that they use every day is being
used by the retailer to build a profile of their shopping patterns to be used
later. The data collected can be sent out to telemarketers and business partners.
Retailers however do not say how often how the discount card scheme is
outsourced to other companies.
Safeway's privacy policy, for instance, states the following:
The information we receive depends on what you do when you visit our stores
and use your card. We collect and store your name, address, home telephone
number, and birth date if provided by you. If you are in an area where we
offer electronic checking and apply for this service, we also ask for
information such as your driver's license number and bank and credit card
account numbers. When you make purchases, we record data about the
transaction; including the amount and content of your purchases and the time
and place these purchases are made
(Flaherty 2011).
2.7.3 Dublin Airport
In May 2013 Dublin airport introduced facial recognition technology at the
biometric gates to speed passengers through border control. This allowed
passengers to be verified using facial recognition and processed in 7.5 seconds
each (futuretravelexperience.com, 2013). Not only will this enhance the
26
traveling experience for passengers but facial recognition will also strengthen
border security in the airport.
2.7.4 On-Line Recommendations
Some on-line companies try to predict their customers wants based on
searches they made on their site. Companies like Amazon use
recommendation algorithms to tailor make recommendations for their
customers by following searches, purchases, length of time on site, duration of
view of item, hovered over items, wish lists, shopping cart activities and rated
items (Ulanoff, 2014). How Amazons algorithm works is illustrated below.
.
2.7.5 Amazons Similarity Algorithm
The algorithm works by finding groups of customers whose items they
purchased and items that they rated overlap with the users purchased items and
rated items (Schafer, Konstan, Reidl, 2001). The algorithm then gathers the
items from similar customers and disregards the items that the purchaser has
already rate or has purchased. The algorithm then recommends the items
remaining to the customer.
If the algorithm represents a customer as N-dimension, where N is the distinct
items in the catalogue. The items of the vectors that are positive have been
positively rated and the items that are negatively have been negatively rated.
For Amazons bestselling items, the
algorithm multiplies the vectors
items by the inverse frequency. This
makes less known products more
important. (Linden, Smith, York,
2003).
Figure 9 Amazon Recommendation Algorithm Source
http://www.cin.ufpe.br/~idal/rs/Amazon-
Recommendations.pdf
27
The algorithm bases its recommendations on a few customers who are most
similar to the user. The algorithm can measure the similarity of customer A
and B in figure 9. It measures the cosine of the angle between the vectors.
2.7.6 In-store Tracking
Retailers are now using more technology than ever to track customer
behaviour while in the shop. This is becoming popular with younger shoppers
who like the interaction in their shopping experience. Carrefour in France use
tracking technology to send coupons onto smart devices as customers walk
down isles passing certain products (Pope, 2015). 30% of retailers are now
using facial recognition to track customers while in the store, according to
software firm CSC wrote (McDonald, 2015). The report also stated 74% of
stores track customers while in the store by using technology while a quarter
of customers saying it gives a good shopping experience.
Many people accept that younger people are happier to have their information
used by retailers for this purpose while not fully understanding how the
information is being used and the security risks that that are associated with it.
Retailers need to explain to their customers why they collect the data for the
benefits customer and not for the retailers.
Retailers in America are using facial recognition not only to track customer’s
in store but also to help with security and to stop theft which costs the retail
industry in America $32 billion (Wahba, 2015). Facial recognition used as a
security will recognise known thieves.
Facial recognition is not used in Ireland for shopping or security. Is Ireland
slow to catch on to new technology in retail or do Irish shoppers not want to
change? Irish customers are known for their brand loyalty. An example of this
is in 1996 Superquinn rolled out concept of a ‘super scanner’ that would end
the dreaded queues in their stores as seen in Figure 10. This was based on trust
28
as customers would scan the items bar codes as they were out in the basket or
trollies. The ‘super scanner ‘would total up the amount and the customer pay
from their account.
A random selection of customer’s trollies would happen to where a customer
would have to go the old way. This was to help avoid overcharging. Mr
Eamonn Quinn said the project was self-financing as it would attract new
customers and increase business (Yeates, 1996).
Figure 10 Super Scanner used in Superquinn Source:
http://ecir2011.dcu.ie/ecommpracticums/main/2006/G.pdf
While Irish retailers seem to be slow to advance to Superquinn seems to be an
innovator in this area. They have piloted some of the world’s most advanced
technology in retail such as self-scanning shopping as seen above, multi-
function kiosks, digital shelf-labels and mobile checkout technology (Labarre,
2001).
29
3.0 Methodology
The strategy of the research that is to be undertaken will be outlined in this
chapter, in relation to the distinct research questions, reaffirmed below:
1. What size shops are using data analytics?
2. Is it cost based not to use data analytics?
3. Are Irish retailers using data analytics to its full potential?
To answer the above questions, the researcher used a quantitative
methodology as it allowed retailers of different sizes within Ireland, to be
interviewed with the aim of answering the research questions. Quantitative
methodology as described by
Quantitative research is a formal, objective, systematic process in
which numerical data are used to obtain information about the
world.
(Burns, Groves,2005)
Other methods were considered such as qualitative and mixed methodology.
Qualitative methodology as described by Pope and Mays:
The goal of qualitative research is (rather than experimental)
settings, giving due emphasis to the meanings, experiences, and
views of all participants.
(Pope C, Mays N, 1995)
Mixed methodology was described by (Leech and Onwuegbuzie 2006):
Because of its logical and intuitive appeal, providing a bridge between the qualitative
and quantitative paradigms, an increasing number of researchers are utilizing mixed
methods research to undertake their studies.
(Leech and Onwuegbuzie 2006)
Reasons and against taking these methodologies are explained below.
30
3.1 Research
Research can be at times mistaken for gathering information, documenting
facts, and rummaging for information (Leedy & Ormrod, 2001). Research is
the process of gathering, and measuring information on variables of interest, in
a traditional, structured fashion that permits one to answer stated research
questions, test hypotheses, and evaluated outcomes.
This data collection compound of research is normal to all fields of study
(Jacob, 2015).
While (Brannick & Roche, 1997) wrote that
good research is purposeful, has clearly defined goals, and significant, the
methodology procedures are defensible, evidence is systematically analysed,
statistical techniques are correctly followed and the objectivity of the researcher is
clearly evident.
(Brannick & Roche, 1997)
Qualitative and quantitative are two ways of conducting research studies.
(Minchiello 1990) stated that qualitative research is where data is collected
through participation and interviews. The interviewee’s data is analysed from
descriptions. The interviewee’s language is the data used. Quantitative
interviews are the measuring of data. The data is analysed by using numerical
comparisons and statistical references.
3.1.1 Research Philosophy
For this research paper the researcher will be using quantitative research. Due
to time constraints and cost of conducting individual interviews, the researcher
decided to use quantitative research in the form of a questionnaire.
SurveyMonkey was used to create and distribute the questionnaire.
SurveyMonkey is an on-line survey software provider, with tools for creating,
modifying, distributing and analysing questionnaires send out to candidates.
Distribution of the survey will be carried out by text, email, social media and
web links.
31
The initial draft was created using the tools on SurveyMonkey and distributed
to numerous retailers using SurveyMonkeys own website. The following
drafts were created on SurveyMonkeys website and distributed individually
and by group via social media, text and email. An issue did occur when
sending the survey to respondents via SurveyMonkeys own website. The
respondents were not receiving the emails. To counteract this problem, the
questionnaires were sent out individually by personal email to the respondents.
3.1.2 Analysing Research Data
Once all the questionnaires were completed on SurveyMonkey, the results will
be inputted into IBM SPSS, a software package for statistical analysis. By
using SPSS to input the results of the questionnaire, the researcher will look
for variables in the results to create the answers for the research questions.
These answers will then be analysed and broken down to the 9 questions
asked. The results will be put to the three research questions and the results
will be put into the discussion section.
The expected outcome of the research is:
1. To find out what size retailers in Ireland are using big data.
2. Is cost the main reason not to use big data?
3. Are Irish retailers using big data to its full potential?
3.1.3 Qualitative versus Quantitative
Primary data can be quantitative or qualitative as show in Figure 11.
(Malhotra, 2000) wrote that the distinction between qualitative and
quantitative research closely matches the distinction between exploratory and
conclusive research.
Qualitative Quantitative
Conceptual o Concerned with understanding
human behaviour from the
informant’s perspectives.
o Assumes a dynamic and negotiate
reality.
o Concerned with
discovering facts
about social
phenomena
32
Methodology o Data are collected through
participant’s observation and
interviews.
o Data are analysed by themes from
descriptions by informants
o Data are important in the
language of the informant.
o Assumes a fixed and
measurable reality.
o Data are collected
through measuring
things.
o Data are analysed
through numerical
comparison and
statistical inferences.
o Data are reported
through statistical
analysis
Source Minchiello et al (1990, p5)
Marketing
Research Data
Marketing
Research Data
Secondary Data
Secondary Data
Primary Data
Primary Data
Qualitative Data
Qualitative Data
Quantitative Data
Quantitative Data
Exploration
Exploration
Cause and
Effect
Cause and
Effect
Description
Description
Figure 11 Classification of Market Research Data Source: Marketing Research Malhotra, N
Figure 11 Classification of Market Research Data Source: Marketing Research Malhotra, N
33
3.2 Research Objectives
This dissertation is to identify if Irish retailers are using Data analytics to its
full potential. The following are the objectives that this research will address:
o What size shops are using data analytics?
o Is it cost based not to use data analytics?
o Are Irish retailers using data analytics to its full potential?
3.3 Qualitative Methodology
The goal of qualitative research is the development of concepts which help us to
understand social phenomena in natural (rather than experimental) settings, giving
due emphasis to the meanings, experiences, and views of all participants.
(Pope C, Mays N, 1995)
The categorisation of qualitative research is shown in Figure 12. These are
categorised as either direct or indirect, based on whether the interviewees
know the true purpose of the assignments direct method is not a disguise. The
reason of the assignment is revealed to the interviewees or is obvious to the
Qualitative Research Procedures
Qualitative Research Procedures
Direct
(Non-Disguised)
Direct
(Non-Disguised)
Indirect
(Disguised)
Indirect
(Disguised)Group Interviews
Group Interviews
Depth Interviews
Depth Interviews
Observation
Techniques
Observation
Techniques
Projective
Techniques
Projective
Techniques
Figure 12 Classification of Qualitative Research Procedures Source: Marketing
Research Malhotra, N
Figure 12 Classification of Qualitative Research Procedures Source: Marketing
Research Malhotra, N
34
Sampling
Techniques
Sampling
Techniques
Judgemental
sampling
Judgemental
sampling
Snowball
sampling
Snowball
sampling
Stratified
sampling
Stratified
sampling
Cluster
sampling
Cluster
sampling
respondents from the questions asked. The main direct interview techniques
are in-depth interviews and focus groups.
The indirect method conceals the exact reason of the assignment. The main
indirect interview techniques are projective and observation techniques.
Different methods of qualitative that (Travers, M. 2002) wrote about,
1. Observation – numerous sociological studies on courtrooms have been
built on observations.
2. Interviewing – individually or focus groups e.g. get different
viewpoints of police officers, lawyers, probation officers.
3. Ethnographic fieldwork – frequently requires devoting a lot of time
with a group e.g. anthropologist studying another culture
4. Discourse analysis – closes study of communication e.g. a recording of
a doctor’s advice to a patient.
5. Textual analysis – analysis of textual and multimedia items e.g. letters
written, diary, files, web sites, notice message boards.
3.4 Sampling Technique
Non-probability sampling
techniques
Non-probability sampling
techniques
Probability
sampling
techniques
Probability
sampling
techniques
Convenience
sampling
Convenience
Quota
sampling
Quota
sampling
Other
sampling
techniques
Other
sampling
techniques
Simple random
sampling
Simple random
sampling
Systematic
sampling
Systematic
sampling
Figure 13 Classification of sampling techniques Source: Marketing Research p353
Figure 13 Classification of sampling techniques Source: Marketing Research p353
35
Sampling techniques as seen in figure 13 can be classified as probability and
non-probability sampling. Probability sampling is where sampling components
are selected by chance.
o It is possible to pre-specify every probable sample of a given size.
o Each potential sample does not need to have the same probability of
selecting each sample
o It is possible to specify the probability of selecting any particular
sample of any size.
o Requires an accurate definition of the target population but also
general specifics of the sampling frame.
o It is possible to control the accuracy of the sample evaluations of the
characteristics of interest.
o Confidence breaks, which comprise of the true population value with
an assumed level of inevitability, can be calculated.
o This allows the researcher to make readings or projections about the
target population from which the sample is obtained.
Classification of probability sampling techniques are founded on,
1. Element versus cluster sampling
2. Equal unit probability versus unequal probabilities
3. Unstratified versus stratified selection
4. Random versus systematic selection
5. Single-stage versus multistage techniques
Non-probability sampling relies on the individual judgement of the researcher
rather than on chance.
o The interviewer can deliberately and subjectively can decide what
elements what to put in the sample.
o Non-probability sampling provides good approximations of population
characteristics.
36
o They do not permit for unprejudiced assessment of the accuracy of
sample results.
o Commonly used techniques include quota sampling, snowball
sampling, convenience sampling and judgemental sampling.
3.4.1 Qualitative Sampling
Researchers can use different strategies for qualitative sampling as seen in
Figure 14. Each gives its own advantages and disadvantages.
Figure 14 Qualitative Research Design, Source Maxwell J 2013, Qualitative Research Design an
Interactive Approach
3.4.2 Snowball Sampling
1. This is the most common method of sampling
2. An advantage of this method is that one interviewee refers the
researcher to another interviewee.
3. A Disadvantage of snowball sampling is that the sample may be
limited because it consists of interviewees who belong to the
networks of the index cases. (Hardon, A, Hodgkin, C, Fresle, D,
2004)
4. After the interview the respondents are asked to identify others
who belong to the same targeted population.
5. Subsequent respondents are selected based on
recommendations.
37
6. By obtaining recommendations from recommendations, this
procedure is carried out in waves, thus leading to a snowball
effect.
7. Even though the initial selected respondents are selected by
probability sampling, the final sample is a non-probability
sample.
8. The recommendations will have psychographic and
demographic characteristics more similar to the person
referring them than would occur a chance (Frankwick, G.L. et
al 1994)
3.4.3 Convenience Sampling
1. Getting a sample from whoever is available to give the sample
2. An advantage of this method is getting a different view of an
interviewee who might not ordinarily be interviewed
3. A disadvantage would be you could interview a person who has
no information on your sample.
4. Convenience sampling is the least time consuming and least
expensive of all sampling techniques.
5. The sampling units are easy to measure, accessible and co-
operative.
6. Convenience sampling has serious limitations.
7. Many potential sources of selection bias are present, including
respondent self-selection.
8. Convenience samples are not representative of any definable
population.
9. They are not suitable for marketing research assignments
involving population inferences.
10.Convenience samples are not suitable for descriptive of casual
research but can be used in investigative research for creating
ideas, hypotheses or understandings.
38
3.4.4. Judgemental Sampling
1. Judgemental sampling is a type of convenience sampling in
which the researcher selects the population elements based on
their judgement.
2. The researcher, exercising knowledge or judgement, selects the
elements to be involved in the sample as they believe that they
are representative of the population of interest.
3. Judgemental sampling is used when there is a limited number
of individuals.
4. It is the only viable sampling technique in obtaining
information from a very specific group of individuals.
3.4.5 Quota Sampling
1. Quota sampling can be regarded as two stage restricted
judgemental sampling that is used extensively in street
interviewing.
2. Stage one comprises of creating control characteristics, or
quotas, of population elements such as age and gender.
3. The researcher lists relevant control characteristics and
determines the distribution of these characteristics in the target
population.
4. In the second stage, sample elements are selected based on
convenience or judgement.
5. Once the quotas have been allocated, there is significant
freedom in choosing the elements to be included in the sample.
6. The only obligation is that the elements selected fit the control
characteristics.
7. It may appear that the quota sampling technique is totally
representative of the population. In some cases, it is not.
39
3.5. Semi Structured
Semi structure methodology is a list of open ended questions that change from
interview to interview. (Kvale, 1996) wrote that semi structure interviews are
a form of human interaction in which knowledge evolves through dialogue.
Face to face interviews are thought to give the interviewer the highest
response rates (Nueman, 2007). The interviewer can also have a list of follow
up probing questions to gather further information on any answer given to the
questions (Curran, 2014).
(Saunders, Lewis, Thornhill, 1997) wrote that semi structured interviews can
have
o Questions can vary from interview to interview
o Questions can be left out and added
o The order of questions may vary.
Advantages and disadvantages of semi structured interviews
Advantages Disadvantages
o Acquires relevant
information
o The interviewees are
specifically targeted
o Structured so as to allow
contrast
o Gives the freedom to
explore general views or
opinions in more detail
o Can be used for sensitive
topics
o Can use an external
organisation so as to
retain independence
o Interviewing skills are
essential
o Need to meet sufficient
people in order to meet
general contrasts
o Preparation must be
carefully planned so not
to make the questions
perspective or leading
o Need to have skills at
analysing the data
o Time consumption and
resource intensive
o You have to be able to
ensure confidentiality
40
3.5.1 Unstructured
Unstructured interviews are in-depth exploration interviews. The interviewer
has no predetermined list of questions. This allows the interviewee to freely
talk about the questions asked.
The questions are open ended allowing the respondents to answer in their own
words. Open ended questions are good questions to start an interview. This
allows the respondents to express opinions that enables the interviewer to
interpret the responses to the questions. These questions have a less biased
influence on the interviewee’s answers than structured questions. The
interviewees are free to express any views (Malhotra, Birks, 2000). The
researcher gets a substantial insight from these comments and explanations.
(Malhotra, Birks 2000) argued that the main disadvantage of unstructured is
that potential for interviewer bias is high. Another disadvantage of
unstructured interviewing is the cost of coding the responses is high and time
consuming (Jones, 1981) and (MacDonald, 1982). Unstructured or open ended
questions give extra substance from interviewees who are more articulate.
Unstructured questions are not suitable for self-administrating questionnaires
such as post and telephone interviews, as these are briefer in writing than in
speaking.
41
3.5.2. Structured
Structured interviewing is the involvement of questionnaires with a
predetermined set of questions.
Strengths of Structured Interviews Weaknesses/Limitations of
Structures Interviews
o It allows the researcher to
examine the knowledge a
respondent has about the
topic.
o This is an important form of
formative assessment. It can
be used to assess a
respondent’s feelings on a
particular topic before using
a second method.
o All the interviewees are
asked the same questions.
o Offers a reliable source of
quantitative data.
o The interviewer is able to
contact large numbers or
people quickly, easily and
efficiently.
o It is fast and straightforward
to generate, code and
interpret.
o A formal relationship is
created between the
interviewer and the
respondent.
o The interviewer does not
have to worry about
incomplete questionnaires,
o This can be time consuming
if the group is very large.
o The quality and practicality
of the information received is
dependent on the quality of
the questions asked. The
assessor cannot add or
subtract questions.
o A considerable amount of
preplanning is needed.
o The design of the
questionnaire makes it
problematical for the
interviewer to inspect
difficult issues and views.
o Answering the questions give
limited scope to expand in
detail or depth.
o The interviewer can influence
answers of a respondent by
their presence, making the
responses biased.
o The interviewer by designing
the questionnaire has decided
in advance which questions
they consider important and
unimportant.
42
biased questionnaires and
response rates with
structured interviews.
(Source: http://www.sociology.org.uk/methsi.pdf)
3.6 Quantitative Methodology
Quantitative research is a formal, objective, systematic process in which numerical
data are used to obtain information about the world”.
(Burns, Groves,2005)
This research method is used:
o to describe variables;
o to examine relationships among variables;
o to determine cause-and-effect interactions between variables’
Quantitative observation involves studying the behavioural patterns of
people, objects and events in a systematic manner to obtain information
about the phenomenon of interest.
3.6.1 Independent Variable
Independent variables are manipulated by the researcher and whose effects are
measured and compared. They are alternatives that are controlled and whose
effects are measured and compared, known as treatments. These could be
price levels, marketing themes and package design.
1. This is what the researcher expects will affect the dependent variable.
2. The researcher sets out to control the effect on the dependent variable.
43
3.6.2 Dependent Variable
Dependent variables are variables that measure the effect of independent
variable These are variables that measure the result of in dependent variables.
These variables can be sales, profits and market shares.
1. The researcher expects that this will be affected by manipulating the
independent variable
2. It can be measured.
3.6.3 Moderating Variable
(Olsen, W, K)
1. A moderating variable represents a process or a factor that alters the
impact of an independent variable.
2. Has a strong contingent effect on the independent and the dependent
variable relationship (Hussey, J. and Hussey R. (1997).
3.7 Mixed Methodology
Mixed methodology involves both quantitative and qualitative research as
seen in Figure 15. To include only qualitative or quantitative methods falls
short in major approaches used in social and human sciences (Creswell, 2003).
It includes philosophies assumptions from the approaches of quantitative and
qualitative methodologies.
One of the key benefits
of mixed methodology
is the capability to
match the method to
the requirements of the
study. (Migiro,
Magangi, 2011) wrote
Figure 15 Methods of Interviewing Source: Creswell 2003
Figure 15 Methods of Interviewing Source: Creswell 2003
44
that mixed methodology lets the researcher get a sense for the vital issues on
the subject before embarking on further development in a study, can be very
valuable. (Creswell & Plano Clark, 2007) wrote that mixed methodologies are
more than just simply accumulating and analysing both kinds of data; it also
comprises the use of both methods in tandem so that the overall strength of a
study is greater than either qualitative or quantitative research.
Mixed methodologies are evolving into a dominant form of research.
However, (Symonds, Gorard, 2008) wrote that the concept of mixed
methodologies has logical foundations rooted more in philosophy than in
pragmatic reality. (Morse 2005) claimed, while acceptance the new field,
admitted to the sense of hearsay because of “sudden faddishness of mixed
methods” had brought to the fore awkward and unanswered questions about
mixed qualitative and quantitative methods within a single set. Others, such as
(Bazeley 2004) warned about the risks of this “new” methodology and the
paradigmatic and methodological topics that could arise. (Giddings and Grant
2007) called mixed methods a bastardisation of positivism. (Giddings 2006)
had issued this frank warning:
Clothed in a semblance of inclusiveness, mixed methods could serve as a
cover for the continuing hegemony of positivism, and maintain the
marginalisation of non-positivist research methodologies. I argue here that
mixed methods as it is currently promoted is not a methodological movement,
but a pragmatic research approach that fits most comfortably within a post
positivist epistemology.
(Giddings, 2006)
45
The Case for or against the use of mixed methodology is argued below.
For Against
o All singular methods
(interview, survey) and all
data types (audio, visual) can
be categorised under one of
two paradigms (Quantitative,
Qualitative)
o That elements from both
paradigms can cohabit in a
single investigation.
o A third classification is
required to refer to
investigations which use
elements of both paradigms.
o Qualitative and quantitative
research are separate
contrasting paradigms.
o Integrated research
approaches can lead to the
threat of discounting
assumptions underlying
research methods.
o It can be time consuming
and expensive
o Researchers need to have
experience in both
qualitative and quantitative
methods.
3.8 Ethics
Organisations using big data reap the benefits of customer’s data, detecting
fraud, customisation of customer services and using company’s resources
efficiently. Why then are people so concerned with the ethics that come with
data analytics? Chessell (2014) wrote the technology itself is inherently ethics-
agnostic, but it does push the art of the possible to new limits in terms of:
o The availability of a wide variety of data from several sources
o The ability to correlate the data to understand the bigger
picture.
o Individuals can me accurately identified and targeted.
o The ability to identify someone’s location for surveillance.
46
Ethics in quantitative research is extremely important. This can be from the
anonymity of the participants to non-disclosure of sensitive information.
Ethical issues arise in market research for numerous reasons. Because market
research comprises of communicating with respondents and the general public,
by data collection, distribution of the research conclusions and advertising
campaigns based on conclusions.
This gives the potential to abuse the information by taking advantage of
participants involved. Data protection and big data are very important.
Organisations should think about them hand in hand. In Ireland the Data
Protection Act (DPA) of 1988 and 2003. This act asks regulatory requirements
when it comes to handling private data. (AL Goodbody) wrote about some of the
key issues that need to be considered,
1. Who controls the data
2. Appropriate security measures
3. Consent of the data subjects
4. De-identification of data
3.8.1 Data Collection
1. All information received on SurveyMonkey will be destroyed at the
end of this research project
2. All records taken for this research paper will be kept confidential and
will only be used by the researcher.
3. Duty of care for the weak and vulnerable
4. To protect and defend the rights of the weak and vulnerable.
5. All interviewees partake wilfully in any research.
47
(Shamoo, Resnik 2015) gave a rough summery of ethical guidelines that
should be followed.
o Honesty
o Objectivity
o Integrity
o Carefulness
o Openness
o Respect for intellectual
property.
o Confidentiality
o Human Subjects Protection
o Responsible Publication
o Responsible Mentoring
o Respect for Colleagues
o Social Responsibility
o Non-discrimination
o Competence
o Legality
o Animal Care
3.8.2 Restrictions
The restrictions on this research paper was outlined by using quantitative
methodology. Even though this methodology has many benefits, the
restrictions such as the categories used may not reflect the candidates
understanding, were not as limiting as the other methods. Quantitative
methodology is also less time consuming and expensive than the other
methodologies.
Mixed methodology was first considered as it covered both quantitative and
qualitative methods. Time constraints and cost of conducting individual
interviews meant that mixed method was not used. Qualitative was also
considered but conducting the interview and collecting the data would be time
consuming and expensive. The data was also more difficult to analyse.
3.8.3 Value
This research looked at Irish retailers using big data to its best capabilities.
While larger multinationals are using big data, the smaller independent Irish
were always going to be playing catch up in adopting this and new
technologies that came with it. The barriers such as cost and training to Irish
retailers.
48
4.0 Results
In this chapter the results of the survey will be defined as in the previous
methodology chapter. 30 surveys were sent out to respondents, 9 via
SurveyMonkey email invite and 17 via SurveyMonkey Mobile link with a
response rate of 26. All retailers replied in full, skipping no parts of the
questionnaire. After issues with sending out the questionnaire via email
invitation 2, the mobile link was used instead. This form of sending out the
questionnaire brought back faster responses from clients. In the below results
(n=) referrers to the amount answered in the survey out of 26. Total is (n=26),
also used in K Nearest Neighbour or KNN as discussed in literature review in
earlier chapter. Figure 16 shows the two methods the survey was sent out and
Figure 17 shows the responses by the day returned.
Figure 16 Methods of sending out survey through SurveyMonkey
Figure 17 Responses by Volume Source: SurveyMonkey
Figure 17 Responses by Volume Source: SurveyMonkey
49
4.1 Survey Results
Question 1: What is the size of your company?
All 26 respondents responded to this question, skipping no parts. This question
in the survey was linked to Question 1 of the methodology section, What size
shops are using data analytics? This research is being carried out on Irish
retailers. Most of the Irish retailers questioned as seen in Figure 18, in the
survey were smaller
retailers, this accounted
for retailers of 0-50
employees or 61.54% or
(n=16). Medium sized
retailers of 50-250
employees which
accounted for 15.38%
(n=4) and the larger
retailers of 250 plus employees
or 23.08% (n=6). With retail, Ireland’s largest employer currently 90% Irish
owned with 77% of these family owned employing 275,000 people (Gleeson
& Lynam).
Respondents
0-50 64.54% 16
50-250 15.38% 4
250 plus 23.08% 6
Table 18
64.54%
15.38%
23.08%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
0-50
Employees
50-250
Employees
250 plus
Employees
Employees
Figure 18 Question 1: What is the size of your company?
50
Question 2: Does your shop use Data Analytics?
This question was to determine what retailers questioned actually used data
analytics.
This
question
was linked
to Question
1 of the
methodology; what size shops are using data analytics? All 26 respondents
responded to this question, skipping no parts.
Out of the 26 respondents 69% (n=18) used data analytics while 31% (n=8)
did not use data analytics in their shop as seen in Table 19.
Respondents Percentage
Yes 69.23% 18
No 30.77% 8
Table 19
As seen in figure 20,
When a result comes back in minus
figures it means there is a problem with
the data entered. Either the sample size
was too small or too many open ended
questions and no scale questions.
Yes
69%
No
31%
Does your shop use data
analytics
Yes
No
Figure 19 Question 2: Does your shop use data analytics
Figure 19 Question 2: Does your shop use data analytics
Figure 20 What size shops are using data
analytics?
Figure 20 What size shops are using data
analytics?
51
Question 3: What form of data collection do you use?
1. This question was to see what form of collection the retailers use when
collecting their customer’s data. This question was linked to Question
3 of the
methodology section, Are Irish retailers using data analytics to its full
potential? All 26 respondents responded to this question, skipping no
parts. As seen in Figure 21 nearly 8% use loyalty cards (n=2), by far
the largest 38% use in store tracking (n=10), such as recording
customer’s details and their purchasing history. 19% use both on line
purchasing and social media (n=5) to collect customer data, while 15%
(n=4) were unsure what kind of collection they use.
Respondents
Loyalty Cards 7.69% 2
In Store Tracking 38.46% 10
On-line Purchasing Tracking 19.23% 5
Facial Recognition 0% 0
Social Media 19.23% 5
Unsure 15.38% 4
Table 21
7.69%
38.46%
19.23%
0%
19.23%
15.38%
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
Loyalty
Cards
In Store
Tracking
On Line
Purchasing
Facial
Recognition
Social
Media
Unsure
What form of data collection do you use?
Figure 21 Question 3: What form of data collection do you use?
52
Question 4: In your opinion, has Data Analytics changed how your
organisation collects customer information?
This question was to find out, how by using data analytics has changed how
their shop collects customer information. It is linked in with Question 3 in the
methodology section, Are Irish retailers using data analytics to its full
potential? All 26 respondents responded to this question, skipping no parts. As
seen in Figure 22 nearly 31% (n=8) said yes while a majority of 54% said no,
data analytics has not changed how the collect customer’s data while 15%
(n=4) said yes and gave a reason as listen below.
1. More emphasis on different types of marketing according to the
information.
2. Better structured and organized.
3. It allows us to see what are the most popular SKU’s by region and by
month to allow us to project forward manufacturing.
4. Offers target at certain demographics.
Figure 22 Question 4 Has data analytics changed how your organisation collects customer information
Respondents
Yes 30.77% 8
No 53.85% 14
Yes (Please give reason) 15.38% 4
Table 22
30.77%
53.85%
15.38%
YES
NO
YES (PLEASE GIVE REASON)
Has Data Analytics changed how
your organisation collects
customer information?
53
Question 5: What, if any, do you think are the benefits of retailers sharing data
(such as POS, inventory, and customer loyalty) with suppliers?
This question was linked with Question 3 of the methodology section, Are
Irish retailers using data analytics to its full potential? All 26 respondents
responded to this question, skipping no parts. As seen in Figure 23, this is very
close in what companies use to collect data. Strengthens relationships with
suppliers getting 19% as does retailers strategies can benefit from suppliers'
product knowledge, who are both (n=5), Helps increase sales received 23% or
(n=6),
Suppliers can forecast and meet customer demand received the highest with
35% or (n=9), only1 retailer answered, there are no benefits with 4% or (n=1).
19%
23%
35%
19%
4%
What, if any, do you think are the
benefits of retailers sharing data
(such as POS, inventory, and
customer loyalty) with suppliers
Strengthens retationships
with suppliers
Helps increase sales.
Suppliers can forecast and
meet customer demand.
Retailers strategies can
benefit from suppliers'
product knowledge.
There are no benefits.
Figure 23 What, if any, do you think are the benefits of retailers sharing data (such as POS,
inventory, and customer loyalty) with suppliers
Figure 7 How does Data Analytics help retailers manage product availability for
customers?Figure 23 What, if any, do you think are the benefits of retailers sharing data
(such as POS, inventory, and customer loyalty) with suppliers
54
Respondents
Strengthens relationships with
suppliers
19.23% 5
Helps increase sales. 23.08% 6
Suppliers can forecast and
meet customer demand.
34.62% 9
Retailer’s strategies can
benefit from suppliers' product
knowledge.
19.23% 5
There are no benefits. 3.85% 1
Table 23
Question 6: How does Data Analytics help retailers manage product
availability for customers?
This question as seen in Figure 24, asks about product availability in stores
whether to keep stock to a minimum, reducing having no stock or predicting
future demands. All 26 respondents responded to this question, skipping no
parts. This question was linked with Question 3, are Irish retailers using data
analytics to its full potential? By predicting future demand, was by far the best
response with 65% of respondents or (n=17), while by reducing out of stock
situations had 23% of the respondents or (n=6). Ensuring the shop is not over
stocked was not as popular with 11% or (n=3). Other had no respondents.
55
Respondents
By predicting future demand 65.38% 17
By reducing out of stock
situations
23.08% 6
Ensuring the shop is not over
stocked
11.54% 3
Other 0% 0
Table 24
Figure 8 How does Data Analytics help retailers manage product availability for
customers?
Figure 9 How does Data Analytics help retailers manage product availability for
customers?
By predicting
future
demand
By reducing
out of stock
situations
Ensuring the
shop is not
over stocked
Other
65.38%
23.08%
11.54%
0
HOW DOES DATA ANALYTICS HELP
RETAILERS MANAGE PRODUCT
AVAILABILITY FOR CUSTOMERS?
Figure 24 How does Data Analytics help retailers manage product availability for
customers?
56
Question 7: How important do you consider the use of Data Analytics for
retailers to gain a competitive advantage over their competitors?
This question was linked with Question 3 are Irish retailers using data
analytics to its full potential? All 26 respondents responded to this question,
skipping no parts. This question looked into how important the retailers
questioned considered the use of Data analysis to gain an advantage over
competitors. Not surprisingly 65% or (n=17) found data analytics very
important in
gaining an
advantage
over
competitors.
23% or (n=6)
considered
data analysis
an advantage
moderately
important,
while 8%
(n=2)
considered
data analysis an advantage important. Only 4% (n=1) considered the use of
data analysis for retailers to gain a competitive advantage of little importance
while 0% responded to other, as seen in table 25.
Respondents
Very Important 65.38% 17
Moderately Important 23.08% 6
Important 7.69% 2
Of little importance 3.85% 1
Not important 0% 0
Table 25
65%
23%
8%
4%
How important do you consider the
use of Data Analytics for retailers to
gain a competitive advantage over
their competitors?
Very Important
Moderately Important
Important
Of little importance
Figure 25 How important do you consider the use of Data Analytics for retailers to
gain a competitive advantage over their competitors?
57
As seen in Figure 2 Cronbach’s Alpha is .607
which is in the suitable parameters.
Question 8: How important is cost in implementing Data Analytics?
This question as seen in Figure 27 asks about the importance of cost when
implementing data analysis. This question is linked with Question 2 in the
methodology
section, is it
cost based not
to use data
analytics? The
graph shows
that 42% or
(n=11) think
that cost is
very important
when it comes
to
implementing
data analysis. This was 12% higher than people who thought it was important
at 30% or (n=8), who was just higher than moderately important at 27% or
(n=7). 0% though it was of little important, not important and other.
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
42.31%
26.92%
30.77%
0% 0 0%
How important is cost in
implementing Data Analytics
Figure 27 How important is cost in implementing Data Analytics?
Figure 26 How important do you consider the
use of Data Analytics for retailers to gain a
competitive advantage over their competitors?
58
Respondents
Very Important 42.31% 11
Moderately Important 26.92% 7
Important 30.77% 8
Of little importance 0 0
Not important 0 0
Other Please specify 0 0
Table 27
Question 9: What are the barriers to adopting Data Analytics?
This question is linked to Question 2 in the methodology section, is it cost
based not to use data analytics? All 26 respondents responded to this question,
skipping no parts. As seen in Figure 28 one of the issues that are in Irish
retailers was shown in the answers to question 9, barriers to adapting data
analysis. 12
respondents, 46.15%
of correspondents
find a lack of
understanding of big
data and its uses as a
barrier. That is
worrying for Irish
retailers being left
behind larger
international
retailers who have
better understand of
data analytics and
where cost is no
option. 19.23% of
retailers questioned said that cost is a barrier in adopting data analytics.
19%
16%
46%
15%
4%
0%
Barriers in adopting data
analysis
Cost
Staff Training
Lack of understanding
of big data and its uses
Problems with
database software
Other (please specify)
Not usable for
customer
Figure 28 Barriers adopting data analysis Source: Flood, P.
Figure 28 Barriers adopting data analysis Source: Flood, P.
59
(Millman 2014) in a 2014 EMC survey carried out in 300 UK businesses, a
worryingly 62% of retailers are without the required skills to comprehend the
ethical, responsible and compliant use of customer data. While 62% of
retailer’s lack understanding of data analytics, the same survey (Millman2014)
also claimed the key to new growth in retail is in the better use of customer
insights, data trends and customer information.
Respondents
Cost 19.23% 5
Staff Training 15.38% 4
Lack of understanding of
big data and its uses
46.15% 12
Problems with database
software
15.38% 4
Not usable for customer 0% 0
Other (please specify) 1 3.85%
Table 28
Training staff also was high on the list at just over 15%. (Ridge et al 2015)
wrote, the lack of understanding of big data by executives, felt that the return
on investment did not warrant the investment in the first place.
As seen in figure 29 Cronbach’s
Alpha reliability is .46.
4.2 Themes That Emerged
o Out of 26 retailers questioned well over half, 18 retailers, and use data
analytics.
o While data analytics is popular, smaller retailers are more likely not to
have data analytics.
o Cost was a main factor when it came to implementing data analytics,
as some retailers saw it as a barrier.
Figure 29 Cronbachs Reliability
Figure 29 Cronbachs Reliability
60
o 46% of retailers saw a lack of understanding of data analysis as a
barrier.
4.3 Results Conclusion
The conclusion of the results of the 3 methodology questions,
1. What size shops are using data analytics?
This question was answered in question 1 and 2 in the questionnaire.
o 69% of companies questioned used data analytics.
o 10 retailers or 62% of retailer’s questions that have a
workforce of 0-50 use data analysis.
o 8 retailers questioned do not use data analytics, 6 were small
retailers of 0-50 workers.
2. Is it cost based not to use data analytics?
This question was answered in question 8 and 9 in the questionnaire.
o Retailers had data analytics in the stores but still had a lack of
understanding of it.
o 4 retailers had issues with database software causing a barrier.
o 1 retailer found, ensuring that you adhere to the strict data protection
laws as a barrier.
o All 26 respondents said cost was an important factor in implementing
data analysis.
3. Are Irish retailers using data analytics to its full potential?
This question was answered in question 3,4,5,6 and 7 in the
questionnaire.
o Instore tracking was the most popular form of tracking for
retailers with 0-50 workers.
o Out of the 6 stores with 250 workers up, 2 were unsure what
form of data analytics they used, 1 store used loyalty cards,
instore tracking, on line purchasing and social media.
o Surprisingly over half the retailer questioned did not thing data
analytics changed the way they collected customer data.
61
o Most suppliers saw data analytics as a way of forecasting and
meeting customer demand as most important, while only 1 saw
data analytics as no benefit to them.
o 17 out of 26 respondents use data analysis to predict future
demand.
o Again 17 respondents say they use data analytics as a tool to
gain a competitive advantage over their rivals.
5.0 Discussion
The objective of this chapter is to discuss the results of the survey in detail and
relate them back to the findings in the literature review. The findings will be
linked to past theory on the topic. The research questions from the
methodology section will be answered to give a current idea of how big data is
used in Irish retailers.
Question 1: What size shops are using data analytics?
This question was made up of 2 questions in the survey, question 1 and
question 2.
o What is the size of your company?
o Does your organisation use Data Analytics?
With 69% of the retailer
questioned using some
form of big data, this
research paper shows
that data analytics is
popular in Ireland. In the
64.54%
15.38%
23.08%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
0-50
Employees
50-250
Employees
250 plus
Employees
Employees
Figure 30 Question 1 What is the size of your company?
62
literature review (Hawkins 2012) claimed that larger retailers who can afford
to change big data by using sophisticated data collection software will leave
smaller independent retailers behind. The results collected show that,
o Retailers 0-50 employees, 10 of the 16 small retailers surveyed
use data analytics with 6 retailers not using data analytics, as
seen in figure 18.
o Retailers 50-250 employees, all 4 medium retailers with 50 –
250 employees surveyed use data analytics, as seen in figure
19.
o Retailers 250 plus employees, 4 out of 6 larger retailers of 250
employees plus surveyed use data analytics, with 2 not using
data analytics, as seen in figure 20.
These
results
disagree
with earlier
research in
the literature
review
where
(Simon
2013) where
541 small to
medium UK companies, none were thinking of taking advantage of big data.
Yes
69%
No
31%
Does your shop use data
analytics
Yes
No
Figure 31 Question 2 Does your shop use data analytics?
63
This shows that since
Hawkins spoke in 2012
smaller independent retailers
have kept up with larger
retailers in using big data.
Using big data does not
require a retailer to spend
large amounts of money to
purchase sophisticated data
collection software. This also agrees with (Schaeffer 2016) who claimed that
retailer’s big data pecking order is less about the size of the IT budget but
more the retailer's inclination towards innovation and agility.
As seen in Figure 33 all retailers medium sized questioned used data analytics.
As (Davis 2015) wrote that 95% of
small to medium enterprises (SME)
use some form of big data. In this
research all medium retailers
questioned used some form of data
analytics. Medium retailers tend to
have a POS system that will give
the option of collecting customer
data or will have a loyalty card
system for example the system that
Superquinn introduced in 1993 as
written in the literature review.
Yes
62%
No
38%
Retailer with 0-50
employees that use
data analytics
Yes
No
Figure 32 Retailers with 0-50 employees that use data
analytics
Figure 32 Retailers with 0-50 employees that use data
analytics
100%
0%
Retailer with 50 -
250 employees that
use data analytics
Yes
No
Figure 33 Retailer with 50 - 250 employees that use
data analytics
64
What was very surprising in the
retailers with 250 plus that 2 out of
6 or 33% of respondents do not
use data analytics. This should be
a disadvantage to those retailers
with competitors able to use
customer data, sometimes it could
be the same customer, against
them for competitive advantage.
While 69% of the retailers interviewed in the questionnaire used data
analytics, 31% or 8 out of the 26 surveyed, had no form of data analytics.
Analytics are essential today to sustain a competitive advantage. (Conley
2016) explained that technology, data and analysis can now be merged to give
a powerful understanding of employees, associates and customers. (Conley
2016) explained in the literature review that 61% senior executives use data
analysis to make corporate decisions which back up the figures in this research
paper. The reason for the high proportion not changing could be down to old
habits and an unwillingness to change, ‘if it’s not broken don’t fix it’.
Thinking like this could have a detrimental effect on an organisation.
For larger retailers, they could also have an old database the is unsuitable for
data analytics without considerable amount of investment.
Yes
4
No
2
RETAILER WITH 250
PLUS EMPLOYEES THAT
USE DATA ANALYTICS
Figure 34 Retailer with 250 plus employees that use
data analytics
Figure 34 Retailer with 250 plus employees that use
data analytics
65
Question 2: Is it cost based not to use data analytics?
This question was made up of 2 questions in the survey, question 8 and
question 9.
Q.8 Is it cost based not to use data analytics?
o Retailers with 0 – 50 employees, as seen in figure 35
- 6 said cost was very important
- 4 said cost was moderately important
- 6 said cost was important
o Retailers with 50 – 250 employees, as seen in figure 35
- 1 said cost was very important
- 2 said cost was modularly important
- 1 said cost was important
o Retailers with 250 plus employees, as seen in figure 35
- 3 said cost was very important
- 0 said cost was moderately important
- 3 said cost was important.
0
1
2
3
4
5
6
7
Very Important Moderatly
Important
Important
Is it cost based not to use data
analytics?
0 - 50 Employees 50 - 250 Employees 250 Plus Employees
Figure 35 Retailer Size for Cost Importance
66
It is clear to see from the respondents in question 8 that all 26 retailers
interviewed think cost in implementing data analytics is important in retail as
seen in figure. 42% of all respondents thought it was very important as seen in
figure 36.
(Bantleman
2012) wrote
about the price
of having
Hadoop as the
main tool but
only large
retailers could
afford the
millions that it
would cost to
run it. A
petabyte Hadoop cluster at $1 million per year and another $1 to run is out of
the budget for most Irish retailers. That is why cost is so important. As seen in
question 3 in the survey a lot of the retailers questioned use instore tracking
and social media. Furthermore (Jain 2014) disclosed other cost effective ways
of having big data in retailer. Social media though not an BI tool, can be an
easy form of data analytic collection. Customers that click ‘like; on an item on
social media or online tracking when a customer clicks on an item, purchases
an item, or just by a customer’s browsing behaviour.
The main barrier that the 26 retailers responded back to was the lack of
understanding of data analytics and its uses. All level of retail sizes had this as
an issue. (Baldwin 2014) said 23% of UK retailers can make instantly make
sense of the data made accessible to them.
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
42.31%
26.92%
30.77%
0% 0 0%
How important is cost in
implementing Data Analytics
Figure 36 How important is cost in implementing Data Analytics?
Figure 310 Question 9 Barriers in adopting data analysis?Figure 36 How
important is cost in implementing Data Analytics?
67
Q.9 What are the barriers to adopting Data Analytics?
o Retailers with 0 – 50 employees, as seen in figure 37.
- 2 said cost was a barrier
- 3 said staff training was a barrier
- 7 said a lack of understanding of data analytics and its uses as a
barrier
- 3 has problems with database software
- 1 had other, ensuring that you adhere to the strict data
protection laws as a barrier
o Retailers with 50 – 250 employees, as seen in figure 37
- 2 said cost was a barrier
- 2 said a lack of understanding of data analytics and its uses as a
barrier
- All the others were 0
19%
16%
46%
15%
4%
0%
Barriers in adopting data
analysis
Cost
Staff Training
Lack of understanding
of big data and its uses
Problems with
database software
Other (please specify)
Not usable for
customer
Figure 311 Question 9 Barriers in adopting data analysis?
Figure 312 Question 9 Barriers in adopting data analysis?
68
o Retailers with 250 plus employees, as seen in figure
- 1 said cost was an issue
- 1 said staff straining was a barrier
- 3 said a lack of understanding of data analytics and its uses as a
barrier
- 1 had problems with database software
As seen in
figure 38 a
lack of
understanding
especially
high in
smaller
independent
retailers at
44%. Most of
these retailers
will have data
analytics in
their stores. Should the retailers spend more on staff training to alleviate the
lack of understanding. Even in the larger retailers questioned, 50% of them
had a lack of understanding of big data as a barrier. These retailers you would
think would have better training or have the pick of better graduates that
understand and have been trained at data analytics.
Again smaller retailers had database problems as a barrier. This is no surprise
as that size retailer may not that the IT database structure needed for data
analytics.
0
1
2
3
4
5
6
7
8
0 - 50 Employees 50 - 250
Employees
250 plus
Employees
What are the barriers to adopting Data
Analytics?
Cost Staff Training
Lack of Understanding Database Issues
Other
Figure 38 What are the barriers to adopting Data Analytics?
69
To break figure 38 down further to each separate section in percentages, it is
clearly visible that a lack
of understanding is the
biggest barrier, followed
closely by staff training
and database issues. As
claimed by (Forrester
Consulting 2014) using
data analytics can
increase profit margins
by 60%. Is investing in
cost, staff training and
database worth it for retailers to get a good end product.
The lack of understanding is also stated by (Forrester Consulting 2014) in the
same article that 68% of retail CIO’s said they collect data, but agreed they
were not maximising its full potential. Staff training would help reduce this
figure. The ‘other’ barrier was to ensuring that you adhere to the strict data
protection laws as a barrie
Data protection and big data are very important. Organisations should think
about them hand in hand. in Ireland the Data Protection Act (DPA) of 1988
and 2003. This act asks regulatory requirements when it comes to handling private
data. Data protection is vitally important for customers and retailers alike. Data
protection protects the customer’s data and gives retailers guidelines to follow
when collecting the data in a lawful way. (AL Goodbody) gave key guidelines for
organisations to follow;
o Who controls the data:
- An organisation should have a data controller to regulate the
rational for which the data is processed.
- A data processor is a third part who administers the data on
behalf of the data controller.
12%
19%
44%
19%
6%
Retailers with 0 - 50
Employees
Cost
Staff Training
Lack of
understanding
Database issues
Other
Figure 313 Retailers with 0 - 50 employees
Figure 14 Retailers with 50 - 250 EmployeesFigure 315 Retailers
with 0 - 50 employees
70
- A written agreement needs to be in place between the data
controller and the data processor in relation to obtaining,
retaining, retrieving and deleting data.
o Registration with the Office of the Data Protection Commissioner
(ODPC):
- Normally all data controllers and data processers have to
register with the ODPC unless exempt.
o Consent with data subjects:
- Consent is the standard way to legitimise data processing.
- Article 29 Working Group, which includes several data
protections regulators of the EU, said, consumers “specific,
explicit, consent” is needed for organisations to use customer
data when using big data projects.
o Appropriate security measures:
- Data controllers must ensure that suitable security measures are
in place for the nature of the data stored.
- This includes a level of security to the harm that might result
from any unauthorised or unlawful processing, accidental or
unlawful destruction of data.
o Transfer of data:
- Under the DPA, the transfer of customer’s data to a country
outside the EU is generally prohibited unless that country is
permitted and guarantees an acceptable level of protection for
the confidentiality and fundamental rights and freedoms of the
data subjects, or certain other conditions are met.
o Appropriate data protection policy:
- Organisations that collect and process customer’s data must
have in place an appropriate data protection policy.
- Some data may have been collected when no data protection
policies were in place.
- For particular big data projects, a customised data protection
policy may be put in place.
71
o De-identification of data:
- Under Irish data protection law, it is conceivable for
organisations to accomplish legally permissible de-
identification.
- Nevertheless, organisations looking at de-identification to
circumvent privacy and data protection laws should proceed
carefully.
The retailer who found adhering to data protection laws a barrier should look
at the laws themselves. The laws are there to protect the customers that use
that shop and to protect the retailer itself. These laws are not to hold back
organisations but to help organisations understand the importance of privacy
when it comes to customer data.
For retailers with
50 – 250
employees, the
barriers are cost
and lack of
understanding as
seen in Figure 40.50%
0%
50%
0%0%
Retailers with 50 - 250
Employees
Cost
Staff Training
Lack of Understanding
Database Issues
Other
Figure 16 Retailers with 50 - 250 Employees
72
Figure 41 Retailers with employees of 250 plus
Question 3: Are Irish retailers using data analytics to its full potential?
This question was made up of 5 questions in the survey, question 3,4,5,6 and
question 7.
o What form of data collection do you use?
o In your opinion, has Data Analytics changed how your organisation
collects customer information?
o What, if any, do you think are the benefits of retailers sharing data
(such as POS, inventory, and customer loyalty) with suppliers?
o How does Data Analytics help retailers manage product availability for
customers?
o How important do you consider the use of Data Analytics for retailers
to gain a competitive advantage over their competitors?
Cost
16%
Staff
Training
17%
Lack of
Understanding
50%
Database
Issues
17%
Retailers with 250 plus
Employees
Cost
Staff Training
Lack of
Understanding
Database Issues
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential
Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential

More Related Content

Viewers also liked

Final Version of Dissertation CS2105 - Team Seeker_v_1_0
Final Version of Dissertation CS2105 - Team Seeker_v_1_0Final Version of Dissertation CS2105 - Team Seeker_v_1_0
Final Version of Dissertation CS2105 - Team Seeker_v_1_0Paul Flood
 
10 importantes avances de la tecnologia
10 importantes avances de la  tecnologia10 importantes avances de la  tecnologia
10 importantes avances de la tecnologiaAlejoboro
 
Time Management at workplace PPT
Time Management at workplace PPTTime Management at workplace PPT
Time Management at workplace PPTgys78692
 
Ingenieria automotriz-2
Ingenieria automotriz-2Ingenieria automotriz-2
Ingenieria automotriz-2osvaldisa
 
Risk management-policies
Risk management-policiesRisk management-policies
Risk management-policiesNic Farrimond
 
Revista xxx-1
Revista xxx-1Revista xxx-1
Revista xxx-1osvaldisa
 
Mejora continua en equipos de desarrollo Software
Mejora continua en equipos de desarrollo SoftwareMejora continua en equipos de desarrollo Software
Mejora continua en equipos de desarrollo Softwarekaesar84
 
Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica David Pittman
 
How Big Data is Changing Retail Marketing Analytics
How Big Data is Changing Retail Marketing Analytics How Big Data is Changing Retail Marketing Analytics
How Big Data is Changing Retail Marketing Analytics Revolution Analytics
 
Big data, analytics and the retail industry: Luxottica
Big data, analytics and the retail industry: LuxotticaBig data, analytics and the retail industry: Luxottica
Big data, analytics and the retail industry: LuxotticaIBM Analytics
 
Big Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesBig Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesGopalakrishna Palem
 
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsTop 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsHortonworks
 

Viewers also liked (15)

B.Sunkanna....
B.Sunkanna....B.Sunkanna....
B.Sunkanna....
 
Final Version of Dissertation CS2105 - Team Seeker_v_1_0
Final Version of Dissertation CS2105 - Team Seeker_v_1_0Final Version of Dissertation CS2105 - Team Seeker_v_1_0
Final Version of Dissertation CS2105 - Team Seeker_v_1_0
 
10 importantes avances de la tecnologia
10 importantes avances de la  tecnologia10 importantes avances de la  tecnologia
10 importantes avances de la tecnologia
 
Time Management at workplace PPT
Time Management at workplace PPTTime Management at workplace PPT
Time Management at workplace PPT
 
Ingenieria automotriz-2
Ingenieria automotriz-2Ingenieria automotriz-2
Ingenieria automotriz-2
 
В чем уникальность работы с нами?
В чем уникальность работы с нами?В чем уникальность работы с нами?
В чем уникальность работы с нами?
 
Risk management-policies
Risk management-policiesRisk management-policies
Risk management-policies
 
Disruptive Innovation
Disruptive InnovationDisruptive Innovation
Disruptive Innovation
 
Revista xxx-1
Revista xxx-1Revista xxx-1
Revista xxx-1
 
Mejora continua en equipos de desarrollo Software
Mejora continua en equipos de desarrollo SoftwareMejora continua en equipos de desarrollo Software
Mejora continua en equipos de desarrollo Software
 
Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica
 
How Big Data is Changing Retail Marketing Analytics
How Big Data is Changing Retail Marketing Analytics How Big Data is Changing Retail Marketing Analytics
How Big Data is Changing Retail Marketing Analytics
 
Big data, analytics and the retail industry: Luxottica
Big data, analytics and the retail industry: LuxotticaBig data, analytics and the retail industry: Luxottica
Big data, analytics and the retail industry: Luxottica
 
Big Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businessesBig Data Predictive Analytics for Retail businesses
Big Data Predictive Analytics for Retail businesses
 
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsTop 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data Analytics
 

Similar to Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential

Patrick Capriola Dissertation
Patrick Capriola DissertationPatrick Capriola Dissertation
Patrick Capriola DissertationPatrick Capriola
 
A Business Development Plan To Improve The Business Performance Of Ford Motor...
A Business Development Plan To Improve The Business Performance Of Ford Motor...A Business Development Plan To Improve The Business Performance Of Ford Motor...
A Business Development Plan To Improve The Business Performance Of Ford Motor...Katie Robinson
 
Rudee Talia Final_ISPReport 3May2013
Rudee Talia Final_ISPReport 3May2013Rudee Talia Final_ISPReport 3May2013
Rudee Talia Final_ISPReport 3May2013Talia Rudee
 
Computational methods of Hepatitis B virus genotyping
Computational methods of Hepatitis B virus genotypingComputational methods of Hepatitis B virus genotyping
Computational methods of Hepatitis B virus genotypingNguyen Nhat Tien
 
Katlego_Pule_674426_Research_report_final_submission
Katlego_Pule_674426_Research_report_final_submissionKatlego_Pule_674426_Research_report_final_submission
Katlego_Pule_674426_Research_report_final_submissionKatlego Pule
 
Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...
Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...
Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...ArthurComman
 
PDST Primary Forbairt Initiatives 2017 2018
PDST Primary Forbairt Initiatives 2017 2018PDST Primary Forbairt Initiatives 2017 2018
PDST Primary Forbairt Initiatives 2017 2018polmagl
 
Online Social Networks as a Driver of Social Travel- Can Social Networking Si...
Online Social Networks as a Driver of Social Travel- Can Social Networking Si...Online Social Networks as a Driver of Social Travel- Can Social Networking Si...
Online Social Networks as a Driver of Social Travel- Can Social Networking Si...Steve Law
 
Attitudes and motivation toward learning l2 in internet based informal context
Attitudes and motivation toward learning l2 in internet based informal contextAttitudes and motivation toward learning l2 in internet based informal context
Attitudes and motivation toward learning l2 in internet based informal contextonaliza
 
fielding_dissertation.pdf
fielding_dissertation.pdffielding_dissertation.pdf
fielding_dissertation.pdfatest9887
 
Forbairt Booklet 2018
Forbairt Booklet 2018Forbairt Booklet 2018
Forbairt Booklet 2018polmagl
 
IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...
IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...
IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...Chimwani George
 
Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...
Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...
Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...Naomi Mapulanga
 
Colm O'Leary. Thesis
Colm O'Leary. ThesisColm O'Leary. Thesis
Colm O'Leary. ThesisColm O'Leary
 
Vonesili Saysana_11384836_Thesis_Final
Vonesili Saysana_11384836_Thesis_FinalVonesili Saysana_11384836_Thesis_Final
Vonesili Saysana_11384836_Thesis_FinalVonesili Saysana
 
SEA150 behind the scenes of events- Industry Professionals Interview Report
SEA150 behind the scenes of events- Industry Professionals Interview ReportSEA150 behind the scenes of events- Industry Professionals Interview Report
SEA150 behind the scenes of events- Industry Professionals Interview ReportJiahui Ye
 

Similar to Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential (20)

Patrick Capriola Dissertation
Patrick Capriola DissertationPatrick Capriola Dissertation
Patrick Capriola Dissertation
 
A Business Development Plan To Improve The Business Performance Of Ford Motor...
A Business Development Plan To Improve The Business Performance Of Ford Motor...A Business Development Plan To Improve The Business Performance Of Ford Motor...
A Business Development Plan To Improve The Business Performance Of Ford Motor...
 
Rudee Talia Final_ISPReport 3May2013
Rudee Talia Final_ISPReport 3May2013Rudee Talia Final_ISPReport 3May2013
Rudee Talia Final_ISPReport 3May2013
 
Thesis 2014
Thesis 2014Thesis 2014
Thesis 2014
 
scsantuc_capstone (1)
scsantuc_capstone (1)scsantuc_capstone (1)
scsantuc_capstone (1)
 
final dissertation
final dissertationfinal dissertation
final dissertation
 
Computational methods of Hepatitis B virus genotyping
Computational methods of Hepatitis B virus genotypingComputational methods of Hepatitis B virus genotyping
Computational methods of Hepatitis B virus genotyping
 
Katlego_Pule_674426_Research_report_final_submission
Katlego_Pule_674426_Research_report_final_submissionKatlego_Pule_674426_Research_report_final_submission
Katlego_Pule_674426_Research_report_final_submission
 
Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...
Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...
Dissertation - The competitive advantage of Industry 4.0 for Food and Beverag...
 
PDST Primary Forbairt Initiatives 2017 2018
PDST Primary Forbairt Initiatives 2017 2018PDST Primary Forbairt Initiatives 2017 2018
PDST Primary Forbairt Initiatives 2017 2018
 
Online Social Networks as a Driver of Social Travel- Can Social Networking Si...
Online Social Networks as a Driver of Social Travel- Can Social Networking Si...Online Social Networks as a Driver of Social Travel- Can Social Networking Si...
Online Social Networks as a Driver of Social Travel- Can Social Networking Si...
 
Attitudes and motivation toward learning l2 in internet based informal context
Attitudes and motivation toward learning l2 in internet based informal contextAttitudes and motivation toward learning l2 in internet based informal context
Attitudes and motivation toward learning l2 in internet based informal context
 
THESIS FINAL
THESIS FINALTHESIS FINAL
THESIS FINAL
 
fielding_dissertation.pdf
fielding_dissertation.pdffielding_dissertation.pdf
fielding_dissertation.pdf
 
Forbairt Booklet 2018
Forbairt Booklet 2018Forbairt Booklet 2018
Forbairt Booklet 2018
 
IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...
IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...
IMPROVING FINANCIAL AWARENESS AMONG THE POOR IN KOOJE SLUMS OF MERU TOWN-FINA...
 
Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...
Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...
Naomi Suwilanji Mapulanga 09177382 - Assessing the Carbon footprint of Refrig...
 
Colm O'Leary. Thesis
Colm O'Leary. ThesisColm O'Leary. Thesis
Colm O'Leary. Thesis
 
Vonesili Saysana_11384836_Thesis_Final
Vonesili Saysana_11384836_Thesis_FinalVonesili Saysana_11384836_Thesis_Final
Vonesili Saysana_11384836_Thesis_Final
 
SEA150 behind the scenes of events- Industry Professionals Interview Report
SEA150 behind the scenes of events- Industry Professionals Interview ReportSEA150 behind the scenes of events- Industry Professionals Interview Report
SEA150 behind the scenes of events- Industry Professionals Interview Report
 

Data Analytics in Irish Retail Is Data Analytics Used in its Full Potential

  • 1. 1 Data Analytics in Irish Retail: Is Data Analytics Used in its Full Potential? Paul Flood Institute of Technology Carlow MSc in Information Technology Management 2016
  • 2. 2 Data Analytics in Irish Retail: Is Data Analytics Used in its Full Potential? Paul Flood Submitted in partial fulfilment of requirements for the MSc in Information Technology Management 2016 Institute of Technology Carlow
  • 3. 3 LIFELONG LEARNING CENTRE Work submitted for assessment which does not include this declaration will not be assessed. DECLARATION *I declare that all material in this submission e.g. thesis/essay/project/assignment is entirely my/our own work except where duly acknowledged. *I have cited the sources of all quotations, paraphrases, summaries of information, tables, diagrams or other material; including software and other electronic media in which intellectual property rights may reside. *I have provided a complete bibliography of all works and sources used in the preparation of this submission. *I understand that failure to comply with the Institute’s regulations governing plagiarism constitutes a serious offence. Student Name: (Printed) ____________________________________________ Student Number(s): ____________________________________________ Programme Title & Yr.: ____________________________________________ Module: ____________________________________________ Signature(s): ____________________________________________ Date: ____________________________________________ ---------------------------------------------------------------------------------------------- --------- Please note: a) Individual declaration is required by each student for joint projects. b) Where projects are submitted electronically, students are required to type their name under signature. c) The Institute regulations on plagiarism are set out in Section 10 of Examination and Assessment Regulations published each year in the Student Handbook.
  • 4. 4 Acknowledgements I would first like to thank everybody who helped me throughout the college year. I would like to thank Martin my advisor who guided and advised me. The feedback was brilliant and I learnt so much on how to design a dissertation. Thank you for bringing my dissertation from being too broad to actually putting my questions on paper and going from there. The college in Carlow especially the lecturers and lifelong learning who made the year that bit easier. It was strange returning to the college after 20 years. To all the retailers who completed my survey and to David for help getting me the names and email addresses of the people used. You know a lot of people. The proof readers, Edel and John, my eyes for finding errors. To all my class for making me feel welcome especially John, Niall and Ross. To all my family in Tipperary and my in-laws in Offaly. And finally to my wife Niamh, my rock this year. You convinced me to do the course and helped me through the hard times, the weeks away from home and the drives from Carlow to Balbriggan at night. I could not do it without you.
  • 5. 5 Contents Acknowledgements .....................................................................................................4 Glossary .......................................................................................................................7 Abstract........................................................................................................................7 1.0 Introduction.....................................................................................................9 1.1 Research Study and Aims ..................................................................................11 1.2 Thesis Structure..................................................................................................12 2.0 Literature Review ...............................................................................................12 2.1 Introduction.........................................................................................................12 2.1 What size shops use Data Analytics? ................................................................14 2.2 Shops using Data analytics to its fullest Potential............................................16 2.3 Cost for retailers not using or not using big data.............................................19 2.4 Data ......................................................................................................................20 2.4.1 Information Hierarchy................................................................................20 2.4.2 Algorithms....................................................................................................21 2.5 Models..................................................................................................................21 2.5.1 K - Nearest Neighbour Model.....................................................................21 2.5.2 C4.5................................................................................................................22 2.6 Data Security Risks.............................................................................................23 2.7 Methods of Collection.........................................................................................24 2.7.1 Loyalty Cards...............................................................................................24 2.7.2 Security .........................................................................................................25 2.7.3 Dublin Airport..............................................................................................25 2.7.4 On-Line Recommendations.........................................................................26 2.7.5 Amazons Similarity Algorithm...................................................................26 2.7.6 In-store Tracking .........................................................................................27 3.0 Methodology........................................................................................................29 3.1 Research...............................................................................................................30 3.1.1 Research Philosophy....................................................................................30 3.1.2 Analysing Research Data ............................................................................31 3.1.3 Qualitative versus Quantitative..................................................................31 3.2 Research Objectives............................................................................................33 3.3 Qualitative Methodology....................................................................................33 3.4 Sampling Techniques..........................................................................................34
  • 6. 6 3.4.1 Qualitative Sampling ...................................................................................36 3.4.2 Snowball Sampling ......................................................................................36 3.4.3 Convenience Sampling.................................................................................37 3.4.4. Judgemental Sampling ...............................................................................38 3.4.5 Quota Sampling............................................................................................38 3.5. Semi Structured .................................................................................................39 3.5.1 Unstructured ................................................................................................40 3.5.2. Structured....................................................................................................41 3.6 Quantitative Methodology..................................................................................42 3.6.1 Independent Variable..................................................................................42 3.6.2 Dependent Variable .....................................................................................43 3.6.3 Moderating Variable ...................................................................................43 3.7 Mixed Methodology ............................................................................................43 3.8 Ethics....................................................................................................................45 3.8.1 Data Collection.............................................................................................46 3.8.2 Restrictions...................................................................................................47 3.8.3 Value .............................................................................................................47 4.0 Results..................................................................................................................48 4.1 Survey Results.....................................................................................................49 4.2 Themes That Emerged .......................................................................................59 4.3 Results Conclusion..............................................................................................60 5.0 Discussion ............................................................................................................61 6.0 Conclusion ...........................................................................................................82 6.1 Summary..............................................................................................................82 6.2 Future Directions................................................................................................83 6.3 Practical significance ..........................................................................................83 6.4 Reflection.............................................................................................................84 7.0 Bibliography........................................................................................................86 7.0.1 Email Invitation for Survey ........................................................................97 7.0.2 Survey ...........................................................................................................98
  • 7. 7 Glossary BI - Business Intelligence KDD - Knowledge Discovery and Data Mining SME - Small to Medium Enterprises CeADAR - Centre for Applied Data Analytics Research MGI - McKinsey Global Institute TSSG - Telecommunications Software & Systems Group ECR - Efficient Consumer Response ICHEC - Telecommunications Software & Systems Group SKU - Stock Keeping Unit CRM - Customer Relationship Management DPA - Data Protection Act ODPC - Office of the Data Protection Commissioner POS – Point of Sale Abstract The main goal of this research paper was to look at retail in Ireland and the use of data analytics. Data analytics comes in many ways and forms and the objective was to look at and describe what size retailers use the different tools available to them. Data analysis has a pivotal role to play in how retailers in Ireland will look at customer information going forward. Even now the smallest stores are looking at their customer’s data via social media or instore tracking. Stores do not have to be bricks and mortar anymore and now, online shops are growing at a rapid pace. There was three question asked in this research paper: 1. What size shops are using data analytics? 2. Is it cost based not to use data analytics? 3. Are Irish retailers using data analytics to its full potential?
  • 8. 8 To get a final answer to the research, a literature review was conducted looking at data models, methods of collecting customer data, security. This data will be collected to make a decision on the three questions. A questionnaire is created by using SurveyMonkey, to answer the three questions stated in the methodology section. To answer these questions nine questions were created and sent to twenty six retailers. The answers to these questions were put into SPSS, a statistical software that can perform complex data analysis. Once the data is analysed the results will be compared to information in the literature review. Themes emerging from the research will be looked at and put to the three methodology questions. The discussion will then find common results in both methods where then the research questions will be answered. Finally a conclusion will be written up where the results will be summarised and future directions will be spoken about. Practical significance and finally a reflective journal on the research paper.
  • 9. 9 1.0 Introduction Big data or data analytics is a concept where organisations use a huge amount of data. Rouse defined big data as the process of examining large data sets containing a variety of data types -- i.e., big data -- to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. (Rouse, 2014) Big data can be characterised by the 3 V’s, data variety, and data volume and data velocity, as seen in figure 1. The amount of data is now so huge it is talked about in petabytes and exabytes. Relationship databases are not used for analysis of big data as it is too costly and time consuming. Instead, new methods of storing and analysing data have developed that rely less on data schema and data quality and more on raw data gathered in a data lake or storage repository. Machine learning and artificial intelligence (AI) programs use difficult algorithms to look for repeatable patterns. Companies are using platform tools such as Hadoop, Tableau and Oracle. In recent years’ retailers have been tracking customers purchases with loyalty cards to create a profile on that customer. Everyday millions of transactions go Figure 1 Rouse, M., Source:http://searchcloudcomputing.techtarget.com/definition/big-data- Big-Data Figure 2 Markey Leaders Source: Hopkins, R.Figure 3 Rouse, M., Source:http://searchcloudcomputing.techtarget.com/definition/big-data- Big-Data
  • 10. 10 through Irish stores whether instore or on line. This data collected is known as big data. A major challenge for retailers is to understand its customers’ needs and wants along with increasing the companies own sales. Retailers today have a wide variety of tools available to them to predict the next growing trend. According to (McHugh, R., 2015) 33% of Irish businesses use Big Data in their strategic decision making process, 81% of businesses use data at the centre of their decision making but only 31% of Irish businesses have restructured their operation to show this. Some of the Business Intelligence (BI) tools, as seen in Figure 2, that are available to retailers for analysing the data are Tableau, IBM, SAS, SAP and many more. A breakdown of the leaders, challengers, visionaries and niche players are listed above in Figure 2. Big data falls under the umbrella of BI. These tools can be used to answer Customer Relationship Management (CRM) questions such as (Doran, 2007) 1. Who are the most valuable and least valuable customers? 2. What aspects effect a sale? 3. How profitable are promotional offers? 4. What are the differences between outlets profits in various geographical locations? Figure 4 Markey Leaders Source: Hopkins, R. Figure 5 Markey Leaders Source: Hopkins, R.
  • 11. 11 Big data covers so much in nearly every aspect of retail that it is very important for retailers to keep up with this ever changing technology. Most people have seen Big Data in use every day through use of loyalty cards. Today 87% of Irish people are signed up to a loyalty card scheme according to (Armstrong 2016). 1.1 Research Study and Aims Current studies show that data mining is becoming bigger in Ireland from the data science side but does that reflect retailers use of customers data? This research paper aims to study Big Data and how it is used in retail in Ireland today. Do smaller retailers use big data or is it just the larger retailers? Is cost one of the main issues holding Irish retailers back or are their other barriers such as lack of knowledge or issues with data base? Can Irish retailers do more to use big data too its full potential? There are three questions asked in the research proposal. 1. What size shops are using data analytics? 2. Is it cost based not to use data analytics? 3. Are Irish retailers using data analytics to its full potential? These questions will be put to Irish retailers of different sizes to get a broad sense of the use of data analytics in Ireland today and the barriers standing in the retailer’s way. Are Irish retailers being left behind other countries retailers?
  • 12. 12 1.2 Thesis Structure The literature review will cover past research and discussions on big data in retail in Ireland. It will talk about algorithms and models. Current security risk will be talked about and ways of tracking customer data. The methodology section will talk about the method used and how it was used, the methods not used and reason for not using them. The research philosophy and how the data was analysed. In the results section, the findings of the survey will be analysed and the results will be displayed. This will include graphs and a brief summary of these results. The discussion then will relate the findings of the survey back to the literature review where common results will be discussed and argued. Finally, the conclusion where future direction and recommendations take place for data analytics in retail in Ireland and answering the three questions set out in the research. At the back of this paper will be Appendices – bibliography, pictures, interview questions, sample survey, consent letters, tables and graphs. 2.0 Literature Review 2.1 Introduction A huge amount of data both structured and unstructured is being processed every day in Irish retail though, Point of Sale (POS), loyalty cards, social media and online sales. Today’s retail environment is tough for retailers as consumer’s face choices from numerous channels and customers demand a personalised shopping experience.
  • 13. 13 Business have always used data to create business value. For organisations to make better, fact based decisions, new tools and platforms have been created to complement this demand for data knowledge. In 1995 the very first International Conference on Knowledge Discovery and Data Mining (KDD) took place. Data mining is the process of exploration and analysis, by automatic or semi- automatic means, of large quantities of data in order to discover meaningful patterns and rules. (Berry and Linoff, 1997) In Figure 3. KDD – Knowledge of Discovering Databases Why do we need KDD? Questions have been asked, is data mining important? (Gartner, 2016) said that data mining is the process of discovering meaningful correlations, patterns and trends by sifting through large amounts of data stored in repositories. Data mining employs pattern recognition technologies, as well as statistical and mathematical techniques. Data Mining is a growing industry in Ireland and the (IPP Ireland, 2016) Irish Government Action Plan for jobs 2013 has identified “Big Data” as an area where Ireland would have an unequivocal advantage over other countries. With Ireland having a high ICT skill level and research capabilities could reap the meaningful benefits of job growth from global organisations in the “Big Data” sector. Ireland is committed to the funding of research that facilitates Big Data. Some of these companies are Centre for Applied Data Analytics Research Figure 3 KDD Source: Zaiane, O., https://webdocs.cs.ualberta.ca/~zaiane/courses/cmput690/notes/Chapter1 / Figure 3 KDD Source: Zaiane, O., https://webdocs.cs.ualberta.ca/~zaiane/courses/cmput690/notes/Chapter1 /
  • 14. 14 (CeADAR), Telecommunications Software & Systems Group (TSSG), The National Centre for Data Analytics (The INSIGHT Centre) and The Irish Centre for High-End Computing (ICHEC). Are Irish retail stores that use data mining making more profit than retail stores that do not use data mining? Stores are now monitoring customer data for: o Predicting customer trends o Tracking customer loyalty o Up-selling strategies o Tracking and targeting profitable items Now more than ever is data mining important for retail stores small and large to try and gain a competitive advantage over their rivals. 2.1 What size shops use Data Analytics? Stores of many sizes use data analytics with each store obtaining the customers information in different ways. Data analytics varies from taking a customer’s details on a POS to Amazon’s similarity algorithm. It is not only large multinational retailers that use data analytics to their benefit, smaller stores too can use data analytics to their advantage. (Hawkins 2012) wrote that the largest retailers, who can afford to have the most sophisticated data collection software are becoming the leading shopping destinations while smaller retailers who in the past were competitors with the larger companies are falling behind. It is a war to keep customers and gain new customers from competitors, by using customer data collected by software. (Hawkins, 2012) is worried that not all retailers will convert to using customer data, the large retailers will leave small, independent retailers behind. (McKenna 2015) stated that 24% of UK retailers are using data effectively enough to enhance sales.
  • 15. 15 (Schaeffer 2016) disagreed stating that retail pecking order is less determined by the size of an organisations IT budget but more by the retailer's inclination towards innovation and agility. The retail business is rapidly changing and smaller shops are showing more agility than larger retailers. (Davis 2015) claimed that 95% of Small to Medium Enterprise’s (SME) have a defined big data agenda. (Simon 2013) wrote that in a recent study of 541 small to medium UK companies, none were thinking of taking advantage of big data. This puts these organisations at a serious disadvantage against competitors that use data mining to predict market trends and customer behaviour. Even large retailers can make mistakes when it comes to exploiting customer data. Tesco have been at the forefront of technology since 1995 when they introduced the loyalty card. By doing this Tesco changed the landscape of retailing. All supermarkets wanted to be like Tesco. Even in America retailers like Walmart took on the Tesco model of customer analytics. But as (Schrage 2014) stated, for all Tesco’s customer data, analytics, segmentation, customisation and promotion their dominance was waning as disillusioned shoppers left to shop in discounted German stores such as Lidl and Aldi. These stores were simple with no gimmicks such as club cards, which customers started to believe benefits the retailer more than the customer. If Tesco’s dramatic decline is a warning that powerful, data rich loyalty programs retailers cannot fight off smaller companies with lower prices and comparable shopping experience. Another reason is that Tesco now lack the innovation and vision that got them to the top. None of this is new, a large company falling from grace but is this a company that could not get through tough times or in an age of big data, predictive analysis and customer knowledge is this technology as powerful as we are lead to believe.
  • 16. 16 Not all organisations use big data. About 61% of senior executives now use big data to help make decisions wrote (Conley 2016). This leaves a large figure of 39% that do not use data analysis. Many see this as a massive disadvantage. The reason for the large amount of organisations not using big data could be down to old habits dying hard. Some senior executives are just afraid of changing with the times and will not embrace change. Some organisations do not know the benefits that come with changing to data analytics. 2.2 Shops using Data analytics to its fullest Potential According to research carried out by McKinsey Global Institute (MGI) and McKinsey Business Technology Office big data will become key in productivity, growth and competition. Big data will affect every sector due to the rise in popularity of social media, the internet of things and multimedia for the foreseeable future. (MGI 2011) carried out a study in Europe and the United States of 5 domains, healthcare in the US, the public sector in Europe, retail in the US, and manufacturing and personal-location data globally. This study showed s figure if big data was used to its fullest in each sector the increase in monetary form would be substantial. If a retailer used big data to its full potential, it can increase its profit margin by more than 60%. Another example in the same report European government administration would saving €100 billion in operating efficiency improvements. With MGI saying that profit margins can be increased by more than 60%, a (Forrester Consulting 2014) report claimed that 68% of retail CIO’s said they collect data, but agreed they were not maximising its full potential. The report also conveyed, only 47% of the survey participants have invested in cross channel analytics to effectively engage customers while only 25% of retailers plan to invest in big data in the future. (Bishop 2013) stated that retailers need to move beyond the term ‘Big Data’, or its full potential will not be fulfilled. Part of the reason is the lack of meaning for the name ‘Big Data’, but the name is not the only think holding back its development.
  • 17. 17 People who comprehend big data strive to convey its value to retail leaders. (Bishop 2013) wrote that besides communication there is also 3 other significant factors stopping big data reaching its full potential, 1. An absence of the ‘Big Picture’ is holding retailers back. o For supply, the initial work on Efficient Consumer Response (ECR) describes a continuous stream of customer demand information to the producer and a continuous stream of product to the customer to meet that request. o For demand, if practiced correctly, one to one marketing can increase the value of retailer’s customer base. 2. Inclination not to share data freely o Fear is one of the factors for the lack of end to end data sharing. Retailers fear that their company will become irrelevant and fall behind. 3. Acceptance of new performance metrics such as Just-in-Time. o Developing the metrics such as just in time will not be an issue, the problem will be new metrics is integrating them and get other companies accepting new metrics. It is only now that retailers are using data analytics techniques to catch up with their online competition. Physical shops know what customers visit buy what they buy, online retailers know, what section is viewed first, what items are viewed but not purchased and how long that item has been viewed for. This information allows online retailers to use big data to a better potential than physical shops. (Lacy 2013) in an interview with Marc Andreesen wrote that Retail guys are going to go out of business and ecommerce will become the place everyone buys. You are not going to have a choice. (Andreesen 2013) While 95% of all retail in United States are physical stores, they are growing at one quarter the rate of online stores.
  • 18. 18 (Baldwin 2014) wrote only 23% of UK retailers feel they can instantly make sense of the data made accessible to them to make the correct business decision while 50% of retailers believe their existing BI tools fall short of their needs, with only 16% of retailer confident that the data analytic tools used give the organisational visibility they require. (Warner 2014) explained, data sharing is important between retailer and supplier. It can cut costs and make savings. o Data Sharing - Retailers and suppliers should share and analyse general set of data. - By measuring and reporting the data both parties will know the same level of information. (Warner 2014) as claimed they are two main types of data that should be shared. o Sell-thru data - This is data that a retailer uses to make decisions on what to order from a supplier. These include; - Scanned sales or Register Sales - Quantity stock on order - Quantity stock on hand - Orders categorised by SKU - Orders categorised by store Giving suppliers access to this information benefits both supplier and retailer. This also gives less chance of a store being over or under stocked. o Product categories - This lets suppliers and retailers focus on groups of products instead of individual products - By measuring the categories and sharing the results, they both can satisfy a customer’s needs.
  • 19. 19 - They can find slow selling and fast selling categories. 2.3 Cost for retailers not using or not using big data With data analytics so engrained in gaining customer information, can retailers really afford not to move to using big data? (Liozu 2014) explained that with online, mobile, social media and in-store, retailers are struggling to keep up with customers ever changing needs. (Liozu 2014) also suggested that retailers should invest in price intelligence and predictive analytics side of data analytics as it helps them figure out what customers want to purchase and when and help decide the optimal price range of that customer. (Gartner 2013) forecasted by 2014 the big data market will grow by 9% ($80 billion) per year and 50% of the growth will be in predictive analysis. Should retailers purchase data analytics at any cost? From what Gartner is saying, big data will be hugely significant for retailers going forward. (Bantleman 2012) wrote about the cost of big data, saying a petabyte Hadoop cluster will need 125 to 250 nodes which cost $1 million. The cost to support Hadoop will cost $1 million. Using data analytics is expensive, but there are more inexpensive alternatives for smaller retailers. (Amazon) explained Amazon released Amazon Redshift, a fast inexpensive option, petabyte scale data warehouse that makes it easy to analyse organisation’s data using your existing BI tools. It can cost from as little as 25c per hour and move up to petabytes for $1,000 per terabyte per year, less than the traditional data analytic tools. (Jain 2014) disagreed with (Bantleman 2012) and explained why retailers do not need to spend big on data analytic tools. A simple trend focused on one or two key variables, employed the right way will start smaller retailer on right track. Online customers and customers on social media can be easily and inexpensively tracked through ‘liked’ clicks on social media, purchasing history browsing behaviour. The Data Protection act of 1988 and 2003 was put in place to protect customer’s information collected by organisations. This act asks regulatory requirements when it comes to handling private data. (AL Goodbody) wrote about some of the key issues that need to be considered,
  • 20. 20 1. Who controls the data 2. Appropriate security measures 3. Consent of the data subjects 4. De-identification of data 2.4 Data 2.4.1 Information Hierarchy The hierarchy referred to variously as the ‘Knowledge Hierarchy’, the ‘Information Hierarchy’ and the ‘Knowledge Pyramid’ is one of the fundamental, widely recognized and ‘taken-for-granted’ models in the information and knowledge literatures. It is often quoted, or used implicitly, in definitions of data, information knowledge in the information, information systems and knowledge management literatures (Rowley, J., 2007) (Russell Ackoff), a systems theorist and professor of organizational change, organised the human mind into 5 categories as seen in Figure 4 (Bellinger et al 2004); o Data – Symbols. o Information – Processed data that return answers to “who”, “what”, “where” and “when” questions. o Intelligence - Application of data and information; answers "how" questions o Knowledge - Appreciation of "why" o Wisdom – Evaluated understanding Figure 4 Information Hierarchy, Flood, P 2016 Figure 4 Information Hierarchy, Flood, P 2016
  • 21. 21 The first four categories deal with the past, with only the final category, wisdom, dealing with the future. IBM claimed that the amount of unstructured and multi-structured data within an organisation is at 80% (Savvas, A., 2011). With more and more people now purchasing on-line, retailers need to be able to extract relevant information to stay competitive. By better predicting what customers want by their habits, preferences, and expectations for a better shopping experience? 2.4.2 Algorithms Microsoft 2016 wrote that data mining algorithms are a set of heuristics and calculations that creates a data mining model from data. The algorithm analysis the data present and searches for specific types of patterns or trends, creating a model. The results of the analysis are then used by the algorithm to characterise the optimal parameters for creating the mining model. The parameters are then implemented across the entire data set to separate actionable patterns and detailed statistics. 2.5 Models 2.5.1 K - Nearest Neighbour Model The K – Nearest Neighbour Model (KNN) is a simple, versatile model used in (Thirumuruganathan, S., 2014) said that KNN is a non-parametric lazy learning algorithm. This means the KNN does not make assumptions on the underlying data distribution. 1. Nearest neighbour outcome is a plus 2. Nearest neighbour outcome is an unknown 4. Nearest neighbour outcome is a minus Figure 5 K-Nearest Neighbour Model
  • 22. 22 To explain how figure 5 demonstrates K-nearest neighbour analysis you need to classify a new object among a number of known examples. By looking at the figure 5, you want to know whether the query point (the orange dot) can be classified as a plus or minus sign. The outcome of KNN based on 1 nearest neighbour. It is clear from figure 3 that the result will be a minus since the nearest point is a minus sign. If the nearest neighbour is increased to 2 then KNN will not be able to classify the outcome with the second closed query point a plus as both answers have the same outcome. If you increase the nearest neighbour to 4, this will identify the nearest neighbour region, indicated above with a circle around it. Since there are 3 plus and 1 minus, the outcome of the query point will be a plus. 2.5.2 C4.5 The C4.5 model creates a classifier in the appearance of a decision tree. For this to happen the C4.5 is given a set of data representing items that are already classified. Young Middle Aged Senior A decision tree is similar to a flow chart. Each tree structure has a root node which is Age in figure 6, branches which is shown by age, middle age and senior. Leaf node is represented by Student? And Credit_Rating? Age Age Student? Student? Yes Yes Credit_Rating? Credit_Rating? No Fig ure 6 C4. 5 De cisi on Tre e So urc e: P Yes Yes No No Yes Yes Figure 6 C4.5
  • 23. 23 Using C4.5 decision tree does have benefits, o Easy to understand small decision trees o Does not involve domain knowledge o Works well with redundant attributes o Using classification steps of a decision tree is straight forward and rapid Disadvantages of using C4.5 are o Irrelevant attributes may affect the creation of a decision tree o Slight changes in the data can create very different looking decision trees o Too many classes can cause errors o Sub trees can be replicated many times o Inadequate for forecasting the value of a continuous class attribute 2.6 Data Security Risks With the increase use of business intelligence comes the bigger concern of security for the company’s data. (McHugh, R., 2015) wrote that 44% of Irish companies are worried that big data will increase security risks. Companies have been working on computer security for the past 30 years. In that time a huge number of successful advances such as public key encryption, multi- level security and cryptographic protocols but unfortunalty attackers also advance at faster rates. With more and more customer’s information being saved by retailers, the customer needs to feel secure in knowing that their data is safely stored.
  • 24. 24 2.7 Methods of Collection 2.7.1 Loyalty Cards Superquinn introduced loyalty cards to Irish retail in 1993 with their SuperClub loyalty card scheme. Loyalty cards did not grow in popularity with other retailers until Tesco introduced their loyalty card scheme in 1997. Most large retailers that have Business Intelligence (BI) also have a loyalty program for its customers such as vouchers, coupon and discounts off items. This is a small incentive for the information that customers are giving the retailers. (Sharpe, 97) asked, do loyalty programs increase loyalty? Retailers use BI to keep existing customers by tracking what they purchase and aiming discount vouchers at their purchases. This helps the retailers keep track of the customers and increase the loyalty of their brand as seen in figure 7. Retailers achieve this through deep purchase history analysis and customer attribution (Sciencesoft, 2014) In a study by North Western University one in five people studied used social media as a form of communication for the most frequent retailers. Among that 20%, Facebook was the most dominant form for discounts and brands news (Beck et al). Retailers are trying to find new and unique ways of attracting new customers and keep existing customers. Figure 7 Analysing Customer Data :Source Slideshare.net In the UK £1 in £7 is spent on a Tesco product or services (Smith, Woods, 2008). This commanding position has been supported by Tesco’s popular club card which is in 1 out of every 2 households in the UK (Davey, 2009). (Armstrong, 2016) wrote that 9 out of 10 people (87%) in Ireland are signed up to loyalty schemes and the average person is participating in up to 4 programs as seen in figure 8.
  • 25. 25 Figure 8 Loyalty Cards Ireland Source: http://irishtechnews.net/ITN3/new-shopping-loyalty-app-is- launched-for-the-irish-market/ 2.7.2 Security Most shoppers are unaware the loyalty cards that they use every day is being used by the retailer to build a profile of their shopping patterns to be used later. The data collected can be sent out to telemarketers and business partners. Retailers however do not say how often how the discount card scheme is outsourced to other companies. Safeway's privacy policy, for instance, states the following: The information we receive depends on what you do when you visit our stores and use your card. We collect and store your name, address, home telephone number, and birth date if provided by you. If you are in an area where we offer electronic checking and apply for this service, we also ask for information such as your driver's license number and bank and credit card account numbers. When you make purchases, we record data about the transaction; including the amount and content of your purchases and the time and place these purchases are made (Flaherty 2011). 2.7.3 Dublin Airport In May 2013 Dublin airport introduced facial recognition technology at the biometric gates to speed passengers through border control. This allowed passengers to be verified using facial recognition and processed in 7.5 seconds each (futuretravelexperience.com, 2013). Not only will this enhance the
  • 26. 26 traveling experience for passengers but facial recognition will also strengthen border security in the airport. 2.7.4 On-Line Recommendations Some on-line companies try to predict their customers wants based on searches they made on their site. Companies like Amazon use recommendation algorithms to tailor make recommendations for their customers by following searches, purchases, length of time on site, duration of view of item, hovered over items, wish lists, shopping cart activities and rated items (Ulanoff, 2014). How Amazons algorithm works is illustrated below. . 2.7.5 Amazons Similarity Algorithm The algorithm works by finding groups of customers whose items they purchased and items that they rated overlap with the users purchased items and rated items (Schafer, Konstan, Reidl, 2001). The algorithm then gathers the items from similar customers and disregards the items that the purchaser has already rate or has purchased. The algorithm then recommends the items remaining to the customer. If the algorithm represents a customer as N-dimension, where N is the distinct items in the catalogue. The items of the vectors that are positive have been positively rated and the items that are negatively have been negatively rated. For Amazons bestselling items, the algorithm multiplies the vectors items by the inverse frequency. This makes less known products more important. (Linden, Smith, York, 2003). Figure 9 Amazon Recommendation Algorithm Source http://www.cin.ufpe.br/~idal/rs/Amazon- Recommendations.pdf
  • 27. 27 The algorithm bases its recommendations on a few customers who are most similar to the user. The algorithm can measure the similarity of customer A and B in figure 9. It measures the cosine of the angle between the vectors. 2.7.6 In-store Tracking Retailers are now using more technology than ever to track customer behaviour while in the shop. This is becoming popular with younger shoppers who like the interaction in their shopping experience. Carrefour in France use tracking technology to send coupons onto smart devices as customers walk down isles passing certain products (Pope, 2015). 30% of retailers are now using facial recognition to track customers while in the store, according to software firm CSC wrote (McDonald, 2015). The report also stated 74% of stores track customers while in the store by using technology while a quarter of customers saying it gives a good shopping experience. Many people accept that younger people are happier to have their information used by retailers for this purpose while not fully understanding how the information is being used and the security risks that that are associated with it. Retailers need to explain to their customers why they collect the data for the benefits customer and not for the retailers. Retailers in America are using facial recognition not only to track customer’s in store but also to help with security and to stop theft which costs the retail industry in America $32 billion (Wahba, 2015). Facial recognition used as a security will recognise known thieves. Facial recognition is not used in Ireland for shopping or security. Is Ireland slow to catch on to new technology in retail or do Irish shoppers not want to change? Irish customers are known for their brand loyalty. An example of this is in 1996 Superquinn rolled out concept of a ‘super scanner’ that would end the dreaded queues in their stores as seen in Figure 10. This was based on trust
  • 28. 28 as customers would scan the items bar codes as they were out in the basket or trollies. The ‘super scanner ‘would total up the amount and the customer pay from their account. A random selection of customer’s trollies would happen to where a customer would have to go the old way. This was to help avoid overcharging. Mr Eamonn Quinn said the project was self-financing as it would attract new customers and increase business (Yeates, 1996). Figure 10 Super Scanner used in Superquinn Source: http://ecir2011.dcu.ie/ecommpracticums/main/2006/G.pdf While Irish retailers seem to be slow to advance to Superquinn seems to be an innovator in this area. They have piloted some of the world’s most advanced technology in retail such as self-scanning shopping as seen above, multi- function kiosks, digital shelf-labels and mobile checkout technology (Labarre, 2001).
  • 29. 29 3.0 Methodology The strategy of the research that is to be undertaken will be outlined in this chapter, in relation to the distinct research questions, reaffirmed below: 1. What size shops are using data analytics? 2. Is it cost based not to use data analytics? 3. Are Irish retailers using data analytics to its full potential? To answer the above questions, the researcher used a quantitative methodology as it allowed retailers of different sizes within Ireland, to be interviewed with the aim of answering the research questions. Quantitative methodology as described by Quantitative research is a formal, objective, systematic process in which numerical data are used to obtain information about the world. (Burns, Groves,2005) Other methods were considered such as qualitative and mixed methodology. Qualitative methodology as described by Pope and Mays: The goal of qualitative research is (rather than experimental) settings, giving due emphasis to the meanings, experiences, and views of all participants. (Pope C, Mays N, 1995) Mixed methodology was described by (Leech and Onwuegbuzie 2006): Because of its logical and intuitive appeal, providing a bridge between the qualitative and quantitative paradigms, an increasing number of researchers are utilizing mixed methods research to undertake their studies. (Leech and Onwuegbuzie 2006) Reasons and against taking these methodologies are explained below.
  • 30. 30 3.1 Research Research can be at times mistaken for gathering information, documenting facts, and rummaging for information (Leedy & Ormrod, 2001). Research is the process of gathering, and measuring information on variables of interest, in a traditional, structured fashion that permits one to answer stated research questions, test hypotheses, and evaluated outcomes. This data collection compound of research is normal to all fields of study (Jacob, 2015). While (Brannick & Roche, 1997) wrote that good research is purposeful, has clearly defined goals, and significant, the methodology procedures are defensible, evidence is systematically analysed, statistical techniques are correctly followed and the objectivity of the researcher is clearly evident. (Brannick & Roche, 1997) Qualitative and quantitative are two ways of conducting research studies. (Minchiello 1990) stated that qualitative research is where data is collected through participation and interviews. The interviewee’s data is analysed from descriptions. The interviewee’s language is the data used. Quantitative interviews are the measuring of data. The data is analysed by using numerical comparisons and statistical references. 3.1.1 Research Philosophy For this research paper the researcher will be using quantitative research. Due to time constraints and cost of conducting individual interviews, the researcher decided to use quantitative research in the form of a questionnaire. SurveyMonkey was used to create and distribute the questionnaire. SurveyMonkey is an on-line survey software provider, with tools for creating, modifying, distributing and analysing questionnaires send out to candidates. Distribution of the survey will be carried out by text, email, social media and web links.
  • 31. 31 The initial draft was created using the tools on SurveyMonkey and distributed to numerous retailers using SurveyMonkeys own website. The following drafts were created on SurveyMonkeys website and distributed individually and by group via social media, text and email. An issue did occur when sending the survey to respondents via SurveyMonkeys own website. The respondents were not receiving the emails. To counteract this problem, the questionnaires were sent out individually by personal email to the respondents. 3.1.2 Analysing Research Data Once all the questionnaires were completed on SurveyMonkey, the results will be inputted into IBM SPSS, a software package for statistical analysis. By using SPSS to input the results of the questionnaire, the researcher will look for variables in the results to create the answers for the research questions. These answers will then be analysed and broken down to the 9 questions asked. The results will be put to the three research questions and the results will be put into the discussion section. The expected outcome of the research is: 1. To find out what size retailers in Ireland are using big data. 2. Is cost the main reason not to use big data? 3. Are Irish retailers using big data to its full potential? 3.1.3 Qualitative versus Quantitative Primary data can be quantitative or qualitative as show in Figure 11. (Malhotra, 2000) wrote that the distinction between qualitative and quantitative research closely matches the distinction between exploratory and conclusive research. Qualitative Quantitative Conceptual o Concerned with understanding human behaviour from the informant’s perspectives. o Assumes a dynamic and negotiate reality. o Concerned with discovering facts about social phenomena
  • 32. 32 Methodology o Data are collected through participant’s observation and interviews. o Data are analysed by themes from descriptions by informants o Data are important in the language of the informant. o Assumes a fixed and measurable reality. o Data are collected through measuring things. o Data are analysed through numerical comparison and statistical inferences. o Data are reported through statistical analysis Source Minchiello et al (1990, p5) Marketing Research Data Marketing Research Data Secondary Data Secondary Data Primary Data Primary Data Qualitative Data Qualitative Data Quantitative Data Quantitative Data Exploration Exploration Cause and Effect Cause and Effect Description Description Figure 11 Classification of Market Research Data Source: Marketing Research Malhotra, N Figure 11 Classification of Market Research Data Source: Marketing Research Malhotra, N
  • 33. 33 3.2 Research Objectives This dissertation is to identify if Irish retailers are using Data analytics to its full potential. The following are the objectives that this research will address: o What size shops are using data analytics? o Is it cost based not to use data analytics? o Are Irish retailers using data analytics to its full potential? 3.3 Qualitative Methodology The goal of qualitative research is the development of concepts which help us to understand social phenomena in natural (rather than experimental) settings, giving due emphasis to the meanings, experiences, and views of all participants. (Pope C, Mays N, 1995) The categorisation of qualitative research is shown in Figure 12. These are categorised as either direct or indirect, based on whether the interviewees know the true purpose of the assignments direct method is not a disguise. The reason of the assignment is revealed to the interviewees or is obvious to the Qualitative Research Procedures Qualitative Research Procedures Direct (Non-Disguised) Direct (Non-Disguised) Indirect (Disguised) Indirect (Disguised)Group Interviews Group Interviews Depth Interviews Depth Interviews Observation Techniques Observation Techniques Projective Techniques Projective Techniques Figure 12 Classification of Qualitative Research Procedures Source: Marketing Research Malhotra, N Figure 12 Classification of Qualitative Research Procedures Source: Marketing Research Malhotra, N
  • 34. 34 Sampling Techniques Sampling Techniques Judgemental sampling Judgemental sampling Snowball sampling Snowball sampling Stratified sampling Stratified sampling Cluster sampling Cluster sampling respondents from the questions asked. The main direct interview techniques are in-depth interviews and focus groups. The indirect method conceals the exact reason of the assignment. The main indirect interview techniques are projective and observation techniques. Different methods of qualitative that (Travers, M. 2002) wrote about, 1. Observation – numerous sociological studies on courtrooms have been built on observations. 2. Interviewing – individually or focus groups e.g. get different viewpoints of police officers, lawyers, probation officers. 3. Ethnographic fieldwork – frequently requires devoting a lot of time with a group e.g. anthropologist studying another culture 4. Discourse analysis – closes study of communication e.g. a recording of a doctor’s advice to a patient. 5. Textual analysis – analysis of textual and multimedia items e.g. letters written, diary, files, web sites, notice message boards. 3.4 Sampling Technique Non-probability sampling techniques Non-probability sampling techniques Probability sampling techniques Probability sampling techniques Convenience sampling Convenience Quota sampling Quota sampling Other sampling techniques Other sampling techniques Simple random sampling Simple random sampling Systematic sampling Systematic sampling Figure 13 Classification of sampling techniques Source: Marketing Research p353 Figure 13 Classification of sampling techniques Source: Marketing Research p353
  • 35. 35 Sampling techniques as seen in figure 13 can be classified as probability and non-probability sampling. Probability sampling is where sampling components are selected by chance. o It is possible to pre-specify every probable sample of a given size. o Each potential sample does not need to have the same probability of selecting each sample o It is possible to specify the probability of selecting any particular sample of any size. o Requires an accurate definition of the target population but also general specifics of the sampling frame. o It is possible to control the accuracy of the sample evaluations of the characteristics of interest. o Confidence breaks, which comprise of the true population value with an assumed level of inevitability, can be calculated. o This allows the researcher to make readings or projections about the target population from which the sample is obtained. Classification of probability sampling techniques are founded on, 1. Element versus cluster sampling 2. Equal unit probability versus unequal probabilities 3. Unstratified versus stratified selection 4. Random versus systematic selection 5. Single-stage versus multistage techniques Non-probability sampling relies on the individual judgement of the researcher rather than on chance. o The interviewer can deliberately and subjectively can decide what elements what to put in the sample. o Non-probability sampling provides good approximations of population characteristics.
  • 36. 36 o They do not permit for unprejudiced assessment of the accuracy of sample results. o Commonly used techniques include quota sampling, snowball sampling, convenience sampling and judgemental sampling. 3.4.1 Qualitative Sampling Researchers can use different strategies for qualitative sampling as seen in Figure 14. Each gives its own advantages and disadvantages. Figure 14 Qualitative Research Design, Source Maxwell J 2013, Qualitative Research Design an Interactive Approach 3.4.2 Snowball Sampling 1. This is the most common method of sampling 2. An advantage of this method is that one interviewee refers the researcher to another interviewee. 3. A Disadvantage of snowball sampling is that the sample may be limited because it consists of interviewees who belong to the networks of the index cases. (Hardon, A, Hodgkin, C, Fresle, D, 2004) 4. After the interview the respondents are asked to identify others who belong to the same targeted population. 5. Subsequent respondents are selected based on recommendations.
  • 37. 37 6. By obtaining recommendations from recommendations, this procedure is carried out in waves, thus leading to a snowball effect. 7. Even though the initial selected respondents are selected by probability sampling, the final sample is a non-probability sample. 8. The recommendations will have psychographic and demographic characteristics more similar to the person referring them than would occur a chance (Frankwick, G.L. et al 1994) 3.4.3 Convenience Sampling 1. Getting a sample from whoever is available to give the sample 2. An advantage of this method is getting a different view of an interviewee who might not ordinarily be interviewed 3. A disadvantage would be you could interview a person who has no information on your sample. 4. Convenience sampling is the least time consuming and least expensive of all sampling techniques. 5. The sampling units are easy to measure, accessible and co- operative. 6. Convenience sampling has serious limitations. 7. Many potential sources of selection bias are present, including respondent self-selection. 8. Convenience samples are not representative of any definable population. 9. They are not suitable for marketing research assignments involving population inferences. 10.Convenience samples are not suitable for descriptive of casual research but can be used in investigative research for creating ideas, hypotheses or understandings.
  • 38. 38 3.4.4. Judgemental Sampling 1. Judgemental sampling is a type of convenience sampling in which the researcher selects the population elements based on their judgement. 2. The researcher, exercising knowledge or judgement, selects the elements to be involved in the sample as they believe that they are representative of the population of interest. 3. Judgemental sampling is used when there is a limited number of individuals. 4. It is the only viable sampling technique in obtaining information from a very specific group of individuals. 3.4.5 Quota Sampling 1. Quota sampling can be regarded as two stage restricted judgemental sampling that is used extensively in street interviewing. 2. Stage one comprises of creating control characteristics, or quotas, of population elements such as age and gender. 3. The researcher lists relevant control characteristics and determines the distribution of these characteristics in the target population. 4. In the second stage, sample elements are selected based on convenience or judgement. 5. Once the quotas have been allocated, there is significant freedom in choosing the elements to be included in the sample. 6. The only obligation is that the elements selected fit the control characteristics. 7. It may appear that the quota sampling technique is totally representative of the population. In some cases, it is not.
  • 39. 39 3.5. Semi Structured Semi structure methodology is a list of open ended questions that change from interview to interview. (Kvale, 1996) wrote that semi structure interviews are a form of human interaction in which knowledge evolves through dialogue. Face to face interviews are thought to give the interviewer the highest response rates (Nueman, 2007). The interviewer can also have a list of follow up probing questions to gather further information on any answer given to the questions (Curran, 2014). (Saunders, Lewis, Thornhill, 1997) wrote that semi structured interviews can have o Questions can vary from interview to interview o Questions can be left out and added o The order of questions may vary. Advantages and disadvantages of semi structured interviews Advantages Disadvantages o Acquires relevant information o The interviewees are specifically targeted o Structured so as to allow contrast o Gives the freedom to explore general views or opinions in more detail o Can be used for sensitive topics o Can use an external organisation so as to retain independence o Interviewing skills are essential o Need to meet sufficient people in order to meet general contrasts o Preparation must be carefully planned so not to make the questions perspective or leading o Need to have skills at analysing the data o Time consumption and resource intensive o You have to be able to ensure confidentiality
  • 40. 40 3.5.1 Unstructured Unstructured interviews are in-depth exploration interviews. The interviewer has no predetermined list of questions. This allows the interviewee to freely talk about the questions asked. The questions are open ended allowing the respondents to answer in their own words. Open ended questions are good questions to start an interview. This allows the respondents to express opinions that enables the interviewer to interpret the responses to the questions. These questions have a less biased influence on the interviewee’s answers than structured questions. The interviewees are free to express any views (Malhotra, Birks, 2000). The researcher gets a substantial insight from these comments and explanations. (Malhotra, Birks 2000) argued that the main disadvantage of unstructured is that potential for interviewer bias is high. Another disadvantage of unstructured interviewing is the cost of coding the responses is high and time consuming (Jones, 1981) and (MacDonald, 1982). Unstructured or open ended questions give extra substance from interviewees who are more articulate. Unstructured questions are not suitable for self-administrating questionnaires such as post and telephone interviews, as these are briefer in writing than in speaking.
  • 41. 41 3.5.2. Structured Structured interviewing is the involvement of questionnaires with a predetermined set of questions. Strengths of Structured Interviews Weaknesses/Limitations of Structures Interviews o It allows the researcher to examine the knowledge a respondent has about the topic. o This is an important form of formative assessment. It can be used to assess a respondent’s feelings on a particular topic before using a second method. o All the interviewees are asked the same questions. o Offers a reliable source of quantitative data. o The interviewer is able to contact large numbers or people quickly, easily and efficiently. o It is fast and straightforward to generate, code and interpret. o A formal relationship is created between the interviewer and the respondent. o The interviewer does not have to worry about incomplete questionnaires, o This can be time consuming if the group is very large. o The quality and practicality of the information received is dependent on the quality of the questions asked. The assessor cannot add or subtract questions. o A considerable amount of preplanning is needed. o The design of the questionnaire makes it problematical for the interviewer to inspect difficult issues and views. o Answering the questions give limited scope to expand in detail or depth. o The interviewer can influence answers of a respondent by their presence, making the responses biased. o The interviewer by designing the questionnaire has decided in advance which questions they consider important and unimportant.
  • 42. 42 biased questionnaires and response rates with structured interviews. (Source: http://www.sociology.org.uk/methsi.pdf) 3.6 Quantitative Methodology Quantitative research is a formal, objective, systematic process in which numerical data are used to obtain information about the world”. (Burns, Groves,2005) This research method is used: o to describe variables; o to examine relationships among variables; o to determine cause-and-effect interactions between variables’ Quantitative observation involves studying the behavioural patterns of people, objects and events in a systematic manner to obtain information about the phenomenon of interest. 3.6.1 Independent Variable Independent variables are manipulated by the researcher and whose effects are measured and compared. They are alternatives that are controlled and whose effects are measured and compared, known as treatments. These could be price levels, marketing themes and package design. 1. This is what the researcher expects will affect the dependent variable. 2. The researcher sets out to control the effect on the dependent variable.
  • 43. 43 3.6.2 Dependent Variable Dependent variables are variables that measure the effect of independent variable These are variables that measure the result of in dependent variables. These variables can be sales, profits and market shares. 1. The researcher expects that this will be affected by manipulating the independent variable 2. It can be measured. 3.6.3 Moderating Variable (Olsen, W, K) 1. A moderating variable represents a process or a factor that alters the impact of an independent variable. 2. Has a strong contingent effect on the independent and the dependent variable relationship (Hussey, J. and Hussey R. (1997). 3.7 Mixed Methodology Mixed methodology involves both quantitative and qualitative research as seen in Figure 15. To include only qualitative or quantitative methods falls short in major approaches used in social and human sciences (Creswell, 2003). It includes philosophies assumptions from the approaches of quantitative and qualitative methodologies. One of the key benefits of mixed methodology is the capability to match the method to the requirements of the study. (Migiro, Magangi, 2011) wrote Figure 15 Methods of Interviewing Source: Creswell 2003 Figure 15 Methods of Interviewing Source: Creswell 2003
  • 44. 44 that mixed methodology lets the researcher get a sense for the vital issues on the subject before embarking on further development in a study, can be very valuable. (Creswell & Plano Clark, 2007) wrote that mixed methodologies are more than just simply accumulating and analysing both kinds of data; it also comprises the use of both methods in tandem so that the overall strength of a study is greater than either qualitative or quantitative research. Mixed methodologies are evolving into a dominant form of research. However, (Symonds, Gorard, 2008) wrote that the concept of mixed methodologies has logical foundations rooted more in philosophy than in pragmatic reality. (Morse 2005) claimed, while acceptance the new field, admitted to the sense of hearsay because of “sudden faddishness of mixed methods” had brought to the fore awkward and unanswered questions about mixed qualitative and quantitative methods within a single set. Others, such as (Bazeley 2004) warned about the risks of this “new” methodology and the paradigmatic and methodological topics that could arise. (Giddings and Grant 2007) called mixed methods a bastardisation of positivism. (Giddings 2006) had issued this frank warning: Clothed in a semblance of inclusiveness, mixed methods could serve as a cover for the continuing hegemony of positivism, and maintain the marginalisation of non-positivist research methodologies. I argue here that mixed methods as it is currently promoted is not a methodological movement, but a pragmatic research approach that fits most comfortably within a post positivist epistemology. (Giddings, 2006)
  • 45. 45 The Case for or against the use of mixed methodology is argued below. For Against o All singular methods (interview, survey) and all data types (audio, visual) can be categorised under one of two paradigms (Quantitative, Qualitative) o That elements from both paradigms can cohabit in a single investigation. o A third classification is required to refer to investigations which use elements of both paradigms. o Qualitative and quantitative research are separate contrasting paradigms. o Integrated research approaches can lead to the threat of discounting assumptions underlying research methods. o It can be time consuming and expensive o Researchers need to have experience in both qualitative and quantitative methods. 3.8 Ethics Organisations using big data reap the benefits of customer’s data, detecting fraud, customisation of customer services and using company’s resources efficiently. Why then are people so concerned with the ethics that come with data analytics? Chessell (2014) wrote the technology itself is inherently ethics- agnostic, but it does push the art of the possible to new limits in terms of: o The availability of a wide variety of data from several sources o The ability to correlate the data to understand the bigger picture. o Individuals can me accurately identified and targeted. o The ability to identify someone’s location for surveillance.
  • 46. 46 Ethics in quantitative research is extremely important. This can be from the anonymity of the participants to non-disclosure of sensitive information. Ethical issues arise in market research for numerous reasons. Because market research comprises of communicating with respondents and the general public, by data collection, distribution of the research conclusions and advertising campaigns based on conclusions. This gives the potential to abuse the information by taking advantage of participants involved. Data protection and big data are very important. Organisations should think about them hand in hand. In Ireland the Data Protection Act (DPA) of 1988 and 2003. This act asks regulatory requirements when it comes to handling private data. (AL Goodbody) wrote about some of the key issues that need to be considered, 1. Who controls the data 2. Appropriate security measures 3. Consent of the data subjects 4. De-identification of data 3.8.1 Data Collection 1. All information received on SurveyMonkey will be destroyed at the end of this research project 2. All records taken for this research paper will be kept confidential and will only be used by the researcher. 3. Duty of care for the weak and vulnerable 4. To protect and defend the rights of the weak and vulnerable. 5. All interviewees partake wilfully in any research.
  • 47. 47 (Shamoo, Resnik 2015) gave a rough summery of ethical guidelines that should be followed. o Honesty o Objectivity o Integrity o Carefulness o Openness o Respect for intellectual property. o Confidentiality o Human Subjects Protection o Responsible Publication o Responsible Mentoring o Respect for Colleagues o Social Responsibility o Non-discrimination o Competence o Legality o Animal Care 3.8.2 Restrictions The restrictions on this research paper was outlined by using quantitative methodology. Even though this methodology has many benefits, the restrictions such as the categories used may not reflect the candidates understanding, were not as limiting as the other methods. Quantitative methodology is also less time consuming and expensive than the other methodologies. Mixed methodology was first considered as it covered both quantitative and qualitative methods. Time constraints and cost of conducting individual interviews meant that mixed method was not used. Qualitative was also considered but conducting the interview and collecting the data would be time consuming and expensive. The data was also more difficult to analyse. 3.8.3 Value This research looked at Irish retailers using big data to its best capabilities. While larger multinationals are using big data, the smaller independent Irish were always going to be playing catch up in adopting this and new technologies that came with it. The barriers such as cost and training to Irish retailers.
  • 48. 48 4.0 Results In this chapter the results of the survey will be defined as in the previous methodology chapter. 30 surveys were sent out to respondents, 9 via SurveyMonkey email invite and 17 via SurveyMonkey Mobile link with a response rate of 26. All retailers replied in full, skipping no parts of the questionnaire. After issues with sending out the questionnaire via email invitation 2, the mobile link was used instead. This form of sending out the questionnaire brought back faster responses from clients. In the below results (n=) referrers to the amount answered in the survey out of 26. Total is (n=26), also used in K Nearest Neighbour or KNN as discussed in literature review in earlier chapter. Figure 16 shows the two methods the survey was sent out and Figure 17 shows the responses by the day returned. Figure 16 Methods of sending out survey through SurveyMonkey Figure 17 Responses by Volume Source: SurveyMonkey Figure 17 Responses by Volume Source: SurveyMonkey
  • 49. 49 4.1 Survey Results Question 1: What is the size of your company? All 26 respondents responded to this question, skipping no parts. This question in the survey was linked to Question 1 of the methodology section, What size shops are using data analytics? This research is being carried out on Irish retailers. Most of the Irish retailers questioned as seen in Figure 18, in the survey were smaller retailers, this accounted for retailers of 0-50 employees or 61.54% or (n=16). Medium sized retailers of 50-250 employees which accounted for 15.38% (n=4) and the larger retailers of 250 plus employees or 23.08% (n=6). With retail, Ireland’s largest employer currently 90% Irish owned with 77% of these family owned employing 275,000 people (Gleeson & Lynam). Respondents 0-50 64.54% 16 50-250 15.38% 4 250 plus 23.08% 6 Table 18 64.54% 15.38% 23.08% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 0-50 Employees 50-250 Employees 250 plus Employees Employees Figure 18 Question 1: What is the size of your company?
  • 50. 50 Question 2: Does your shop use Data Analytics? This question was to determine what retailers questioned actually used data analytics. This question was linked to Question 1 of the methodology; what size shops are using data analytics? All 26 respondents responded to this question, skipping no parts. Out of the 26 respondents 69% (n=18) used data analytics while 31% (n=8) did not use data analytics in their shop as seen in Table 19. Respondents Percentage Yes 69.23% 18 No 30.77% 8 Table 19 As seen in figure 20, When a result comes back in minus figures it means there is a problem with the data entered. Either the sample size was too small or too many open ended questions and no scale questions. Yes 69% No 31% Does your shop use data analytics Yes No Figure 19 Question 2: Does your shop use data analytics Figure 19 Question 2: Does your shop use data analytics Figure 20 What size shops are using data analytics? Figure 20 What size shops are using data analytics?
  • 51. 51 Question 3: What form of data collection do you use? 1. This question was to see what form of collection the retailers use when collecting their customer’s data. This question was linked to Question 3 of the methodology section, Are Irish retailers using data analytics to its full potential? All 26 respondents responded to this question, skipping no parts. As seen in Figure 21 nearly 8% use loyalty cards (n=2), by far the largest 38% use in store tracking (n=10), such as recording customer’s details and their purchasing history. 19% use both on line purchasing and social media (n=5) to collect customer data, while 15% (n=4) were unsure what kind of collection they use. Respondents Loyalty Cards 7.69% 2 In Store Tracking 38.46% 10 On-line Purchasing Tracking 19.23% 5 Facial Recognition 0% 0 Social Media 19.23% 5 Unsure 15.38% 4 Table 21 7.69% 38.46% 19.23% 0% 19.23% 15.38% 0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00% 45.00% Loyalty Cards In Store Tracking On Line Purchasing Facial Recognition Social Media Unsure What form of data collection do you use? Figure 21 Question 3: What form of data collection do you use?
  • 52. 52 Question 4: In your opinion, has Data Analytics changed how your organisation collects customer information? This question was to find out, how by using data analytics has changed how their shop collects customer information. It is linked in with Question 3 in the methodology section, Are Irish retailers using data analytics to its full potential? All 26 respondents responded to this question, skipping no parts. As seen in Figure 22 nearly 31% (n=8) said yes while a majority of 54% said no, data analytics has not changed how the collect customer’s data while 15% (n=4) said yes and gave a reason as listen below. 1. More emphasis on different types of marketing according to the information. 2. Better structured and organized. 3. It allows us to see what are the most popular SKU’s by region and by month to allow us to project forward manufacturing. 4. Offers target at certain demographics. Figure 22 Question 4 Has data analytics changed how your organisation collects customer information Respondents Yes 30.77% 8 No 53.85% 14 Yes (Please give reason) 15.38% 4 Table 22 30.77% 53.85% 15.38% YES NO YES (PLEASE GIVE REASON) Has Data Analytics changed how your organisation collects customer information?
  • 53. 53 Question 5: What, if any, do you think are the benefits of retailers sharing data (such as POS, inventory, and customer loyalty) with suppliers? This question was linked with Question 3 of the methodology section, Are Irish retailers using data analytics to its full potential? All 26 respondents responded to this question, skipping no parts. As seen in Figure 23, this is very close in what companies use to collect data. Strengthens relationships with suppliers getting 19% as does retailers strategies can benefit from suppliers' product knowledge, who are both (n=5), Helps increase sales received 23% or (n=6), Suppliers can forecast and meet customer demand received the highest with 35% or (n=9), only1 retailer answered, there are no benefits with 4% or (n=1). 19% 23% 35% 19% 4% What, if any, do you think are the benefits of retailers sharing data (such as POS, inventory, and customer loyalty) with suppliers Strengthens retationships with suppliers Helps increase sales. Suppliers can forecast and meet customer demand. Retailers strategies can benefit from suppliers' product knowledge. There are no benefits. Figure 23 What, if any, do you think are the benefits of retailers sharing data (such as POS, inventory, and customer loyalty) with suppliers Figure 7 How does Data Analytics help retailers manage product availability for customers?Figure 23 What, if any, do you think are the benefits of retailers sharing data (such as POS, inventory, and customer loyalty) with suppliers
  • 54. 54 Respondents Strengthens relationships with suppliers 19.23% 5 Helps increase sales. 23.08% 6 Suppliers can forecast and meet customer demand. 34.62% 9 Retailer’s strategies can benefit from suppliers' product knowledge. 19.23% 5 There are no benefits. 3.85% 1 Table 23 Question 6: How does Data Analytics help retailers manage product availability for customers? This question as seen in Figure 24, asks about product availability in stores whether to keep stock to a minimum, reducing having no stock or predicting future demands. All 26 respondents responded to this question, skipping no parts. This question was linked with Question 3, are Irish retailers using data analytics to its full potential? By predicting future demand, was by far the best response with 65% of respondents or (n=17), while by reducing out of stock situations had 23% of the respondents or (n=6). Ensuring the shop is not over stocked was not as popular with 11% or (n=3). Other had no respondents.
  • 55. 55 Respondents By predicting future demand 65.38% 17 By reducing out of stock situations 23.08% 6 Ensuring the shop is not over stocked 11.54% 3 Other 0% 0 Table 24 Figure 8 How does Data Analytics help retailers manage product availability for customers? Figure 9 How does Data Analytics help retailers manage product availability for customers? By predicting future demand By reducing out of stock situations Ensuring the shop is not over stocked Other 65.38% 23.08% 11.54% 0 HOW DOES DATA ANALYTICS HELP RETAILERS MANAGE PRODUCT AVAILABILITY FOR CUSTOMERS? Figure 24 How does Data Analytics help retailers manage product availability for customers?
  • 56. 56 Question 7: How important do you consider the use of Data Analytics for retailers to gain a competitive advantage over their competitors? This question was linked with Question 3 are Irish retailers using data analytics to its full potential? All 26 respondents responded to this question, skipping no parts. This question looked into how important the retailers questioned considered the use of Data analysis to gain an advantage over competitors. Not surprisingly 65% or (n=17) found data analytics very important in gaining an advantage over competitors. 23% or (n=6) considered data analysis an advantage moderately important, while 8% (n=2) considered data analysis an advantage important. Only 4% (n=1) considered the use of data analysis for retailers to gain a competitive advantage of little importance while 0% responded to other, as seen in table 25. Respondents Very Important 65.38% 17 Moderately Important 23.08% 6 Important 7.69% 2 Of little importance 3.85% 1 Not important 0% 0 Table 25 65% 23% 8% 4% How important do you consider the use of Data Analytics for retailers to gain a competitive advantage over their competitors? Very Important Moderately Important Important Of little importance Figure 25 How important do you consider the use of Data Analytics for retailers to gain a competitive advantage over their competitors?
  • 57. 57 As seen in Figure 2 Cronbach’s Alpha is .607 which is in the suitable parameters. Question 8: How important is cost in implementing Data Analytics? This question as seen in Figure 27 asks about the importance of cost when implementing data analysis. This question is linked with Question 2 in the methodology section, is it cost based not to use data analytics? The graph shows that 42% or (n=11) think that cost is very important when it comes to implementing data analysis. This was 12% higher than people who thought it was important at 30% or (n=8), who was just higher than moderately important at 27% or (n=7). 0% though it was of little important, not important and other. 0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00% 45.00% 42.31% 26.92% 30.77% 0% 0 0% How important is cost in implementing Data Analytics Figure 27 How important is cost in implementing Data Analytics? Figure 26 How important do you consider the use of Data Analytics for retailers to gain a competitive advantage over their competitors?
  • 58. 58 Respondents Very Important 42.31% 11 Moderately Important 26.92% 7 Important 30.77% 8 Of little importance 0 0 Not important 0 0 Other Please specify 0 0 Table 27 Question 9: What are the barriers to adopting Data Analytics? This question is linked to Question 2 in the methodology section, is it cost based not to use data analytics? All 26 respondents responded to this question, skipping no parts. As seen in Figure 28 one of the issues that are in Irish retailers was shown in the answers to question 9, barriers to adapting data analysis. 12 respondents, 46.15% of correspondents find a lack of understanding of big data and its uses as a barrier. That is worrying for Irish retailers being left behind larger international retailers who have better understand of data analytics and where cost is no option. 19.23% of retailers questioned said that cost is a barrier in adopting data analytics. 19% 16% 46% 15% 4% 0% Barriers in adopting data analysis Cost Staff Training Lack of understanding of big data and its uses Problems with database software Other (please specify) Not usable for customer Figure 28 Barriers adopting data analysis Source: Flood, P. Figure 28 Barriers adopting data analysis Source: Flood, P.
  • 59. 59 (Millman 2014) in a 2014 EMC survey carried out in 300 UK businesses, a worryingly 62% of retailers are without the required skills to comprehend the ethical, responsible and compliant use of customer data. While 62% of retailer’s lack understanding of data analytics, the same survey (Millman2014) also claimed the key to new growth in retail is in the better use of customer insights, data trends and customer information. Respondents Cost 19.23% 5 Staff Training 15.38% 4 Lack of understanding of big data and its uses 46.15% 12 Problems with database software 15.38% 4 Not usable for customer 0% 0 Other (please specify) 1 3.85% Table 28 Training staff also was high on the list at just over 15%. (Ridge et al 2015) wrote, the lack of understanding of big data by executives, felt that the return on investment did not warrant the investment in the first place. As seen in figure 29 Cronbach’s Alpha reliability is .46. 4.2 Themes That Emerged o Out of 26 retailers questioned well over half, 18 retailers, and use data analytics. o While data analytics is popular, smaller retailers are more likely not to have data analytics. o Cost was a main factor when it came to implementing data analytics, as some retailers saw it as a barrier. Figure 29 Cronbachs Reliability Figure 29 Cronbachs Reliability
  • 60. 60 o 46% of retailers saw a lack of understanding of data analysis as a barrier. 4.3 Results Conclusion The conclusion of the results of the 3 methodology questions, 1. What size shops are using data analytics? This question was answered in question 1 and 2 in the questionnaire. o 69% of companies questioned used data analytics. o 10 retailers or 62% of retailer’s questions that have a workforce of 0-50 use data analysis. o 8 retailers questioned do not use data analytics, 6 were small retailers of 0-50 workers. 2. Is it cost based not to use data analytics? This question was answered in question 8 and 9 in the questionnaire. o Retailers had data analytics in the stores but still had a lack of understanding of it. o 4 retailers had issues with database software causing a barrier. o 1 retailer found, ensuring that you adhere to the strict data protection laws as a barrier. o All 26 respondents said cost was an important factor in implementing data analysis. 3. Are Irish retailers using data analytics to its full potential? This question was answered in question 3,4,5,6 and 7 in the questionnaire. o Instore tracking was the most popular form of tracking for retailers with 0-50 workers. o Out of the 6 stores with 250 workers up, 2 were unsure what form of data analytics they used, 1 store used loyalty cards, instore tracking, on line purchasing and social media. o Surprisingly over half the retailer questioned did not thing data analytics changed the way they collected customer data.
  • 61. 61 o Most suppliers saw data analytics as a way of forecasting and meeting customer demand as most important, while only 1 saw data analytics as no benefit to them. o 17 out of 26 respondents use data analysis to predict future demand. o Again 17 respondents say they use data analytics as a tool to gain a competitive advantage over their rivals. 5.0 Discussion The objective of this chapter is to discuss the results of the survey in detail and relate them back to the findings in the literature review. The findings will be linked to past theory on the topic. The research questions from the methodology section will be answered to give a current idea of how big data is used in Irish retailers. Question 1: What size shops are using data analytics? This question was made up of 2 questions in the survey, question 1 and question 2. o What is the size of your company? o Does your organisation use Data Analytics? With 69% of the retailer questioned using some form of big data, this research paper shows that data analytics is popular in Ireland. In the 64.54% 15.38% 23.08% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 0-50 Employees 50-250 Employees 250 plus Employees Employees Figure 30 Question 1 What is the size of your company?
  • 62. 62 literature review (Hawkins 2012) claimed that larger retailers who can afford to change big data by using sophisticated data collection software will leave smaller independent retailers behind. The results collected show that, o Retailers 0-50 employees, 10 of the 16 small retailers surveyed use data analytics with 6 retailers not using data analytics, as seen in figure 18. o Retailers 50-250 employees, all 4 medium retailers with 50 – 250 employees surveyed use data analytics, as seen in figure 19. o Retailers 250 plus employees, 4 out of 6 larger retailers of 250 employees plus surveyed use data analytics, with 2 not using data analytics, as seen in figure 20. These results disagree with earlier research in the literature review where (Simon 2013) where 541 small to medium UK companies, none were thinking of taking advantage of big data. Yes 69% No 31% Does your shop use data analytics Yes No Figure 31 Question 2 Does your shop use data analytics?
  • 63. 63 This shows that since Hawkins spoke in 2012 smaller independent retailers have kept up with larger retailers in using big data. Using big data does not require a retailer to spend large amounts of money to purchase sophisticated data collection software. This also agrees with (Schaeffer 2016) who claimed that retailer’s big data pecking order is less about the size of the IT budget but more the retailer's inclination towards innovation and agility. As seen in Figure 33 all retailers medium sized questioned used data analytics. As (Davis 2015) wrote that 95% of small to medium enterprises (SME) use some form of big data. In this research all medium retailers questioned used some form of data analytics. Medium retailers tend to have a POS system that will give the option of collecting customer data or will have a loyalty card system for example the system that Superquinn introduced in 1993 as written in the literature review. Yes 62% No 38% Retailer with 0-50 employees that use data analytics Yes No Figure 32 Retailers with 0-50 employees that use data analytics Figure 32 Retailers with 0-50 employees that use data analytics 100% 0% Retailer with 50 - 250 employees that use data analytics Yes No Figure 33 Retailer with 50 - 250 employees that use data analytics
  • 64. 64 What was very surprising in the retailers with 250 plus that 2 out of 6 or 33% of respondents do not use data analytics. This should be a disadvantage to those retailers with competitors able to use customer data, sometimes it could be the same customer, against them for competitive advantage. While 69% of the retailers interviewed in the questionnaire used data analytics, 31% or 8 out of the 26 surveyed, had no form of data analytics. Analytics are essential today to sustain a competitive advantage. (Conley 2016) explained that technology, data and analysis can now be merged to give a powerful understanding of employees, associates and customers. (Conley 2016) explained in the literature review that 61% senior executives use data analysis to make corporate decisions which back up the figures in this research paper. The reason for the high proportion not changing could be down to old habits and an unwillingness to change, ‘if it’s not broken don’t fix it’. Thinking like this could have a detrimental effect on an organisation. For larger retailers, they could also have an old database the is unsuitable for data analytics without considerable amount of investment. Yes 4 No 2 RETAILER WITH 250 PLUS EMPLOYEES THAT USE DATA ANALYTICS Figure 34 Retailer with 250 plus employees that use data analytics Figure 34 Retailer with 250 plus employees that use data analytics
  • 65. 65 Question 2: Is it cost based not to use data analytics? This question was made up of 2 questions in the survey, question 8 and question 9. Q.8 Is it cost based not to use data analytics? o Retailers with 0 – 50 employees, as seen in figure 35 - 6 said cost was very important - 4 said cost was moderately important - 6 said cost was important o Retailers with 50 – 250 employees, as seen in figure 35 - 1 said cost was very important - 2 said cost was modularly important - 1 said cost was important o Retailers with 250 plus employees, as seen in figure 35 - 3 said cost was very important - 0 said cost was moderately important - 3 said cost was important. 0 1 2 3 4 5 6 7 Very Important Moderatly Important Important Is it cost based not to use data analytics? 0 - 50 Employees 50 - 250 Employees 250 Plus Employees Figure 35 Retailer Size for Cost Importance
  • 66. 66 It is clear to see from the respondents in question 8 that all 26 retailers interviewed think cost in implementing data analytics is important in retail as seen in figure. 42% of all respondents thought it was very important as seen in figure 36. (Bantleman 2012) wrote about the price of having Hadoop as the main tool but only large retailers could afford the millions that it would cost to run it. A petabyte Hadoop cluster at $1 million per year and another $1 to run is out of the budget for most Irish retailers. That is why cost is so important. As seen in question 3 in the survey a lot of the retailers questioned use instore tracking and social media. Furthermore (Jain 2014) disclosed other cost effective ways of having big data in retailer. Social media though not an BI tool, can be an easy form of data analytic collection. Customers that click ‘like; on an item on social media or online tracking when a customer clicks on an item, purchases an item, or just by a customer’s browsing behaviour. The main barrier that the 26 retailers responded back to was the lack of understanding of data analytics and its uses. All level of retail sizes had this as an issue. (Baldwin 2014) said 23% of UK retailers can make instantly make sense of the data made accessible to them. 0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00% 45.00% 42.31% 26.92% 30.77% 0% 0 0% How important is cost in implementing Data Analytics Figure 36 How important is cost in implementing Data Analytics? Figure 310 Question 9 Barriers in adopting data analysis?Figure 36 How important is cost in implementing Data Analytics?
  • 67. 67 Q.9 What are the barriers to adopting Data Analytics? o Retailers with 0 – 50 employees, as seen in figure 37. - 2 said cost was a barrier - 3 said staff training was a barrier - 7 said a lack of understanding of data analytics and its uses as a barrier - 3 has problems with database software - 1 had other, ensuring that you adhere to the strict data protection laws as a barrier o Retailers with 50 – 250 employees, as seen in figure 37 - 2 said cost was a barrier - 2 said a lack of understanding of data analytics and its uses as a barrier - All the others were 0 19% 16% 46% 15% 4% 0% Barriers in adopting data analysis Cost Staff Training Lack of understanding of big data and its uses Problems with database software Other (please specify) Not usable for customer Figure 311 Question 9 Barriers in adopting data analysis? Figure 312 Question 9 Barriers in adopting data analysis?
  • 68. 68 o Retailers with 250 plus employees, as seen in figure - 1 said cost was an issue - 1 said staff straining was a barrier - 3 said a lack of understanding of data analytics and its uses as a barrier - 1 had problems with database software As seen in figure 38 a lack of understanding especially high in smaller independent retailers at 44%. Most of these retailers will have data analytics in their stores. Should the retailers spend more on staff training to alleviate the lack of understanding. Even in the larger retailers questioned, 50% of them had a lack of understanding of big data as a barrier. These retailers you would think would have better training or have the pick of better graduates that understand and have been trained at data analytics. Again smaller retailers had database problems as a barrier. This is no surprise as that size retailer may not that the IT database structure needed for data analytics. 0 1 2 3 4 5 6 7 8 0 - 50 Employees 50 - 250 Employees 250 plus Employees What are the barriers to adopting Data Analytics? Cost Staff Training Lack of Understanding Database Issues Other Figure 38 What are the barriers to adopting Data Analytics?
  • 69. 69 To break figure 38 down further to each separate section in percentages, it is clearly visible that a lack of understanding is the biggest barrier, followed closely by staff training and database issues. As claimed by (Forrester Consulting 2014) using data analytics can increase profit margins by 60%. Is investing in cost, staff training and database worth it for retailers to get a good end product. The lack of understanding is also stated by (Forrester Consulting 2014) in the same article that 68% of retail CIO’s said they collect data, but agreed they were not maximising its full potential. Staff training would help reduce this figure. The ‘other’ barrier was to ensuring that you adhere to the strict data protection laws as a barrie Data protection and big data are very important. Organisations should think about them hand in hand. in Ireland the Data Protection Act (DPA) of 1988 and 2003. This act asks regulatory requirements when it comes to handling private data. Data protection is vitally important for customers and retailers alike. Data protection protects the customer’s data and gives retailers guidelines to follow when collecting the data in a lawful way. (AL Goodbody) gave key guidelines for organisations to follow; o Who controls the data: - An organisation should have a data controller to regulate the rational for which the data is processed. - A data processor is a third part who administers the data on behalf of the data controller. 12% 19% 44% 19% 6% Retailers with 0 - 50 Employees Cost Staff Training Lack of understanding Database issues Other Figure 313 Retailers with 0 - 50 employees Figure 14 Retailers with 50 - 250 EmployeesFigure 315 Retailers with 0 - 50 employees
  • 70. 70 - A written agreement needs to be in place between the data controller and the data processor in relation to obtaining, retaining, retrieving and deleting data. o Registration with the Office of the Data Protection Commissioner (ODPC): - Normally all data controllers and data processers have to register with the ODPC unless exempt. o Consent with data subjects: - Consent is the standard way to legitimise data processing. - Article 29 Working Group, which includes several data protections regulators of the EU, said, consumers “specific, explicit, consent” is needed for organisations to use customer data when using big data projects. o Appropriate security measures: - Data controllers must ensure that suitable security measures are in place for the nature of the data stored. - This includes a level of security to the harm that might result from any unauthorised or unlawful processing, accidental or unlawful destruction of data. o Transfer of data: - Under the DPA, the transfer of customer’s data to a country outside the EU is generally prohibited unless that country is permitted and guarantees an acceptable level of protection for the confidentiality and fundamental rights and freedoms of the data subjects, or certain other conditions are met. o Appropriate data protection policy: - Organisations that collect and process customer’s data must have in place an appropriate data protection policy. - Some data may have been collected when no data protection policies were in place. - For particular big data projects, a customised data protection policy may be put in place.
  • 71. 71 o De-identification of data: - Under Irish data protection law, it is conceivable for organisations to accomplish legally permissible de- identification. - Nevertheless, organisations looking at de-identification to circumvent privacy and data protection laws should proceed carefully. The retailer who found adhering to data protection laws a barrier should look at the laws themselves. The laws are there to protect the customers that use that shop and to protect the retailer itself. These laws are not to hold back organisations but to help organisations understand the importance of privacy when it comes to customer data. For retailers with 50 – 250 employees, the barriers are cost and lack of understanding as seen in Figure 40.50% 0% 50% 0%0% Retailers with 50 - 250 Employees Cost Staff Training Lack of Understanding Database Issues Other Figure 16 Retailers with 50 - 250 Employees
  • 72. 72 Figure 41 Retailers with employees of 250 plus Question 3: Are Irish retailers using data analytics to its full potential? This question was made up of 5 questions in the survey, question 3,4,5,6 and question 7. o What form of data collection do you use? o In your opinion, has Data Analytics changed how your organisation collects customer information? o What, if any, do you think are the benefits of retailers sharing data (such as POS, inventory, and customer loyalty) with suppliers? o How does Data Analytics help retailers manage product availability for customers? o How important do you consider the use of Data Analytics for retailers to gain a competitive advantage over their competitors? Cost 16% Staff Training 17% Lack of Understanding 50% Database Issues 17% Retailers with 250 plus Employees Cost Staff Training Lack of Understanding Database Issues