The role of data quality in
managerial decision making
Ana Isabel Canhoto
University of Sussex Business School
a.i.canhoto@sussex.ac.uk
www.anacanhoto.com;
@canhoto
Customer
Data
Customer
insight
Service automation - e.g.,
Customer service,
Recommendations, Customer
screening, Market research…
generates
Machine
Learning
2
© Ana Isabel Canhoto, 2023
Service Automation
The promise of Big Data + ML
© Ana Isabel Canhoto, 2023 3
FIG. 1 — The encoding of platform participation by social media.
Viewed in this light, social media establish online a drastically simplified version of
social interaction and communication. Essentially, on social media basic things or
entities such as users, comments, photos, posts are all classified as data objects and
every activity connecting two objects as action. For instance, Facebook defines sta-
tus updates, pictures, videos, etc. as objects because in this way objects can be con-
nected, or, as Facebook calls it, edged (Bucher 2012). Through this elementary syn-
tax, every action undertaken on Facebook generates an edge, that is, a link connect-
ing two objects. “Liking an object, tagging a photo, leaving a comment, these are all
edge generators.”3
Encoding activities such as sharing, tagging, liking, and so on
provide connections between two objects that can be further computed (see Figure
2). By processing the data resulting from the encoding of user interaction, the system
is able to extract potentially meaningful sets of information on user behavior. For
instance, in the case of Facebook, connections or edges are ranked under different
criteria, such as how recent they are (what Facebook calls time decay), or how close
the two end-users connected are (what Facebook calls affinity) (see Bucher 2012).
FIG. 2— The exemplified script of social interaction as encoding of data.
3
See for instance Taylor (2011) “Everything you need to know about Facebook’s EdgeRank” The
The promise of Big Data + ML
© Ana Isabel Canhoto, 2023 4
Source: https://doi.org/10.1016/j.bushor.2019.11.003
Type Description Examples
Historical Records of past events
stored in internal or external
databases
Customer’s past transaction
data
external credit rating
information
Real time Activity data collected via
sensors of by online trackers
Beacons in stores, or tracking
of online activity
Knowledge Records of outcomes of past
problem-solving exercises
Past product
recommendations which were
accepted or rejected
The promise of Big Data + ML
• Novel insight
• Cost savings
• Branding
© Ana Isabel Canhoto, 2023 5
Source: https://doi.org/10.1016/j.indmarman.2021.11.001
The banking project
© Ana Isabel Canhoto, 2023 6
More: https://www.starlingbank.com/docs/reports-research/StarlingGenderRepresentationReport.pdf
The banking project
© Ana Isabel Canhoto, 2023 7
Generative
AI
© Ana Isabel Canhoto, 2023 8
Source: https://www.jamaissanselles.fr/biais-intelligence-artificielle/
The handful of datasets that rule our
lives
© Ana Isabel Canhoto, 2023 9
Over 50% of dataset usages in PWC as of June 2021 can
be attributed to just twelve institutions. Moreover, this
concentration… has increased to over 0.80 in recent years
(Figure 3 right red)
Source: https://arxiv.org/abs/2112.01716
The promise of Big Data + ML
© Ana Isabel Canhoto, 2023 10
Customer
Data
Customer
insight
Service automation - e.g.,
Customer service,
Recommendations, Customer
screening, Market research…
generates
Machine
Learning
11
© Ana Isabel Canhoto, 2023
It starts (and ends) with data
Quality of data
needed to inform
decision making
Quality of data
available to
inform decision
making
The data quality gap
12
© Ana Isabel Canhoto, 2023
Data quality
13
© Ana Isabel Canhoto, 2023
Product
Production
Access
Use
Adapted from: Kahn, B. K., Strong, D.M. & Wang, R. Y. (2002). Information quality benchmarks: Product and
service performance. Communications of the ACM 45(4): 184-192.
Sound
(What is shared)
Dependable
(How it is shared)
Fit for use
(What is gathered)
Usable
(How it is gathered)
• Achieve goal
• Accurate representation
(identity, activities, state of
mind…)
• Explains phenomenon
• Novel
• Timely
• Understandable
• Availability
• Ease of use
• Task fit (affordances)
• Netiquette (norms)
• Accessible
• Easy to integrate
• Cost effective
• High quality source
Fit for use
Usable
Sound
Dependable
Product
Production
Access
Use
• Explains phenomenon
• High quality source
• Accurate
representation
• Availability
• Accessible
• Easy to
integrate
• Cost effective
• Netiquette (norms)
• Task fit (affordances)
• Ease of use
• Understandable
• Timely
• Novel
• Achieve goal
Data
Qual
© Ana Isabel Canhoto, 2023
14
Customer
Data
Customer
insight
Service automation - e.g.,
Customer service,
Recommendations, Customer
screening, Market research…
generates
Machine
Learning
15
© Ana Isabel Canhoto, 2023
Service Automation
The role of data quality in
managerial decision making
Ana Isabel Canhoto
University of Sussex Business School
a.i.canhoto@sussex.ac.uk
www.anacanhoto.com;
@canhoto

Data Qual.pptx

  • 1.
    The role ofdata quality in managerial decision making Ana Isabel Canhoto University of Sussex Business School a.i.canhoto@sussex.ac.uk www.anacanhoto.com; @canhoto
  • 2.
    Customer Data Customer insight Service automation -e.g., Customer service, Recommendations, Customer screening, Market research… generates Machine Learning 2 © Ana Isabel Canhoto, 2023 Service Automation
  • 3.
    The promise ofBig Data + ML © Ana Isabel Canhoto, 2023 3 FIG. 1 — The encoding of platform participation by social media. Viewed in this light, social media establish online a drastically simplified version of social interaction and communication. Essentially, on social media basic things or entities such as users, comments, photos, posts are all classified as data objects and every activity connecting two objects as action. For instance, Facebook defines sta- tus updates, pictures, videos, etc. as objects because in this way objects can be con- nected, or, as Facebook calls it, edged (Bucher 2012). Through this elementary syn- tax, every action undertaken on Facebook generates an edge, that is, a link connect- ing two objects. “Liking an object, tagging a photo, leaving a comment, these are all edge generators.”3 Encoding activities such as sharing, tagging, liking, and so on provide connections between two objects that can be further computed (see Figure 2). By processing the data resulting from the encoding of user interaction, the system is able to extract potentially meaningful sets of information on user behavior. For instance, in the case of Facebook, connections or edges are ranked under different criteria, such as how recent they are (what Facebook calls time decay), or how close the two end-users connected are (what Facebook calls affinity) (see Bucher 2012). FIG. 2— The exemplified script of social interaction as encoding of data. 3 See for instance Taylor (2011) “Everything you need to know about Facebook’s EdgeRank” The
  • 4.
    The promise ofBig Data + ML © Ana Isabel Canhoto, 2023 4 Source: https://doi.org/10.1016/j.bushor.2019.11.003 Type Description Examples Historical Records of past events stored in internal or external databases Customer’s past transaction data external credit rating information Real time Activity data collected via sensors of by online trackers Beacons in stores, or tracking of online activity Knowledge Records of outcomes of past problem-solving exercises Past product recommendations which were accepted or rejected
  • 5.
    The promise ofBig Data + ML • Novel insight • Cost savings • Branding © Ana Isabel Canhoto, 2023 5 Source: https://doi.org/10.1016/j.indmarman.2021.11.001
  • 6.
    The banking project ©Ana Isabel Canhoto, 2023 6 More: https://www.starlingbank.com/docs/reports-research/StarlingGenderRepresentationReport.pdf
  • 7.
    The banking project ©Ana Isabel Canhoto, 2023 7 Generative AI
  • 8.
    © Ana IsabelCanhoto, 2023 8 Source: https://www.jamaissanselles.fr/biais-intelligence-artificielle/
  • 9.
    The handful ofdatasets that rule our lives © Ana Isabel Canhoto, 2023 9 Over 50% of dataset usages in PWC as of June 2021 can be attributed to just twelve institutions. Moreover, this concentration… has increased to over 0.80 in recent years (Figure 3 right red) Source: https://arxiv.org/abs/2112.01716
  • 10.
    The promise ofBig Data + ML © Ana Isabel Canhoto, 2023 10
  • 11.
    Customer Data Customer insight Service automation -e.g., Customer service, Recommendations, Customer screening, Market research… generates Machine Learning 11 © Ana Isabel Canhoto, 2023 It starts (and ends) with data
  • 12.
    Quality of data neededto inform decision making Quality of data available to inform decision making The data quality gap 12 © Ana Isabel Canhoto, 2023
  • 13.
    Data quality 13 © AnaIsabel Canhoto, 2023 Product Production Access Use Adapted from: Kahn, B. K., Strong, D.M. & Wang, R. Y. (2002). Information quality benchmarks: Product and service performance. Communications of the ACM 45(4): 184-192. Sound (What is shared) Dependable (How it is shared) Fit for use (What is gathered) Usable (How it is gathered) • Achieve goal • Accurate representation (identity, activities, state of mind…) • Explains phenomenon • Novel • Timely • Understandable • Availability • Ease of use • Task fit (affordances) • Netiquette (norms) • Accessible • Easy to integrate • Cost effective • High quality source
  • 14.
    Fit for use Usable Sound Dependable Product Production Access Use •Explains phenomenon • High quality source • Accurate representation • Availability • Accessible • Easy to integrate • Cost effective • Netiquette (norms) • Task fit (affordances) • Ease of use • Understandable • Timely • Novel • Achieve goal Data Qual © Ana Isabel Canhoto, 2023 14
  • 15.
    Customer Data Customer insight Service automation -e.g., Customer service, Recommendations, Customer screening, Market research… generates Machine Learning 15 © Ana Isabel Canhoto, 2023 Service Automation
  • 16.
    The role ofdata quality in managerial decision making Ana Isabel Canhoto University of Sussex Business School a.i.canhoto@sussex.ac.uk www.anacanhoto.com; @canhoto