“Generative AI in research”training series
Ana Isabel Canhoto
Professor of Digital Business
University of Sussex Business School
Deputy Director of the SeNSS Doctoral Training
Partnership
a.i.canhoto@sussex.ac.uk
www.anacanhoto.com
@acanhoto.bsky.social
Perceptions of GenAI
AI Literacy
• Generative AI (GenAI) is
revolutionising how we do –
and think about - research
• Cc 60,000 papers published in
2023 may have been written
using LLMs(1)
• Up to 16.9% of reviews at
leading AI conferences
“substantially generated” by
LLMs(2)
(1) https://arxiv.org/pdf/2403.16887; (2) https://arxiv.org/pdf/2403.07183
2
GenAI in research work
Authors Problematisation Lit Review Data source Data analysis Writing Dissemination
Bing (2023) x
Bokolo & Liu (2024) x
Burger et al (2023) x
Cao & Cao (2023) x
Cherinka & Prezzama (2023) x x
Combrinck (2024) x
De Paoli (2024) x
Goetschalckx etal (2021) x
Guerberof-Arenas & Asimakoulas (2023) x
Hong & Wang (2024) x
Karinshak etal (2023) x
Kupferschmidt et al (2024) x
Lu et al (2024) x
M. Foisy etal ((2024) x
Maathuis & Kerkhof (2024a) x
Maathuis & Kerkhof (2024b). x
Nguyen et al (2024) x
Pividori & Greene (2023) x
Rathje et al (2024) x
Sebastian et al(2025) x
Steinmacher etal (2024) x
Stokel-Walker & Van Noorden (2023) x x x
Telitsyna (2024) x x x x
Vani Lakshmi etal (2024) x
Wang et al(2024) x
Yegemberdiyeva & Amirgaliyev (2021) x
Zhao et a (2025) x
3 3 5 15 4 3
• Improved readability
• Detected an error in an equation
• Required an average of 5 minutes and $0.50 / per
manuscript to review
Faster, more accurate and more consistent than humans,
at qualitative data analysis
Achieves relatively high accuracy across many
languages… may help facilitate more cross-linguistic
research with understudied languages
Could address common methodological challenges like
recruitment difficulties, representational shortcomings,
and respondent fatigue
3
GenAI in research work
GenAI in research work
Social Sciences
lagging behind:
• Raman (2023):
Predominantly,
Biomedical and Clinical
Sciences, as well as
Information and
Computing Sciences,
engaging with AI tools
Gender gap:
• Ofcom (2024): 50%
males aged 16+ vs 33%
females
• World Economic Forum
(2024): 71% 18-24 y.o.
males vs 59% females
Role of self-
confidence:
• Lee et al (2025):
Higher confidence in
GenAI => LESS
critical thinking vs
Higher self-
confidence =>
MORE critical
thinking 4
A socio-technical view of GenAI
GenAI
weaknesses
GenAI
strengths
Effective
use
Ineffective
use
5
A (very) quick primer on AI
• AI as a system: Data + Algorithm + Output
• Effective use: Fit between characteristics of each system component
and the task at hand
• PhD / Research work
6
Are used
in…
Result
in…
Computation
Data inputs Output
Adapted from https://doi.org/10.1016/j.jbusres.2020.10.012
A (very) quick primer on AI
• AI as a system: Data + Algorithm + Output
• An algorithm is a recipe
• When input is X => produce output Y
• With machine learning, the computer “discovers” its own recipe
7
Type of ML
algorithm
Description Visualisation Best application of this type
of ML
Supervised Learn the patterns that
connect the inputs with the
outputs, and develop rules
to be applied to future
instances of the same
problem
Regression - e.g., predict
inflation
Classification – e.g., medical
imaging
Unsupervised Learn how the data points
are related to each other,
and different from others
Clustering – e.g., segmentation
Association – e.g., market
basket analysis
Reinforced Find the best combination
of data points actions to
achieve a particular goal
Classification – e.g., GPS
navigation
Control – e.g., emergency
systems
8
Adapted from: https://anacanhoto.com/2020/02/21/types-of-machine-learning-a-crib-sheet-for-marketers/
Deep learning: Generative AI - LLMs
9
Text from
Wikipedia,
Media articles,
papers, etc
for training
Calculate
probability of
tokens (words)
being together
Tokens
(words)
that usually
follow the
tokens in
the prompt
Adapted from https://numind.ai/blog/what-are-large-language-models
Model fine tuning
10
Adapted from https://www.youtube.com/watch?v=fwaDtRbfioU&list=PLuD_SqLtxSdVDcrCYIHayTL91DapuIHrO&index=11
In summary
• GenAI are (very powerful) statistical models, not knowledge models
• Do not use it, if you can’t assess the veracity of the output
• The models reflect the data used in the training
• We can get new combinations, but not new knowledge
• Beware of gaps and embedded bias
• The models reflect the fine-tuning
• Be aware of model fit, who provides feedback, commercial interests, political
influence, etc…
• Closer fit with task domain and domain specific data => better model’s
performance
• If your domain is poorly reflected in the training data, the result will be poor
11
“Please translate this into European Portuguese "The
doctor is happy; the nurse is happy”
O médico está feliz; a enfermeira está feliz.
(M) (F)
A socio-technical view of GenAI
12
• Discuss in groups (10 minutes):
• For which typical PhD tasks is the use of GenAi effective vs ineffective?
• Can the use of GenAI be compatible with the learning outcomes for a PhD?
• e.g., Edinburgh Business School:
GenAI capabilities
GenAI
weaknesses
GenAI
strengths
• Summarisation
• Provision of (verifiable) common
sense examples, metaphors…
• Brainstorming / Experimentation
• <8?
• Specialised GenAIs
• e.g., Deep Research, NotebookLM
• Machine learning => use as
interlocutor / coach
13
Principles of experimentation
• Articulate your expectations
• Be as precise as possible with your prompts
• Be patient. There is a learning curve
• Learn from others
• Learn from yourself (reflective journal)
• Develop ethical awareness
GenAI capabilities
GenAI
weaknesses
GenAI
strengths
• Summarisation
• Provision of (verifiable) common
sense examples, metaphors…
• Brainstorming / Experimentation
• <8?
• Specialised GenAIs
• e.g., Deep Research, NotebookLM
• Machine learning => use as
interlocutor / coach
• Hallucinations => bad for use
outside of your domain knowledge
• Where non-bias is important
• Algorithm censoring (e.g., Grok/
Deep Seek)
• Machine Learning => poor
replicability
• Plagiarism machines => not original;
can’t use outputs directly
15
DOI: 10.25300/MISQ/2021/16553
A socio-technical view of GenAI
GenAI
weaknesses
GenAI
strengths
Effective
use
Ineffective
use
16
E.g., Electricity
XIX XX XXI
17
Century:
The TFI framework
• Anthropologist Edward T. Hall: All humans
operate on, and understand the world in, three
levels:
18
A socio-technical view of GenAI
GenAI
weaknesses
GenAI
strengths
Aligned with social norms
Against social norms
Effective
use
Ineffective
use
Unfair use
Futile use
19
Formal norms for GenAI use
• Direct: Scarce (and vague) formal
norms
• e.g., ESRC
• Indirect:
• Data protection policies
• Academic misconduct
• Etc…
20
Informal norms for GenAI use
21
Source: DOI:10.13140/RG.2.2.28031.05287
Viva examiners’view of“original contribution”
Journal editors - E.g., Davison et al (2024) – ISJ editorial:
“The researcher has to take epistemic responsibility
which involves being accountable to the evidence
where evidence is relationally constituted between
the researcher and the researched and to assume
responsibility for what the researcher claims to
know”
Society’s views of research work
Consider carefully:
Etc…
The informal layer of GenAI
• Personal values:
Source: https://uottawa.libguides.com/generative_ai/costs
22
The normative layer of GenAI
Against
social norms
Aligned with
social norms
• Summarization for personal use
• But not for abstracts
• To improve text readability for non-
academic audiences
• But not for academic ones
• Exploration – e.g., impact 1 vs 5
years?
• Partners in dialogue
• Impersonation (academic
misconduct)
• Should not use as search engines
(environmental impact)
• Data analysis?
• Quite granular outcomes
• Data protection
• Explainability
• NOT mechanisms to automate
academic labour 23
A socio-technical view of GenAI
GenAI
weaknesses
GenAI
strengths
Aligned with social norms
Against social norms
Effective
use
Ineffective
use
Unfair use
Futile use
• (W) Write abstracts for journal
articles
• (W) Define concepts
• (W) Create references list
• (W) Produce public
engagement materials
• (DA) Additional round of qualitative
data analysis of anonymized data, if
access to citations
• (DA) Additional round of
qualitative data analysis of
anonymized data, if NO
access to citations
• (DA) Only round of data
analysis
• (DA) Analysis of non-
anonymized data
24
Resources
25
“Generative AI in research”
training series
26th March 2025
The ethics of using AI in research
Professor Fragkiskos Filippaios
26
“Generative AI in research”training series
Ana Isabel Canhoto
Professor of Digital Business
University of Sussex Business School
Deputy Director of the SeNSS Doctoral Training
Partnership
a.i.canhoto@sussex.ac.uk
www.anacanhoto.com
@acanhoto.bsky.social
Perceptions of GenAI
AI Literacy

Introduction to generative AI for PhD students

  • 1.
    “Generative AI inresearch”training series Ana Isabel Canhoto Professor of Digital Business University of Sussex Business School Deputy Director of the SeNSS Doctoral Training Partnership a.i.canhoto@sussex.ac.uk www.anacanhoto.com @acanhoto.bsky.social Perceptions of GenAI AI Literacy
  • 2.
    • Generative AI(GenAI) is revolutionising how we do – and think about - research • Cc 60,000 papers published in 2023 may have been written using LLMs(1) • Up to 16.9% of reviews at leading AI conferences “substantially generated” by LLMs(2) (1) https://arxiv.org/pdf/2403.16887; (2) https://arxiv.org/pdf/2403.07183 2 GenAI in research work
  • 3.
    Authors Problematisation LitReview Data source Data analysis Writing Dissemination Bing (2023) x Bokolo & Liu (2024) x Burger et al (2023) x Cao & Cao (2023) x Cherinka & Prezzama (2023) x x Combrinck (2024) x De Paoli (2024) x Goetschalckx etal (2021) x Guerberof-Arenas & Asimakoulas (2023) x Hong & Wang (2024) x Karinshak etal (2023) x Kupferschmidt et al (2024) x Lu et al (2024) x M. Foisy etal ((2024) x Maathuis & Kerkhof (2024a) x Maathuis & Kerkhof (2024b). x Nguyen et al (2024) x Pividori & Greene (2023) x Rathje et al (2024) x Sebastian et al(2025) x Steinmacher etal (2024) x Stokel-Walker & Van Noorden (2023) x x x Telitsyna (2024) x x x x Vani Lakshmi etal (2024) x Wang et al(2024) x Yegemberdiyeva & Amirgaliyev (2021) x Zhao et a (2025) x 3 3 5 15 4 3 • Improved readability • Detected an error in an equation • Required an average of 5 minutes and $0.50 / per manuscript to review Faster, more accurate and more consistent than humans, at qualitative data analysis Achieves relatively high accuracy across many languages… may help facilitate more cross-linguistic research with understudied languages Could address common methodological challenges like recruitment difficulties, representational shortcomings, and respondent fatigue 3 GenAI in research work
  • 4.
    GenAI in researchwork Social Sciences lagging behind: • Raman (2023): Predominantly, Biomedical and Clinical Sciences, as well as Information and Computing Sciences, engaging with AI tools Gender gap: • Ofcom (2024): 50% males aged 16+ vs 33% females • World Economic Forum (2024): 71% 18-24 y.o. males vs 59% females Role of self- confidence: • Lee et al (2025): Higher confidence in GenAI => LESS critical thinking vs Higher self- confidence => MORE critical thinking 4
  • 5.
    A socio-technical viewof GenAI GenAI weaknesses GenAI strengths Effective use Ineffective use 5
  • 6.
    A (very) quickprimer on AI • AI as a system: Data + Algorithm + Output • Effective use: Fit between characteristics of each system component and the task at hand • PhD / Research work 6 Are used in… Result in… Computation Data inputs Output Adapted from https://doi.org/10.1016/j.jbusres.2020.10.012
  • 7.
    A (very) quickprimer on AI • AI as a system: Data + Algorithm + Output • An algorithm is a recipe • When input is X => produce output Y • With machine learning, the computer “discovers” its own recipe 7
  • 8.
    Type of ML algorithm DescriptionVisualisation Best application of this type of ML Supervised Learn the patterns that connect the inputs with the outputs, and develop rules to be applied to future instances of the same problem Regression - e.g., predict inflation Classification – e.g., medical imaging Unsupervised Learn how the data points are related to each other, and different from others Clustering – e.g., segmentation Association – e.g., market basket analysis Reinforced Find the best combination of data points actions to achieve a particular goal Classification – e.g., GPS navigation Control – e.g., emergency systems 8 Adapted from: https://anacanhoto.com/2020/02/21/types-of-machine-learning-a-crib-sheet-for-marketers/
  • 9.
    Deep learning: GenerativeAI - LLMs 9 Text from Wikipedia, Media articles, papers, etc for training Calculate probability of tokens (words) being together Tokens (words) that usually follow the tokens in the prompt Adapted from https://numind.ai/blog/what-are-large-language-models
  • 10.
    Model fine tuning 10 Adaptedfrom https://www.youtube.com/watch?v=fwaDtRbfioU&list=PLuD_SqLtxSdVDcrCYIHayTL91DapuIHrO&index=11
  • 11.
    In summary • GenAIare (very powerful) statistical models, not knowledge models • Do not use it, if you can’t assess the veracity of the output • The models reflect the data used in the training • We can get new combinations, but not new knowledge • Beware of gaps and embedded bias • The models reflect the fine-tuning • Be aware of model fit, who provides feedback, commercial interests, political influence, etc… • Closer fit with task domain and domain specific data => better model’s performance • If your domain is poorly reflected in the training data, the result will be poor 11 “Please translate this into European Portuguese "The doctor is happy; the nurse is happy” O médico está feliz; a enfermeira está feliz. (M) (F)
  • 12.
    A socio-technical viewof GenAI 12 • Discuss in groups (10 minutes): • For which typical PhD tasks is the use of GenAi effective vs ineffective? • Can the use of GenAI be compatible with the learning outcomes for a PhD? • e.g., Edinburgh Business School:
  • 13.
    GenAI capabilities GenAI weaknesses GenAI strengths • Summarisation •Provision of (verifiable) common sense examples, metaphors… • Brainstorming / Experimentation • <8? • Specialised GenAIs • e.g., Deep Research, NotebookLM • Machine learning => use as interlocutor / coach 13
  • 14.
    Principles of experimentation •Articulate your expectations • Be as precise as possible with your prompts • Be patient. There is a learning curve • Learn from others • Learn from yourself (reflective journal) • Develop ethical awareness
  • 15.
    GenAI capabilities GenAI weaknesses GenAI strengths • Summarisation •Provision of (verifiable) common sense examples, metaphors… • Brainstorming / Experimentation • <8? • Specialised GenAIs • e.g., Deep Research, NotebookLM • Machine learning => use as interlocutor / coach • Hallucinations => bad for use outside of your domain knowledge • Where non-bias is important • Algorithm censoring (e.g., Grok/ Deep Seek) • Machine Learning => poor replicability • Plagiarism machines => not original; can’t use outputs directly 15 DOI: 10.25300/MISQ/2021/16553
  • 16.
    A socio-technical viewof GenAI GenAI weaknesses GenAI strengths Effective use Ineffective use 16
  • 17.
    E.g., Electricity XIX XXXXI 17 Century:
  • 18.
    The TFI framework •Anthropologist Edward T. Hall: All humans operate on, and understand the world in, three levels: 18
  • 19.
    A socio-technical viewof GenAI GenAI weaknesses GenAI strengths Aligned with social norms Against social norms Effective use Ineffective use Unfair use Futile use 19
  • 20.
    Formal norms forGenAI use • Direct: Scarce (and vague) formal norms • e.g., ESRC • Indirect: • Data protection policies • Academic misconduct • Etc… 20
  • 21.
    Informal norms forGenAI use 21 Source: DOI:10.13140/RG.2.2.28031.05287 Viva examiners’view of“original contribution” Journal editors - E.g., Davison et al (2024) – ISJ editorial: “The researcher has to take epistemic responsibility which involves being accountable to the evidence where evidence is relationally constituted between the researcher and the researched and to assume responsibility for what the researcher claims to know” Society’s views of research work Consider carefully: Etc…
  • 22.
    The informal layerof GenAI • Personal values: Source: https://uottawa.libguides.com/generative_ai/costs 22
  • 23.
    The normative layerof GenAI Against social norms Aligned with social norms • Summarization for personal use • But not for abstracts • To improve text readability for non- academic audiences • But not for academic ones • Exploration – e.g., impact 1 vs 5 years? • Partners in dialogue • Impersonation (academic misconduct) • Should not use as search engines (environmental impact) • Data analysis? • Quite granular outcomes • Data protection • Explainability • NOT mechanisms to automate academic labour 23
  • 24.
    A socio-technical viewof GenAI GenAI weaknesses GenAI strengths Aligned with social norms Against social norms Effective use Ineffective use Unfair use Futile use • (W) Write abstracts for journal articles • (W) Define concepts • (W) Create references list • (W) Produce public engagement materials • (DA) Additional round of qualitative data analysis of anonymized data, if access to citations • (DA) Additional round of qualitative data analysis of anonymized data, if NO access to citations • (DA) Only round of data analysis • (DA) Analysis of non- anonymized data 24
  • 25.
  • 26.
    “Generative AI inresearch” training series 26th March 2025 The ethics of using AI in research Professor Fragkiskos Filippaios 26
  • 27.
    “Generative AI inresearch”training series Ana Isabel Canhoto Professor of Digital Business University of Sussex Business School Deputy Director of the SeNSS Doctoral Training Partnership a.i.canhoto@sussex.ac.uk www.anacanhoto.com @acanhoto.bsky.social Perceptions of GenAI AI Literacy