Generative AI, Search Engines and GDPR

Professor David Erdos
Faculty of Law
University of Cambridge

Search Indexing & European DP: Timeline
 Mid-1980s – mid-1990s: Early concerns & some
regulation even of news archive searches
 Late-1990s-2000s: Search engines largely seen as “out of
reach”; some focus on limiting exposure e.g. no robots
 2007/8 onwards: Spanish DPA (with some wider
support) sees engines as at least ex post controllers
 2014-present: CJEU in Google Spain supports core
Spanish position; working out of reach & limitations

Generative AI & DP: Timeline
 Early 2023: Mass spread of generative AI including in
form of chatbots like Open AI ChatGPT & Google Bard
 March-July 2023: Investigations including temporary
ban on ChatGPT in Italy; EDPB creates Taskforce
 June 2023: G7 DPA Statement on Generative AI
 October 2023: GPA Resolution on Generative AI Systems
“current law applies to generative AI products and users, even as
different jurisdictions continue to develop AI-specific laws and policies”

Generative AI & DP: Unclear Realities
 Legal Basis? - Other than “legitimate interest” (which is always
insufficient for special category data)
 Categories of Personal Data and Sources?
 Storage Periods?
 Accuracy/Data Quality? - “Bard .. sometimes gives inaccurate or
inappropriate results”
 Subject Rights? - “we may not be able to correct the accuracy … [i]n
that case, you may request that we remove your Personal Data from
ChatGPT’s output”

Can/are Search Engine Limits Relied Upon?
“Inasmuch as the activity of a search engine is … liable to affect
significantly and additionally compared with that of the publishers of
websites, the fundamental rights to privacy and to the protection of
personal data, the operator of the search engine … must ensure, within the
framework of its responsibilities, powers and capabilities, that the activity
meets the requirements of Directive 95/46 in order that the guarantees
laid down may have full effect and that effective and complete protection
of data subjects, in particular of the right to privacy, may actually be
achieved.” (Google Spain (2014) at [38])
 GC et. al (2019): Freedom of expression/information can be
invoked (although journalistic derogation not applicable)

Substantive Freedom of Expression Limits
 Sensitive Data – GC et. al. (2019):
 Accuracy - TU, RE v Google (2022):
“[T]he operator must … ascertain, having regard to the reasons of
substantial public interest referred to in … Article 9(2)(g) of Regulation
2016/679 and in compliance with conditions laid down in those
provisions, whether the inclusion of that link in the list of results
displayed following a search on the basis of the data subject’s name is
strictly necessary for protecting the freedom of information of internet
users” (at [68])
“where, at the very least, a part – which is not minor in relation to the
content as a whole – of the information referred to in the request for de-
referencing proves to be inaccurate… the right to inform and the right to
be informed cannot be taken into account.” (at [64])

Responsibility Limitations: TU, RE (2022)
 Only Ex Post:
 Without Active Investigatory Duties:
“the prohibitions and restrictions laid down by … the GDPR can apply to
that operator only by reason of that referencing and thus via a
verification, under the supervision of the competent national authorities,
on the basis of a request by the data subject.” (at [53])
“operator cannot be required to play an active role in trying to find facts
which are not substantiated by the request for de-referencing.” (at [70])

Relevant Generative AI Experience to Date
 DP By Design & DP Impact Assessments - stressed by DPAs and
strongly ex ante not ex post in nature
 Proactive Transparency – also stressed, including by Italian DPA
which required Open AI to carry out active information campaign
 Rectification – ChatGPT states can’t always be carried out
 Restriction – Not mentioned (and burden of proof re accuracy
remains unclear)
 All processing or just results? – Specification of rights for non-
users generally focus only on latter

Significant & Additional Rights Risk Limits
 CJEU has always expressed conceptually
 EDPB (2020) only states right “mainly based” on name search
 Italian DPA (2019) applied right to search on a job title
 But Google resolutely limits right to name-search only
“the operator of a search engine is responsible … because of the
referencing of that page and in particular the display of the link to that
web page in the list of results presented to internet users following a
search on the basis of an individual’s name, since such a display of the
link in such a list is liable significantly to affect the data subject’s
fundamental rights”

Relevant Generative AI Experience to Date
 Chat GPT talks in Removal Form about “prompts” (although
unclear if it sees some prompts as too remote for any action)
 Also states that “our training information does incidentally
include personal information” & no clear route given for access or
control rights regarding this (although is under DPA examination)

Taking Stock
 Even within the EU, search engine indexing benefits from
far-reaching exemptions from data protection
 Exemptions enable a balance to be achieved with innovation,
freedom of information etc. but are in essence extra- (& often
contra-) legislative & grant operators great (& often
disproportionate) discretion
 Generative AI services act even less as an intermediary and
process personal data in an even more active manner
 Should seek a better way than this to ensure a balance
between Generative AI products and data protection

Generative AI, Search Engines and GDPR

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Generative AI, Search Engines and GDPR

Similar to Generative AI, Search Engines and GDPR (20)

More from David Erdos

More from David Erdos (20)

Recently uploaded

Recently uploaded (20)

Generative AI, Search Engines and GDPR