These slides explore the interface between generative AI services such as ChatGPT and Google Bard and the GDPR in light of the experience of search engine indexing under the EU framework. In contrast to search engines, EU data protection authorities have responded promptly to the emergence of generative AI and, in principle, have stressed the need for full data protection compliance. However, in reality a host of legal problems remain live including an absence of a clear legal basis at least for sensitive personal data, uncertainty about whether data quality standards and data subject rights at least as regards background processing are or even can be met and failures of transparency as regards the categories, sources and storage periods for the personal data under processing. There is a serious likelihood, and indeed even present indications, that generative AI services will seek to claim the extra- and even contra-legislative derogations crafted in case law for search engines which limit duties to situations where processing is liable to affect fundamental rights “significantly and additionally” and to actions which are deemed to fall within the “responsibilities, powers and capabilities” of the service operators. Such derogations grant operators too much discretion and pay insufficient attention to the highly active manner in which generative AI services process personal data.
2. Search Indexing & European DP: Timeline
Mid-1980s – mid-1990s: Early concerns & some
regulation even of news archive searches
Late-1990s-2000s: Search engines largely seen as “out of
reach”; some focus on limiting exposure e.g. no robots
2007/8 onwards: Spanish DPA (with some wider
support) sees engines as at least ex post controllers
2014-present: CJEU in Google Spain supports core
Spanish position; working out of reach & limitations
3. Generative AI & DP: Timeline
Early 2023: Mass spread of generative AI including in
form of chatbots like Open AI ChatGPT & Google Bard
March-July 2023: Investigations including temporary
ban on ChatGPT in Italy; EDPB creates Taskforce
June 2023: G7 DPA Statement on Generative AI
October 2023: GPA Resolution on Generative AI Systems
“current law applies to generative AI products and users, even as
different jurisdictions continue to develop AI-specific laws and policies”
4. Generative AI & DP: Unclear Realities
Legal Basis? - Other than “legitimate interest” (which is always
insufficient for special category data)
Categories of Personal Data and Sources?
Storage Periods?
Accuracy/Data Quality? - “Bard .. sometimes gives inaccurate or
inappropriate results”
Subject Rights? - “we may not be able to correct the accuracy … [i]n
that case, you may request that we remove your Personal Data from
ChatGPT’s output”
5. Can/are Search Engine Limits Relied Upon?
“Inasmuch as the activity of a search engine is … liable to affect
significantly and additionally compared with that of the publishers of
websites, the fundamental rights to privacy and to the protection of
personal data, the operator of the search engine … must ensure, within the
framework of its responsibilities, powers and capabilities, that the activity
meets the requirements of Directive 95/46 in order that the guarantees
laid down may have full effect and that effective and complete protection
of data subjects, in particular of the right to privacy, may actually be
achieved.” (Google Spain (2014) at [38])
GC et. al (2019): Freedom of expression/information can be
invoked (although journalistic derogation not applicable)
6. Substantive Freedom of Expression Limits
Sensitive Data – GC et. al. (2019):
Accuracy - TU, RE v Google (2022):
“[T]he operator must … ascertain, having regard to the reasons of
substantial public interest referred to in … Article 9(2)(g) of Regulation
2016/679 and in compliance with conditions laid down in those
provisions, whether the inclusion of that link in the list of results
displayed following a search on the basis of the data subject’s name is
strictly necessary for protecting the freedom of information of internet
users” (at [68])
“where, at the very least, a part – which is not minor in relation to the
content as a whole – of the information referred to in the request for de-
referencing proves to be inaccurate… the right to inform and the right to
be informed cannot be taken into account.” (at [64])
8. Responsibility Limitations: TU, RE (2022)
Only Ex Post:
Without Active Investigatory Duties:
“the prohibitions and restrictions laid down by … the GDPR can apply to
that operator only by reason of that referencing and thus via a
verification, under the supervision of the competent national authorities,
on the basis of a request by the data subject.” (at [53])
“operator cannot be required to play an active role in trying to find facts
which are not substantiated by the request for de-referencing.” (at [70])
9. Relevant Generative AI Experience to Date
DP By Design & DP Impact Assessments - stressed by DPAs and
strongly ex ante not ex post in nature
Proactive Transparency – also stressed, including by Italian DPA
which required Open AI to carry out active information campaign
Rectification – ChatGPT states can’t always be carried out
Restriction – Not mentioned (and burden of proof re accuracy
remains unclear)
All processing or just results? – Specification of rights for non-
users generally focus only on latter
10. Significant & Additional Rights Risk Limits
CJEU has always expressed conceptually
EDPB (2020) only states right “mainly based” on name search
Italian DPA (2019) applied right to search on a job title
But Google resolutely limits right to name-search only
“the operator of a search engine is responsible … because of the
referencing of that page and in particular the display of the link to that
web page in the list of results presented to internet users following a
search on the basis of an individual’s name, since such a display of the
link in such a list is liable significantly to affect the data subject’s
fundamental rights”
11. Relevant Generative AI Experience to Date
Chat GPT talks in Removal Form about “prompts” (although
unclear if it sees some prompts as too remote for any action)
Also states that “our training information does incidentally
include personal information” & no clear route given for access or
control rights regarding this (although is under DPA examination)
12. Taking Stock
Even within the EU, search engine indexing benefits from
far-reaching exemptions from data protection
Exemptions enable a balance to be achieved with innovation,
freedom of information etc. but are in essence extra- (& often
contra-) legislative & grant operators great (& often
disproportionate) discretion
Generative AI services act even less as an intermediary and
process personal data in an even more active manner
Should seek a better way than this to ensure a balance
between Generative AI products and data protection