Trustworthy AI and Open Science

TRUSTWORTHY AI AND
OPEN SCIENCE
Beth Plale
Michael A and Laurie Burns McRobbie Professor of Computer Engineering
Beilstein Open Science symposium
October 06, 2021
Luddy School of Informatics, Computing, and Engineering
Data To Insight Center

Observations influenced by my role (2017-2020) in the
National Science Foundation working on agency policies
and practice in open science. Views expressed are
entirely my own.
Funding agency perspective on open science: how do
we bring visibility to the products of research (that we
fund)

NSF funds the collection and capture
of research data through projects
ranging from a few hundred thousand
dollars to tens of millions of dollars.
The data are maintained in a
landscape of solutions to meet the
needs of researchers.

Specialist repositories
- Organizational resources
Generalist repositories
Data Portals
- Low velocity data
- Employs cloud resources
- Employs data-compute proximity for analysis
Observation networks
- High velocity data
RESEARCH DATA LANDSCAPE
SAGE
NEON ARM
HPWREN
UWI
LTER, OOI
NEON
HydroShare
LTER
MGDS, IRIS
ICPSR
QDR
TAIR
MDF
IEDA
PDB
CCDC
DataVerse
Figshare
Dryad
Zenodo
IRs
Exemplar
systems

Data
timeliness
need
Researcher
depth of
expertise
Expectation
for level of
curation
Expectation
of data
longevity
Data Portals
- Low velocity data
SAGE
NEON ARM
HPWREN
UWI
LTER, OOI
NEON
HydroShare
LTER
MGDS, IRIS
ICPSR
QDR
TAIR
MDF
IEDA
PDB
CCDC
DataVerse
Figshare
Dryad
Zenodo
IRs

Publisher’s
view of
landscape
(general
public
view as
well)
Optimization
for timeliness
of research
could
suggest
lower value
over time
Data Portals
- Low velocity data
SAGE
NEON ARM
HPWREN
UWI
LTER, OOI
NEON
HydroShare
LTER
MGDS, IRIS
ICPSR
QDR
TAIR
MDF
IEDA
PDB
CCDC
DataVerse
Figshare
Dryad
Zenodo
IRs

Generalist–Aided
Deposit:
engages generalist
curators
Metadata:
generalist schema
Reuse potential:
moderate-low as
metadata is curated
but general
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
Specialist-DBMS
Deposit:
difficult so DB often
read-only
Metadata:
data dictionary + DB
schema
Reuse potential:
high potential as self
contained
Scope:
subdiscipline scope
Discovery:
known within
subdiscipline
Specialist–Aided
Deposit:
engages specialist
curators
Metadata:
specialized
schema
Reuse potential:
high due to
specialists
Scope:
discipline scope
Discovery:
known within
discipline
Specialist-Unaided
Deposit:
unaided deposit
Metadata:
specialized schema
Reuse potential:
moderate-high from
discipline focus of
metadata schema
Scope:
discipline scope
Discovery:
known within
discipline
Generalist-Unaided
Deposit:
unaided deposit
Metadata:
generalist schema
Reuse potential:
low as metadata is
minimal
Scope:
discipline agnostic
scope
Discovery:
broad name
recognition
i.e., institutional repositories

FEDERAL RESEARCH DATA SUMMARY
• Observation networks and data portals are a fixed part of the
landscape. They have a different role in open science than do
repositories
• Generalist repositories are easier to use than specialist
repositories
• Specialist repositories have higher reusability
• Generalist repositories have economies of scale
• If specialist repositories can leverage generalist repositories as
back ends it would reduce overall cost

OPEN SCIENCE ROLE IN AI
TRUSTWORTHINESS

“ON ARTIFICIAL
INTELLIGENCE, TRUST
IS A MUST, NOT A
NICE-TO-HAVE”
Margrethe Vestager, the European
Commission executive vice president
who oversees digital policy for the 27-
nation bloc

TRUST ó TRUSTWORTHINESS
TRUST
• An individual’s confidence in an
entity
• “I trust this web site”
TRUSTWORTHINESS
• An entity’s state of being
trustworthy or reliable
• An estimate of an object’s
worthiness to receive someone’s
trust
• Trustworthiness is difficult to
accurately quantify

INDIANA UNIVERSITY BLOOMINGTON
AI: Human-Machine Interaction
§ Fitness smartwatch, smart hearing aids
§ Co-bots, cyber-crews, digital twins
§ Integration of smart machines into human body in
form of computer-brain interfaces or cyborgs
AI: Autonomous and Semi-
Autonomous Actors
• Weapon systems
• Robots in deep sea and space
exploration
• Self driving cars
• Bots in financial trade
AI: Big Data / Big Compute
• Deep learning / Machine Learning /
Natural Language Processing
• Medical diagnosis, image recognition
Broad Categories
of AI

Autonomous Actors
• Weapon systems
exploration
Broad Categories
of AI
Category with most
urgency in issues of
artificial moral agency

Autonomous Actors
• Weapon systems
exploration
Broad Categories
of AI
Research needed in policy
and technical extensions
that lead to greater and
more measurable forms of
accountability

INTERVENTION POINTS: ENHANCED
TRUSTWORTHINESS
Developer
ethics,
development
process norms
Societal influence:
public pressure,
legislation,
regulatory
oversight AI algorithmic
knowledge
exhibiting
higher levels
of
trustworthiness
Technological
manifestation:
verifiable claims,
explainability,
accountability

Trustworthy AI is AI that is designed, developed, and used in a
manner that is lawful, fair, unbiased, accurate, reliable,
effective, safe, secure, resilient, understandable, and with
processes in place to regularly monitor and evaluate the AI
system’s performance and outcomes
Lynne Parker, Deputy US Chief Technology Officer and Director of the National Artificial Intelligence Initiative Office

ML PROCESS
M. Veale et al., CHI 2018
Data
Training
data
Feature
extraction
Test data
Learning
algorithm
Trained
model
Predict
New
data
Explain-
ability
inquiries
dev ops

RESEARCH PRODUCTS
M. Veale et al., CHI 2018
Data
Training
data
Feature
extraction
Test data
Learning
algorithm
Trained
model
Predict
New
data
Explain-
ability
inquiries
dev ops

Open science contributes to trustworthy
AI (trusted products)
The research products of AI need to
include intermediate results and
explainability services

BETH PLALE
INDIANA UNIVERSITY
PLALE@INDIANA.EDU
TRUSTWORTHY

Trustworthy AI and Open Science

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Trustworthy AI and Open Science

Similar to Trustworthy AI and Open Science (20)

More from Beth Plale

More from Beth Plale (11)

Recently uploaded

Recently uploaded (20)

Trustworthy AI and Open Science