Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy

Human-Centered Artificial Intelligence: Reliable,
Safe & Trustworthy
Guidelines for Human-AI Interaction
Authors:
Di Pierro Davide
Impellizzeri Federico
Jalna Afridi
Mosca Sara

Get a modern PowerPoint Presentation that is beautifully
designed. Easy to change colors, photos and Text.
Contents
04
Contents
01
Contents
03
RST systems overcome autonomy problems.
Human-Centered Artificial Intelligence:
Reliable, Safe & Trustworthy
RELIABLE
Reliability derives from
the use of technical
practices which support
responsibility, fairness
and explainability such as
benchmark tests,
continuous review of
data quality and different
design strategies.
Safety derives from open
management strategies
like leadership, open
reporting of failures and
root-cause failure
analyses.
SAFETY TRUSTWORTHY
Trustworthiness derives by
adopting independent
oversight structures like
organizations that develop
standards (like IEEE) or
government agencies.
Public expectations go
beyond trusted systems,
they want trustworthy
systems based on
respected independent
oversight structures.
2/12

Dimensional HCAI Frameworks
In 1-dimensional framework designers had to choose a
point on the one line from human control to computer
automation. The implicit message was that more automation
meant less user control.
By splitting the 2 concepts of
autonomy/human control, a 2-dimensional
framework is obtained which suggests that
achieving high levels of human control and
high levels of computer automation is also
possible.
3/12

4/12
AI Components can be integrated into
user-facing system but result to have an
NON PROBABILISTIC BEHAVIOR
This can lead to:
● Degradation of UX;
● Erode User confidence;
● Abandon AI technologies.
This study summarizes over 20 years of
learning in AI design into a small set of
generally applicable design guidelines.
18 generally applicable design guidelines for
human-AI interaction have been proposed
and validated through multiple rounds of
evaluation including an user study.
Guidelines for Human-AI Interaction
Consolidating
Guidelines
1
3
2
4
Modified
Heuristic
Evaluation
User Study
Expert
evaluation
of Revisions
Guidelines

5/12
Guidelines (1/2)
Moment AI Design Guidelines
INITIALLY
G1 Make clear what the system can do
G2 Make clear how well the system can do what it can do
DURING INTERACTION
G3 Time services based on context
G4 Show contextually relevant information
G5 Match relevant social norms
G6 Mitigate social biases
WHEN WRONG
G7 Support efficient invocation
G8 Support efficient dismissal
G9 Support efficient correction
G10 Scope services when in doubt
G11 Make clear why the system did what it did

6/12
Moment AI Design Guidelines
OVERTIME
G12 Remember recent interactions
G13 Learn from user behavior
G14 Update and adapt cautiously
G15 Encourage granular feedback
G16 Convey the consequences of user actions
G17 Provide global controls
G18 Notify users about changes
Guidelines (2/2)

Phase 1 - Consolidating Guidelines
168
35
20
Sources:
● Review of AI guidelines of industry;
● Public articles;
● Relevant papers about A.
CLUSTERING GUIDELINES
Causes:
● too vague;
● highly related to particular context;
● not AI-specific.
FILTER
GATHERING PROCESS
Clustering according to the meaning.
Each 20 guidelines has been transformed into a sentence
Moments: Initially, During Interaction, When wrong, Over time 7/12

8/12
Microsoft conducted an evaluation to test and iterate
on the initial set of 20 AI design guidelines using a
heuristic evaluation with 11 evaluators.
Each guideline must:
● be written as a rule of action;
● be accompanied by a description (to clarify
ambiguities);
● not contain conjunctions.
Goals
● Identify Applications and Violations;
● Reflect on the meaning of guidelines.
Result
Remove guidelines used few times or not at all.
Phase 2 - Modified Heuristic Evaluation

PARTICIPANTS
Age:
● 18-24;
● 25-34
● 35-44
● 45-54
Job:
● researchers;
● designers;
● design interns;
● engineers.
It was conducted a user study with 49 HCI practitioners to understand the guidelines’ applicability across a variety of
products; and get feedback about the guidelines’ clarity.
Phase 3 - User Study (1/2)
Experience:
● 1-4 years;
● 5-9 years;
● 10-14 years;
● 15-19 years
● 20+ years.
Assigned to each participant a familiar product
For each guideline:
● if does not applies, explain why;
● otherwise:
○ application/violation;
○ give some examples;
○ rate it by a 5-point semantic scale;
○ explain the rate.
PROCEDURE PRODUCT
9/12

Results
The evaluation in this phase was focused on two key
questions, each addressed in one of the subsections below:
➢ Are the guidelines relevant? That is, can we identify
examples of each guideline across a variety of products
and features?
➢ Are the guidelines clear? That is, can participants
understand and differentiate among them?
Phase 3 - User Study (2/2)
10/12

11/12
To verify whether the revisions improved the guidelines, an expert review has been used.
11 experts with experience in applying various guidelines to design solutions would be able to assess
whether guidelines would be easy to understand and therefore to work with.
Phase 4 - Expert Evaluation of Revisions
● Each expert reviewed the 9 revised
guidelines independently and they chose,
for each guideline, the version they
thought was easier to understand.
● The experts reviewed the pairs of
guidelines that emerged in Phase 3 as
confusing or overlapping and, for each
pair, experts rated whether the two
guidelines mean the same thing and the
difficulty of distinguishing between them.

These guidelines only begin to touch on topics of fairness and broader ethical considerations. Ethical
concerns extend beyond the matching of social norms and mitigating social biases.
As the current technology landscape is shifting towards the increasing inclusion of AI in computing
applications, we see significant value in working to further develop and refine design guidelines for
human-AI interaction.
There is an high trade-off among generality and specialization. In particular, additional guidelines may
be required in certain high-risk or highly regulated areas.
The decisions to optimize for generality, and to focus on observable properties, serve as a reminder
that interaction designers routinely encounter these types of trade-offs.
Discussions and Conclusion
12/12

Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy

More Related Content

What's hot

Similar to Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy

Recently uploaded

Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy