Human-Centered Artificial Intelligence: Reliable,
Safe & Trustworthy
Guidelines for Human-AI Interaction
Authors:
Di Pierro Davide
Impellizzeri Federico
Jalna Afridi
Mosca Sara
Get a modern PowerPoint Presentation that is beautifully
designed. Easy to change colors, photos and Text.
Contents
04
Get a modern PowerPoint Presentation that is beautifully
designed. Easy to change colors, photos and Text.
Contents
01
Contents
03
Get a modern PowerPoint Presentation that is beautifully
designed. Easy to change colors, photos and Text.
RST systems overcome autonomy problems.
Human-Centered Artificial Intelligence:
Reliable, Safe & Trustworthy
RELIABLE
Reliability derives from
the use of technical
practices which support
responsibility, fairness
and explainability such as
benchmark tests,
continuous review of
data quality and different
design strategies.
Safety derives from open
management strategies
like leadership, open
reporting of failures and
root-cause failure
analyses.
SAFETY TRUSTWORTHY
Trustworthiness derives by
adopting independent
oversight structures like
organizations that develop
standards (like IEEE) or
government agencies.
Public expectations go
beyond trusted systems,
they want trustworthy
systems based on
respected independent
oversight structures.
2/12
Dimensional HCAI Frameworks
In 1-dimensional framework designers had to choose a
point on the one line from human control to computer
automation. The implicit message was that more automation
meant less user control.
By splitting the 2 concepts of
autonomy/human control, a 2-dimensional
framework is obtained which suggests that
achieving high levels of human control and
high levels of computer automation is also
possible.
3/12
4/12
AI Components can be integrated into
user-facing system but result to have an
NON PROBABILISTIC BEHAVIOR
This can lead to:
● Degradation of UX;
● Erode User confidence;
● Abandon AI technologies.
This study summarizes over 20 years of
learning in AI design into a small set of
generally applicable design guidelines.
18 generally applicable design guidelines for
human-AI interaction have been proposed
and validated through multiple rounds of
evaluation including an user study.
Guidelines for Human-AI Interaction
Consolidating
Guidelines
1
3
2
4
Modified
Heuristic
Evaluation
User Study
Expert
evaluation
of Revisions
Guidelines
5/12
Guidelines (1/2)
Moment AI Design Guidelines
INITIALLY
G1 Make clear what the system can do
G2 Make clear how well the system can do what it can do
DURING INTERACTION
G3 Time services based on context
G4 Show contextually relevant information
G5 Match relevant social norms
G6 Mitigate social biases
WHEN WRONG
G7 Support efficient invocation
G8 Support efficient dismissal
G9 Support efficient correction
G10 Scope services when in doubt
G11 Make clear why the system did what it did
6/12
Moment AI Design Guidelines
OVERTIME
G12 Remember recent interactions
G13 Learn from user behavior
G14 Update and adapt cautiously
G15 Encourage granular feedback
G16 Convey the consequences of user actions
G17 Provide global controls
G18 Notify users about changes
Guidelines (2/2)
Phase 1 - Consolidating Guidelines
168
35
20
Sources:
● Review of AI guidelines of industry;
● Public articles;
● Relevant papers about A.
CLUSTERING GUIDELINES
Causes:
● too vague;
● highly related to particular context;
● not AI-specific.
FILTER
GATHERING PROCESS
Clustering according to the meaning.
Each 20 guidelines has been transformed into a sentence
Moments: Initially, During Interaction, When wrong, Over time 7/12
8/12
Microsoft conducted an evaluation to test and iterate
on the initial set of 20 AI design guidelines using a
heuristic evaluation with 11 evaluators.
Each guideline must:
● be written as a rule of action;
● be accompanied by a description (to clarify
ambiguities);
● not contain conjunctions.
Goals
● Identify Applications and Violations;
● Reflect on the meaning of guidelines.
Result
Remove guidelines used few times or not at all.
Phase 2 - Modified Heuristic Evaluation
PARTICIPANTS
Age:
● 18-24;
● 25-34
● 35-44
● 45-54
Job:
● researchers;
● designers;
● design interns;
● engineers.
It was conducted a user study with 49 HCI practitioners to understand the guidelines’ applicability across a variety of
products; and get feedback about the guidelines’ clarity.
Phase 3 - User Study (1/2)
Experience:
● 1-4 years;
● 5-9 years;
● 10-14 years;
● 15-19 years
● 20+ years.
Assigned to each participant a familiar product
For each guideline:
● if does not applies, explain why;
● otherwise:
○ application/violation;
○ give some examples;
○ rate it by a 5-point semantic scale;
○ explain the rate.
PROCEDURE PRODUCT
9/12
Results
The evaluation in this phase was focused on two key
questions, each addressed in one of the subsections below:
➢ Are the guidelines relevant? That is, can we identify
examples of each guideline across a variety of products
and features?
➢ Are the guidelines clear? That is, can participants
understand and differentiate among them?
Phase 3 - User Study (2/2)
10/12
11/12
To verify whether the revisions improved the guidelines, an expert review has been used.
11 experts with experience in applying various guidelines to design solutions would be able to assess
whether guidelines would be easy to understand and therefore to work with.
Phase 4 - Expert Evaluation of Revisions
● Each expert reviewed the 9 revised
guidelines independently and they chose,
for each guideline, the version they
thought was easier to understand.
● The experts reviewed the pairs of
guidelines that emerged in Phase 3 as
confusing or overlapping and, for each
pair, experts rated whether the two
guidelines mean the same thing and the
difficulty of distinguishing between them.
These guidelines only begin to touch on topics of fairness and broader ethical considerations. Ethical
concerns extend beyond the matching of social norms and mitigating social biases.
As the current technology landscape is shifting towards the increasing inclusion of AI in computing
applications, we see significant value in working to further develop and refine design guidelines for
human-AI interaction.
There is an high trade-off among generality and specialization. In particular, additional guidelines may
be required in certain high-risk or highly regulated areas.
The decisions to optimize for generality, and to focus on observable properties, serve as a reminder
that interaction designers routinely encounter these types of trade-offs.
Discussions and Conclusion
12/12

Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy

  • 1.
    Human-Centered Artificial Intelligence:Reliable, Safe & Trustworthy Guidelines for Human-AI Interaction Authors: Di Pierro Davide Impellizzeri Federico Jalna Afridi Mosca Sara
  • 2.
    Get a modernPowerPoint Presentation that is beautifully designed. Easy to change colors, photos and Text. Contents 04 Get a modern PowerPoint Presentation that is beautifully designed. Easy to change colors, photos and Text. Contents 01 Contents 03 Get a modern PowerPoint Presentation that is beautifully designed. Easy to change colors, photos and Text. RST systems overcome autonomy problems. Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy RELIABLE Reliability derives from the use of technical practices which support responsibility, fairness and explainability such as benchmark tests, continuous review of data quality and different design strategies. Safety derives from open management strategies like leadership, open reporting of failures and root-cause failure analyses. SAFETY TRUSTWORTHY Trustworthiness derives by adopting independent oversight structures like organizations that develop standards (like IEEE) or government agencies. Public expectations go beyond trusted systems, they want trustworthy systems based on respected independent oversight structures. 2/12
  • 3.
    Dimensional HCAI Frameworks In1-dimensional framework designers had to choose a point on the one line from human control to computer automation. The implicit message was that more automation meant less user control. By splitting the 2 concepts of autonomy/human control, a 2-dimensional framework is obtained which suggests that achieving high levels of human control and high levels of computer automation is also possible. 3/12
  • 4.
    4/12 AI Components canbe integrated into user-facing system but result to have an NON PROBABILISTIC BEHAVIOR This can lead to: ● Degradation of UX; ● Erode User confidence; ● Abandon AI technologies. This study summarizes over 20 years of learning in AI design into a small set of generally applicable design guidelines. 18 generally applicable design guidelines for human-AI interaction have been proposed and validated through multiple rounds of evaluation including an user study. Guidelines for Human-AI Interaction Consolidating Guidelines 1 3 2 4 Modified Heuristic Evaluation User Study Expert evaluation of Revisions Guidelines
  • 5.
    5/12 Guidelines (1/2) Moment AIDesign Guidelines INITIALLY G1 Make clear what the system can do G2 Make clear how well the system can do what it can do DURING INTERACTION G3 Time services based on context G4 Show contextually relevant information G5 Match relevant social norms G6 Mitigate social biases WHEN WRONG G7 Support efficient invocation G8 Support efficient dismissal G9 Support efficient correction G10 Scope services when in doubt G11 Make clear why the system did what it did
  • 6.
    6/12 Moment AI DesignGuidelines OVERTIME G12 Remember recent interactions G13 Learn from user behavior G14 Update and adapt cautiously G15 Encourage granular feedback G16 Convey the consequences of user actions G17 Provide global controls G18 Notify users about changes Guidelines (2/2)
  • 7.
    Phase 1 -Consolidating Guidelines 168 35 20 Sources: ● Review of AI guidelines of industry; ● Public articles; ● Relevant papers about A. CLUSTERING GUIDELINES Causes: ● too vague; ● highly related to particular context; ● not AI-specific. FILTER GATHERING PROCESS Clustering according to the meaning. Each 20 guidelines has been transformed into a sentence Moments: Initially, During Interaction, When wrong, Over time 7/12
  • 8.
    8/12 Microsoft conducted anevaluation to test and iterate on the initial set of 20 AI design guidelines using a heuristic evaluation with 11 evaluators. Each guideline must: ● be written as a rule of action; ● be accompanied by a description (to clarify ambiguities); ● not contain conjunctions. Goals ● Identify Applications and Violations; ● Reflect on the meaning of guidelines. Result Remove guidelines used few times or not at all. Phase 2 - Modified Heuristic Evaluation
  • 9.
    PARTICIPANTS Age: ● 18-24; ● 25-34 ●35-44 ● 45-54 Job: ● researchers; ● designers; ● design interns; ● engineers. It was conducted a user study with 49 HCI practitioners to understand the guidelines’ applicability across a variety of products; and get feedback about the guidelines’ clarity. Phase 3 - User Study (1/2) Experience: ● 1-4 years; ● 5-9 years; ● 10-14 years; ● 15-19 years ● 20+ years. Assigned to each participant a familiar product For each guideline: ● if does not applies, explain why; ● otherwise: ○ application/violation; ○ give some examples; ○ rate it by a 5-point semantic scale; ○ explain the rate. PROCEDURE PRODUCT 9/12
  • 10.
    Results The evaluation inthis phase was focused on two key questions, each addressed in one of the subsections below: ➢ Are the guidelines relevant? That is, can we identify examples of each guideline across a variety of products and features? ➢ Are the guidelines clear? That is, can participants understand and differentiate among them? Phase 3 - User Study (2/2) 10/12
  • 11.
    11/12 To verify whetherthe revisions improved the guidelines, an expert review has been used. 11 experts with experience in applying various guidelines to design solutions would be able to assess whether guidelines would be easy to understand and therefore to work with. Phase 4 - Expert Evaluation of Revisions ● Each expert reviewed the 9 revised guidelines independently and they chose, for each guideline, the version they thought was easier to understand. ● The experts reviewed the pairs of guidelines that emerged in Phase 3 as confusing or overlapping and, for each pair, experts rated whether the two guidelines mean the same thing and the difficulty of distinguishing between them.
  • 12.
    These guidelines onlybegin to touch on topics of fairness and broader ethical considerations. Ethical concerns extend beyond the matching of social norms and mitigating social biases. As the current technology landscape is shifting towards the increasing inclusion of AI in computing applications, we see significant value in working to further develop and refine design guidelines for human-AI interaction. There is an high trade-off among generality and specialization. In particular, additional guidelines may be required in certain high-risk or highly regulated areas. The decisions to optimize for generality, and to focus on observable properties, serve as a reminder that interaction designers routinely encounter these types of trade-offs. Discussions and Conclusion 12/12