Computer-assisted cognitive training can help patients affected by several illnesses alleviate their cognitive deficits, or healthy people improve their mental performance. Adapting the difficulty of the exercises to how individuals perform in their execution is crucial to improve the effectiveness of cognitive training activities. We propose the use of Reinforcement Learning to learn how to automatically adapt the difficulty of computerized exercises for cognitive training. We illustrate a method to be initially used to learn difficulty-variation policies tailored for specific categories of trainees, and then to refine these policies for single individuals. We present the results of two user studies that provide evidence for the effectiveness of our method.
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
SFScon21 - Floriano Zini - Tailored computer-based cognitive training with reinforcement learning
1. Tailored computer-based cognitive
training with reinforcement learning
Floriano Zini
Free University of Bozen-Bolzano
Faculty of Computer Science
Smart Data Factory
12 Nov 2021
Join work with
Fabio Le Piane
Mauro Gaspari
University of Bologna
2. › Use of computerized CT is increasing
› for therapeutic purposes
› to enhance mental performance
› Promising results have been achieved in
research studies
Computerized cognitive training
3. MS-rehab: computerized CT
dedicated to Multiple Sclerosis
Set up of
rehabilitation
sessions
Cognitive
profile
building
Neuropsychological
test input
CR exercise
execution
Monitoring
5. › Exercises were designed by a team of
computer scientists and neuropsychologists
› The exercises cover the three main cognitive
domains:
› attention (12 exercises), memory (8), and
executive functions (4)
› each exercise has multiple versions in which the
stimuli vary, for a total of 52 exercises
› Patients can practice with simplified exercise
before starting the rehabilitation
› Patients can stop an exercise, and restart it
later
› the system restarts the rehabilitation with an
exercise at the same difficulty level as the last one
completed by the patient.
› A feedback is shown at the end of each exercise
› It includes the obtained performance, and the
number of correct, wrong, and missed answers
Cognitive training exercises in MS-rehab
6. MS-rehab efficacy assessment (1)
Recruitment
of 8 MS
patients
T0: Baseline
evaluation
Treatment
with MS-
rehab
T1: Post-
treatment
evaluation
Experiment @ Lab of Cognitive Psychology Univ. of Parma
Evaluation
› Psychometric
› Cognitive/neuropsychological
(WAIS-IV scale)
Treatment
› Three weekly cycles, each including
three individual sessions of 40
minutes each
› Same session exercises for all the
patients
› Difficulty of the exercises
automatically increased Rehabilitation session
7. MS-rehab efficacy assessment (2)
Results
Results of 2-tailed T and Wilcoxon signed-rank tests.
In bold the p-values showing a significant difference between T0 and T1 (α = .05)
WAIS-IV
VCI: Verbal Comprehension
PRI: Perceptual Reasoning
WMI: Working Memory
PSI: Processing speed
Psychometric params
BDI-II: depression
STAI-Y1, STAI-Y2: anxiety
FSS: fatigue
MSQOL-54: quality of life
8. From automatic exercise difficult variation …
› State-of-the-art systems for computerized cognitive training
have automatic mechanisms to fit the difficulty of
exercises to how the subjects perform in their execution
› Limit: the exercise parameters are changed using a
predefined rules which are the same for all the subjects
› MS-rehab has a mechanism defined by expert
rehabilitators
› difficulty is increased when the patient performance on an exercise
overcomes the 80% of the maximum for two consecutive times
CogniPlus
9. › We have designed and embedded in MS-rehab a truly adaptive
mechanism based on Reinforcement Learning (RL)
› The policy used to change the exercise difficulty is learned while the
individuals execute the training
› A policy, given a state, assigns a probability to each action that varies the
parameters determining the difficulty of an exercise
› A state is a set of exercise parameter values
… to adaptive exercise difficult variation
10. Two phases
1. Build a policy for an exercise that is specific for a
category of trainees
2. Refine the policy elicited and tailor it to a single
individual of the category
Phase 1 steps
1. A few subjects in one category are selected
2. The first subject executes several instances of the
exercise, starting from the lowest difficulty level;
the RL agent starts with a fully random policy
3. The RL agent interacts in the same way with the
other subjects and progressively refines the policy
4. The policy obtained after all the subjects have
executed their exercises is selected as the
category policy
Method to learn adaptive policies
11. Our method is based on four hypotheses
1. A category policy learned using RL can drive the cognitive training
of subjects belonging to that category better than the mechanisms
currently available in computerized cognitive training systems
2. An individually personalized policy learned for a subject belonging
to a category would better drive the cognitive training for that
subject than the category policy from which the individual policy
derives
3. A policy elicited for a category would not improve the training process of
subjects belonging to another category
4. An individually personalized policy for a subject belonging
to a category can be more efficiently learned refining a
policy previously learned for her/his category instead
of starting from a fully random policy
12. Training of 2
exercise
difficulty-
variation policies
with group T
•Method phase 1
Baseline
evaluation of
groups A and B
•Paced Auditory Serial
Addition Test (PASAT)
A and B had
sessions of the
two selected
exercises
•A used MS-rehab
embedding the policies
learned with RL
•B used the baseline
version of MS-rehab
Post-treatment
evaluation
of A and B
•PASAT + other tests
Hypothesis 1 (1)
Category: university students
A category policy learned using RL can drive the
cognitive training of subjects belonging to that
category better than the mechanisms currently
available in computerized cognitive training systems
13. Training of 2
exercise
difficulty-
variation policies
with group T
•Method phase 1
Baseline
evaluation of
groups A and B
•PASAT
A and B had
sessions of the
two selected
exercises
•A used MS-rehab
embedding the policies
learned with RL
•B used the baseline
version of MS-rehab
Post-treatment
evaluation
of A and B
•PASAT + other test
Hypothesis 1 (2)
Category: university students
Memory Nback
Alternating attention
14. Training of 2
exercise
difficulty-
variation policies
with group T
•Method phase 1
Baseline
evaluation of
groups A and B
•PASAT
A and B had
sessions of the
two selected
exercises
•A used MS-rehab
embedding the policies
learned with RL
•B used the baseline
version of MS-rehab
Post-treatment
evaluation
of A and B
•PASAT + other tests
Hypothesis 1 (3)
Category: university students
15. Training of 2
exercise
difficulty-
variation policies
with group T
•Method phase 1
Baseline
evaluation of
groups A and B
•PASAT
A and B had
sessions of the
two selected
exercises
•A used MS-rehab
embedding the policies
learned with RL
•B used the baseline
version of MS-rehab
Post-treatment
evaluation
of A and B
•PASAT + other test
Hypothesis 1 (4)
Category: university students
The average score of group 𝐴 was 47.8, while that
of group 𝐵 was 49.2.
No sta;s;cally significant difference between the
two average scores (2-tailed T-test, 𝑝 = 0.42)
16. Training of 2
exercise
difficulty-
variation policies
with group T
•Method phase 1
Baseline
evaluation of
groups A and B
•PASAT
A and B had
sessions of the
two selected
exercises
•A used MS-rehab
embedding the policies
learned with RL
•B used the baseline
version of MS-rehab
Post-treatment
evaluation
of A and B
•PASAT + other tests
Hypothesis 1 (5)
Category: university students
Both A and B performed significantly better than before intervention (2-tailed
paired T-test, 𝑝 = 1.7 ∗ 10-5 for group 𝐴, and 𝑝 = 0.02 for group 𝐵)
17. Training of 2
exercise
difficulty-
variation policies
with group T
•Method phase 1
Baseline
evaluation of
groups A and B
•PASAT
A and B had
sessions of the
two selected
exercises
•A used MS-rehab
embedding the policies
learned with RL
•B used the baseline
version of MS-rehab
Post-treatment
evaluation
of A and B
•PASAT + other tests
Hypothesis 1 (6)
Category: university students
18. Baseline evaluation of
group C
•PASAT
C had sessions of the
two selected exercises
•Refinement of 2 personalized
policies for students in group C
Post-treatment
evaluation
of C
•PASAT + other tests
Hypothesis 2 (1)
Category: university students
An individually personalized policy learned for a
subject belonging to a category would better drive
the cognitive training for that subject than the
category policy from which the individual policy
derives
19. Baseline evaluation of
group C
•PASAT
C had sessions of the
two selected exercises
•Refinement of 2 personalized
policies for students in group C
Post-treatment
evaluation
of C
•PASAT + other test
Hypothesis 2 (2)
Category: university students
• The average cognitive functioning of
group 𝐶 after the treatment
was significantly better than that of
group 𝐵
• This result gives and additional proof
of the validity of Hypothesis 1.
20. Baseline evaluation of
group C
•PASAT
C had sessions of the
two selected exercises
•Refinement of 2 personalized
policies for students in group C
Post-treatment
evaluation
of C
•PASAT + other tests
Hypothesis 2 (3)
Category: university students
21. Baseline evaluation of
group C
•PASAT
C had sessions of the
two selected exercises
•Refinement of 2 personalized
policies for students in group C
Post-treatment
evaluation
of C
•PASAT + other tests
Hypothesis 2 (4)
Category: university students
• The number of memory exercises required for group 𝐶 to
achieve an improvement equivalent to that of group 𝐴
was significantly lower (2-tailed T-test, 𝑝 = 4.87 ∗ 10-5).
• 46 exercises for group 𝐶, 4.6 on average
• 243 exercises for group A, 24.3 on average
• The customised policy may be more efficient in the
number of exercises and, consequently, in the ;me
needed for training
22. Conclusions
› In an experimentation with 8 patients,
MS-rehab demonstrated its efficacy in MS
rehabilitation
› For university students it is possible to
learn, using RL, an effective category
policy for exercise difficult variation
› This policy is better than the baseline policy
designed by cognitive training experts
› The category policy learned for students
can be further personalized with RL
› While we were not able improve the cognitive
training with the personalized policy, we could
able help student obtain equivalent training
results in a much shorter time and with far
less exercises
23. Future work
› Set up other experiments to fully assess our
method to learn adaptive policies
› verify Hypotheses 3 and 4
› Strengthen the results on Hypothesis 1 by
replicating the experiment with other more critical
categories of subjects
› Investigate the influence of the degree of
specificity of a category policy
24. Publications
› Mauro Gaspari, Floriano Zini, Debora Castellano, Federica Pinardi, Sergio Stecchi (2017). An
advanced system to support cognitive rehabilitation in multiple sclerosis. In: IEEE 3rd
International Forum on Research and Technologies for Society and Industry (RTSI) Conference
Proceedings. p. 326-331, IEEE, ISBN: 978-1-5386-3906-1, Modena, September 11-13 2017,
doi: 10.1109/RTSI.2017.8065970
› Floriano Zini, Elena Maria Bressan, Mauro Gaspari (2017). A formative user-based usability
study for an advanced cognitive rehabilitation system. In: CHItaly '17 Proceedings of the 12th
Biannual Conference on Italian SIGCHI Chapter. p. 1-10, ISBN: 9781450352376, Cagliari,
Italy, September 18 - 20, 2017, doi: 10.1145/3125571.3125599
› Daniele Baschieri, Mauro Gaspari, Floriano Zini (2018). Planning-Based Serious Game for
Cognitive Rehabilitation in Multiple Sclerosis. In: GOODTECHS 2018 - 4th EAI International
Conference on Smart Objects and Technologies for Social Good. November 28-30 - Bologna,
Italy
› Mauro Gaspari, Margherita Donnici (2019). Weekend in Rome: a cognitive training exercise
based on planning. In: Proceedings of the Workshop Socio-Affective Technologies: an
interdisciplinary approach (co-located with IEEE SMC 2019)
› Mauro Gaspari, Floriano Zini, Sergio Stecchi (2020). Enhancing cognitive
rehabilitation in multiple sclerosis with a disease-specific tool. In: Disability and
Rehabilitation: Assistive Technology, DOI: 10.1080/17483107.2020.1849432
› Floriano Zini, Fabio Le Piane, Mauro Gaspari (2021). Adaptive Cognitive Training with
Reinforcement Learning. Accepted for publication in ACM Transactions on Interactive
Intelligent Systems.
25. Acknowledgements
› Dr. Sergio Stecchi for contributing to the design of MS-rehab with
his medical expertise
› Dr. Debora Castellano, Dr. Federica Pinardi, Dr. Francesca Rizzi, Dr.
Fabio Bellomi, Dr. Beatrice Goretti, Dr. Federica Lato, Dr. Enrico
Montanari and Dr. Livia Ludovico for their help in the analysis of the
MS cognitive rehabilitation process
› Dr. Elena Maria Bressan, Dr. Daniele Baschieri, Dr. Margherita
Donnici, and Dr. Bartolomeo Lombardi for the contribution given to
the developement of MS-rehab with their master thesis.
› Laboratory of Cognitive Psychology of the Department of Medicine
and Surgery of the University of Parma, and Prof. Olimpia Pino and
Dr. Ciro Urselli for their contribution to the pilot study of MS-rehab
› Prof. Franca Stablum and Dr. Agnieszka Kolasińska of the
Department of General Psychology, University of Padua for
extensively testing MS-rehab
26. Thank you for your attention!
MS-rehab website:
https://rehab.cs.unibo.it/MS-rehab-website
Contact:
Floriano Zini
Free University of Bozen-Bolzano – Faculty of Computer
Science – Smart Data Factory
floriano.zini@unibz.it
smart@unibz.it
https://smart.inf.unibz.it/