Case based reasoning founded on randomization final

Case-Based Reasoning Founded On
Randomization
Presented by: Miled Basma BENTAIBA LCSI, ESI,Algiers,Algeria
Supervisor: Pr. Stuart H. RUBIN NIWC Pacific, San Diego, CA, USA
Co-supervisor: Dr. Lydia BOUZAR LCSI, ESI,Algiers,Algeria
Thesis defense
Prepared with view to obtaining an
LMD Doctorate in Computer Science
July 2021

This presentation is dedicated
to Pr. Thouraya Bouabana-
Tebibel
2

Outline
Introduction
State of the
Art
Contributions
Conclusion &
Perspectives
3

Introduction – State of the Art – Contributions – Conclusion & Perspectives
Context & Motivation
World
Problems tend to
recur
Similar problems have
similar solutions
5

Context & Motivation
The concept of
problem solving for
humans
Case-based reasoning
imitates the human
thinking
An evolutionary case
base versus a static
case base
6

Case-Based Reasoning
Retrieve
Reuse
Revise
Retain Case base
New problem
Case base
[Aamodt and Plaza, 1994]
Learned case
7

Problematic & Proposed Solution
How to ensure
accuracy and
efficiency of CBR’s
problem resolution?
Current Solutions
• Feed the case base
using inference
methods
Problem
• Generate a massive
case base
• Time consuming
• Applied on rules
Proposed Solution
• Use randomization to
generalize and to
generate new data
Problem
• The generated cases
may not be valid
Solution
• Validate the generated
cases before their use
8

Randomization
Random
Transformations
Non-inferential
transmutation
e.g. Mutation
Operation in GA
9

Randomization in Case-Based Reasoning
There are three main reasons for why using randomization on CBR:
Enlarge the reasoning
Space
Extract explicit
knowledge from
implicit One
Keep the case base
compressed &
optimized
10

Case-Based Reasoning Tasks
Tasks
of
Case-Based
Reasoning
AnalyticTasks
Classification
Diagnosis
Prediction
SyntheticTasks
Configuration
Planification
Design
12

Case-Based Reasoning for Analytic Tasks
Tasks
of
Case-Based
Reasoning
AnalyticTasks
SyntheticTasks
Retrieve
Reuse
Revise
Retain
Divide the case base into
small case bases
[Fan et al., 2011; Smiti and Elouedi, 2013]
Shrink the volume of the
case base into smaller size
[Smiti and Elouedi, 2018; Bentaiba-lagrid et
Al., 2018; Smiti and Elouedi, 2010; Smiti and;
Elouedi, 2014;Yan et Al., 2016]
Case
base
New
problem
13

Tasks
of
Case-Based
Reasoning
AnalyticTasks
SyntheticTasks
Retrieve
Reuse
Revise
Retain Case
base
Rank features according
to their importance
[Ahn and Kim, 2009; Elter et Al., 2007; Huang
et al., 2012;Yan et al., 2016]
Use intelligent systems for
the retrieve
[Quellec et al., 2010; Rezvan et al., 2013]
New
problem
14

Tasks
of
Case-Based
Reasoning
AnalyticTasks
SyntheticTasks
Retrieve
Reuse
Revise
Retain Case
base
New
problem
Use adaptation for the
Reuse
[Mazurowski et al., 2008;Yan et al., 2016]
Decide whether retaining
a case or not
[Sharaf-el-deen et al., 2014;Yan et al., 2016]
15

Tasks
of
Case-Based
Reasoning
AnalyticTasks
SyntheticTasks
Retrieve
Reuse
Revise
Retain Case
base
New
problem
Revise is highly related
with the application
domain
16

Case-Based Reasoning for Synthetic Tasks
Tasks
of
Case-Based
Reasoning
AnalyticTasks
SyntheticTasks
Retrieve
Reuse
Revise
Retain Case
base
New
problem
Structure the cases into a
hierarchy of sub-components
that constitutes the cases
[Burke et al., 2001; Burke et al., 2006]
Decompose the case into
small solvable problems
[Burke et al., 2006]
Model the case base into
petri net language
[Lim et al., 2015]
17

Research Directions & Contributions
Tasks
AnalyticTasks Diagnosis
Cases
Randomization
Mammography
Mass
CBR modules Medical Diagnosis
SyntheticTasks
Planification
Cases
Randomization
Route Planning
Design
Cases
Randomization
Scheduling
Systems
19
Task Contribution
Application
Domain

Contributions on Randomization
Tasks
Cases
Randomization
Mammography
Mass
SyntheticTasks
Planification
Cases
Randomization
Route Planning
Design
Cases
Randomization
Scheduling
Systems
20
Task Contribution
Application
Domain

User Interface
Case Base
Cases
Randomization
Retrieve
Reuse
Retain
Revise
Validation of
Generated
Cases
21
Overview

22
Contribution 1:
Route planning
Contribution 2:
Scheduling
Systems
Contribution 3:
Mammography
Mass
Application Domains

23
Approaches Details: Cases Randomization
Route
planning
• Problem
part
substitution
• Solution
part
substitution Scheduling
System
• Problem
part
substitution
Mammography
Mass
• Problem
part
substitution

24
Approaches Details: CasesValidation
Route
planning
• Coherence
Verification
Scheduling
System
• Adding
constraints
• (Pre-randomization)
• Coherence
verification
• (Post-
randomization)
Mammography
Mass
• Coherence
verification
• Stochastic
validation
• Absolute
validation

25
Approaches Details: Datasets
Route
planning
• OpenStreetMap
across Algiers
city
Scheduling
System
• Project
Scheduling
Problem Library
Mammography
Mass
• UCI Machine
Learning
Repository

26
Approaches Details: Research Findings
Route
planning
• 40% of
problems are
resolved using
new cases
• Case base is
augmented by
140% Scheduling
System
• 30% of
problems are
resolved using
new cases
• 91% of results
are better than
the benchmarks
• Improvement of
2 to 15%
Mammography
Mass
• Problems
resolution
increased by 8%
• 90% of
generated cases
are already valid

Discussion on Contributions on Randomization
Limitations
27
The involvement of the expert for validation is necessary
The case base is controlled by randomization module
No contributions inside case-based reasoning modules
Validation module is separated from revise module
User Interface
Case Base
Cases
Randomization
Retrieve
Reuse
Retain
Revise
Validation of
Generated Cases

Case-Based Reasoning For Medical Diagnosis
Tasks
Cases
Randomization
Mammography
Mass
SyntheticTasks
Planification
Cases
Randomization
Route Planning
Design
Cases
Randomization
Scheduling
Systems
28
Task Contribution
Application
Domain

Retrieve
• Similarity functions
Reuse
• Copy cases solutions
• Copy delegates
• Test with rules
Retain
• Store the case in
segments or increment
its frequency
Revise
• Coherence verification
• Stochastic validation
• Absolute validation
Case Base
Segmented with delegates
Feature Selection &
Weighting Module
• Offline process
Rules Generation
Module
• Periodic Process
Amplification using
Randomization Module
• At each new coming case
Case-Based Reasoning External Modules
User Interface
29

Retrieve
Reuse
• Copy delegates
• Test with rules
Retain
its frequency
Revise
• Validation using rules
Case Base
Feature Selection &
Weighting Module
• Offline process
Rules Generation
Module
Amplification using
User Interface
30

1. Feature Selection & Weighting
31
and with highly similar
values of feature ti
and with slightly similar
values of feature ti
Pairs of cases with the same
solution
V Y
Pairs of cases with different
solutions
X Z

2. Rules Generation
Start Point
(Benign)
3
Ill-defined
Lobular
52
4
Circumscribed
Round
54
Oval
60
Lobular
36
BI-RADS
Mass Margin
Mass Shape
Age
32

Retrieve
Reuse
• Copy delegates
• Test with rules
Retain
its frequency
Revise
Case Base
Feature Selection &
Weighting Module
• Offline process
Rules Generation
Module
Amplification using
User Interface
33

Case Base Segmentation
Pj highly
similar to D1
… …
Pj highly
similar to Dj
Pj less highly
similar to D1
…
Pj slightly
similar to D1
…
… …
…
Pj slightly
similar to Dj
…
Pj less highly
similar to Dj
…
…
delegate D1 … delegate Dk … delegate Dj
Levels
level
1
level
2
…
level
n
Sector presented by S = S1
Cj: problem (Pj)  solution (Sj)
Representative
of
a
segment
composed
of:
S
=
S
1
Delegate
=
D
k
Sector
presented by S
= Sm
Case form:
34

Retrieve
Reuse
• Copy delegates
• Test with rules
Retain
its frequency
Revise
Case Base
Feature Selection &
Weighting Module
• Offline process
Rules Generation
Module
Amplification using
User Interface
35

Retrieve Reuse
Cases with similarity > threshold
return the solution(s) of the selected
case(s)
Segments with similarity > threshold return the solution(s) of the segment(s)
Test the problem with different solutions
using the generated rules
return the succeeded solution(s)
36

Retrieve
Reuse
• Copy delegates
• Test with rules
Retain
its frequency
Revise
Case Base
Feature Selection &
Weighting Module
• Offline process
Rules Generation
Module
Amplification using
User Interface
37

1. Revise & Retain Process
Revise Module
coherence verification
stochastic validation
absolute validation
validity < threshold
coherent
cases
New case
38
• Coherence verification using our
stochastic grammar
• Stochastic validation :
 Probability of the case’s validity,
 a dynamic value
• Absolute validation:
 Validation using the generated rules
 Expert validation (non-essential)
validity >=
threshold

Experiments: Datasets
Mammography Mass
• 5 features (4 categorical,
1 numerical)
• 2 classes
• 961 cases
Thyroid Disease
• 5 features (5 numerical)
• 3 classes
• 215 cases
39

Experiments: Case-Based Reasoning Prototypes
40
Configurations
Case Base
Segmentation
Randomization Retrieve
NR_NS Flat No Cases
NR_S Segmented No Cases & Delegates
R_S Segmented Yes Cases & Delegates
R_NS Flat Yes Cases

Experiments Map
Experiments Benchmarks
Black Box
White Box
Case Base Amplification
Feature Weighting
CBR Modules
Machine Learning
Other Related Work
Confusion Matrix Metrics
ROC Curve & AUC
Resolution Time
41

Black Box Experiments
Black Box
White Box
Feature Weighting
CBR Module
Machine Learning
Other Related Work
ROC Curve & AUC
Resolution Time
42

Black Box Exp: Metrics Related to Confusion Matrix
0
10
20
30
40
50
60
70
80
90
100
NR_NS NR_S R_NS R_S
Mammography Mass
Resolution Capacity Accuracy f1 Score
0
10
20
30
40
50
60
70
80
90
100
NR_NS NR_S R_NS R_S
Thyroid Disease
Resolution Capacity Accuracy f1 Score
43

Black Box Exp: ResolutionTime
0
1000
2000
3000
4000
5000
6000
0
1300
2600
3900
5200
6500
7800
9100
10400
11700
13000
14300
15600
16900
18200
19500
20800
22100
23400
24700
26000
27300
28600
29900
31200
32500
33800
35100
36400
37700
39000
NR_NS NR_S R_S R_NS
0
10
20
30
40
50
60
0
1300
2600
3900
5200
6500
7800
9100
10400
11700
13000
14300
15600
16900
18200
19500
20800
22100
23400
24700
26000
27300
28600
29900
31200
32500
33800
35100
36400
37700
39000
NR_NS NR_S R_S
44

Black Box
White Box
Feature Weighting
CBR Modules
Machine Learning
Other Related Work
ROC Curve & AUC
Resolution Time
Benchmarks
45

Benchmarks: Comparison to Machine Learning (Using Accuracy)
82.29
82.29
82.29
82.29
76.04
80.21
85.42
73.48
67.6
99.7
88.15
LR
SVM
KNN
NB
Perceptron
DT
RF
NR_NS
NR_S
R_NS
R_S
98.57
96
100
96
92
96
96
35.18
83.33
97.52
94.81
Mammography Mass Thyroid Disease
46

Benchmarks: Comparison to Related Works (Using Accuracy)
Mammography Mass Thyroid Disease
75.45
78.13
79.47
79.79
80
80.62
82.29
77.08
58.33
98.96
96.87
91.86
94.14
95.16
95.3
95.7
97.02
97.49
35.18
83.33
97.52
94.81
GDM-GA-CBR
GA-CBR
WE-CBR
DSL-CBR
NN
WEH-CBR
DUL-CBR
NR_NS
NR_S
R_NS
R_S
GDA-WSVM
DeFalco
GA
CRCR_SVM
Somrani
Hayashi
FS-PSO-SVM
NR_NS
NR_S
R_NS
R_S
47

Discussion
Categorical DatasetVS Numerical Dataset
RandomizationVS No Randomization
SegmentationVS No Segmentation
Which Prototype is Recommended?
48

Discussion
NR_NS
• Bad accuracy
NR_S
• Bad accuracy
R_NS
• Excellent accuracy
• Very bad resolution time
R_S
• Very good accuracy
• Good resolution time
49

Objectives
Propose a
Randomization
Approach for
Knowledge
Amplification
Build a CBR with the
proposed
Randomization
Apply the Full
Approach on Real
Datasets
Ensure Accuracy
Ensure Good
ResolutionTime
51

Challenges
Research on CBR is reduced in favor of
ML
“AccuracyTakes All” Paradigm
Randomization in CBR is a Recent
Research Direction
52

Our Achievements
Randomization for
Planification Tasks
• Integrate it in CBR
• Test it on Route
Planning Problems
Randomization for
Design Tasks
• Test it on scheduling
Problem
Randomization for
Diagnosis Tasks
• Test it on
Mammographic Mass
Classification Problem
CBR For Diagnosis
Tasks
• Extend the previous
approach
• Test it on the Medical
Field
53

Perspectives
• Add a layer for
conversational CBR
Improve
• Create a global
framework for
analytic tasks
• Create a global
framework for
synthetic tasks
Generalize
• Create hybrid
system by combining
ML with CBR
[Burca et al., 2018]
New Research
Directions
54

List of Publications
ESWA 2020 Bentaiba-Lagrid, M. B., Bouzar-Benlabiod, L., Rubin, S. H., & Hanini, M. R. (2020). A Case-Based
Reasoning System for Supervised Classification Problems in the Medical Field. Expert Systems with Applications,
113335.
IRI 2017 Bouabana-Tebibel,T., Rubin, S. H., Bentaiba, M. B., Allaoua, A., & Boumhand, A. (2017, August). Knowledge
Amplification through Randomization for Scheduling Systems. In 2017 IEEE International Conference on Information
Reuse and Integration (IRI) (pp. 589-598). IEEE.
IRI 2018a Bentaiba-Lagrid, M. B., Bouzar-Benlabiod, L., Rubin, S. H., Bouabana-Tebibel,T., & Hanini, M. R. (2018, July).
Knowledge Amplification Using Randomization in Case-Based Reasoning--Case Study: Severity of Mammography Mass.
In 2018 IEEE International Conference on Information Reuse and Integration (IRI) (pp. 155-162). IEEE.
IRI 2018b Bouabana-Tebibel,T., Rubin, S. H., Bouzar-Benlabiod, L., Bentaiba-Lagrid, M. B., & Hanini, M. R. (2018, July).
Knowledge-Based Randomization for Amplification. In 2018 IEEE International Conference on Information Reuse and
Integration (IRI) (pp. 147-154). IEEE.
CIIA 2018 Bentaiba-Lagrid, M. B., Bouzar-Benlabiod, L., Rubin., (2018, May) Randomization Approach in Case-Based
Reasoning System to Amplify the Knowledge Base. In 2018 IFIP international conference on Computational Intelligence
and Its Applications (CIIA).
55

Thank You For Listening!
Questions?
56

References
Aamodt, A., & Plaza, E. (1994). Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI communications,
7(1), 39-59.
Ahn, H., & Kim, K. J. (2009). Global optimization of case-based reasoning for breast cytology diagnosis. Expert Systems with Applications, 36(1), 724-
734.
Bentaiba-Lagrid, M. B., Bouzar-Benlabiod, L., Rubin, S. H., Bouabana-Tebibel, T., & Hanini, M. R. (2018, July). Knowledge Amplification Using
Randomization in Case-Based Reasoning--Case Study: Severity of Mammography Mass. In 2018 IEEE International Conference on Information Reuse
and Integration (IRI) (pp. 155-162). IEEE.
Burca, D., Schüller, M., & Zlabinger, J. (2018). Case-based Reasoning and Machine Learning.
Burke, E. K., MacCarthy, B., Petrovic, S., & Qu, R. (2001, July). Case-based reasoning in course timetabling: an attribute graph approach. In
International Conference on Case-Based Reasoning (pp. 90-104). Springer, Berlin, Heidelberg.
Burke, E. K., MacCarthy, B. L., Petrovic, S., & Qu, R. (2006). Multipleretrieval case-based reasoning for course timetabling problems. Journal of the
Operational Research Society, 57(2), 148-162.
Elter, M., Schulz‐Wendtland, R., & Wittenberg,T. (2007).The prediction of breast cancer biopsy outcomes using two CAD approaches that both
emphasize an intelligible decision process. Medical physics, 34(11), 4164-4172.
Fan, C.Y., Chang, P. C., Lin, J. J., & Hsieh, J. C. (2011). A hybrid model combining case-based reasoning and fuzzy decision tree for medical data
classification.Applied Soft Computing, 11(1), 632-644.
Huang, M. L., Hung,Y. H., Lee,W. M., Li, R. K., & Wang,T. H. (2012). Usage of case-based reasoning, neural network and adaptive neuro-fuzzy inference
system classification techniques in breast cancer dataset classification diagnosis. Journal of medical systems, 36(2), 407-414.
57

References
Lim, J., Chae, M. J., Yang, Y., Park, I. B., Lee, J., & Park, J. (2015). Fast scheduling of semiconductor manufacturing facilities using case-based reasoning. IEEE
Transactions on Semiconductor Manufacturing, 29(1), 22-32.
Mazurowski, M. A., Zurada, J. M., & Tourassi, G. D. (2008). Selection of examples in case-based computer-aided decision systems. Physics in Medicine &
Biology, 53(21), 6079.
Pereira, I., & Madureira,A. (2013). Self-optimization module for scheduling using case-based reasoning.Applied Soft Computing, 13(3), 1419-1432.
Sharaf-El-Deen, D. A., Moawad, I. F., & Khalifa, M. E. (2014). A new hybrid case-based reasoning approach for medical diagnosis systems. Journal of
medical systems, 38(2), 9.
Smiti, A., & Elouedi, Z. (2010). Coid: Maintaining case method based on clustering, outliers and internal detection. In Software Engineering, Artificial
Intelligence, Networking and Parallel/Distributed Computing 2010 (pp. 39-52). Springer, Berlin, Heidelberg.
Smiti, A., & Elouedi, Z. (2013, April). Using clustering for maintaining case based reasoning systems. In 2013 5th International Conference on Modeling,
Simulation and Applied Optimization (ICMSAO) (pp. 1-6). IEEE
Smiti, A., & Elouedi, Z. (2014, June). Maintaining case based reasoning systems based on soft competence model. In International Conference on
Hybrid Artificial Intelligence Systems (pp. 666-677). Springer, Cham.
Smiti, A., & Elouedi, Z. (2018). SCBM: soft case base maintenance method based on competence model. Journal of Computational Science, 25, 221-
227
Yan, A., Song, H., & Wang, P. (2016). Case-based reasoning model with genetic algorithms, group decision-making and template reduction. International
Journal on Artificial Intelligence Tools, 25(02), 1550032
58

Case based reasoning founded on randomization final

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to Case based reasoning founded on randomization final

Similar to Case based reasoning founded on randomization final (20)

Recently uploaded

Recently uploaded (20)

Case based reasoning founded on randomization final

Editor's Notes