Comparing Performance of Decision Diagrams vs. Case Retrieval Nets

Comparing Performance of
Decision Diagrams and
Case Retrieval Nets

Alexandre Hanft & Matthias Ringe
Intelligent Information Systems Lab,
University of Hildesheim
Alexandre.hanft|matthias.ringe@uni-hildesheim.de

FG-WM @ LWA 2008 | 2008-10-07 2 of 15

Outline
• Application Domain
• Decision diagrams
• Comparison
– General Comparison
– Build-up
– Retrieval
• Future Work

FG-WM @ LWA 2008 | 2008-10-07 3 of 15

Application Domain: insurance claims
• Pristinely 9500 cases: Atttributes:
plausibilitaet (engl.: plausibility) [2]
free text, ≥1 techn. device schadensursache (cause_of_loss) [7]
• Consolidated: 18086 eingang_datum (date_of_receipt) [7], only year
structured cases, 1 device zustand (condition) [9]
zeitwert_bis (present_value_to) [13]
• 13 Case Attributes typ (type) [18]
– [amount of different schadensobjekt (claim_object) [34]
values]: reparaturkosten_bis (cost_of_repair_to) [35]
geraetealter_in_jahren (object_age) [41]
• Similarity modelling: anschaffungswert_bis (acquisition_value_to) [102]
functions for numerical zeitwert_von (present_value_from) [140]
values, taxonomies reparaturkosten_von (cost_of_repair_from) [317]
anschaffungswert_von (acquisition_value_from) [4730]

FG-WM @ LWA 2008 | 2008-10-07 4 of 15

Example case: 1 device of an insurance
claim (consolidated)
• 5 of 13 Case Attributes: plausib claim object acquisition
No ility cause of loss object age value from
– plausibilitaet (engl.:
plausibility) [2] 1 true water damage computer 1 800
– schadensursache
(cause_of_loss) [7] 2 true water damage computer 3 800
…
– schadensobjekt 3 false water damage computer 4 650
(claim_object) [34] washing
– geraetealter_in_jahren 4 true overvoltage machine 5 999
(object_age) [41]
5 false overvoltage dryer 6 550
– anschaffungswert_von
(acquisition_value_from)
6 true lightning laptop 5 999
[4730]
7 false water damage laptop 4 1300
8 false water damage laptop 4 700

FG-WM @ LWA 2008 | 2008-10-07 5 of 15

Application Domain: insurance claims
• 13 Case Attributes:
– plausibilitaet (engl.: plausibility) [2]
– schadensursache (cause_of_loss) [7]
– eingang_datum (date_of_receipt) [7], only year
– zustand (condition) [9]
– zeitwert_bis (present_value_to) [13]
– typ (type) [18]
– schadensobjekt (claim_object) [34]
– reparaturkosten_bis (cost_of_repair_to) [35]
– geraetealter_in_jahren (object_age) [41]
– anschaffungswert_bis (acquisition_value_to) [102]
– zeitwert_von (present_value_from) [140]
– reparaturkosten_von (cost_of_repair_from) [317]
– anschaffungswert_von (acquisition_value_from) [4730]

FG-WM @ LWA 2008 | 2008-10-07 6 of 15

Decision Diagrams (DD)
• Assumption: list of fixed attribute-value-pairs(AVP)
• Directed graph, source, sink
• Node labeled with Attribute (except sink)
• Edge labeled with value
• Case = path source … sink
• [Nicholson et al., 2006]

[Nicholson et al., 2006] R. Nicholson, D. Bridge, and N. Wilson. Decision diagrams: Fast and ﬂexible support for case
retrieval and recommendation. In Mehmet H. Göker, Thomas Roth-Berghofer, and H. Altay Güvenir, (eds.), Proceedings of
the 8th ECCBR’06, Ölüudeniz/Fethiye, Turkey, volume 4106 of LNCS, pages 136–150, Heidelberg, 2006. Springer Verlag.

FG-WM @ LWA 2008 | 2008-10-07 7 of 15

Decision Diagrams: Example
plausibility caus_of_loss claim_object object_age aquisition_value

FG-WM @ LWA 2008 | 2008-10-07 8 of 15

Retrieval in Decision Diagrams
• Look for the path(case) with smallest distance (f
in sink)
α(n n’) = wα ∗ distα(v, v’) v=query value, v‘ value on edge
⎧
⎪ 0, if n = source
f (n) = def ⎨
⎪min ( f (n' ) + α (n' → n)), else
⎩ n→n '
⎧
⎪ 0, if n = sink
g (n) =def ⎨
⎪min (α (n → n' ) + g (n' )), else
⎩ n→n '

FG-WM @ LWA 2008 | 2008-10-07 9 of 15

Retrieval in Decision Diagrams: Example

FG-WM @ LWA 2008 | 2008-10-07 10 of 15

Case Retrieval Nets (CRN) [optional]
• attribute-value
pair is Information
Entity (IE)
• case descriptor
node
• case is a sub-
graph of the CRN
• [Lenz 1999]

FG-WM @ LWA 2008 | 2008-10-07 11 of 15

Decision Diagrams (DDs) vs. Case Retrieval Nets (CRN)
General Comparison
Approach
Decision Diagram Case Retrieval Net
feature
retrieval approach index-oriented index-oriented

calculation of local similarities/ during retrieval during build-up
distances
dealing with NULL Values as normal values (otherwise can be omitted
uncomplete paths)
determination of similarities distances
similar cases
adding new cases during yes yes
lifetime
direct assignment of cases from not directly directly through ID in
index structure to case base case descriptor
suitability for incomplete cases no yes
suitability for domain with high no no
amount of different attribute
values

FG-WM @ LWA 2008 | 2008-10-07 12 of 15

DD vs. CRN: Comparison procedure
Selected Attribute:
• Measurements processed for 5: plausibility, cause of loss,
– For Build-up condition, type, claim object
– For Retrieval 10: plausibility, cause of loss,
condition, type, claim object,
– For Insertion of 1 new case object age, present value to,
– 5, 10 and all(13) attributes present value from, cost of
• 1st-10th attribute: 725 different AVPs repair to, acquisition value to
• 1st-13th attribute: 5.455 different AVPs 13: plausibility, cause of loss,
condition, type, claim object,
– 1.000, 2.000,...18.000 cases object age, present value to,
– Average of last 5 from 6 runs present value from, cost of
repair to, acquisition value to,
• tests run on an ordinary PC cost of repair from, date of
– 1.83GHz Core2 CPU, 2GB RAM, Win XP receipt, acquisition value from
– Implemented in .net

FG-WM @ LWA 2008 | 2008-10-07 13 of 15

Compare Build-Up
• with13 attributes
• DD is always faster
(65% time in ø)
– 22.35 vs. 41.34sec
• DD: sigmoidal
– only add edges for
different paths
• CRN: parabola
– add at least 13
relevance arcs + sims
• Doesn’t rise
exponentially

FG-WM @ LWA 2008 | 2008-10-07 14 of 15

Compare Retrieval in CRNs
• Case: 5, 10, 13 Attributes
• 5: 0.34 to 4.1 msec
• 10: 0.61 to 5.2 msec
• 13: 2 to 17.5 msec
• Each curve: Nearly linear
• Outliner: due to internal .net
data structures

FG-WM @ LWA 2008 | 2008-10-07 15 of 15

Compare Retrieval in DDs with Taxonomies
• Case: 5, 10, 13
Attributes
• Increase linear up to
9000, afterwards
constant
• 0.33 to 2.55sec:
Slow!
– parser calls to
calculate similarity
in taxonomies
– Omit in following
test series

FG-WM @ LWA 2008 | 2008-10-07 16 of 15

Retrieval CRNs vs. DDs w/o taxonomies
• CRN: 13: 2 to 17.5 msec
• DD: 5.8 to 69 msec
• CRN are always faster

FG-WM @ LWA 2008 | 2008-10-07 17 of 15

Compare Insertion of a case
Einfügen in DD Einfügen in CRN

0,1600 0,0009

0,1400 0,0008

0,0007
0,1200

Laufzeit in Sekunden
Laufzeit in Sekunden

0,0006
0,1000
0,0005
0,0800
0,0004
0,0600 0,0003
0,0400 0,0002

0,0200 0,0001

0,0000 0,0000

00

00

00

00

00

00

00

00

00

0

0

0

0

0

0

0

0

0
00

00

00

00

00

00

00

00

00
00

00

00

00

00

00

00

00

00

0

0

0

0

15 0
0

0

0

0

10

20

30

40

50

60

70

80

90
00

00

00

00

00

00

00

00

00

10

11

12

13

14

15

16

17

18
10

20

30

40

50

60

70

80

90
10

11

12

13

14

16

17

18
Anzahl Fälle
Anzahl Fälle

• Insert one case with 13 attributes
• DD: 24.6 to 142.8 msec
• CRN: around 0.7 msec: 30 to 183 times faster!

FG-WM @ LWA 2008 | 2008-10-07 18 of 15

Conclusion & Future Work
• comparison Decision Diagrams with Case Retrieval Nets
– build-up: DDs are faster (local similarities inserted in CRN)
– Retrieval: CRNs are faster (local similarities exist in CRN)

• In-depth investigation with same dataset as [Nicholson et
al., 2006]
• investigate dependency of the duration time for build-up
and retrieval from the amount and distribution of the
values of the attributes with artificial datasets.

Thank you for your attention!

Questions | Suggestions | Comments

Comparing Performance of Decision Diagrams vs. Case Retrieval Nets

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Comparing Performance of Decision Diagrams vs. Case Retrieval Nets

Similar to Comparing Performance of Decision Diagrams vs. Case Retrieval Nets (10)

Recently uploaded

Recently uploaded (20)

Comparing Performance of Decision Diagrams vs. Case Retrieval Nets