Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Flor incomp26 anim
1. Incomparable, what now ?
IV: An (unexpected) modelling
challenge
R. Bruggemann(1) and L. Carlsen(2)
(1): Leibniz-Institute of Freshwater Ecology and Inland
Fisheries, Berlin, Germany
(2): Awareness Center, Roskilde, Denmark
Flor_incomp25_anim.ppt 27.2.2015 – 4.4.2015
2. Ranking I
• Most often there is no measure for the
ranking aim
• Hence a multi-indicatorsystem (MIS) is needed
as a proxy for the ranking aim. Examples:
Poverty Sustainability Child-well being
3. Ranking I (cont‘d)
• MCDA-methods, so far a ranking is intended,
are constructing a ranking index CI.
• The order due to CI is by construction a weak
(or better a linear) order.
• Hence, there are no incomparabilities.
Everything seems to be on its best way!
Really??
4. Consequences
• By construction a linear order is intended:
– No ambiguity
– Most often no ties
• A metric is available
• Modelling of the knowledge of
stakeholders/decision makers
5. However….
• Conflicts are hidden from the very beginning
• There may be robustness problems
• Problems to get the needed parameters (for
example the weights) of the MCDA-method
6. Ranking II
• HDT-equation: x y: qi(x) qi(y) for all qi of MIS
• The HDT-equation is very strict: Incomparabilities are
arising:
– Minute numerical differences i: = abs(qi(x) – qi(y))
– No care how many indicator pairs taken from MIS are
contributing to x || y
– No care, whether or not an incomparability is induced by
contextual similar qi of the MIS
• Regarding knowledge to eliminate some
incomparabilities: How to model within the framework
of HDT?
7. Our talk here in Florence:
1. An empirical data set (taken from
environmental chemistry)
2. Weight intervals
3. Towards a controlling law
4. Results
5. Discussion, Future tasks
8. Pesticides in the environment
DDT Aldrin (ALD)
… and nine other pesticides.
How do they affect the environment? There is no single measure.
Hence a multi-indicator System:
Persistence (Pers), Bioaccumulation (BioA) and Toxicity (Tox)
Hexachlorobenzene (HCB)
9. Hasse diagram (HD) of 12 Pesticides
There are many incomparabilities (such as DDD || HCL),
so there is a need for modelling.
Pesticides:
DDT
ALD
CHL
DDE
DDD
HCL
HCB
MEC
DIE
PCN
PCP
LIN
Characterized by a
MIS {Pers, BioA, Tox}
U = |{(xi,xj)XxX, xi||xj, with i<j}| = 31
10. Modelling by weight intervals
1. Basic paper: Match – Commun. Math.
Comput. Chem. 2013, 69, 413-432
2. Idea: Let
(x) = wi qi(x)
the value of a composite indicator , one of
the simplest constructions to get a linear or
weak order.
11. Modelling by weight intervals (cont‘d)
1. The problem is the selection of weights,
which causes
– subjectivity and
– hides incomparabilities expressing major conflicts
in the data
2. Hence: A relaxation by weight-intervals
12. U = 5
0 1
Range of weights
Persistence
Bioaccumulation
Toxicity
In our example:
Many weak orders
Concatenation
of these orders
Algebraically:
Intersection of
these orders.
CI1 CI2 …
13. Questions
1. What can be said about stability? Could it
happen that another MC – run changes the
result?
2. How can we judge the role of weight-
intervals?
– Effect of lengths of the intervals
– Effect of location of the intervals
14. Idea
• Let Wi be the ith weight interval (located within the
span of [0,1])
• U = f(Wi) i = 1,…,m ; m number of indicators (1)
• Equation (1) by far too detailled
• Hence: Introduction of an artificial parameter:
Vr = (realized wi,max – realized wi,min)
for all (realized wi,max – realized wi,min ) 0
15. Expectation
U, incomparability
Vr = 0
Certainty about
weights, i.e. exactly
one weight for each
indicator:
U = 0
0 < Vr < 1
Some uncertainty
Hence intervals with
Lengths < 1
0 < U < Umax
Vr = 1
Complete
weight‘s span.
No knowledge.
Original poset
U = Umax
0
16. U = f(Vr)
• Having m indicators, m* may be the number
of indicators, where Vr 0. Then:
• U = f(Vr, m*)
• Focus on the length of the intervals
• Disregarding the position of the intervals
17. Hypothesis
• U = Umax* Vr
s, s = 1/m*,
• Umax = U(Vr= 1),
• Vr = 1: all intervals have length 1, maximal
uncertainty, original poset (without any
weights)
• The values U(Vr) may be considered as
describing a kind of normal behaviour.
• Realistic assumption?
19. Results/Summary
• Modelling stakeholders knowledge within the
framework of partial order theory: weight
intervals
• Need of a function to get an overview
U = f(Vr).
• A power law seems to be a suitable approach
• Other modelling concepts (not shown here):
power law seems to describe pretty well
U = f(p), i.e. U = Umax *ps, p methods parameter.
20. Check of the ideas with another
dataset
• Polluted sites in South-Westgermany
• Four indicators, 59 regions
• Power law seems to fail!
21. Results: Pollution in a South-Western
region of Germany
0
200
400
600
800
1000
1200
1400
1600
0 0.2 0.4 0.6 0.8 1 1.2
Ucalc
Ureal
The values obtained by the MC simulations seem to be
larger then the values obtained by the power law U = 1386 * V(1/4)
The problem: In contrast to the pesticides data, the single indicators
induces orders with large equivalence classes.
I.e. the degree of degeneracy is high.
Umax = 1386
22. Pollution data
Playing with ideas…..
K(qi) describes the degeneracy with respect to indicator qi.
K = Ni*(Ni – 1) , Ni = |ith equivalence class|
The degree of degeneracy k(qi) related to each indicator:
)1(
)1(
:)(
nn
NN
qk ii
i
1/m is valid, only if no degeneracy appears, because
then each indicator contributes its own linear order to the poset.
Idea: (1/m) eff(ectiv) = f(m, k(qi))
n: number of objects
23. Approach:
))(1(
1
:
1
,..,1
mi
i
eff
qk
mm
Pesticide data : k(qi) = 0 for all i. No correction needed
Pollution data: k(q1) = 0.074 , k(q2) = 0.036, k(q3) = 0.037, k(q4) = 0.016
(1/m)eff = 0.21. UcalcP is calculated following (1/m)eff:
0
200
400
600
800
1000
1200
1400
1600
0 0.2 0.4 0.6 0.8 1 1.2
Ureal
UcalcP
U
Vr
Eq. 2
-100
-50
0
50
100
150
200
250
300
350
400
0 0.2 0.4 0.6 0.8 1 1.2
delta1
delta2
Vr
24. Future Work
• Explain, why a power law is the correct law!
• Find out, whether Vr is a good selection
• Find methods for a correct interpretation
• Are there other more reasonable controlling
parameter?
• Is s generally well described as a funktion of
1/m and k(qi)? Could we do it better?
27. Remarks
1) „Differential“ view for effect of V for CI‘s
2) Linear orders w.r.t. to the indicators or any CI imply equal distance to
the original poset
3) In linear orders the number of 1 in the -matrix equals (n*(n-1)/2)
and is independent of the special CI
All 1 in for the original poset are realized in all (CI).
Therefore the distances of all CI to the original poset are the same,
namely (in the case of the pesticides) 31
4) If not a linear but weak orders with nontrivial equivalence classes
appear then irregularities appear as in case of the second data set
nji
otherwise
jiif
ji ,...,1,
0
1
,
28. Example wi (i=1,2,3 and wi = 1)
w3 w2
w1
w*(1)
w*(2)
w*(3)
Three tuples w* selected
30. Pollution data:
59 objects , 4 indicators)
orig Pb cd zn S CIrelat
orig 0 1513 1448 1449 1414 1386
Pb 1513 0 1851 1884 1455 1359
cd 1448 1851 0 1183 1922 742
zn 1449 1884 1183 0 1703 1185
S 1414 1455 1922 1703 0 1534
CIrelat 1386 1359 742 1185 1534 0
Orig (poset) Max distance: 1513,
• The attributes Pb, Cd, Zn and S induce weak orders with
• Pretty large nontrivial equivalence classes.
Degeneracy (K) : CI 0; Pb 254; Cd 124; Zn 126; S 56
• Assumption s = ¼ may not a good one, because some
indicators are insufficient in differentiating the order.
31. Check of the exponent s in U = Umax*Vs
Pollution data
0
200
400
600
800
1000
1200
1400
1600
0 0.2 0.4 0.6 0.8 1 1.2
Ureal
Ucalc3
Ucalc4
Ucalc3:
s = 0.1
Ucalc4:
s = 0.2
Hypothesis: s = f(m*,K)
U
V
32. Discussion
• Selection of interval length:
– Seems to describe crudely the behavior of U
– Position of intervals in dependence of any indicator is
considered as „fine tuning“
– There is no need to find a law which exactly
meets the points (U,Vr).
• Main effect at the very beginning of the Vr-scale:
Vr = 0 Vr =
– Sensitivity?
– Distances (orders due to single indicators)
33. V 0
•V = 0 means, we start from a weak order ,
•V = means, there is an influence from other indicators
Orig (partial order)
Pers Tox
BioA
58
16
50
31
31
31
46 16
12
31
A composite indicator
due to equal weights
34. The behaviour U = f(Vr) could be as
follows:
Range
a) due to mc-runs
and
b) how different
weight intervals are
associated with the
indicators
Vr
U
Umax
U = 0
Vr = 0 Vr = 1
35. Constructed test data set
7 objects, three indicators. By construction a high degree of
degeneracy: K(q1) = 8, K(q2) = 8, K(q3) = 14.
i.e. k(q1) = 0.19, k(q2) = 0.19, k(q3) = 0.333
0
2
4
6
8
10
12
14
0 0.2 0.4 0.6 0.8 1 1.2
U
Ucalc
Ucalcwithoutdeg
Calculated
individually
Calculated,
by eq. 2
Calculated
Without regarding
degeneracy
36. Balance equation
a, b d, e, f
c
V = |{(c,a),(c,b),(c,d),(c,e),(c,f),(a,b),(b,a),(d,e),(e,d),(d,f),(f,d),(e,f),(f,e)}|
U = |{(a,d),(a,e),(a,f),(b,d),(b,e),(b,f)}|
K = = |{(a,b),(b,a)} {(d,e),(e,d),(d,f),(f,d),(e,f),(f,,e)}|
2*U + 2*V – K = n*(n-1)
37. Discussion
• Assume U = f(p) can be established, then
• The control is concerned with stability, not…
• …at which values of p the „best“ partial order
will be found (see with respect to fuzzy
modelling: Annoni et al., 2008, De Baets et al.,
2011)
• The distance graph will help us to decide the
effect of mixing the indicators due to their weight
intervals
• As to how far Vinput can be seen as a better
leading quantity is a task for the future