Does Fractional Norms Help Overcome Curse of Dimensionality

Do Fractional Norms and Quasinorms
Help to Overcome
the Curse of Dimensionality?
Alexander N. Gorban
with Jeza Allohibi and Evgeny M. Mirkes
University of Leicester, UK
and Lobachevsky State University, Russia

Curse of dimensionality (Bellman, 1957)
Blessing of dimensionality (Kainen, 1997)
For a random sample in high-dimensional space
• Concentration of distances: Distances between almost all
pairs of points are almost equal;
• Quasiorthogonality: Vectors of the sample are almost
orthogonal (after centralization);
• Stochastic separation: Almost every point is linearly
separable from the set of all other points
With high probability,
for a wide class of distributions
and even for exponentially large samples

Essentially high-dimensional
distributions
• Stochastic separation theorems and other
concentration results do not need hypotheses about
independence and uniform distribution of data.
• They do not need any other hypothesis about special
distributions like the Gaussian one.
• The main condition used instead of these
simpliﬁcations is: sets of small volume should not have
a high probability (further speciﬁcations in what ‘small’
and ‘large’ mean here can be found in publications).
• In particular, instead of uniform or Gaussian
distributions general log-concave distributions can be
used, and this is just an example.

Measure concentration
Almost all are almost orthogonal (equidistribution in a cube [-1,1]n)

Measure concentration
Almost all are almost equidistant (equidistribution in a cube [0,1]n)

Minkowski distance
Value
𝑥 𝑝 =
𝑖=1
𝑑
𝑥𝑖
𝑝
1/𝑝
is Minkovski distance for
𝑝 ≥ 1 and quasinorm for
0 < 𝑝 < 1

Fractional norms can compensate
curse of dimensionality???
(C.C. Aggarwal, 2001)
We select three measures to compare 𝑙 𝑝 for different p
(𝐷 𝑝 is the set of 𝑙 𝑝 distances between points in a sample):
1. Relative contrast: 𝑅𝐶 𝑝 =
max 𝐷 𝑝 −min 𝐷 𝑝
min 𝐷 𝑝
2. Coefficient of variation 𝐶𝑉𝑝 =
var 𝐷 𝑝
𝑚𝑒𝑎𝑛 𝐷 𝑝
3. Accuracy of KNN classification

Relative contrast
Comparison of relative contrast for Euclidean and
Manhattan metrics: for any dataset with reasonable size
𝑃[𝑅𝐶2 < 𝑅𝐶1] = 1 (equidistribution in a cube [0,1]n)
Dim
𝑃[𝑅𝐶2 < 𝑅𝐶1] for number of points
[Aggarwal] 10 10 20 100
1 0 0 0 0
2 0.850 0.850 0.960 >0.999
3 0.887 0.930 0.996 >0.999
4 0.913 0.973 0.996 >0.999
10 0.956 0.994 >0.999 >0.999
15 0.961 >0.999 >0.999 >0.999
20 0.971 >0.999 >0.999 >0.999
100 0.982 >0.999 >0.999 >0.999

Relative contrast and
coefficient of variation
For almost all relatively rich datasets, the following
inequality are true
𝑅𝐶 𝑝 < 𝑅𝐶 𝑞, 𝐶𝑉𝑝 < 𝐶𝑉𝑞, ∀𝑝 > 𝑞
(equidistribution in a cube [0,1]n)

Main questions:
A) What does it mean
“Data Dimension”?
B) Does the greater value of relative
contrast or coefficient of variation
means the better quality of
classifier?

Dimension definitions in use
• Number of attributes (#Attr) 
• Number of informative principal components
according to the Kaiser rule (PCA-K)
according to the Broken stick rule (PCA-B)
according to the Conditional number rule (PCA-CN)
• Dimension according to the separability property
• Fractal dimension

Name #Attr PCA-K PCA_B PCA-CN SepD FracD
EEG Eye State 14 4 4 5 2.1 1.2
Climate Model Simulation Crashes 18 10 0 18 16.8 21.7
Diabetic Retinopathy Debrecen 19 5 3 8 4.3 2.3
SPECT Heart 22 7 3 12 4.9 11.5
Breast Cancer 30 6 3 5 4.3 3.5
Ionosphere 34 8 4 9 3.9 3.5
QSAR biodegradation 41 11 6 15 5.4 3.1
SPECTF Heart 44 10 3 6 5.6 7
MiniBooNE particle identiﬁcation 50 4 1 1 0.5 2.7
First-order theorem proving 51 13 7 9 3.4 2.04
Connectionist Bench (Sonar) 60 13 6 11 6.1 5.5
Quality Assessment of Digital Colposcopies 62 11 6 9 5.6 4.7
LFW (faces) 128 51 55 57 13.8 19.3
Musk 1 166 23 9 7 4.1 4.4
Musk 2 166 25 13 6 4.1 7.8
Madelon 500 224 0 362 436.3 13.5
Gisette 5,000 1465 133 25 10.2 2.04
Different dimensions for databases

Comparison of accuracies for 𝑙 𝑝
We select several measures of classification accuracy
measures:
1. Total Number of Neighbours of the Same Class
(TNNSC)
2. Accuracy (fraction of correctly recognised cases
among all cases)
3. Sensitivity plus specificity (true positive rate +
true negative rate)
For TNNSC and accuracy the proportion estimation
was used to identify significance of differences

Comparison of several algorithms
To compare simultaneously performance of several
algorithms we applied Friedman test (null hypothesis
is “all algorithms have the same performance”)
If the Friedman test identified a performance
inequality of tested algorithms then post hoc
Nomenyi test allows identifying pairs of algorithms
with statistically significantly different performance.

Results
Green is the best, Yellow is the second best, Red is the worst
𝑝 for 𝑙 𝑝
Indicator
0.01 0.1 0.5 1 2 4 10 ∞
TNNSC
The best 1 5 10 13 4 6 1 3
The worst 23 4 2 2 3 3 4 7
Insignificantly different from the best 19 26 32 31 30 29 26 26
Insignificantly different from the worst 36 24 22 21 22 22 26 26
Accuracy
The best 2 7 15 8 8 3 3 6
The worst 18 6 3 4 5 9 8 8
Insignificantly different from the best 30 31 34 33 33 32 31 32
Insignificantly different from the worst 36 33 31 31 31 32 33 32
Sensitivity plus specificity
The best 5 8 13 6 9 3 4 5
The worst 15 4 2 3 3 7 8 13

Results
Friedman test shows p-values of less than 0.0001 for
all tests.
Preprocessing Indicator
Set of insignificantly different
0.01 0.1 0.5 1 2 4 10 ∞
No preprocessing
TNNSC X X X X X
Accuracy X X X X
Se+Sp X X X X
Standardisation
TNNSC X X X
Accuracy X X X
Se+Sp X X X X
Standard
dispersion
TNNSC X X X X
Accuracy X X X X
Se+Sp X X X X

Conclusion
• For almost all rich enough datasets relative contrast
and coefficient of variation are less for greater degrees
p of Minkowski metrics or quasimetrics 𝑙 𝑝 (Fractional
quasimetrics with small p have greater relative contrast
and coefficient of variation).
• Greater values of relative contrast and coefficient of
variations do not mean better quality of KNN
classification.
• Performance of KNN for 𝑝 = 0.5, 1, 2 are statistically
insignificant for all tests. Extremely small or high values
of 𝑝 correspond to worse performance.
• Fractional quasinorms do not help to overcome the
curse of dimensionality in classification problem.

Conclusion
Fractional quasinorms do not help to overcome the
curse of dimensionality in classification problem

Some references 1
• C. C. Aggarwal, A. Hinneburg, and D. A. Keim, On the surprising
behavior of distance metrics in high dimensional space, in
International conference on database theory. Springer, 2001, pp.
420–434.
• P. C. Kainen, Utilizing geometric anomalies of high dimension:
When complexity makes computation easier, in Computer
Intensive Methods in Control and Signal Processing. Springer,
1997, pp. 283–294.
• P. Lévy, Problèmes concrets d’analyse fonctionnelle. Paris, France:
Gauthier-Villars, 1951.
• P . Kainen, V. Kůrková. Quasiorthogonal dimension of Euclidian
spaces. Appl. Math. Lett. 6 (1993), 7–10.
• A.N. Gorban, I.Y. Tyukin, D.V. Prokhorov, K.I. Sofeikov,
Approximation with random bases: Pro et Contra, Information
Sciences 364-365, (2016), 129-145.

Some references 2
• A.N. Gorban, I.Y. Tyukin. Stochastic Separation Theorems, Neural
Networks, 94, October 2017, 255-259.
• D. Donoho, J. Tanner. Observed universality of phase transitions in
high-dimensional geometry, with implications for modern data
analysis and signal processing, Philosophical Transactions of The
Royal Society A 367(1906), 20090152 (2009).
• A.N. Gorban, I.Y. Tyukin. Blessing of dimensionality: mathematical
foundations of the statistical physics of data. Philosophical
Transactions of The Royal Society A 376(2118), 20170237 (2018).
• A.N. Gorban, A. Golubkov, B. Grechuk, E.M. Mirkes, I.Y. Tyukin,
Correction of AI systems by linear discriminants: Probabilistic
foundations, Information Sciences 466 (2018), 303-322.
• A.N. Gorban, V.A. Makarov, I.Y. Tyukin, The unreasonable
effectiveness of small neural ensembles in high-dimensional
brain, Physics of Life Reviews, 2019,
https://doi.org/10.1016/j.plrev.2018.09.005

Does Fractional Norms Help Overcome Curse of Dimensionality

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Does Fractional Norms Help Overcome Curse of Dimensionality

Similar to Does Fractional Norms Help Overcome Curse of Dimensionality (20)

More from Alexander Gorban

More from Alexander Gorban (7)

Recently uploaded

Recently uploaded (20)

Does Fractional Norms Help Overcome Curse of Dimensionality