SlideShare a Scribd company logo
1 of 16
Download to read offline
1/16
18.650 – Fundamentals of Statistics
8. Principal Component Analysis (PCA)
2/16
Multivariate statistics
I Let X be a d-dimensional random vector and X1, . . . , Xn be
n independent copies of X.
I Write Xi = (X
(1)
i , . . . , X
(d)
i )>, i = 1, . . . , n.
I Denote by X the random n × d matrix
X =



· · · X>
1 · · ·
.
.
.
· · · X>
n · · ·


 .
3/16
Multivariate statistics
I Assume that IE[kXk2
2] < ∞.
I Mean of X:
IE[X] =

IE[X(1)
], . . . , IE[X(d)
]

.
I Covariance matrix of X: the matrix Σ = (σj,k)j,k=1,...,d, where
σj,k = cov(X(j)
, X(k)
).
I It is easy to see that
Σ = IE[XX
]−IE[X]IE[X]
= IE
h
(X−IE[X])(X−IE[X])
i
.
4/16
Multivariate statistics
I Empirical mean of X1, . . . , Xn:
X̄ =
1
n
n
X
i=1
Xi =

X̄(1)
, . . . , X̄(d)

.
I Empirical covariance of X1, . . . , Xn: the matrix
S = (sj,k)j,k=1,...,d where sj,k is the empirical covariance of
the X
(j)
i , X
(h)
i , i = 1 . . . , n.
I It is easy to see that
S =
1
n
n
X
i=1
XiX
i − X̄X̄
=
1
n
n
X
i=1
Xi − X̄

Xi − X̄

.
5/16
Multivariate statistics
I Note that X̄ =
1
n
X
1
I, where 1
I = (1, . . . , 1) ∈ IRd .
I Note also that
S =
1
n
X
X −
1
n2
X1
I1
I
X =
1
n
X
HX,
where H = In − 1
n1
I1
I.
I H is an orthogonal projector: H2 = H, H = H. (on what
subspace ?)
I If u ∈ IRd,
I u
Σu = var(u
X)
I u
Su is the sample variance of u
X1, . . . , u
Xn.
6/16
Multivariate statistics
I In particular, uSu measures how spread (i.e., diverse) the
points are in direction u.
I If uSu = 0, then all Xi’ s are in an affine subspace
orthogonal to u.
I If uΣu = 0, then X is almost surely in an affine subspace
orthogonal to u.
I If uSu is large with kuk2 = 1, then the direction of u
explains well the spread (i.e., diversity) of the sample.
7/16
Review of linear algebra
I In particular, Σ and S are symmetric, positive semi-definite.
I Any real symmetric matrix A ∈ IRd×d has the spectral
decomposition
A = PDP
,
where:
I P is a d × d orthogonal matrix, i.e., PP
= P
P = Id;
I D is diagonal.
I The diagonal elements of D are the eigenvalues of A and the
columns of P are the corresponding eigenvectors of A.
I A is semi-definite positive iff all its eigenvalues are
nonnegative.
8/16
Principal Component Analysis
I The sample X1, . . . , Xn makes a cloud of points in IRd.
I In practice, d is large. If d  3, it becomes impossible to
represent the cloud on a picture.
I Question: Is it possible to project the cloud onto a linear
subspace of dimension d0  d by keeping as much information
as possible ?
I Answer: PCA does this by keeping as much covariance
structure as possible by keeping orthogonal directions that
discriminate well the points of the cloud.
9/16
Variances
I Idea: Write S = PDP, where
I P = (v1, . . . , vd) is an orthogonal matrix, i.e.,
kvjk2 = 1, v
j vk = 0, ∀j 6= k.
I
D = diag(λ1, . . . , λd) =









λ1
λ2 0
...
0 ...
λd









with λ1 ≥ . . . ≥ λd ≥ 0.
I Note that D is the empirical covariance matrix of the
PXi’s, i = 1, . . . , n.
I In particular, λ1 is the empirical variance of the v
1 Xi’s; λ2 is
the empirical variance of the v
2 Xi’s, etc...
10/16
Projection
I So, each λj measures the spread of the cloud in the direction
vj.
I In particular, v1 is the direction of maximal spread.
I Indeed, v1 maximizes the empirical covariance of
aX1, . . . , aXn over a ∈ IRd such that kak2 = 1.
I Proof: For any unit vector a, show that
a
Σa =

P
a

D

P
a

≤ λ1,
with equality if a = v1.
11/16
Principal Component Analysis: Main principle
I Idea of the PCA: Find the collection of orthogonal directions
in which the cloud is much spread out.
Theorem
v1 ∈ argmax
kuk=1
u
Su,
v2 ∈ argmax
kuk=1,u⊥v1
u
Su,
· · ·
vd ∈ argmax
kuk=1,u⊥vj,j=1,...,d−1
u
Su.
Hence, the k orthogonal directions in which the cloud is the most
spread out correspond exactly to the eigenvectors associated with
the k largest values of S. They are called principal directions
12/16
Principal Component Analysis: Algorithm
1. Input: X1, . . . , Xn: cloud of n points in dimension d.
2. Step 1: Compute the empirical covariance matrix.
3. Step 2: Compute the spectral decomposition S = PDP,
where D = diag(λ1, . . . , λd), with λ1 ≥ λ2 ≥ . . . ≥ λd and
P = (v1, . . . , vd) is an orthogonal matrix.
4. Step 3: Choose k  d and set Pk = (v1, . . . , vk) ∈ IRd×k.
5. Output: Y1, . . . , Yn, where
Yi = P
k Xi ∈ IRk
, i = 1, . . . , n.
Question: How to choose k ?
13/16
How to choose the number of principal components k?
I Experimental rule: Take k where there is an inflection point in
the sequence λ1, . . . , λd (scree plot).
I Define a criterion: Take k such that
proportion of explained variance=
λ1 + . . . + λk
λ1 + . . . + λd
≥ 1 − α,
for some α ∈ (0, 1) that determines the approximation error
that the practitioner wants to achieve.
I Remark: λ1 + . . . + λk is called the variance explained by the
PCA and λ1 + . . . + λd = tr(S) is the total variance.
I Data visualization: Take k = 2 or 3.
14/16
Example: Expression of 500,000 genes among 1400
Europeans
15/16
Principal Component Analysis - Beyond practice
I PCA is an algorithm that reduces the dimension of a cloud of
points and keeps its covariance structure as much as possible.
I In practice this algorithm is used for clouds of points that are
not necessarily random.
I In statistics, PCA can be used for estimation.
I If X1, . . . , Xn are i.i.d. random vectors in IRd, how to
estimate their population covariance matrix Σ ?
I If n  d, then the empirical covariance matrix S is a
consistent estimator.
I In many applications, n  d (e.g., gene expression). Solution:
sparse PCA
16/16
Principal Component Analysis - Beyond practice
I It may be known beforehand that Σ has (almost) low rank.
I Then, run PCA on S: Write S ≈ S0, where
S0
= P












λ1
λ2 0
...
λk
0
0 ...
0












P
.
I S0 will be a better estimator of S under the low-rank
assumption.
I A theoretical analysis would lead to an optimal choice of the
tuning parameter k.

More Related Content

Similar to asset-v1_MITx+18.6501x+2T2020+type@asset+block@lectureslides_Chap8-noPhantom.pdf

The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)
JelaiAujero
 
Machine learning (12)
Machine learning (12)Machine learning (12)
Machine learning (12)
NYversity
 
sublabel accurate convex relaxation of vectorial multilabel energies
sublabel accurate convex relaxation of vectorial multilabel energiessublabel accurate convex relaxation of vectorial multilabel energies
sublabel accurate convex relaxation of vectorial multilabel energies
Fujimoto Keisuke
 

Similar to asset-v1_MITx+18.6501x+2T2020+type@asset+block@lectureslides_Chap8-noPhantom.pdf (20)

Cheatsheet unsupervised-learning
Cheatsheet unsupervised-learningCheatsheet unsupervised-learning
Cheatsheet unsupervised-learning
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdf
 
Mgm
MgmMgm
Mgm
 
Interpolation techniques - Background and implementation
Interpolation techniques - Background and implementationInterpolation techniques - Background and implementation
Interpolation techniques - Background and implementation
 
Jere Koskela slides
Jere Koskela slidesJere Koskela slides
Jere Koskela slides
 
Communication Theory - Random Process.pdf
Communication Theory - Random Process.pdfCommunication Theory - Random Process.pdf
Communication Theory - Random Process.pdf
 
Basics of probability in statistical simulation and stochastic programming
Basics of probability in statistical simulation and stochastic programmingBasics of probability in statistical simulation and stochastic programming
Basics of probability in statistical simulation and stochastic programming
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)The Euclidean Spaces (elementary topology and sequences)
The Euclidean Spaces (elementary topology and sequences)
 
Fixed Point Theorm In Probabilistic Analysis
Fixed Point Theorm In Probabilistic AnalysisFixed Point Theorm In Probabilistic Analysis
Fixed Point Theorm In Probabilistic Analysis
 
The Probability that a Matrix of Integers Is Diagonalizable
The Probability that a Matrix of Integers Is DiagonalizableThe Probability that a Matrix of Integers Is Diagonalizable
The Probability that a Matrix of Integers Is Diagonalizable
 
Cs229 notes11
Cs229 notes11Cs229 notes11
Cs229 notes11
 
Tensor 1
Tensor  1Tensor  1
Tensor 1
 
Multivariate Distributions, an overview
Multivariate Distributions, an overviewMultivariate Distributions, an overview
Multivariate Distributions, an overview
 
Machine learning (12)
Machine learning (12)Machine learning (12)
Machine learning (12)
 
multivariate normal distribution.pdf
multivariate normal distribution.pdfmultivariate normal distribution.pdf
multivariate normal distribution.pdf
 
Lecture_note2.pdf
Lecture_note2.pdfLecture_note2.pdf
Lecture_note2.pdf
 
Lecture7 channel capacity
Lecture7   channel capacityLecture7   channel capacity
Lecture7 channel capacity
 
sublabel accurate convex relaxation of vectorial multilabel energies
sublabel accurate convex relaxation of vectorial multilabel energiessublabel accurate convex relaxation of vectorial multilabel energies
sublabel accurate convex relaxation of vectorial multilabel energies
 

Recently uploaded

result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 

Recently uploaded (20)

Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 

asset-v1_MITx+18.6501x+2T2020+type@asset+block@lectureslides_Chap8-noPhantom.pdf

  • 1. 1/16 18.650 – Fundamentals of Statistics 8. Principal Component Analysis (PCA)
  • 2. 2/16 Multivariate statistics I Let X be a d-dimensional random vector and X1, . . . , Xn be n independent copies of X. I Write Xi = (X (1) i , . . . , X (d) i )>, i = 1, . . . , n. I Denote by X the random n × d matrix X =    · · · X> 1 · · · . . . · · · X> n · · ·    .
  • 3. 3/16 Multivariate statistics I Assume that IE[kXk2 2] < ∞. I Mean of X: IE[X] = IE[X(1) ], . . . , IE[X(d) ] . I Covariance matrix of X: the matrix Σ = (σj,k)j,k=1,...,d, where σj,k = cov(X(j) , X(k) ). I It is easy to see that Σ = IE[XX ]−IE[X]IE[X] = IE h (X−IE[X])(X−IE[X]) i .
  • 4. 4/16 Multivariate statistics I Empirical mean of X1, . . . , Xn: X̄ = 1 n n X i=1 Xi = X̄(1) , . . . , X̄(d) . I Empirical covariance of X1, . . . , Xn: the matrix S = (sj,k)j,k=1,...,d where sj,k is the empirical covariance of the X (j) i , X (h) i , i = 1 . . . , n. I It is easy to see that S = 1 n n X i=1 XiX i − X̄X̄ = 1 n n X i=1 Xi − X̄ Xi − X̄ .
  • 5. 5/16 Multivariate statistics I Note that X̄ = 1 n X 1 I, where 1 I = (1, . . . , 1) ∈ IRd . I Note also that S = 1 n X X − 1 n2 X1 I1 I X = 1 n X HX, where H = In − 1 n1 I1 I. I H is an orthogonal projector: H2 = H, H = H. (on what subspace ?) I If u ∈ IRd, I u Σu = var(u X) I u Su is the sample variance of u X1, . . . , u Xn.
  • 6. 6/16 Multivariate statistics I In particular, uSu measures how spread (i.e., diverse) the points are in direction u. I If uSu = 0, then all Xi’ s are in an affine subspace orthogonal to u. I If uΣu = 0, then X is almost surely in an affine subspace orthogonal to u. I If uSu is large with kuk2 = 1, then the direction of u explains well the spread (i.e., diversity) of the sample.
  • 7. 7/16 Review of linear algebra I In particular, Σ and S are symmetric, positive semi-definite. I Any real symmetric matrix A ∈ IRd×d has the spectral decomposition A = PDP , where: I P is a d × d orthogonal matrix, i.e., PP = P P = Id; I D is diagonal. I The diagonal elements of D are the eigenvalues of A and the columns of P are the corresponding eigenvectors of A. I A is semi-definite positive iff all its eigenvalues are nonnegative.
  • 8. 8/16 Principal Component Analysis I The sample X1, . . . , Xn makes a cloud of points in IRd. I In practice, d is large. If d 3, it becomes impossible to represent the cloud on a picture. I Question: Is it possible to project the cloud onto a linear subspace of dimension d0 d by keeping as much information as possible ? I Answer: PCA does this by keeping as much covariance structure as possible by keeping orthogonal directions that discriminate well the points of the cloud.
  • 9. 9/16 Variances I Idea: Write S = PDP, where I P = (v1, . . . , vd) is an orthogonal matrix, i.e., kvjk2 = 1, v j vk = 0, ∀j 6= k. I D = diag(λ1, . . . , λd) =          λ1 λ2 0 ... 0 ... λd          with λ1 ≥ . . . ≥ λd ≥ 0. I Note that D is the empirical covariance matrix of the PXi’s, i = 1, . . . , n. I In particular, λ1 is the empirical variance of the v 1 Xi’s; λ2 is the empirical variance of the v 2 Xi’s, etc...
  • 10. 10/16 Projection I So, each λj measures the spread of the cloud in the direction vj. I In particular, v1 is the direction of maximal spread. I Indeed, v1 maximizes the empirical covariance of aX1, . . . , aXn over a ∈ IRd such that kak2 = 1. I Proof: For any unit vector a, show that a Σa = P a D P a ≤ λ1, with equality if a = v1.
  • 11. 11/16 Principal Component Analysis: Main principle I Idea of the PCA: Find the collection of orthogonal directions in which the cloud is much spread out. Theorem v1 ∈ argmax kuk=1 u Su, v2 ∈ argmax kuk=1,u⊥v1 u Su, · · · vd ∈ argmax kuk=1,u⊥vj,j=1,...,d−1 u Su. Hence, the k orthogonal directions in which the cloud is the most spread out correspond exactly to the eigenvectors associated with the k largest values of S. They are called principal directions
  • 12. 12/16 Principal Component Analysis: Algorithm 1. Input: X1, . . . , Xn: cloud of n points in dimension d. 2. Step 1: Compute the empirical covariance matrix. 3. Step 2: Compute the spectral decomposition S = PDP, where D = diag(λ1, . . . , λd), with λ1 ≥ λ2 ≥ . . . ≥ λd and P = (v1, . . . , vd) is an orthogonal matrix. 4. Step 3: Choose k d and set Pk = (v1, . . . , vk) ∈ IRd×k. 5. Output: Y1, . . . , Yn, where Yi = P k Xi ∈ IRk , i = 1, . . . , n. Question: How to choose k ?
  • 13. 13/16 How to choose the number of principal components k? I Experimental rule: Take k where there is an inflection point in the sequence λ1, . . . , λd (scree plot). I Define a criterion: Take k such that proportion of explained variance= λ1 + . . . + λk λ1 + . . . + λd ≥ 1 − α, for some α ∈ (0, 1) that determines the approximation error that the practitioner wants to achieve. I Remark: λ1 + . . . + λk is called the variance explained by the PCA and λ1 + . . . + λd = tr(S) is the total variance. I Data visualization: Take k = 2 or 3.
  • 14. 14/16 Example: Expression of 500,000 genes among 1400 Europeans
  • 15. 15/16 Principal Component Analysis - Beyond practice I PCA is an algorithm that reduces the dimension of a cloud of points and keeps its covariance structure as much as possible. I In practice this algorithm is used for clouds of points that are not necessarily random. I In statistics, PCA can be used for estimation. I If X1, . . . , Xn are i.i.d. random vectors in IRd, how to estimate their population covariance matrix Σ ? I If n d, then the empirical covariance matrix S is a consistent estimator. I In many applications, n d (e.g., gene expression). Solution: sparse PCA
  • 16. 16/16 Principal Component Analysis - Beyond practice I It may be known beforehand that Σ has (almost) low rank. I Then, run PCA on S: Write S ≈ S0, where S0 = P             λ1 λ2 0 ... λk 0 0 ... 0             P . I S0 will be a better estimator of S under the low-rank assumption. I A theoretical analysis would lead to an optimal choice of the tuning parameter k.