SlideShare a Scribd company logo
Platzhalter für Bild, Bild auf Titelfolie hinter das Logo einsetzen
Avinash Bapu Sreenivas
4359023
Numerical Investigation for Information
Tracking of Noisy and Non-smooth data in
Large-scale Statistics
Overview
 Introduction
 Regularization and its importance
 Total Variation (TV) regularization
• Principle
• Formula
• Importance
 Determination of regularization parameter
• Eye-balling (Trial and error) method
• L-curve method
• Normalized Cumulative Periodogram
• Generalized Cross Validation
 Summary
23. April 2021 | Referent | Information tracking of noisy data | Seite 2
Introduction
1. Understanding the term “large-scale data”
2. Complexities surrounding large-scale data
• Measurement
error
• Sample error
• Human error
23. April 2021 | Referent | Information tracking of noisy data | Seite 3
Large scale data
Large volume
High sample size
High growth rate
Characteristics
Complexities
Statistical analysis of
large scale data
Tedious
Complex
Computationally
challenging
Reasons
These types of errors results in additional meaningless information termed as
“noise”.
3. Major consequence of noise in field of engineering,
23. April 2021 | Referent | Information tracking of noisy data | Seite 4
Apply
Derivatives are
crucial in engineering
Finite difference
method
Noisy
data
Amplification
of noise
Therefore , regularization methods are employed
to overcome the amplification
Regularization and its importance
1. Aim of regularization
•to overcome the problem of overfitting of data arising from highly unstable RSS.
•achieved by the introduction of an additional term, known as penalty term.
2. General representation,
3. Detailed understanding of the aforementioned terms,
•Data fidelity term,
23. April 2021 | Referent | Information tracking of noisy data | Seite 5
Derivative of “f” Penalty term Data fidelity
Residual Sum Square (RSS)
Contd.
• Penalty ,
Based on the penalty term , some of the well known regularization are mentioned below ,
23. April 2021 | Referent | Information tracking of noisy data | Seite 6
Regularization parameter
Regularization term
Ridge (L2)
LASSO (L1)
[Least Absolute Shrinkage and
Selection Operator]
TV regularization
Comparison between L1 & L2 regularization
23. April 2021 | Referent | Information tracking of noisy data | Seite 7
1. Geometric representation
L2
L1
Total Variation (TV) regularization
1. General form of TV regularization,
2. Basic principle of TV regularization
• General form of Euler-Lagrange equation,
23. April 2021 | Referent | Information tracking of noisy data | Seite 8
RSS Penalty term
Derivative
of
solution ‘u’
a) Gradient descent (using
Euler-Lagrange equation)
b) Lagged diffusivity fixed point
method
Contd.
23. April 2021 | Referent | Information tracking of noisy data | Seite 9
Applying Euler-Lagrange equation coupled with gradient descent method to TV
regularization,
• Final equation
Final equation
Solved Lagged diffusivity
Contd.
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 10
4. Standard deviation increases Stronger parameter values for
satisfactory results
3. Importance of
• Jump discontinuities inclusion
• Determination of discontinuous derivative
Eye-balling technique
1. Trial and error method
2. Large scale data a) Inefficient
b) Infeasible
3.
23. April 2021 | Referent | Information tracking of noisy data | Seite 12
1. Numerical and approximated function values of TF1
0 1 2 3 4 5 6 7
Data points
-1
-0.5
0
0.5
1
1.5
Function
values
Comparision between numerical function vs approximated function
known function
alpha=0.0005
alpha=0.21
alpha=5
Over-fit
Under-fit
Good-fit
Contd.
Test function 1
Data points
Given function
Standard deviation
of
AWGN
Noisy function
23. April 2021 | Referent | Information tracking of noisy data | Seite 13
TEST FUNCTION 1
0 1 2 3 4 5 6
Data points
-1.5
-1
-0.5
0
0.5
1
1.5
Function
values
Graphical representation of tabular values
given data
noisy data
L-curve method
1. Estimation of L-curve parameter:
23. April 2021 | Referent | Information tracking of noisy data | Seite 14
Determine
the solution
norm (a)
Calc to residual
norm (b)
Determine
log10 (a)
&
log10 (b)
For
each
RP
value
Plot a graph
(log10b , log10a)
Contd.
2. Estimation of curvature of L-curve graph:
23. April 2021 | Referent | Information tracking of noisy data | Seite 15
Determine
curvature using
3 consecutive
points of a curve
distance
formula
Plot
Select
Results regarding TF1
23. April 2021 | Referent | Information tracking of noisy data | Seite 16
Function values
Curvature plot
Results for Test function 2
23. April 2021 | Referent | Information tracking of noisy data | Seite 17
Data points
Given function
Standard
deviation of
AWGN
Noisy function
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
-0.5
0
0.5
1
1.5
2
2.5
3
Function
values
Graphical representation of tabular values
given function
Noist function
Results for Test function 3
23. April 2021 | Referent | Information tracking of noisy data | Seite 18
Data points
Given function
Standard
deviation of
AWGN
Noisy function
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Data points
-20
0
20
40
60
80
100
120
140
160
Function
points
Graphical representation of tabular values
given function
Noisy function
Contd.
Advantages and disadvantages of L-curve:
Advantages:
1. Highly robust method
2. Clear indication of over-fit and under-fit condition
3. “corner” balance between the two conditions
Disadvantages:
1. Special cases
2. Increase in problem size
23. April 2021 | Referent | Information tracking of noisy data | Seite 19
2 “corners”
• Global corner
• New corner
Over-regularization
or
Large parameter value
Manually check for better
yielding solution
Normalized Cumulative Periodogram (NCP)
1. Aim and objective of NCP
 NCP mainly deals with residual vector ,
 Extraction of relevant information from data such that only noise remains in
 Optimal regularization parameter leads to the transition of from various signal
dominance to white-noise characteristics.
23. April 2021 | Referent | Information tracking of noisy data | Seite 20
Discrete Fourier
transform of r
Periodogram
Length of residual
vector
NCP of residual
vector
For
each
RP
value
Discrete Fourier transform of r Periodogram
Length of residual vector
NCP of residual vector
Results for TF1
23. April 2021 | Referent | Information tracking of noisy data | Seite 21
0 50 100 150 200 250 300
length of residual vector
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
NCP
values
NCP graph for various regularization parameter (RP)
RP(12)=0.56
RP(2)=0.06
RP(40)=1.96
RP(25)=1.21
RP(8)=0.36
RP(19)=0.91
RP(33)=1.61
RP(18)=0.86
RP(20)=0.96
RP(1)=0.01
Gaussian white noise characteristics
Contd.
23. April 2021 | Referent | Information tracking of noisy data | Seite 22
Results for TF2 and TF3
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 23
0 50 100 150 200 250
length of residual vector
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
NCP
values
NCP graph for various regularization parameter
Each line indicates NCP values for 48 different RP values
Test function 2
Test function 3
Contd.
Advantages and disadvantages of NCP:
Advantages:
1. Computationally inexpensive
2. Deals with only residual vector More stable for high regularization parameter
Disadvantages:
1. Deals mainly with white noise
2. Dealing with different types of noise
NOTE : For a noise vector
Covariance matrix : Colored noise model
23. April 2021 | Referent | Information tracking of noisy data | Seite 24
• Constant spectral density
• Random errors independent to each other
where,
Power spectral density
Frequency
White noise
Covariance matrix must known
White noise model
Generalized Cross Validation (GCV)
1. Objective of GCV:
 Helps determine the optimal regularization parameter even when,
a) exact data are unknown
b) No knowledge about noise variance
Hence proves to be an effective statistical tool.
2. Mathematics behind GCV:
 System definition :
 GCV is given by : where,
23. April 2021 | Referent | Information tracking of noisy data | Seite 25
GCV Dataset Identity
matrix
Differentiation
operator
Noisy
data
Trace of
a matrix
Results of GCV
23. April 2021 | Referent | Information tracking of noisy data | Seite 26
Test function 1 Test function 2
Optimal regularization parameter Minimum of GCV function
NOTE:
Advantages and disadvantages of GCV:
Advantages:
Unknown noise level
Disadvantages:
In small sample
23. April 2021 | Referent | Information tracking of noisy data | Seite 27
• noise with smaller standard deviation is hard to capture
• Characterized by under-regularization (small parameter value)
• GCV can be employed
• Provides satisfactory parameter value
Application of TV regularization in data-driven process
23. April 2021 | Referent | Information tracking of noisy data | Seite 28
Problem description
0 1 2 3 4 5 6 7 8 9 10
tspan
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
polynomial
v
l
a
ues
polynomial graph of the form
-0.3x+0.2x2
+0.1x
3
+0.5x
4
values
Contd.
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 29
Convergence and optimal parameter value
Contd.
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 30
Analytical values
Above graph proves that the optimal parameter
provides satisfactory results
Summary
1. Regularization is one of the best possibilities for tracking information from noisy data.
2. Total variation regularization is investigated thoroughly and the following conclusions are
drawn,
» Optimal regularization parameter plays a crucial role during the analysis of noisy data.
» Determination of optimal parameter is of paramount importance but complex.
» Different methods are tested to determine the optimal parameter value.
 L-curve
 Normalized Cumulative Periodogram (NCP)
 Generalized Cross Validation (GCV)
3. Data-driven system was simply employed to verify the results.
23. April 2021 | Referent | Information tracking of noisy data | Seite 31
Differs in philosophy but
provides satisfactory values
Discussion
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 32
Thank you
all for
your patience
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 33
Appendix
Results for TF2
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 34
Derivative values
Function values
Results for TF3
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 35
Derivative values
Function values
2. Based on the penalty term , the following 2 properties are discussed below
 Robustness
•L1:
•L2:
 Computational effort
•L1:
•L2:
3. Sparsity
 Shrinkage of coefficients to zeros
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 36
more robust
less robust
L1-norm More computational effort
L2-norm Less computational effort
Description of test functions
Data points
Given function
Standard deviation
of
AWGN
Noisy function
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 37
TEST FUNCTION 1
0 1 2 3 4 5 6
Data points
-1.5
-1
-0.5
0
0.5
1
1.5
Function
values
Graphical representation of tabular values
given data
noisy data
Description of test functions
Data points
Given function
Standard deviation
of
AWGN
Noisy function
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 38
TEST FUNCTION 2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Data points
-0.5
0
0.5
1
1.5
2
2.5
3
Function
values
Graphical representation of tabular values
given function
Noist function
Description of test functions
Data points
Given function
Standard deviation
of
AWGN
Noisy function
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 39
TEST FUNCTION 3
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Data points
0
50
100
150
Function
values
Graphical representation of tabular values
given function
noisy function
Generalized Cumulative Periodogram (GCV)
1. Objective of GCV:
 Helps determine the optimal regularization parameter even when,
a) exact data are unknown
b) No knowledge about noise variance
Hence proves to be an effective statistical tool.
2. Fundamental formula surrounding GCV:
23. April 2021 | Referent | Information tracking of noisy data | Seite 40
GCV Dataset Identity
matrix
Noisy
data
Trace of
a matrix
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 41
Comparison of known and determined function using GCV for TF1
Satisfactory results
Concept behind Data-driven (sparse regression) method
1. Aim of data-driven
» Obtaining governing equation (PDE’s) in “spatiotemporal” existence.
2. Principle of data-driven
» Efficient determination of coefficients at a fixed spatial location.
3. General formulation of data-driven (sparse regression) method,
» Non-linear form of a PDE,
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 42
Unknowns
Partial derivative
w.r.t time or space
Parameters
(coefficients)
» Discretization of nonlinear PDE ,
23. April 2021 | Referent | Kurztitel der Präsentation | Seite 43
Right hand side of
nonlinear PDE
Matrix consisting of
derivative parameters
coefficients of the PDE
Sparse vector
Parameter to be determined

More Related Content

Similar to Master thesis

23 an investigation on image 233 241
23 an investigation on image 233 24123 an investigation on image 233 241
23 an investigation on image 233 241
Alexander Decker
 
CaoTupinThursday20110722.ppt
CaoTupinThursday20110722.pptCaoTupinThursday20110722.ppt
CaoTupinThursday20110722.pptgrssieee
 
An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...
IJECEIAES
 
Towards better performance: phase congruency based face recognition
Towards better performance: phase congruency based face recognitionTowards better performance: phase congruency based face recognition
Towards better performance: phase congruency based face recognition
TELKOMNIKA JOURNAL
 
Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...
Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...
Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...
IRJET Journal
 
An efficient technique for Image Resolution Enhancement using Discrete and St...
An efficient technique for Image Resolution Enhancement using Discrete and St...An efficient technique for Image Resolution Enhancement using Discrete and St...
An efficient technique for Image Resolution Enhancement using Discrete and St...
IRJET Journal
 
Backscatter Working Group Software Inter-comparison Project Requesting and Co...
Backscatter Working Group Software Inter-comparison ProjectRequesting and Co...Backscatter Working Group Software Inter-comparison ProjectRequesting and Co...
Backscatter Working Group Software Inter-comparison Project Requesting and Co...
Giuseppe Masetti
 
IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...
IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...
IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...
IRJET Journal
 
Geometric Approach to Spectral Substraction
Geometric Approach to Spectral SubstractionGeometric Approach to Spectral Substraction
Geometric Approach to Spectral Substractionkeerthi thallam
 
Estimation of Separation and Location of Wave Emitting Sources : A Comparison...
Estimation of Separation and Location of Wave Emitting Sources : A Comparison...Estimation of Separation and Location of Wave Emitting Sources : A Comparison...
Estimation of Separation and Location of Wave Emitting Sources : A Comparison...
sipij
 
Developing digital signal clustering method using local binary pattern histog...
Developing digital signal clustering method using local binary pattern histog...Developing digital signal clustering method using local binary pattern histog...
Developing digital signal clustering method using local binary pattern histog...
IJECEIAES
 
Dcp project
Dcp projectDcp project
Dcp project
Chetan Soni
 
Amplification, ROADM and Optical Networking activities at CPqD
Amplification, ROADM and Optical Networking activities at CPqDAmplification, ROADM and Optical Networking activities at CPqD
Amplification, ROADM and Optical Networking activities at CPqDCPqD
 
Mm3422322236
Mm3422322236Mm3422322236
Mm3422322236
IJERA Editor
 
IRJET- Wavelet Transform along with SPIHT Algorithm used for Image Compre...
IRJET-  	  Wavelet Transform along with SPIHT Algorithm used for Image Compre...IRJET-  	  Wavelet Transform along with SPIHT Algorithm used for Image Compre...
IRJET- Wavelet Transform along with SPIHT Algorithm used for Image Compre...
IRJET Journal
 
Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...
journalBEEI
 
The application wavelet transform algorithm in testing adc effective number o...
The application wavelet transform algorithm in testing adc effective number o...The application wavelet transform algorithm in testing adc effective number o...
The application wavelet transform algorithm in testing adc effective number o...
ijcsit
 
W4101139143
W4101139143W4101139143
W4101139143
IJERA Editor
 
Sinan Sami.docx
Sinan Sami.docxSinan Sami.docx
Sinan Sami.docx
raaed5
 

Similar to Master thesis (20)

23 an investigation on image 233 241
23 an investigation on image 233 24123 an investigation on image 233 241
23 an investigation on image 233 241
 
CaoTupinThursday20110722.ppt
CaoTupinThursday20110722.pptCaoTupinThursday20110722.ppt
CaoTupinThursday20110722.ppt
 
An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...
 
Towards better performance: phase congruency based face recognition
Towards better performance: phase congruency based face recognitionTowards better performance: phase congruency based face recognition
Towards better performance: phase congruency based face recognition
 
paper573
paper573paper573
paper573
 
Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...
Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...
Analysis of Phase Noise and Gaussian Noise in terms of Average BER for DP 16-...
 
An efficient technique for Image Resolution Enhancement using Discrete and St...
An efficient technique for Image Resolution Enhancement using Discrete and St...An efficient technique for Image Resolution Enhancement using Discrete and St...
An efficient technique for Image Resolution Enhancement using Discrete and St...
 
Backscatter Working Group Software Inter-comparison Project Requesting and Co...
Backscatter Working Group Software Inter-comparison ProjectRequesting and Co...Backscatter Working Group Software Inter-comparison ProjectRequesting and Co...
Backscatter Working Group Software Inter-comparison Project Requesting and Co...
 
IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...
IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...
IRJET-Denoising of Images using Wavelet Transform,Weiner Filter and Soft Thre...
 
Geometric Approach to Spectral Substraction
Geometric Approach to Spectral SubstractionGeometric Approach to Spectral Substraction
Geometric Approach to Spectral Substraction
 
Estimation of Separation and Location of Wave Emitting Sources : A Comparison...
Estimation of Separation and Location of Wave Emitting Sources : A Comparison...Estimation of Separation and Location of Wave Emitting Sources : A Comparison...
Estimation of Separation and Location of Wave Emitting Sources : A Comparison...
 
Developing digital signal clustering method using local binary pattern histog...
Developing digital signal clustering method using local binary pattern histog...Developing digital signal clustering method using local binary pattern histog...
Developing digital signal clustering method using local binary pattern histog...
 
Dcp project
Dcp projectDcp project
Dcp project
 
Amplification, ROADM and Optical Networking activities at CPqD
Amplification, ROADM and Optical Networking activities at CPqDAmplification, ROADM and Optical Networking activities at CPqD
Amplification, ROADM and Optical Networking activities at CPqD
 
Mm3422322236
Mm3422322236Mm3422322236
Mm3422322236
 
IRJET- Wavelet Transform along with SPIHT Algorithm used for Image Compre...
IRJET-  	  Wavelet Transform along with SPIHT Algorithm used for Image Compre...IRJET-  	  Wavelet Transform along with SPIHT Algorithm used for Image Compre...
IRJET- Wavelet Transform along with SPIHT Algorithm used for Image Compre...
 
Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...Noise resistance territorial intensity-based optical flow using inverse confi...
Noise resistance territorial intensity-based optical flow using inverse confi...
 
The application wavelet transform algorithm in testing adc effective number o...
The application wavelet transform algorithm in testing adc effective number o...The application wavelet transform algorithm in testing adc effective number o...
The application wavelet transform algorithm in testing adc effective number o...
 
W4101139143
W4101139143W4101139143
W4101139143
 
Sinan Sami.docx
Sinan Sami.docxSinan Sami.docx
Sinan Sami.docx
 

Recently uploaded

Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
bhadouriyakaku
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptxTOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
nikitacareer3
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
obonagu
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
ssuser36d3051
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 

Recently uploaded (20)

Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptxTOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 

Master thesis

  • 1. Platzhalter für Bild, Bild auf Titelfolie hinter das Logo einsetzen Avinash Bapu Sreenivas 4359023 Numerical Investigation for Information Tracking of Noisy and Non-smooth data in Large-scale Statistics
  • 2. Overview  Introduction  Regularization and its importance  Total Variation (TV) regularization • Principle • Formula • Importance  Determination of regularization parameter • Eye-balling (Trial and error) method • L-curve method • Normalized Cumulative Periodogram • Generalized Cross Validation  Summary 23. April 2021 | Referent | Information tracking of noisy data | Seite 2
  • 3. Introduction 1. Understanding the term “large-scale data” 2. Complexities surrounding large-scale data • Measurement error • Sample error • Human error 23. April 2021 | Referent | Information tracking of noisy data | Seite 3 Large scale data Large volume High sample size High growth rate Characteristics Complexities Statistical analysis of large scale data Tedious Complex Computationally challenging Reasons These types of errors results in additional meaningless information termed as “noise”.
  • 4. 3. Major consequence of noise in field of engineering, 23. April 2021 | Referent | Information tracking of noisy data | Seite 4 Apply Derivatives are crucial in engineering Finite difference method Noisy data Amplification of noise Therefore , regularization methods are employed to overcome the amplification
  • 5. Regularization and its importance 1. Aim of regularization •to overcome the problem of overfitting of data arising from highly unstable RSS. •achieved by the introduction of an additional term, known as penalty term. 2. General representation, 3. Detailed understanding of the aforementioned terms, •Data fidelity term, 23. April 2021 | Referent | Information tracking of noisy data | Seite 5 Derivative of “f” Penalty term Data fidelity Residual Sum Square (RSS)
  • 6. Contd. • Penalty , Based on the penalty term , some of the well known regularization are mentioned below , 23. April 2021 | Referent | Information tracking of noisy data | Seite 6 Regularization parameter Regularization term Ridge (L2) LASSO (L1) [Least Absolute Shrinkage and Selection Operator] TV regularization
  • 7. Comparison between L1 & L2 regularization 23. April 2021 | Referent | Information tracking of noisy data | Seite 7 1. Geometric representation L2 L1
  • 8. Total Variation (TV) regularization 1. General form of TV regularization, 2. Basic principle of TV regularization • General form of Euler-Lagrange equation, 23. April 2021 | Referent | Information tracking of noisy data | Seite 8 RSS Penalty term Derivative of solution ‘u’ a) Gradient descent (using Euler-Lagrange equation) b) Lagged diffusivity fixed point method
  • 9. Contd. 23. April 2021 | Referent | Information tracking of noisy data | Seite 9 Applying Euler-Lagrange equation coupled with gradient descent method to TV regularization, • Final equation Final equation Solved Lagged diffusivity
  • 10. Contd. 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 10 4. Standard deviation increases Stronger parameter values for satisfactory results 3. Importance of • Jump discontinuities inclusion • Determination of discontinuous derivative
  • 11. Eye-balling technique 1. Trial and error method 2. Large scale data a) Inefficient b) Infeasible 3.
  • 12. 23. April 2021 | Referent | Information tracking of noisy data | Seite 12 1. Numerical and approximated function values of TF1 0 1 2 3 4 5 6 7 Data points -1 -0.5 0 0.5 1 1.5 Function values Comparision between numerical function vs approximated function known function alpha=0.0005 alpha=0.21 alpha=5 Over-fit Under-fit Good-fit Contd.
  • 13. Test function 1 Data points Given function Standard deviation of AWGN Noisy function 23. April 2021 | Referent | Information tracking of noisy data | Seite 13 TEST FUNCTION 1 0 1 2 3 4 5 6 Data points -1.5 -1 -0.5 0 0.5 1 1.5 Function values Graphical representation of tabular values given data noisy data
  • 14. L-curve method 1. Estimation of L-curve parameter: 23. April 2021 | Referent | Information tracking of noisy data | Seite 14 Determine the solution norm (a) Calc to residual norm (b) Determine log10 (a) & log10 (b) For each RP value Plot a graph (log10b , log10a)
  • 15. Contd. 2. Estimation of curvature of L-curve graph: 23. April 2021 | Referent | Information tracking of noisy data | Seite 15 Determine curvature using 3 consecutive points of a curve distance formula Plot Select
  • 16. Results regarding TF1 23. April 2021 | Referent | Information tracking of noisy data | Seite 16 Function values Curvature plot
  • 17. Results for Test function 2 23. April 2021 | Referent | Information tracking of noisy data | Seite 17 Data points Given function Standard deviation of AWGN Noisy function 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 -0.5 0 0.5 1 1.5 2 2.5 3 Function values Graphical representation of tabular values given function Noist function
  • 18. Results for Test function 3 23. April 2021 | Referent | Information tracking of noisy data | Seite 18 Data points Given function Standard deviation of AWGN Noisy function 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Data points -20 0 20 40 60 80 100 120 140 160 Function points Graphical representation of tabular values given function Noisy function
  • 19. Contd. Advantages and disadvantages of L-curve: Advantages: 1. Highly robust method 2. Clear indication of over-fit and under-fit condition 3. “corner” balance between the two conditions Disadvantages: 1. Special cases 2. Increase in problem size 23. April 2021 | Referent | Information tracking of noisy data | Seite 19 2 “corners” • Global corner • New corner Over-regularization or Large parameter value Manually check for better yielding solution
  • 20. Normalized Cumulative Periodogram (NCP) 1. Aim and objective of NCP  NCP mainly deals with residual vector ,  Extraction of relevant information from data such that only noise remains in  Optimal regularization parameter leads to the transition of from various signal dominance to white-noise characteristics. 23. April 2021 | Referent | Information tracking of noisy data | Seite 20 Discrete Fourier transform of r Periodogram Length of residual vector NCP of residual vector For each RP value Discrete Fourier transform of r Periodogram Length of residual vector NCP of residual vector
  • 21. Results for TF1 23. April 2021 | Referent | Information tracking of noisy data | Seite 21 0 50 100 150 200 250 300 length of residual vector 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 NCP values NCP graph for various regularization parameter (RP) RP(12)=0.56 RP(2)=0.06 RP(40)=1.96 RP(25)=1.21 RP(8)=0.36 RP(19)=0.91 RP(33)=1.61 RP(18)=0.86 RP(20)=0.96 RP(1)=0.01 Gaussian white noise characteristics
  • 22. Contd. 23. April 2021 | Referent | Information tracking of noisy data | Seite 22
  • 23. Results for TF2 and TF3 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 23 0 50 100 150 200 250 length of residual vector 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 NCP values NCP graph for various regularization parameter Each line indicates NCP values for 48 different RP values Test function 2 Test function 3
  • 24. Contd. Advantages and disadvantages of NCP: Advantages: 1. Computationally inexpensive 2. Deals with only residual vector More stable for high regularization parameter Disadvantages: 1. Deals mainly with white noise 2. Dealing with different types of noise NOTE : For a noise vector Covariance matrix : Colored noise model 23. April 2021 | Referent | Information tracking of noisy data | Seite 24 • Constant spectral density • Random errors independent to each other where, Power spectral density Frequency White noise Covariance matrix must known White noise model
  • 25. Generalized Cross Validation (GCV) 1. Objective of GCV:  Helps determine the optimal regularization parameter even when, a) exact data are unknown b) No knowledge about noise variance Hence proves to be an effective statistical tool. 2. Mathematics behind GCV:  System definition :  GCV is given by : where, 23. April 2021 | Referent | Information tracking of noisy data | Seite 25 GCV Dataset Identity matrix Differentiation operator Noisy data Trace of a matrix
  • 26. Results of GCV 23. April 2021 | Referent | Information tracking of noisy data | Seite 26 Test function 1 Test function 2 Optimal regularization parameter Minimum of GCV function NOTE:
  • 27. Advantages and disadvantages of GCV: Advantages: Unknown noise level Disadvantages: In small sample 23. April 2021 | Referent | Information tracking of noisy data | Seite 27 • noise with smaller standard deviation is hard to capture • Characterized by under-regularization (small parameter value) • GCV can be employed • Provides satisfactory parameter value
  • 28. Application of TV regularization in data-driven process 23. April 2021 | Referent | Information tracking of noisy data | Seite 28 Problem description 0 1 2 3 4 5 6 7 8 9 10 tspan 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 polynomial v l a ues polynomial graph of the form -0.3x+0.2x2 +0.1x 3 +0.5x 4 values
  • 29. Contd. 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 29 Convergence and optimal parameter value
  • 30. Contd. 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 30 Analytical values Above graph proves that the optimal parameter provides satisfactory results
  • 31. Summary 1. Regularization is one of the best possibilities for tracking information from noisy data. 2. Total variation regularization is investigated thoroughly and the following conclusions are drawn, » Optimal regularization parameter plays a crucial role during the analysis of noisy data. » Determination of optimal parameter is of paramount importance but complex. » Different methods are tested to determine the optimal parameter value.  L-curve  Normalized Cumulative Periodogram (NCP)  Generalized Cross Validation (GCV) 3. Data-driven system was simply employed to verify the results. 23. April 2021 | Referent | Information tracking of noisy data | Seite 31 Differs in philosophy but provides satisfactory values
  • 32. Discussion 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 32 Thank you all for your patience
  • 33. 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 33 Appendix
  • 34. Results for TF2 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 34 Derivative values Function values
  • 35. Results for TF3 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 35 Derivative values Function values
  • 36. 2. Based on the penalty term , the following 2 properties are discussed below  Robustness •L1: •L2:  Computational effort •L1: •L2: 3. Sparsity  Shrinkage of coefficients to zeros 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 36 more robust less robust L1-norm More computational effort L2-norm Less computational effort
  • 37. Description of test functions Data points Given function Standard deviation of AWGN Noisy function 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 37 TEST FUNCTION 1 0 1 2 3 4 5 6 Data points -1.5 -1 -0.5 0 0.5 1 1.5 Function values Graphical representation of tabular values given data noisy data
  • 38. Description of test functions Data points Given function Standard deviation of AWGN Noisy function 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 38 TEST FUNCTION 2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Data points -0.5 0 0.5 1 1.5 2 2.5 3 Function values Graphical representation of tabular values given function Noist function
  • 39. Description of test functions Data points Given function Standard deviation of AWGN Noisy function 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 39 TEST FUNCTION 3 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Data points 0 50 100 150 Function values Graphical representation of tabular values given function noisy function
  • 40. Generalized Cumulative Periodogram (GCV) 1. Objective of GCV:  Helps determine the optimal regularization parameter even when, a) exact data are unknown b) No knowledge about noise variance Hence proves to be an effective statistical tool. 2. Fundamental formula surrounding GCV: 23. April 2021 | Referent | Information tracking of noisy data | Seite 40 GCV Dataset Identity matrix Noisy data Trace of a matrix
  • 41. 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 41 Comparison of known and determined function using GCV for TF1 Satisfactory results
  • 42. Concept behind Data-driven (sparse regression) method 1. Aim of data-driven » Obtaining governing equation (PDE’s) in “spatiotemporal” existence. 2. Principle of data-driven » Efficient determination of coefficients at a fixed spatial location. 3. General formulation of data-driven (sparse regression) method, » Non-linear form of a PDE, 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 42 Unknowns Partial derivative w.r.t time or space Parameters (coefficients)
  • 43. » Discretization of nonlinear PDE , 23. April 2021 | Referent | Kurztitel der Präsentation | Seite 43 Right hand side of nonlinear PDE Matrix consisting of derivative parameters coefficients of the PDE Sparse vector Parameter to be determined