NEW METHOD OF SIGNAL DENOISING BY THE PAIRED TRANSFORM
Lab Report pub
1. Developing a Low-Pass Filter for Spectral Data Using Supernova 2011fe
Dustin Clouse
Advisor: Dr. Edward Baron
The University of Oklahoma, 440 W. Brooks St., Norman, OK 73019
Abstract
There are a variety of methods used to reduce noise in spectra data. In this project I
explored several methods for reducing noise. My methods range from simple (a running
average) to complex (Wiener filtering). I also compare and evaluate their effectiveness. I do
this primarily by analyzing their effects on the late-time (360 days past peak light)
spectrum of supernova (SN) 2011fe provided by Dr. Baron. I also use spectra of the
supernova taken at different times to act as a baseline for comparison. I determined that
Wiener filtering is the best of the methods I used. I created a method in the program
Mathematica to take a spectrum as input, output a filtered spectrum, and plot the original
and filtered data for visual comparison.
1. Introduction
Spectra data is usually received
with a lot of noise included, making it
difficult to gain useful information.
Several methods exist for reducing the
noise. However, sometimes those are not
enough. Some groups have created
methods that can remove noise from
spectra without affecting the signal.1
From this, Marion et al. were able to
identify spectral features and measure
Doppler velocities in supernovae.
My goal is to create a general
method for making a low-pass filter to
remove noise from spectra. The filters
made will be unique to each spectrum the
method is used on. I have explored
several methods using the program
Mathematica: a running average, Savitky–
Golay filtering, various discrete Fourier
transforms, and Wiener filtering.
In this project, I used several
spectra provided to me by Dr. Baron from
supernova 2011fe. It was discovered on
August 24, 2011, several days before it
reached peak light on September 10.4 the
data I was given was taken at several
times in the supernova’s evolution. In
days past peak light they are –13.86, –
10.73, –7.55, –3.56, –0.55, 2.73, 8.63,
20.28, 26.33, 40.19, and 360. I primarily
focused on the spectrum from 360 days
past peak light because it was the noisiest
of them all. This data was in the
ultraviolet and optical range, from
1604.65 ångströms to 5699.35
ångströms. It was obtained by the Hubble
space telescope.
2. Figure1. The spectrum of supernova 2011fe.Note the large amounts of noise in the middle between 3000 Å and 4000 Å.
2. Theory
For this project, I have used
several different methods to filter the
data.
One method I used was a running
average. It is also called a moving average
or a boxcar average. A running average
replaces values in set of data with an
average value of those around it. An n-
point running average uses n values for
each average. I did this to the flux values
according to the formula 𝑋 𝑘 =
1
𝑛
∑ 𝑥 𝑖
𝑘+𝑓𝑙𝑜𝑜𝑟(𝑛/2)
𝑖=𝑘−𝑓𝑙𝑜𝑜𝑟(𝑛/2)
, where the xi are the
flux values, the Xk are the averages, and
floor is the floor function.
A Savitzky–Golay filter is a filter
that works by fitting a polynomial
(usually low-degree) to a sub-set of a data
and then gets a new value from this
polynomial. The sub-set used is a set of
points within a given radius (distance)
from a given point. This is equivalent to
convolving the data with a list of
coefficients based off the polynomial.3
Mathematica has a built-in function,
SavitzkyGolayMatrix[{r}, d], that produces
a list of coefficients for a radius r and
polynomial degree d that when convolved
with the spectrum data reduces its noise.
A discrete Fourier transform
(DFT) converts a list of N data points {xn}
into a list of coefficients of complex
sinusoids {Xk} ordered by frequency. The
formula used in Mathematica is 𝑋 𝑘 =
1
√ 𝑁
∑ 𝑥 𝑛
𝑁
𝑛=1 𝑒2𝜋𝑖(𝑛−1)(𝑘−1)/𝑁
. For real
number data, the coefficients have the
relation 𝑋 𝑁−𝑘 = 𝑋 𝑘
∗
. By reducing the value
of the coefficients in the middle of the
DFT evenly, I can remove the noise of the
spectrum. I use the inverse transform on
the modified coefficients to get a filtered
spectrum.
The discrete sine transform (DST)
and discrete cosine transform (DCT) are
related to the Fourier transform. They
both produce a list of coefficients, but
3. only real numbers. Each one has several
different types depending on the assumed
symmetry of the extensions to the data
(DCT-1, DCT-2, DCT-3, etc.). In
Mathematica, the formula for DCT-2 is
𝑋 𝑛 =
1
√ 𝑁
∑ 𝑥 𝑟 cos(
𝜋
𝑁
(𝑟 −
1
2
)( 𝑛 − 1))𝑁
𝑟=1 .
The formula for DST-2 is 𝑋 𝑛 =
1
√ 𝑁
∑ 𝑥 𝑟 sin(
𝜋
𝑁
(𝑟 −
1
2
) 𝑛)𝑁
𝑟=1 . By reducing
coefficients on the high-frequency end of
the transforms, I can remove the noise of
the spectrum.
Wiener filtering is a method for
filtering noise from data by creating an
optimal filter from estimates of the signal
and noise contributions to the power
spectrum of a Fourier transform (DFT,
DST, or DCT) of data.2 The power
spectrum is the square of the magnitude
of each coefficient. The optimal filter, Φ 𝑛,
is given by the formula Φ 𝑛 =
| 𝑆 𝑛|2
| 𝑆 𝑛 |2+| 𝑁 𝑛 |2,
where 𝑆 𝑛 is the estimated true signal
contribution and 𝑁 𝑛 is the estimated
noise contribution. This is considered the
mathematically optimal filter. This filter
has the property that it is close to 1 where
the signal dominates and close to 0 where
the noise dominates and has a smooth
transition between the regions. When
multiplied by the transformed values, the
filter reduces the noise contribution in
the transform. Using the inverse
transform, I get a filtered spectrum.
3. Methods
I used Mathematica, version
10.4.1.0, to develop and create all my
methods for creating low-pass filters.
At first, I experimented with the
Python program that Dr. Baron had given
me. I knew nothing of the Python
language, but I was familiar with other
programming languages such as Java and
C++. Armed with this experience, I
gathered the gist of what the program did.
It divided the flux values by a
normalization factor, did a discrete cosine
transform on the normalized supernova
data, selected a given percent of the
Fourier coefficients, zeroed-out the
others, did an inverse transform on the
result, and multiplied the resulting flux
values by the normalizations factor. The
normalization factor was computed by
making a quadratic best-fit model of the
data. Given my lack a familiarity with
Python, I decided that it was best to try to
replicate the program using Mathematica.
For my first attempt, I did my best
to replicate the Python program. I wrote a
series of commands in Mathematica. The
full method imports the data into
Mathematica as a 2-dimensional array
with Import[…] and splits the data into
wavelength and flux lists for easier
manipultion. It calculates a best-fit (in the
least-squares sense) polynomial for
normalization using the Fit[…] function. It
then transforms the data by dividing the
flux values element-wise by values
calculated from the normalization
polynomial, then applying FourierDCT[…]
(type 1) to get a list of coefficients. It
filters coefficient list by replacing a given
percent of the values with 0,
corresponding to the higher-frequency
cosines. It transforms the filtered values
back to flux values with FourierDCT[…]
(type 1, as it is it’s own inverse) and
multiplies those with the normalization
values to get the filtered spectrum data. It
then plots the original spectrum and the
filtered spectrum together for visual
comparison.
For the normalization process, I
used polynomials of different degrees for
the normalization process, from degree 0
to degree 3. I found that if the data had
been normalized, the resulting filtered
data had a lot of ringing, particularly at
the low-wavelength end. When I ran the
4. data without the normalization, I found
that the ringing was not present when
using the same cut-off value. From this, I
wrote a new series of commands that
omitted the normalization process and
decided to continue without using
normalization from then on.
At Dr. Baron’s suggestion, I created
a method for filtering the data using a
moving average for comparison. I did this
in a similar manner to the previous
method, the difference being that I didn’t
use a discrete cosine transform. I instead
used the MovingAverage[…] function on
the flux data. It then evenly trims the
wavelength list at the ends to fit the
averaged flux data. It then plots the
original and the modified spectra
together.
I next created methods based on
other Fourier transforms. I made series of
commands almost exactly like the DCT–
based one, but replacing FourierDCT[…]
with FourierDST[…] and not using
normalization.
I also made a method using a full
discrete Fourier transform buy replacing
FourierDST[…] with Fourier[…] and
reversing the transform with
InverseFourier[…]. No normalization was
used. As a full Fourier transform goes into
complex numbers, the method of
removing coefficients was changed. For a
list of real numbers {xn} of length N, the
discrete Fourier transform of it, {Xn}, has
the property 𝑋 𝑘 = 𝑋 𝑁−𝑘
∗
. Knowing this, I
selected the coefficients to be zeroed out
from the middle of the list so that it would
retain this property. After using
InverseFourier[…] on the filtered
coefficients a small imaginary number
component is present in the filtered flux
values. This component is due to the
default numerical precision of
Mathematica. I used Re[…] on the flux
values to discard the imaginary
component it as I found it didn’t have any
apparent affect on the final result. The
method then plots the original and the
modified spectra together.
Another method I used was
Savitsky-Golay filtering. This is quite
different to the Fourier transform–based
methods. After importing the spectrum
data, a radius, r, and a degree, d, is chosen.
A filter is made using
SavitzkyGolayMatrix[{r}, d]. This filter is
convolved with the flux data using
ListConvolve[…] to get the filtered flux
data. Since this list is smaller than the
original, the wavelength list is trimmed at
the ends to fit. It then plots the original
and the modified spectra together.
Finally, I created a method that
uses Wiener filtering (also called optimal
filtering). This method is similar to the
discrete cosine method, except instead of
using a single cut-off point when filtering,
an optimal filter is created. The optimal
filter is made by estimating the
contribution of the signal and the noise in
the power spectrum of the cosine
coefficients. I chose the high-frequency
half of the data for my estimation of the
noise contribution. I chose the first tenth
on the low-frequency end for the
estimation of the signal contribution. For
a wavelength, x, and a power spectrum
value, y, the estimated functions are made
with Fit[…] to be linear best-fit functions
of x versus ln(y) for their ranges. The
estimated power spectrum signal and
noise contributions are extrapolated from
these functions by exp(f(x)), where f is
either function. The optimal filter then is
then, in Mathematica, “filter =
Chop[signal/(signal + noise)]”. “signal”
and “noise” correspond to the values of
|S|2 and |N|2, respectively, mentioned in
section 2. Chop[…] is used to reduce
negligible components (< 10-10) to 0. This
filter is multiplied element-wise to the
5. flux data. It then plots the original
spectrum and the filtered spectrum
together for visual comparison. It also
plots on a log plot the power spectrum
with the estimated signal and noise
contributions to show how well they fit
and gives the point where the filter drops
below 0.5.
4. Analysis and Discussion
During my first attempt at making
a filter, I noted that there was ringing
present in the filtered spectrum data
when I used a normalization factor,
especially in the noisy region of the
original spectrum. I didn’t see this effect
when using a constant value for the
normalization, which was to be expected
since a Fourier transform is a linear
operation and thus would not affect the
normalization constant. The original
Python program also did not seem to have
this problem even though it did use a
normalization function.
Figure2. The original spectrum of SN 2011fe at 360 days (green) with filtered spectra using DCT-1 and 5% of points on
the low-Fourier-frequency end with normalization (red) and without normalization (black).Note the ringing in the low
wavelength end.
I suspect that this ringing problem
was due to the fact that a normalization
function would increase the magnitude of
the noise that the Fourier transforms
were seeing. I do not know how the
original program was able to avoid the
ringing problem. I think the problem
might be explainable by analogy. Suppose
a function that represents the data, d(x),
one that is the normalization function,
n(x), and one that is the filter f(x). The
transformed data would then be the
Fourier transform (FT) of their product,
FT[d(x) n(x)](r). Multiplied by f and then
transformed back, it would give FT-
1[f(r)](x)*[d(x) n(x)], where in this case
“*” represents convolution. Say g(x) = FT-
1[f(r)](x), then the filtered flux data would
6. be g*(d n)/n. The effects of normalization
would not be completely removed by
dividing due to being convolved with g.
A running average gave values
that, while removing some effects of the
noise, began to distort the data when I
used a high number of points, n, for the
average, beginning around n = 40. It
smoothed out features in the middle of
the spectrum starting around this point as
well. I also could not determine a method
to judge the effectiveness of the filter in
revealing hidden features.
Figure3. The original spectrum of SN 2011fe at 360 days (green) with filtered spectra using a 50-point running average
(black). Note the smoothing out of the possible features in the 3000–4000 Årange and howthe filtered spectrum doesn’t
quite reach the highs and lows on the high-wavelength end.
The Savitzky–Golay filter fit better
than the running average. It didn’t have
the problem of becoming too low or high
at peaks and troughs. However, it did
start to smooth out features in the middle
of the spectrum beginning at radius r =
40. Again, I also could not determine a
method to judge the effectiveness of the
filter in revealing hidden features.
7. Figure4. The original spectrum of SN 2011fe at 360 days (green) with filtered spectra using a Savitzky–Golay filter
(black) with d = 2 and r = 50. Note the smoothing out of the possible features in the 3000–4000 Årange and howthe
filtered spectrum doesn’t quite reach the highs and lows on the high-wavelength end,although it is not a bad as the
running average.
The Fourier transforms I used—
DFT, DST, and DCT—were very similar to
each other when a simple cut-off filter
was used using the same percent of
coefficients. The only significant
difference between them was at the edges
of the spectrum. Because of fringing
effects at the edges, I decided that this
was not significant. My choice of DCT over
DFT was because the DFT will produce
complex numbers. And even when
filtering in a way that preserves the
symmetry that a DFT of real number data
produces, I still got a small imaginary
contribution, most like due to the limits of
precision Mathematica was using. I
preferred DCT to DST because of the fact
that fewer cosines than sines are needed
to get the same precision. I did explore
using type 1, 2, 3, and 4 DSTs and DCTs,
however, I found no significant difference
among the different types.
8. Figure5. A comparison of the Fourier transforms DFT (black), DST-1 (red),and DCT-1 (blue).Each one uses only 5% of
coefficients in the transform. Note the slight differences on the edges.
Figure6. The original spectrum of SN 2011fe at 360 days (green) with filtered spectra using DCT-1 with 5% of
coefficients (black). Note how it doesn’t smooth out the 3000–4000 Årange like the running average and the Savitzky–
Golay.
9. The Wiener filter I produced
looked promising as I had a way of
determining which coefficients I should
reduce. I determined the regions to use in
the estimations of the signal and noise
contributions (the 1st tenth and the 2nd
half respectively) by looking at the power
spectra of the SN at different times. I
figured because of the quick transition of
the optimal filter (see fig. [filter]) that the
slope of the lines in log space wasn’t as
important as getting the intersection of
the two near where the signal started to
rise above the noise.
Figure7. The original spectrum of SN 2011fe at 360 days (green) with filtered spectra using DCT-2 and Wiener filtering
(black). Note how it follows the middle region more closely than the other FTs.
Figure8. Left: the power spectrum of 2011fe with estimates for the signal (orange) and noise (green) contributions. The
intersection of the two is near where the power spectrum starts to rise. Right: the filter created for 2011fe.The filter
drops below 0.5 at frequency bin 118.
5. Conclusions I have developed a method for
extracting signal data from noisy spectra
10. that I believe to be close to what would be
considered optimal. My methods
converged towards those of Marion, et al.
We both used Wiener filtering in
constructing a filter. There are, however,
a few differences between their method
and mine. They used a full DFT whereas I
used a DCT. They estimate a constant
noise contribution from the average of
the power spectrum values of the 100
points past the frequency bin range they
consider, 50. For their signal estimation,
they draw a line from the first point in the
power spectrum to the point where it first
intersects the noise line. I use an
exponential fit on the 1st tenth of the
points for the signal and the 2nd half for
the noise.
With further work I think I could
develop a method for determining which
regions of the power spectrum to use
generalized to individual spectra.
References
1. Marion, G. H., P. Höflich, C. L. Gerardy,
W. D. Vacca, J. C. Wheeler, and E. L.
Robinson. “A Catalog of Near-
Infrared Spectra from Type Ia
Supernovae”. The Astronomical
Journal 138.3 (2009): 727–57.
2. Press, William H., Saul A. Teukolsky,
William T. Vetterling, and Brian P.
Flannery. Numerical Recipies in
Fortran 77 and Fortran 90. New
York: Cambridge UP. Numerical
Recipies. Web. 1 May 2016.
http://numerical.recipes/.
3. Savitzky, Abraham., and M. J. E. Golay.
"Smoothing and Differentiation of
Data by Simplified Least Squares
Procedures." Analytical Chemistry
Anal. Chem. 36.8 (1964): 1627-
639.
4. Zhang, Kaicheng, Xiaofeng Wang, Jujia
Zhang, Tianmeng Zhang, Mohan
Ganeshalingam, Weidong Li, Alexei
V. Filippenko, Xulin Zhao, Weikang
Zheng, Jinming Bai, Jia Chen,
Juncheng Chen, Fang Huang, Jun
Mo, Liming Rui, Hao Song, Hanna
Sai, Wenxiong Li, Lifan Wang, and
Chao Wu. “Optical Observations of
the Type Ia Supernova 2011fe in
M101 for Nearly 500 Days”. ApJ
The Astrophysical Journal 820.1
(2016): 67.