SlideShare a Scribd company logo
1 of 10
Download to read offline
Kernel-based Visualization of Point
Patterns in Python with Application to
Tornado Landing Data
Weifeng Hu
Computer Science, University of Michigan ∗
March 25, 2015
∗Mentored by Dr. Stilian Stoev
1
Abstract
Many data sets come as point patterns of the form (longitude, latitude,
time, magnitude). The examples of data sets in this format includes tor-
nado events, origins/destination of internet flows, earthquakes, terrorist
attacks and etc. It is difficult to visualize the data with simple plot-
ting. This research project studies and implements non-parametric kernel
smoothing in Python as a way of visualizing the intensity of point patterns
in space and time. A two-dimensional grid M with size mx ×my is used to
store the calculation result for the kernel smoothing of each grid points.
The heat-map in Python then uses the grid to plot the resulting images on
a map where the resolution is determined by mx and my. The resulting
images also depend on a spatial and a temporal smoothing parameters,
which control the resolution (smoothness) of the figure. The Python code
is applied to visualize over 56,000 tornado landings in the continental U.S.
from the period 1950 - 2014. The magnitudes of the tornado are based on
Fujita scale.
1 Introduction
Internet traffic flow can be visualized as large point patterns in a plane. The
Internet data comes in the format of (longitude, latitude, payload, time). The
same data format is also applicable to tornadoes touchdown data, where the
payload is the Fujita scale of tornado incidents, earthquake and terrorist at-
tacks. Since the data varies both spatially and temporally, it is difficult to
visualize them using simple plot. As a result, this research project implements
in Python kernel smoothing, which is a non-parametric statistical methodology
that can easily and computationally efficiently visualize large point patterns in
space. Kernel smoothing allows visualization and further analysis of spatial-
temporal features in point-patterns such as locations, timing and magnitude of
the internet traffic.
This research report describes the experimental methods of implementing
kernel smoothing in Python programming language. We use 56,000 tornado
data from the past five decades and visualize them using Python heat map. In
particular we will be focusing on the formulas that are applied to the input
data. The output of the Python program will be a matrix containing the results
of kernel smoothing and the matrix is plotted on a heat map of the United
States based on the values of its entries. Further analysis includes taking the
differences of two output images with different temporal bandwidth to observe
the change in tornado intensity over different periods of time.
This project also studies how to improve the computational efficiency in
Python when kernel smoothing is applied to relatively large data sets and/or
1
high resolution of the resulting images is required. We implemented an algo-
rithm that creates a smaller two-dimensional grid S with size n×n. For a point
on the bigger grid M, we only apply kernel smoothing to grid points that fall
within the grid S. Intuitively, the size of grid M plays no role in computation in
this algorithm, which allows us to handle larger resolution in relatively efficient
way.
2 Kernel Smoothing Methodology
Kernel smoothing is a technique to estimate the real valued function by using its
noisy observations. Consider the data set (xi, yi, ti, pi), where x is the longitude,
y is the latitude, t is the time and p is the payload of the network data. In order
to visualize the data points in the map, we need to use couple formulas for kernel
smoothing. The kernel of the data can be calculated using the formula
K(x, y) =
1
2 · π
· e
−(x2+y2)
2 (1)
K is the probability density function of the covariate normal distribution
with zero mean and variance to be the covariance matrix.
Σ =
1 0
0 1
After we obtain the formula for kernel, the spatial kernel smoothing can also
be obtained using
Kh(x, y) =
1
h2
· K(
x
h
,
y
h
) (2)
In formula 2, h is the smoothing parameter called bandwidth. It has strong
influences in the result. However, choosing a appropriate bandwidth is relatively
difficult.
We also consider temporal kernel smoothing because the data varies with
time. Temporal kernel smoothing uses a different bandwidth, denoted as ht. The
formula for temporal kernel smoothing, when given a bandwidth ht is shown
below. Note that temporal kernel smoothing only affects the point after a
specific time. If the input time is negative, meaning the data happens before
the time, we return the temporal kernel smoothing as zero.
Tλ(t) =
λ · e−tλ
t > 0
0 t ≤ 0
(3)
We visualize the data points on a map using Python heat map by first
creating a two-by-two grid M. We divide the grid, according to user’s input of
2
number of pixels in x direction mx and number of pixels in y direction my, such
that
xk = x0 + ∆x · (k +
1
2
), k = 1, 2, 3 · · · , mx − 1 (4)
yl = y0 + ∆y · (l +
1
2
), l = 1, 2, 3 · · · , my − 1 (5)
Therefore, for every x, y point in the grid M, when given a bandwidth h and
data of size N, can be calculated using
Ph(x, y, t) =
1
N
·
N
i=1
pi · Kh(x − xi, y − yi) · Tλ(t − ti) (6)
The following figure shows the implementation in Python of the kernel
smoothing formulas.
3 Algorithm
The algorithm, however, is in order of mx · my · N when implemented using
nested loops in Python. The scaling efficiency becomes problematic when the
user requires high resolution, i.e. bigger values of mx and my. As a result, a
new algorithm is implemented to improve the efficiency of the program. The
pseudo-code is described below.
3
1: procedure Calc index(min, max, position, pixel)
2: index = int(pixel · (position−min)
max−min )
3: return index
4: end procedure
5:
6: procedure S Matrix Entries(k, h, row, col)
7: entry = 1
2·π·h2 · e− 1
h2 ·((row−(k+1))2
+(col−(k+1))2
)
8: return entry
9: end procedure
10:
11: for i = 0 to 2 · k do
12: for j = 0 to 2 · k do
13: S(i, j) = S Matrix Entries(k, bandwidth, i, j)
14: end for
15: end for
16:
17: S = normalize(S)
18:
19: for i = 0 to number of data points do
20: X index = Calc index(min(xi), max(xi), xi, mx)
21: Y index =Calc index(min(yi), max(yi), yi, my)
22: m idx left = max(0, X index − k)
23: m idx right = min(X index + k + 1, mx)
24: m idy left = max(0, Y index − k)
25: m idy right = min(Y index + k + 1, my)
26: s idx left = max(0, k − X index)
27: s idx right = min(k + mx − X index, 2 ∗ k + 1)
28: s idy left = max(0, k − Y index)
29: s idy right = min(k + my − Y index, 2 ∗ k + 1)
30: xm = [m idx left : m idx right]
31: xs = [s idx left : s idx right]
32: ym = [m idy left : m idy right]
33: ys = [s idy left : s idy right]
34: M[xm, ym]+ = S[xs, ys] · pi · Tλ(t − ti)
35: end for
This algorithm uses a smaller two dimensional grid S of size s × s,to apply
kernel smoothing only to the grid points that are close to the data of interest,
i.e. those falling within the grid S. The value of k determines the size of grid S.
Intuitively the algorithm, when given a data size N, is in order of k2
· N. We
can see that it is easy to handle higher resolution map now because the pixels
mx, my play no role in computation except memory storage in computers.
Notice that the last part of the pseudo-code presented involves the calcula-
4
tions to update the specific regions of the matrix. The math involves using max
or min to find out which regions of matrix M needs to be updated by the matrix
S. We take into consideration of the edge cases where the smaller S matrix may
go out of bound of the bigger M matrix when applying kernel smoothing to a
region. After we manage to determine which regions need to be updated, we
used vectorization in Python that helps improve the speed of the program.
4 Result & Applications
We first apply the algorithm for tornado data, which is similar to internet data
because it has touchdown longitude and latitude, time and Fujita scale. We
extracted all 56,000 tornado data for the past 50 years. The data is loaded in
Python and the heat map is drawn on the map of the United States.
Figure 1: Tornado Touchdown Data Visualized on Heatmap of the United States
We can see the “tornado alley” in the US map from Figure 1. The “tornado
alley” is area where tornado is most frequent. The core of “tornado alley”
extends from northern Texas, Oklahoma, Kansas and Nebraska. Figure 1 clearly
shows that the states with high frequency of tornado are Oklahoma, Texas,
Colorado and Kansas, which is very consistent with the areas that are recognized
as in the “tornado alley”. The values on the colorbar is calculated based on
the temporal bandwidth, spatial bandwidth and Fujita scale of the touchdown
tornado using the calculation described in Algorithm section.
We can also investigate the change in intensity over years. The plots below
shows how the intensity of tornado touchdown changes in the United States in
a ten-year interval for the past 50 years.
5
(a) Intensity of Tornado From 1950 to 1960 (b) Tornado Intensity Change From 1960 to 1970
(c) Tornado Intensity Change From 1970 to 1980 (d) Tornado Intensity Change From 1980 to 1990
(e) Tornado Intensity Change From 1990 to 2000 (f) Tornado Intensity Change From 2000 to 2010
It is clear in the plots that for the first ten years(Figure (a)), there are more
incidents of tornado touchdowns in states such as Oklahoma and Kansas. In
Figure (b), we can see that the intensity of tornado shows a dramatic increase in
6
the northern part of the country like Wisconsin, Iowa and Illinois. In Figure (c)
the intensity of the tornado in the northern part of the United States decreased
while the intensity of tornado in southern states such as Mississippi, Texas and
Alabama, which is on the east part of the core areas of the “tornado alley”,
increased significantly than first two decades. The Figures (d), (e), (f) show
the most recent intensity changes of tornado touchdowns. We can see from the
plots that there is a dramatic increase in both the frequency and the intensity
of tornadoes on the north part of the “tornado alley” from year 1990 to 2010.
It suggests that the frequency of tornado is increasing towards the northern
part of “tornado valley” in the state of South Dakota, the boarder of North
Dakota and Wisconsin and extends to the Canadian broader. Moreover, the
frequency of tornado also increases towards the southeast part of the “tornado
alley”. In Figures (e) and (f), which is the most recent change in intensity of
tornado, southern states such as Florida, Georgia, and Alabama have suffered
more incidents of tornado than in 1950s. However, the intensity of tornado show
little changes in northern states like New York and Massachusetts, and the east
coast.
We can also investigate the total intensity of tornado in the past six decades.
The figure below shows the total intensity calculated using kernel smoothing.
Figure 2: Total Intensity of Tornado Touchdown in the Past Decades
Note that the intensity of tornado starting from 2010 is calculated only based
on year 2010-2014. However, we can see that the total intensity is still almost
the same as the past years. It is because during 2011 there was a Tornado
Super Outbreak, which is the largest tornado outbreak ever recorded. As a
result, the total intensity in 2010-2014 still matches the intensity level of the
previous decades thanks to the Super Outbreak in 2011.
7
In conclusion, the result of this project shows that the intensity of tornado
shows greater increase in the states that are on the north of the “tornado alley”
such as South Dakota and Nebraska. The southern states that are on the
east of the “tornado alley” such as Mississippi and Alabama also suffer from
an increase in tornado intensity recently. The intensity of tornado shows little
change in northern states and the east coast. However, the states in the “tornado
alley” like Oklahoma and Texas still suffer high intensity of tornado in the past
50 years, which can be seen in Figure 1. In short, the intensity of tornado
touchdowns has a trend to increase towards the north of the “tornado alley”
to the Canadian boarder, and towards the east of the “tornado alley” to the
southern states of the US.
8
5 Appendix - Python Code
Please click on the following link for more information
https://github.com/weifhu0124/Undergraduate Research
References
[1] Tornado History Project
http://www.tornadohistoryproject.com/
[2] Tornado Alley
http://www.si.edu/Content/SE/Educator%20Guides/Tornado EdGuide R5.pdf/
9

More Related Content

What's hot

PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)neeraj7svp
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningRyo Iwaki
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesNECST Lab @ Politecnico di Milano
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesAtomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesIJERA Editor
 
Lecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationLecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationVARUN KUMAR
 
Choosing the right physics for WRF (5/2009)
Choosing the right physics for WRF (5/2009)Choosing the right physics for WRF (5/2009)
Choosing the right physics for WRF (5/2009)Skylar Hernandez
 
Human activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone dataHuman activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone dataUniversity of Salerno
 
Click-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) predictionClick-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) predictionAndrey Lange
 
Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)Ha Phuong
 
increasing the action gap - new operators for reinforcement learning
increasing the action gap - new operators for reinforcement learningincreasing the action gap - new operators for reinforcement learning
increasing the action gap - new operators for reinforcement learningRyo Iwaki
 
Pre-computation for ABC in image analysis
Pre-computation for ABC in image analysisPre-computation for ABC in image analysis
Pre-computation for ABC in image analysisMatt Moores
 
010_20160216_Variational Gaussian Process
010_20160216_Variational Gaussian Process010_20160216_Variational Gaussian Process
010_20160216_Variational Gaussian ProcessHa Phuong
 
Histogram Operation in Image Processing
Histogram Operation in Image ProcessingHistogram Operation in Image Processing
Histogram Operation in Image ProcessingVARUN KUMAR
 

What's hot (20)

PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
PSF_Introduction_to_R_Package_for_Pattern_Sequence (1)
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Domain Driven Design In C#3.0
Domain Driven Design In C#3.0Domain Driven Design In C#3.0
Domain Driven Design In C#3.0
 
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cyclesAtomic algorithm and the servers' s use to find the Hamiltonian cycles
Atomic algorithm and the servers' s use to find the Hamiltonian cycles
 
Lecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationLecture 3 image sampling and quantization
Lecture 3 image sampling and quantization
 
Choosing the right physics for WRF (5/2009)
Choosing the right physics for WRF (5/2009)Choosing the right physics for WRF (5/2009)
Choosing the right physics for WRF (5/2009)
 
Human activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone dataHuman activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone data
 
Click-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) predictionClick-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) prediction
 
Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)Tutorial of topological data analysis part 3(Mapper algorithm)
Tutorial of topological data analysis part 3(Mapper algorithm)
 
increasing the action gap - new operators for reinforcement learning
increasing the action gap - new operators for reinforcement learningincreasing the action gap - new operators for reinforcement learning
increasing the action gap - new operators for reinforcement learning
 
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...2019 GDRR: Blockchain Data Analytics  - Dissecting Blockchain Price Analytics...
2019 GDRR: Blockchain Data Analytics - Dissecting Blockchain Price Analytics...
 
W33123127
W33123127W33123127
W33123127
 
Python time
Python timePython time
Python time
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
reportVPLProject
reportVPLProjectreportVPLProject
reportVPLProject
 
Pre-computation for ABC in image analysis
Pre-computation for ABC in image analysisPre-computation for ABC in image analysis
Pre-computation for ABC in image analysis
 
010_20160216_Variational Gaussian Process
010_20160216_Variational Gaussian Process010_20160216_Variational Gaussian Process
010_20160216_Variational Gaussian Process
 
Histogram Operation in Image Processing
Histogram Operation in Image ProcessingHistogram Operation in Image Processing
Histogram Operation in Image Processing
 

Viewers also liked

Guía para el manejo del infarto agudo de miocardio con elevación del segmento ST
Guía para el manejo del infarto agudo de miocardio con elevación del segmento STGuía para el manejo del infarto agudo de miocardio con elevación del segmento ST
Guía para el manejo del infarto agudo de miocardio con elevación del segmento STAndrés Fernando Fuentes Romero
 
Ml0017 mall management
Ml0017 mall managementMl0017 mall management
Ml0017 mall managementStudy Stuff
 
Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...
Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...
Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...Beatriz Lorena Ramos Santos Protomartire
 
Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...
Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...
Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...Pedro Vozza
 
Erwin. biología molecular de insulinorresistencia
Erwin. biología molecular de insulinorresistenciaErwin. biología molecular de insulinorresistencia
Erwin. biología molecular de insulinorresistenciaErwin Chiquete, MD, PhD
 
Sistemas politicos y sistemas electorales
Sistemas politicos y sistemas electoralesSistemas politicos y sistemas electorales
Sistemas politicos y sistemas electoralesNatasha Suarez
 
Ib0018 – export import finance
Ib0018 – export import financeIb0018 – export import finance
Ib0018 – export import financeStudy Stuff
 
INSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDA
INSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDAINSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDA
INSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDAVictor Jesus Lara Rosado
 
Mba projects(fin ,hr, mrktng)
Mba projects(fin ,hr, mrktng)Mba projects(fin ,hr, mrktng)
Mba projects(fin ,hr, mrktng)Study Stuff
 
Dashboards: Sabemos Comunicar o nosso Desempenho?
Dashboards: Sabemos Comunicar o nosso Desempenho?Dashboards: Sabemos Comunicar o nosso Desempenho?
Dashboards: Sabemos Comunicar o nosso Desempenho?comunidades@ina
 
Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...
Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...
Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...Conferencia Sindrome Metabolico
 
Introdução à Infografia
Introdução à InfografiaIntrodução à Infografia
Introdução à InfografiaLeonardo Pereira
 
Anestesia En Enfermedades Tiroideas
Anestesia En Enfermedades TiroideasAnestesia En Enfermedades Tiroideas
Anestesia En Enfermedades Tiroideasguest8decbd
 
Action, enaction, inter(en)ation schiavio
Action, enaction, inter(en)ation   schiavioAction, enaction, inter(en)ation   schiavio
Action, enaction, inter(en)ation schiavioMaría Marchiano
 

Viewers also liked (18)

Guía para el manejo del infarto agudo de miocardio con elevación del segmento ST
Guía para el manejo del infarto agudo de miocardio con elevación del segmento STGuía para el manejo del infarto agudo de miocardio con elevación del segmento ST
Guía para el manejo del infarto agudo de miocardio con elevación del segmento ST
 
Ml0017 mall management
Ml0017 mall managementMl0017 mall management
Ml0017 mall management
 
Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...
Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...
Aula de Publishe - para o Projeto Abolição da escravidão – rupturas e continu...
 
Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...
Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...
Incontinencia de orina de esfuerzo, tratamiento quirurgico con cintas vaginal...
 
Erwin. biología molecular de insulinorresistencia
Erwin. biología molecular de insulinorresistenciaErwin. biología molecular de insulinorresistencia
Erwin. biología molecular de insulinorresistencia
 
Sistemas politicos y sistemas electorales
Sistemas politicos y sistemas electoralesSistemas politicos y sistemas electorales
Sistemas politicos y sistemas electorales
 
Ib0018 – export import finance
Ib0018 – export import financeIb0018 – export import finance
Ib0018 – export import finance
 
INSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDA
INSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDAINSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDA
INSUFICIENCIA VENOSA CRONICA Y TROMBOSIS VENOSA PROFUNDA
 
Mba projects(fin ,hr, mrktng)
Mba projects(fin ,hr, mrktng)Mba projects(fin ,hr, mrktng)
Mba projects(fin ,hr, mrktng)
 
Dashboards: Sabemos Comunicar o nosso Desempenho?
Dashboards: Sabemos Comunicar o nosso Desempenho?Dashboards: Sabemos Comunicar o nosso Desempenho?
Dashboards: Sabemos Comunicar o nosso Desempenho?
 
Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...
Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...
Simposio ECNT: La Estrategía Nacional de Promoción y Prevención para una mejo...
 
Introdução à Infografia
Introdução à InfografiaIntrodução à Infografia
Introdução à Infografia
 
OSTEOMIELITIS. CASO CLINICO TERAPÉUTICO
OSTEOMIELITIS. CASO CLINICO TERAPÉUTICOOSTEOMIELITIS. CASO CLINICO TERAPÉUTICO
OSTEOMIELITIS. CASO CLINICO TERAPÉUTICO
 
Anestesia En Enfermedades Tiroideas
Anestesia En Enfermedades TiroideasAnestesia En Enfermedades Tiroideas
Anestesia En Enfermedades Tiroideas
 
Action, enaction, inter(en)ation schiavio
Action, enaction, inter(en)ation   schiavioAction, enaction, inter(en)ation   schiavio
Action, enaction, inter(en)ation schiavio
 
Tormenta tiroidea
Tormenta tiroideaTormenta tiroidea
Tormenta tiroidea
 
Cristalino anatomia y fisiologia
Cristalino anatomia y fisiologiaCristalino anatomia y fisiologia
Cristalino anatomia y fisiologia
 
La radio
La radioLa radio
La radio
 

Similar to Report

Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms cscpconf
 
Frequency magnitude distribution
Frequency magnitude distributionFrequency magnitude distribution
Frequency magnitude distributioneinsteinxxx
 
Chaos Image Encryption using Pixel shuffling
Chaos Image Encryption using Pixel shuffling Chaos Image Encryption using Pixel shuffling
Chaos Image Encryption using Pixel shuffling cscpconf
 
Road Segmentation from satellites images
Road Segmentation from satellites imagesRoad Segmentation from satellites images
Road Segmentation from satellites imagesYoussefKitane
 
Application of matrices in Daily life
Application of matrices in Daily lifeApplication of matrices in Daily life
Application of matrices in Daily lifeshubham mishra
 
A novel technique for speech encryption based on k-means clustering and quant...
A novel technique for speech encryption based on k-means clustering and quant...A novel technique for speech encryption based on k-means clustering and quant...
A novel technique for speech encryption based on k-means clustering and quant...journalBEEI
 
A novel elementary spatial expanding scheme form on SISR method with modifyin...
A novel elementary spatial expanding scheme form on SISR method with modifyin...A novel elementary spatial expanding scheme form on SISR method with modifyin...
A novel elementary spatial expanding scheme form on SISR method with modifyin...TELKOMNIKA JOURNAL
 
Method for a Simple Encryption of Images Based on the Chaotic Map of Bernoulli
Method for a Simple Encryption of Images Based on the Chaotic Map of BernoulliMethod for a Simple Encryption of Images Based on the Chaotic Map of Bernoulli
Method for a Simple Encryption of Images Based on the Chaotic Map of BernoulliAIRCC Publishing Corporation
 
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIMETHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIijcsit
 
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIMETHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIAIRCC Publishing Corporation
 
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSCOLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSIJNSA Journal
 
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSCOLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSIJNSA Journal
 
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSCOLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSIJNSA Journal
 
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPS
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPSIMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPS
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPSIJNSA Journal
 

Similar to Report (20)

Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms Enhancement and Analysis of Chaotic Image Encryption Algorithms
Enhancement and Analysis of Chaotic Image Encryption Algorithms
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
Journal_ICACT
Journal_ICACTJournal_ICACT
Journal_ICACT
 
Frequency magnitude distribution
Frequency magnitude distributionFrequency magnitude distribution
Frequency magnitude distribution
 
Chaos Image Encryption using Pixel shuffling
Chaos Image Encryption using Pixel shuffling Chaos Image Encryption using Pixel shuffling
Chaos Image Encryption using Pixel shuffling
 
Road Segmentation from satellites images
Road Segmentation from satellites imagesRoad Segmentation from satellites images
Road Segmentation from satellites images
 
Application of matrices in Daily life
Application of matrices in Daily lifeApplication of matrices in Daily life
Application of matrices in Daily life
 
T01022103108
T01022103108T01022103108
T01022103108
 
A novel technique for speech encryption based on k-means clustering and quant...
A novel technique for speech encryption based on k-means clustering and quant...A novel technique for speech encryption based on k-means clustering and quant...
A novel technique for speech encryption based on k-means clustering and quant...
 
Laboratory 7
Laboratory 7Laboratory 7
Laboratory 7
 
A novel elementary spatial expanding scheme form on SISR method with modifyin...
A novel elementary spatial expanding scheme form on SISR method with modifyin...A novel elementary spatial expanding scheme form on SISR method with modifyin...
A novel elementary spatial expanding scheme form on SISR method with modifyin...
 
Method for a Simple Encryption of Images Based on the Chaotic Map of Bernoulli
Method for a Simple Encryption of Images Based on the Chaotic Map of BernoulliMethod for a Simple Encryption of Images Based on the Chaotic Map of Bernoulli
Method for a Simple Encryption of Images Based on the Chaotic Map of Bernoulli
 
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIMETHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
 
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLIMETHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
METHOD FOR A SIMPLE ENCRYPTION OF IMAGES BASED ON THE CHAOTIC MAP OF BERNOULLI
 
How to Decide the Best Fuzzy Model in ANFIS
How to Decide the Best Fuzzy Model in ANFIS How to Decide the Best Fuzzy Model in ANFIS
How to Decide the Best Fuzzy Model in ANFIS
 
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSCOLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
 
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSCOLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
 
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMSCOLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
COLOR IMAGE ENCRYPTION BASED ON MULTIPLE CHAOTIC SYSTEMS
 
Pakdd
PakddPakdd
Pakdd
 
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPS
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPSIMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPS
IMAGE ENCRYPTION BASED ON DIFFUSION AND MULTIPLE CHAOTIC MAPS
 

Report

  • 1. Kernel-based Visualization of Point Patterns in Python with Application to Tornado Landing Data Weifeng Hu Computer Science, University of Michigan ∗ March 25, 2015 ∗Mentored by Dr. Stilian Stoev 1
  • 2. Abstract Many data sets come as point patterns of the form (longitude, latitude, time, magnitude). The examples of data sets in this format includes tor- nado events, origins/destination of internet flows, earthquakes, terrorist attacks and etc. It is difficult to visualize the data with simple plot- ting. This research project studies and implements non-parametric kernel smoothing in Python as a way of visualizing the intensity of point patterns in space and time. A two-dimensional grid M with size mx ×my is used to store the calculation result for the kernel smoothing of each grid points. The heat-map in Python then uses the grid to plot the resulting images on a map where the resolution is determined by mx and my. The resulting images also depend on a spatial and a temporal smoothing parameters, which control the resolution (smoothness) of the figure. The Python code is applied to visualize over 56,000 tornado landings in the continental U.S. from the period 1950 - 2014. The magnitudes of the tornado are based on Fujita scale. 1 Introduction Internet traffic flow can be visualized as large point patterns in a plane. The Internet data comes in the format of (longitude, latitude, payload, time). The same data format is also applicable to tornadoes touchdown data, where the payload is the Fujita scale of tornado incidents, earthquake and terrorist at- tacks. Since the data varies both spatially and temporally, it is difficult to visualize them using simple plot. As a result, this research project implements in Python kernel smoothing, which is a non-parametric statistical methodology that can easily and computationally efficiently visualize large point patterns in space. Kernel smoothing allows visualization and further analysis of spatial- temporal features in point-patterns such as locations, timing and magnitude of the internet traffic. This research report describes the experimental methods of implementing kernel smoothing in Python programming language. We use 56,000 tornado data from the past five decades and visualize them using Python heat map. In particular we will be focusing on the formulas that are applied to the input data. The output of the Python program will be a matrix containing the results of kernel smoothing and the matrix is plotted on a heat map of the United States based on the values of its entries. Further analysis includes taking the differences of two output images with different temporal bandwidth to observe the change in tornado intensity over different periods of time. This project also studies how to improve the computational efficiency in Python when kernel smoothing is applied to relatively large data sets and/or 1
  • 3. high resolution of the resulting images is required. We implemented an algo- rithm that creates a smaller two-dimensional grid S with size n×n. For a point on the bigger grid M, we only apply kernel smoothing to grid points that fall within the grid S. Intuitively, the size of grid M plays no role in computation in this algorithm, which allows us to handle larger resolution in relatively efficient way. 2 Kernel Smoothing Methodology Kernel smoothing is a technique to estimate the real valued function by using its noisy observations. Consider the data set (xi, yi, ti, pi), where x is the longitude, y is the latitude, t is the time and p is the payload of the network data. In order to visualize the data points in the map, we need to use couple formulas for kernel smoothing. The kernel of the data can be calculated using the formula K(x, y) = 1 2 · π · e −(x2+y2) 2 (1) K is the probability density function of the covariate normal distribution with zero mean and variance to be the covariance matrix. Σ = 1 0 0 1 After we obtain the formula for kernel, the spatial kernel smoothing can also be obtained using Kh(x, y) = 1 h2 · K( x h , y h ) (2) In formula 2, h is the smoothing parameter called bandwidth. It has strong influences in the result. However, choosing a appropriate bandwidth is relatively difficult. We also consider temporal kernel smoothing because the data varies with time. Temporal kernel smoothing uses a different bandwidth, denoted as ht. The formula for temporal kernel smoothing, when given a bandwidth ht is shown below. Note that temporal kernel smoothing only affects the point after a specific time. If the input time is negative, meaning the data happens before the time, we return the temporal kernel smoothing as zero. Tλ(t) = λ · e−tλ t > 0 0 t ≤ 0 (3) We visualize the data points on a map using Python heat map by first creating a two-by-two grid M. We divide the grid, according to user’s input of 2
  • 4. number of pixels in x direction mx and number of pixels in y direction my, such that xk = x0 + ∆x · (k + 1 2 ), k = 1, 2, 3 · · · , mx − 1 (4) yl = y0 + ∆y · (l + 1 2 ), l = 1, 2, 3 · · · , my − 1 (5) Therefore, for every x, y point in the grid M, when given a bandwidth h and data of size N, can be calculated using Ph(x, y, t) = 1 N · N i=1 pi · Kh(x − xi, y − yi) · Tλ(t − ti) (6) The following figure shows the implementation in Python of the kernel smoothing formulas. 3 Algorithm The algorithm, however, is in order of mx · my · N when implemented using nested loops in Python. The scaling efficiency becomes problematic when the user requires high resolution, i.e. bigger values of mx and my. As a result, a new algorithm is implemented to improve the efficiency of the program. The pseudo-code is described below. 3
  • 5. 1: procedure Calc index(min, max, position, pixel) 2: index = int(pixel · (position−min) max−min ) 3: return index 4: end procedure 5: 6: procedure S Matrix Entries(k, h, row, col) 7: entry = 1 2·π·h2 · e− 1 h2 ·((row−(k+1))2 +(col−(k+1))2 ) 8: return entry 9: end procedure 10: 11: for i = 0 to 2 · k do 12: for j = 0 to 2 · k do 13: S(i, j) = S Matrix Entries(k, bandwidth, i, j) 14: end for 15: end for 16: 17: S = normalize(S) 18: 19: for i = 0 to number of data points do 20: X index = Calc index(min(xi), max(xi), xi, mx) 21: Y index =Calc index(min(yi), max(yi), yi, my) 22: m idx left = max(0, X index − k) 23: m idx right = min(X index + k + 1, mx) 24: m idy left = max(0, Y index − k) 25: m idy right = min(Y index + k + 1, my) 26: s idx left = max(0, k − X index) 27: s idx right = min(k + mx − X index, 2 ∗ k + 1) 28: s idy left = max(0, k − Y index) 29: s idy right = min(k + my − Y index, 2 ∗ k + 1) 30: xm = [m idx left : m idx right] 31: xs = [s idx left : s idx right] 32: ym = [m idy left : m idy right] 33: ys = [s idy left : s idy right] 34: M[xm, ym]+ = S[xs, ys] · pi · Tλ(t − ti) 35: end for This algorithm uses a smaller two dimensional grid S of size s × s,to apply kernel smoothing only to the grid points that are close to the data of interest, i.e. those falling within the grid S. The value of k determines the size of grid S. Intuitively the algorithm, when given a data size N, is in order of k2 · N. We can see that it is easy to handle higher resolution map now because the pixels mx, my play no role in computation except memory storage in computers. Notice that the last part of the pseudo-code presented involves the calcula- 4
  • 6. tions to update the specific regions of the matrix. The math involves using max or min to find out which regions of matrix M needs to be updated by the matrix S. We take into consideration of the edge cases where the smaller S matrix may go out of bound of the bigger M matrix when applying kernel smoothing to a region. After we manage to determine which regions need to be updated, we used vectorization in Python that helps improve the speed of the program. 4 Result & Applications We first apply the algorithm for tornado data, which is similar to internet data because it has touchdown longitude and latitude, time and Fujita scale. We extracted all 56,000 tornado data for the past 50 years. The data is loaded in Python and the heat map is drawn on the map of the United States. Figure 1: Tornado Touchdown Data Visualized on Heatmap of the United States We can see the “tornado alley” in the US map from Figure 1. The “tornado alley” is area where tornado is most frequent. The core of “tornado alley” extends from northern Texas, Oklahoma, Kansas and Nebraska. Figure 1 clearly shows that the states with high frequency of tornado are Oklahoma, Texas, Colorado and Kansas, which is very consistent with the areas that are recognized as in the “tornado alley”. The values on the colorbar is calculated based on the temporal bandwidth, spatial bandwidth and Fujita scale of the touchdown tornado using the calculation described in Algorithm section. We can also investigate the change in intensity over years. The plots below shows how the intensity of tornado touchdown changes in the United States in a ten-year interval for the past 50 years. 5
  • 7. (a) Intensity of Tornado From 1950 to 1960 (b) Tornado Intensity Change From 1960 to 1970 (c) Tornado Intensity Change From 1970 to 1980 (d) Tornado Intensity Change From 1980 to 1990 (e) Tornado Intensity Change From 1990 to 2000 (f) Tornado Intensity Change From 2000 to 2010 It is clear in the plots that for the first ten years(Figure (a)), there are more incidents of tornado touchdowns in states such as Oklahoma and Kansas. In Figure (b), we can see that the intensity of tornado shows a dramatic increase in 6
  • 8. the northern part of the country like Wisconsin, Iowa and Illinois. In Figure (c) the intensity of the tornado in the northern part of the United States decreased while the intensity of tornado in southern states such as Mississippi, Texas and Alabama, which is on the east part of the core areas of the “tornado alley”, increased significantly than first two decades. The Figures (d), (e), (f) show the most recent intensity changes of tornado touchdowns. We can see from the plots that there is a dramatic increase in both the frequency and the intensity of tornadoes on the north part of the “tornado alley” from year 1990 to 2010. It suggests that the frequency of tornado is increasing towards the northern part of “tornado valley” in the state of South Dakota, the boarder of North Dakota and Wisconsin and extends to the Canadian broader. Moreover, the frequency of tornado also increases towards the southeast part of the “tornado alley”. In Figures (e) and (f), which is the most recent change in intensity of tornado, southern states such as Florida, Georgia, and Alabama have suffered more incidents of tornado than in 1950s. However, the intensity of tornado show little changes in northern states like New York and Massachusetts, and the east coast. We can also investigate the total intensity of tornado in the past six decades. The figure below shows the total intensity calculated using kernel smoothing. Figure 2: Total Intensity of Tornado Touchdown in the Past Decades Note that the intensity of tornado starting from 2010 is calculated only based on year 2010-2014. However, we can see that the total intensity is still almost the same as the past years. It is because during 2011 there was a Tornado Super Outbreak, which is the largest tornado outbreak ever recorded. As a result, the total intensity in 2010-2014 still matches the intensity level of the previous decades thanks to the Super Outbreak in 2011. 7
  • 9. In conclusion, the result of this project shows that the intensity of tornado shows greater increase in the states that are on the north of the “tornado alley” such as South Dakota and Nebraska. The southern states that are on the east of the “tornado alley” such as Mississippi and Alabama also suffer from an increase in tornado intensity recently. The intensity of tornado shows little change in northern states and the east coast. However, the states in the “tornado alley” like Oklahoma and Texas still suffer high intensity of tornado in the past 50 years, which can be seen in Figure 1. In short, the intensity of tornado touchdowns has a trend to increase towards the north of the “tornado alley” to the Canadian boarder, and towards the east of the “tornado alley” to the southern states of the US. 8
  • 10. 5 Appendix - Python Code Please click on the following link for more information https://github.com/weifhu0124/Undergraduate Research References [1] Tornado History Project http://www.tornadohistoryproject.com/ [2] Tornado Alley http://www.si.edu/Content/SE/Educator%20Guides/Tornado EdGuide R5.pdf/ 9