SlideShare a Scribd company logo
1 of 25
Storm Prediction
Analysis
Ankit Dargad
Gautam Sawant
Janhavi Kandalgaonkar
Multivariate Data Analysis
Prof. David Belanger
2
Overview:
• Tropical Cyclones, Storms and Tornados
cause huge amount of human and property
loss each year
• 1.9 million people have perished due to
cyclones during the last two years
• United States is one of the worst affected
countries in terms of property loss due to
cyclones
• In addition to the human loss United States
has suffered a property loss in excess of
$10 Billion for each of the last 8 years
• Can we predict the loss caused from
cyclones from past data and thereby
provide relevant insights to the disaster
management efforts to actually reduce the
loss?
3
Project Summary
• Data Source: Dataset contains information on tornadoes from 1950 - 2015.
• Dataset created by National weather service and available at
http://www.spc.noaa.gov/gis/svrgis/
• Project Objective: We plan to analyze the storm data and provide insights that
can help the disaster management teams to better channelize their resources
for future cyclones
• Analysis will include state wise analysis of worst affected states
• We can also try to predict the the revenue loss which is a good indicator of
intensity of the cyclone and use this information to deploy rescue efforts as
soon as a new cyclone is predicted
4
Data understanding
• Data contains 60,114 rows each containing an instance of cyclone and 21
columns/ attributes for each cyclone
• Following is a list of variables in the dataset:
Variables Nos. Variable Type/Description Variable names
1-7 Information regarding day,
date and time of tornadoes
om, yr(year), mo(month),
day, date, time tz(timezone)
8-9 State related information state, stf(State Fips no),
stn(state n0o.)
11-15 information related to
magnitude and loss in terms
of human life and money
mag, inj(injuries), fatalities,
loss, closs(crop loss)
16-21 Attributes for measuring
storm/ hurricane
slat(starting latitude),
slon(starting longitude),
elat, elon, len, wid
5
Data Quality Check & Cleaning
• Correlation Matrix
Predictors that highly correlate with target variable are:
1. Magnitude
2. Fatalities
3. Length of Tornado
4. Width of Tornado
• Missing Values
There were NO missing values in the dataset.
• Outlier Detection
There were NO significant outliers found in dataset.
• Data Split:
Out of 60,114 instances of storms, we randomly splitted the data.
Training dataset contains 20,000 values
Testing dataset contains 40,114 values.
6
Correlation Matrix
7
State-wise loss Prediction
• This analysis aims to look at total property
loss and tornado frequency by state from
1996 through 2015, for which the data is
sliced from 1996 to 2015.
• The data is then indexed and aggregated by
state, providing the frequency and sum of
total property damage.
8
Funds Allocation
FL-Florida with total property loss worth 923.86 million $ requires maximum
fund allocation
9
Relief Measures Allocation
TX- Texas with 2767 as frequency of tornado occurrence needs to be
allocated with maximum relief measures.
1
Multiple Linear Regresion
• Multiple linear regression attempts to model
the relationship between two or more
explanatory variables and a response variable
by fitting a linear equation to observed data.
Every value of the independent variable x is
associated with a value of the dependent
variable y.
• We created a model for multiple regression on
the training data and applied this model on the
tsting data
• As we can see from the analysis a total 16
variables are significant if we take loss as
dependent variable and all the remaining
variables as independent variables
1
Step-wise Multiple Regression
• Stepwise regression only helps us confirm the
best variables for performing multiple
regression.
• We will use the result of stepwise regression in
further analysis
• Instead of using all the independent variables
we will use only the significant variables
provided in this analysis
• Again we have applied the model generated
using training data on the testing data
1
Principal Component Analysis
• We next calculate the
principal components using
PCA.
• We get the principal
components as seen in the
screenshot:
1
Proportion of Variance explained
• First 8 components explain 75% of
variance
• We now will perform algorithms
using the first 8 principal
components and check whether
principal components improve the
efficiency of our model
1
Random Forest
• Random forests or random decision forests
are an ensemble learning method for
classification, regression and other tasks, that
operate by constructing a multitude of decision
trees at training time and outputting the class
that is the mode of the classes (classification)
or mean prediction (regression) of the individual
trees.
• We have the confusion matrix with results of
random forest prediction of loss on the testing
data with or without PCA
• In our case accuracy reduces by using the
principal components
Accuracy without PCA Accuracy with PCA
86.98% 85.38%
1
Linear Discriminant Analysis
• Discriminant Analysis is used
to classify individuals into one
of two or more groups on the
basis of measurements
• We will try to classify the loss
of future cyclones as
low/Medium and High or 1,2,3
using the past data
1
Linear Discriminant Analysis
• We have the results of LDA
confusion matrix without principal
components and LDA with principal
components
• As we can see the accuracy of the
model is better without regression
1
K-Means to predict Emergency level
• K-means clustering algorithm is used to to partition n observations into k
clusters in which each observation belongs to the cluster with the nearest
mean, serving as a prototype of the cluster.
• K-means clustering is applied to Storm Dataset to define the different
levels(clusters) of emergency under which a particular storm can be defined.
• Length(in miles) and width(in yards) of the storm are used to build the clusters.
• Total of 60114 observations are partitioned into 6 clusters hence defining 6
levels of emergency with level 1 being the low emergency situation and level 6
being the high emergency level.
1
K-Means to predict Emergency level
1
K-Means to predict Emergency level
2
Random Forest to predict frequency
of storms in different seasons
• Random forest algorithm is used to predict frequencies of storm in different
seasons so as to analyze the effect of climatic conditions on storms.
• Season data was created using the month of the occurrence of the tornado.
•
Months Season
1-2(January-February) Winter
3-6(March-June) Spring
7-9(July-September) Summer
10-12(October-December) Fall
2
Random Forest to predict frequency
of storms in different seasons
• Confusion Matrix:
• Calculating % Accuracy:
Fall + Spring + Summer + Winter/(Number of Observations)
= 1357+36739+2228+460/ 60114
= 67.84%
2
Random Forest to predict frequency
of storms in different seasons
• Accuracy = 67.84%
• Hence, our model was not only able to predict the % accuracy but was also able
to depict the difference in occurrence of storms in different seasons in U.S.
• After the analysis, it was found out that the occurrence of Storms were most
common in spring and least common in winter.
• The model can be used by the government entities such as disaster
management and rescue operations team to take the required precautions in
different seasons to avoid the loss.
2
Conclusion
• We performed several different analysis such as analysis of state-wise loss,
predicting loss through classification models, predicting the seasons of cyclones
and clustering.
• We conclude that loss of the cyclones can be successfully predicted beforehand
and rescue efforts can be directed accordingly to increase the effectiveness of
rescue efforts.
• We also saw that for our data the results of prediction are better without
performing PCA. Hence we recommend that we can develop models without
doing dimension reduction in our dataset
• We found random forest to be most accurate in predicting loss with 86.98%
accuracy. Hence we will go ahead with this model for prediction
• We were able to predict the the level of emergency using clustering.
• We were also able to predict the seasons when storms are most likely to occur
and accordingly keep a tab on the readiness of the rescue efforts.
2
References
• https://www.kaggle.com/jtennis/spctornado
• https://www.analyticsvidhya.com/blog/2016/03/practical-guide-principal-
component-analysis-python/
• http://www.statmethods.net/stats/regression.html
• http://www.spc.noaa.gov/wcm/data/SPC_severe_database_description.pdf
• https://weather.com/safety/hurricane/news/hurricanes-tropical-storms-us-deaths-
surge-flooding
• https://www.r-bloggers.com/predicting-wine-quality-using-random-forests/
• http://trevorstephens.com/kaggle-titanic-tutorial/r-part-5-random-forests/
2
Thank You

More Related Content

What's hot

Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...YutaSuzuki27
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptxhiblooms
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networksFrancisco Restivo
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsSelman Bozkır
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfSowmyaJyothi3
 
Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?
Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?
Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?Kirill Eremenko
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing mapsraphaelkiminya
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machineDr. Radhey Shyam
 
Computational learning theory
Computational learning theoryComputational learning theory
Computational learning theoryswapnac12
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayesDhwaj Raj
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural NetworksAshray Bhandare
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
 
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdf
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdfNEURALNETWORKS_DM_SOWMYAJYOTHI.pdf
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdfSowmyaJyothi3
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolutionPrudhvi Raj
 
Combining inductive and analytical learning
Combining inductive and analytical learningCombining inductive and analytical learning
Combining inductive and analytical learningswapnac12
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionKai-Wen Zhao
 

What's hot (20)

Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
 
Resampling methods
Resampling methodsResampling methods
Resampling methods
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networks
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systems
 
CLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdfCLUSTERING IN DATA MINING.pdf
CLUSTERING IN DATA MINING.pdf
 
Bayes Belief Networks
Bayes Belief NetworksBayes Belief Networks
Bayes Belief Networks
 
Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?
Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?
Deep Learning A-Z™: Self Organizing Maps (SOM) - How Do SOMs Work?
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
properties, application and issues of support vector machine
properties, application and issues of support vector machineproperties, application and issues of support vector machine
properties, application and issues of support vector machine
 
Feature Pyramid Network, FPN
Feature Pyramid Network, FPNFeature Pyramid Network, FPN
Feature Pyramid Network, FPN
 
Computational learning theory
Computational learning theoryComputational learning theory
Computational learning theory
 
Introduction to text classification using naive bayes
Introduction to text classification using naive bayesIntroduction to text classification using naive bayes
Introduction to text classification using naive bayes
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdf
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdfNEURALNETWORKS_DM_SOWMYAJYOTHI.pdf
NEURALNETWORKS_DM_SOWMYAJYOTHI.pdf
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 
Combining inductive and analytical learning
Combining inductive and analytical learningCombining inductive and analytical learning
Combining inductive and analytical learning
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person Detection
 

Similar to Storm Prediction data analysis using R/SAS

Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptxSunny429247
 
R Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal DependenceR Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal DependenceWork-Bench
 
Risk And Uncertainty Analysis: A Primer for Floodplain Managers
Risk And Uncertainty Analysis:  A Primer for Floodplain ManagersRisk And Uncertainty Analysis:  A Primer for Floodplain Managers
Risk And Uncertainty Analysis: A Primer for Floodplain ManagersMichael DePue
 
MovingAverage (2).pptx
MovingAverage (2).pptxMovingAverage (2).pptx
MovingAverage (2).pptxbrahimNasibov
 
1 lab basicstatisticsfall2013
1 lab basicstatisticsfall20131 lab basicstatisticsfall2013
1 lab basicstatisticsfall2013TAMUK
 
Applied Mathematics project final report
Applied Mathematics project final reportApplied Mathematics project final report
Applied Mathematics project final reportKang Feng
 
FormalWriteupTornado_1
FormalWriteupTornado_1FormalWriteupTornado_1
FormalWriteupTornado_1Katie Harvey
 
Seminar final1
Seminar final1Seminar final1
Seminar final1Amod6
 
Ai big dataconference_taras firman how to build advanced prediction with addi...
Ai big dataconference_taras firman how to build advanced prediction with addi...Ai big dataconference_taras firman how to build advanced prediction with addi...
Ai big dataconference_taras firman how to build advanced prediction with addi...Olga Zinkevych
 
Taras Firman "How to build advanced prediction with adding external data."
Taras Firman "How to build advanced prediction with adding external data."Taras Firman "How to build advanced prediction with adding external data."
Taras Firman "How to build advanced prediction with adding external data."DataConf
 
Holtwinters terakhir lengkap
Holtwinters terakhir lengkapHoltwinters terakhir lengkap
Holtwinters terakhir lengkapZulyy Astutik
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesArun Kejariwal
 
Machine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation dataMachine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation datajagan477830
 
Ground measured data vs meteo data sets:57 locations in India_01.01.2020
Ground measured data vs meteo data sets:57 locations in India_01.01.2020Ground measured data vs meteo data sets:57 locations in India_01.01.2020
Ground measured data vs meteo data sets:57 locations in India_01.01.2020Gensol Engineering Limited
 
urpl969-group2-paper-03May06
urpl969-group2-paper-03May06urpl969-group2-paper-03May06
urpl969-group2-paper-03May06Wintford Thornton
 

Similar to Storm Prediction data analysis using R/SAS (20)

Forecasting Examples
Forecasting ExamplesForecasting Examples
Forecasting Examples
 
Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptx
 
EXL Analytics
EXL AnalyticsEXL Analytics
EXL Analytics
 
R Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal DependenceR Packages for Time-Varying Networks and Extremal Dependence
R Packages for Time-Varying Networks and Extremal Dependence
 
Risk And Uncertainty Analysis: A Primer for Floodplain Managers
Risk And Uncertainty Analysis:  A Primer for Floodplain ManagersRisk And Uncertainty Analysis:  A Primer for Floodplain Managers
Risk And Uncertainty Analysis: A Primer for Floodplain Managers
 
MovingAverage (2).pptx
MovingAverage (2).pptxMovingAverage (2).pptx
MovingAverage (2).pptx
 
1 lab basicstatisticsfall2013
1 lab basicstatisticsfall20131 lab basicstatisticsfall2013
1 lab basicstatisticsfall2013
 
Applied Mathematics project final report
Applied Mathematics project final reportApplied Mathematics project final report
Applied Mathematics project final report
 
FormalWriteupTornado_1
FormalWriteupTornado_1FormalWriteupTornado_1
FormalWriteupTornado_1
 
FORECASTING MODELS
FORECASTING MODELSFORECASTING MODELS
FORECASTING MODELS
 
Seminar final1
Seminar final1Seminar final1
Seminar final1
 
Ai big dataconference_taras firman how to build advanced prediction with addi...
Ai big dataconference_taras firman how to build advanced prediction with addi...Ai big dataconference_taras firman how to build advanced prediction with addi...
Ai big dataconference_taras firman how to build advanced prediction with addi...
 
Taras Firman "How to build advanced prediction with adding external data."
Taras Firman "How to build advanced prediction with adding external data."Taras Firman "How to build advanced prediction with adding external data."
Taras Firman "How to build advanced prediction with adding external data."
 
Holtwinters terakhir lengkap
Holtwinters terakhir lengkapHoltwinters terakhir lengkap
Holtwinters terakhir lengkap
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time Series
 
Machine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation dataMachine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation data
 
Ground measured data vs meteo data sets:57 locations in India_01.01.2020
Ground measured data vs meteo data sets:57 locations in India_01.01.2020Ground measured data vs meteo data sets:57 locations in India_01.01.2020
Ground measured data vs meteo data sets:57 locations in India_01.01.2020
 
Ax4301259274
Ax4301259274Ax4301259274
Ax4301259274
 
Forcasting methods
Forcasting methodsForcasting methods
Forcasting methods
 
urpl969-group2-paper-03May06
urpl969-group2-paper-03May06urpl969-group2-paper-03May06
urpl969-group2-paper-03May06
 

Recently uploaded

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 

Recently uploaded (20)

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 

Storm Prediction data analysis using R/SAS

  • 1. Storm Prediction Analysis Ankit Dargad Gautam Sawant Janhavi Kandalgaonkar Multivariate Data Analysis Prof. David Belanger
  • 2. 2 Overview: • Tropical Cyclones, Storms and Tornados cause huge amount of human and property loss each year • 1.9 million people have perished due to cyclones during the last two years • United States is one of the worst affected countries in terms of property loss due to cyclones • In addition to the human loss United States has suffered a property loss in excess of $10 Billion for each of the last 8 years • Can we predict the loss caused from cyclones from past data and thereby provide relevant insights to the disaster management efforts to actually reduce the loss?
  • 3. 3 Project Summary • Data Source: Dataset contains information on tornadoes from 1950 - 2015. • Dataset created by National weather service and available at http://www.spc.noaa.gov/gis/svrgis/ • Project Objective: We plan to analyze the storm data and provide insights that can help the disaster management teams to better channelize their resources for future cyclones • Analysis will include state wise analysis of worst affected states • We can also try to predict the the revenue loss which is a good indicator of intensity of the cyclone and use this information to deploy rescue efforts as soon as a new cyclone is predicted
  • 4. 4 Data understanding • Data contains 60,114 rows each containing an instance of cyclone and 21 columns/ attributes for each cyclone • Following is a list of variables in the dataset: Variables Nos. Variable Type/Description Variable names 1-7 Information regarding day, date and time of tornadoes om, yr(year), mo(month), day, date, time tz(timezone) 8-9 State related information state, stf(State Fips no), stn(state n0o.) 11-15 information related to magnitude and loss in terms of human life and money mag, inj(injuries), fatalities, loss, closs(crop loss) 16-21 Attributes for measuring storm/ hurricane slat(starting latitude), slon(starting longitude), elat, elon, len, wid
  • 5. 5 Data Quality Check & Cleaning • Correlation Matrix Predictors that highly correlate with target variable are: 1. Magnitude 2. Fatalities 3. Length of Tornado 4. Width of Tornado • Missing Values There were NO missing values in the dataset. • Outlier Detection There were NO significant outliers found in dataset. • Data Split: Out of 60,114 instances of storms, we randomly splitted the data. Training dataset contains 20,000 values Testing dataset contains 40,114 values.
  • 7. 7 State-wise loss Prediction • This analysis aims to look at total property loss and tornado frequency by state from 1996 through 2015, for which the data is sliced from 1996 to 2015. • The data is then indexed and aggregated by state, providing the frequency and sum of total property damage.
  • 8. 8 Funds Allocation FL-Florida with total property loss worth 923.86 million $ requires maximum fund allocation
  • 9. 9 Relief Measures Allocation TX- Texas with 2767 as frequency of tornado occurrence needs to be allocated with maximum relief measures.
  • 10. 1 Multiple Linear Regresion • Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable y. • We created a model for multiple regression on the training data and applied this model on the tsting data • As we can see from the analysis a total 16 variables are significant if we take loss as dependent variable and all the remaining variables as independent variables
  • 11. 1 Step-wise Multiple Regression • Stepwise regression only helps us confirm the best variables for performing multiple regression. • We will use the result of stepwise regression in further analysis • Instead of using all the independent variables we will use only the significant variables provided in this analysis • Again we have applied the model generated using training data on the testing data
  • 12. 1 Principal Component Analysis • We next calculate the principal components using PCA. • We get the principal components as seen in the screenshot:
  • 13. 1 Proportion of Variance explained • First 8 components explain 75% of variance • We now will perform algorithms using the first 8 principal components and check whether principal components improve the efficiency of our model
  • 14. 1 Random Forest • Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks, that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. • We have the confusion matrix with results of random forest prediction of loss on the testing data with or without PCA • In our case accuracy reduces by using the principal components Accuracy without PCA Accuracy with PCA 86.98% 85.38%
  • 15. 1 Linear Discriminant Analysis • Discriminant Analysis is used to classify individuals into one of two or more groups on the basis of measurements • We will try to classify the loss of future cyclones as low/Medium and High or 1,2,3 using the past data
  • 16. 1 Linear Discriminant Analysis • We have the results of LDA confusion matrix without principal components and LDA with principal components • As we can see the accuracy of the model is better without regression
  • 17. 1 K-Means to predict Emergency level • K-means clustering algorithm is used to to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. • K-means clustering is applied to Storm Dataset to define the different levels(clusters) of emergency under which a particular storm can be defined. • Length(in miles) and width(in yards) of the storm are used to build the clusters. • Total of 60114 observations are partitioned into 6 clusters hence defining 6 levels of emergency with level 1 being the low emergency situation and level 6 being the high emergency level.
  • 18. 1 K-Means to predict Emergency level
  • 19. 1 K-Means to predict Emergency level
  • 20. 2 Random Forest to predict frequency of storms in different seasons • Random forest algorithm is used to predict frequencies of storm in different seasons so as to analyze the effect of climatic conditions on storms. • Season data was created using the month of the occurrence of the tornado. • Months Season 1-2(January-February) Winter 3-6(March-June) Spring 7-9(July-September) Summer 10-12(October-December) Fall
  • 21. 2 Random Forest to predict frequency of storms in different seasons • Confusion Matrix: • Calculating % Accuracy: Fall + Spring + Summer + Winter/(Number of Observations) = 1357+36739+2228+460/ 60114 = 67.84%
  • 22. 2 Random Forest to predict frequency of storms in different seasons • Accuracy = 67.84% • Hence, our model was not only able to predict the % accuracy but was also able to depict the difference in occurrence of storms in different seasons in U.S. • After the analysis, it was found out that the occurrence of Storms were most common in spring and least common in winter. • The model can be used by the government entities such as disaster management and rescue operations team to take the required precautions in different seasons to avoid the loss.
  • 23. 2 Conclusion • We performed several different analysis such as analysis of state-wise loss, predicting loss through classification models, predicting the seasons of cyclones and clustering. • We conclude that loss of the cyclones can be successfully predicted beforehand and rescue efforts can be directed accordingly to increase the effectiveness of rescue efforts. • We also saw that for our data the results of prediction are better without performing PCA. Hence we recommend that we can develop models without doing dimension reduction in our dataset • We found random forest to be most accurate in predicting loss with 86.98% accuracy. Hence we will go ahead with this model for prediction • We were able to predict the the level of emergency using clustering. • We were also able to predict the seasons when storms are most likely to occur and accordingly keep a tab on the readiness of the rescue efforts.
  • 24. 2 References • https://www.kaggle.com/jtennis/spctornado • https://www.analyticsvidhya.com/blog/2016/03/practical-guide-principal- component-analysis-python/ • http://www.statmethods.net/stats/regression.html • http://www.spc.noaa.gov/wcm/data/SPC_severe_database_description.pdf • https://weather.com/safety/hurricane/news/hurricanes-tropical-storms-us-deaths- surge-flooding • https://www.r-bloggers.com/predicting-wine-quality-using-random-forests/ • http://trevorstephens.com/kaggle-titanic-tutorial/r-part-5-random-forests/