SlideShare a Scribd company logo
1 of 22
Extracting Dense Regions From
HurricaneTrajectory Data[1]
Praveen KumarTripathi, Madhuri Debnath, Ramez Elmasri
Dept. of Computer Science, University ofTexas at Arlington
Presented By:
Vivek Kumar Sharma
[1]”GeoRich’14 Proceedings ofWorkshop on
Managing and Mining Enriched Geo-Spatial Data”
Outline
• Introduction
• Trajectory Clustering on Point Data
• DBSCAN
• Dense Regions Extraction
• Experiments and Analysis
• Related Work
• Conclusion
• References
Introduction
• Weather Data:A good example of Spatial-temporal data(Time & Space)
• Clustering: Key technique to analyze storm trajectories
• Trajectories, Sub-Trajectories and Point Data
• (long, lat) as spatial attributes and (wind-speed, time) as similarity measure
• Monitoring & Predicting Weather conditions
• Identifying origin, destination of Hurricanes and most affected regions
• Hurricane dataset in Atlantic region from 1950-1999 with attributes longitude,
latitude, time, wind-speed, pressure and status
Trajectory Clustering on Point Data
• Consider a trajectory as an individual Point
• Points are not constrained to belong to its respective parent trajectory or
sub-trajectory
[1] http://weather.unisys.com/hurricane/atlantic/2012H/index.php
DBSCAN
• Two global parameters:
• Eps: Maximum radius of the neighborhood
• MinPts: Minimum number of points in an Eps-neighborhood of that point
• Density = number of points within a specified radius r (Eps)
• Core Object: object with at least MinPts objects within a radius ‘Eps-neighborhood’
• Border Object: object that on the border of a cluster
• A noise point is any point that is not a core point or a border point.
Background (I)
• NEps(p): {q belongs to D | dist(p, q) <= Eps}
• Directly density-reachable:
A point p is directly density-reachable from a point q wrt. Eps, MinPts if
• 1) p belongs to NEps(q)
• 2) |NEps (q)| >= MinPts
(core point condition) p
q
MinPts = 5
Eps = 1 cm
Background (II)
• Density-reachable:
• A point p is density-reachable
from a point q wrt. Eps, MinPts if
there is a chain of points p1, …,
pn, p1 = q, pn = p such that pi+1 is
directly density-reachable from
pi
• Density-connected:
• A point p is density-connected
to a point q wrt. Eps, MinPts if
there is a point o such that both,
p and q are density-reachable
from o wrt. Eps and MinPts.
p
q
p1
p q
o
DBSCAN Algorithm
• Arbitrary select a point p
• Retrieve all points density-reachable from p wrt Eps and MinPts.
• If p is a core point, a cluster is formed.
• If p is a border point, no points are density-reachable from p and DBSCAN visits the next
point of the database.
• Continue the process until all of the points have been processed.
Core
Border
Outlier
Eps = 1cm
MinPts = 5
Dense Region Extraction
• How to find regions of high storm activity
• DBSCAN parameter Eps which gives the neighborhood
• First neighborhood considering spatial attributes
• Then adding non-spatial measure wind-speed as well
• SpatialClustering
• Spatial and non spatial clustering
• Spatial temporal clustering
Dense Region Extraction
(Spatial Clustering)
• Latitude and Longitude as data points
• Haversine formula as similarity measure between two points
• a = sin(/2) + cos(φ1) × cos(φ2) × sin2(λ/2)
• c = 2 × arctan 2(pa, p(1 − a))
• d = R × c
• where (φ/latitude) = φ2 − φ1 and (λ/longitude) = λ2 − λ1, R = 6371Km is the radius of the
Earth, and d is the Haversine distance.
• This formula is used to calculate spherical distance between two points because of
spatial coordinates which are lying on spherical object(Earth)
Dense Region Extraction
(Spatial and non Spatial Clustering)
• Extending DBSCAN to handle non-spatial attribute
• If we have a data set D = {d1, d2, . . . , dn}, where di = (xi, yi, ai, bi) and (xi, yi) be the
spatial attribute & (ai, bi) as non-spatial attribute
• Similarity measure can be calculated as:
• dist_spatial (d1,d2) = √(x1 − x2)^2 + (y1 − y2)^2
• dist_nonspatial (d1,d2) = √(a1 − a2)^2 + (b1 − b2)^2
• Remaining steps for DBSCAN will remain same to calculate dense regions for
spatial and non-spatial attributes
• Finally composite neighborhood will include data points common in both
Dense Region Extraction
(SpatialTemporal Clustering)
• Consider Time as non-spatial attribute for analysis
• Hurricane data is sampled at 6 hour interval
• Haversine formula is being used to calculate similarity measure between spatial
attributes of two data points
• Manhattan distance for non-spatial attribute time is used
• Relative time framework instead of Absolute time considered which means relative
time from start of a particular trajectory
• Important to analyze movement patterns of hurricane
• Normalized time component(as per length of trajectory) varies between 0-1.0
• 0.00 signifies initial stage
• 0.5 signifies middle stage
• Higher values close to 1.0 signifies end of hurricane
Experiments and Analysis
• Implementation and Analysis being done on MATLAB
• Different combinations of Eps and MinPts has been tried to get well
separated and compact clusters
• Four different types of analysis being done:
• Analysing Spatial attributes
• QualitativeAnalysis of clusters
• Analysing combination of spatial and non-spatial attributes
• Storm starting and landing information
Experiment and Analysis
(Analysing Spatial Attributes)
• Eps= 35, MinPts= 10, k= 15
Experiment and Analysis
(Qualitative Analysis of Clusters)
• Clustering based on spatial and non-spatial attributes
• Wind-speed and Time
• Wind-speed threshold has been varied from 20-100 in the interval of 10
• Time/Temporal threshold from 0.2-1.0 in the interval of 0.1
• Quality measure needed to check the clustering results
• SSE= (Sum of the Square Error)
• numclus is the number of clusters, N is the set of noise points and Ci is the ith cluster
• Smaller the value of SSE, better will be the clustering results
Experiment and Analysis
(Combination of Spatial and non-spatial attributes)
• Studying the nature of storms
• Now neighboring points will be close to core data point spatially and will
have similar wind-speed values
• Reduced cluster size as they represents regions with same nature of storms
• As wind-speed being reduced cluster size also reduces
Experiment and Analysis
(Combination of Spatial and non-spatial attributes)
• SSE Results :
• Reduction in wind-speed results
compact and homogeneous
clusters.
• Spatial-Temporal Clustering:
• Temporal threshold reduced
from 1.0 to 0.2.
Experiment and Analysis
(Storm Starting and Landing Information)
• Storm Starting Clusters
• Considered starting three data points wrt first
three timestamps for each trajectory
• Storm Landing Clusters
• Considered last three data points
RelatedWork
• DENTRAC (DENsity basedTRAjectory Clustering)
• Post processing of clusters to get more domain specific knowledge
• Partition and Group based Clustering
• Using concept of MDL (Minimum Descriptive Length) identifies characteristic points
• Representing trajectory by connecting consecutive characteristics points
• Clustering algorithm using computational geometry and string processing
• Remove noise by preprocessing trajectory data and divide into sub-trajectories
• Slicing-STS-Miner using sequential patterns
• Spatial temporal pattern convoy being proposed
• Region based &Trajectory based Clustering
• Classifying trajectories using duration of trajectory as an important feature
Conclusion
• Analyzed hurricane trajectory data as point data
• Initial clustering based on spatial attributes
• Influence of non-spatial attribute i.e. wind-speed obtained
• Spatial temporal DBSCAN algorithm has been proposed in which
normalized time is also considered as non-spatial attribute
• Post-processing of clusters to get origin, landing and tracking information
References
• http://weather.unisys.com/hurrricane/atlantic/index.html.
• Data Mining: Concepts andTechniques, 2nd ed.Morgan Kaufmann, 2006.
• Derya Birant and Alp Kut. ST-DBSCAN:An algorithm for clustering spatial-temporal data. DataKnowl.
Eng., 60(1):208–221, January 2007.
• Chun-Sheng Chen, Christoph F. Eick, and NouhadJ. Rizk. Mining spatial trajectories using non-
parametric density functions. In Petra Perner, editor, Machine Learning and Data Mining in Pattern
Recognition,volume 6871 of Lecture Notes in Computer Science, pages 496–510. Springer Berlin
Heidelberg, 2011.
• Martin Ester, Hans peter Kriegel, Jorg S, and Xiaowei Xu. A density-based algorithm for discovering
clusters in large spatial databases with noise. pages 226–231.AAAI Press, 1996.
• Jae gil Lee and Jiawei Han.Trajectory clustering:A partition-and-group framework. In In SIGMOD,
pages 593–604, 2007.
References
• Joachim Gudmundsson,AndreasThom, and JanVahrenhold.Of motifs and goals: mining trajectory data. In
Proceedings of the 20th InternationalConference on Advances in Geographic InformationSystems,
SIGSPATIAL ’12, pages 129–138, NewYork, NY, USA, 2012.ACM.
• Yan Huang, Liqin Zhang, and Pusheng Zhang.A framework for mining sequential patterns from spatio-
temporal event data sets. IEEETransactions on Knowledge and Data Engineering, 20(4):433–448, 2008.
• Hoyoung Jeung, Man LungYiu, Xiaofang Zhou,Christian S. Jensen, and HengTao Shen. Discovery of
convoys in trajectory databases. Proc.VLDB Endow.,1(1):1068–1080,August 2008.
• Jae-Gil Lee, Jiawei Han, Xiaolei Li, and Hector Gonzalez.Traclass:Trajectory classification using hierarchical
region-based and trajectory-based clustering. Proc.VLDB Endow., 1(1):1081–1094,August 2008.
• Dhaval Patel, Chang Sheng,Wynne Hsu, and Mong Li Lee. Incorporating duration information for trajectory
classification. In Proceedings of the 2012 IEEE 28th InternationalConference on Data Engineering, ICDE ’12,
pages 1132–1143, 2012.
• MarcosVieira, Petko Bakalov, andVassilisTsotras.On-line discovery of flock patterns in spatio-temporal
data, 2009.

More Related Content

What's hot

DM Measurement with CHIME
DM Measurement with CHIMEDM Measurement with CHIME
DM Measurement with CHIMEBen Izmirli
 
Determining the k in k-means with MapReduce
Determining the k in k-means with MapReduceDetermining the k in k-means with MapReduce
Determining the k in k-means with MapReduceThibault Debatty
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesNECST Lab @ Politecnico di Milano
 
Automated design of multiphase space missions using hybrid optimal contro (1)
Automated design of multiphase space missions using hybrid optimal contro (1)Automated design of multiphase space missions using hybrid optimal contro (1)
Automated design of multiphase space missions using hybrid optimal contro (1)Julio Gonzalez-Saenz
 
Download-manuals-surface water-waterlevel-40howtocompiledischargedata
 Download-manuals-surface water-waterlevel-40howtocompiledischargedata Download-manuals-surface water-waterlevel-40howtocompiledischargedata
Download-manuals-surface water-waterlevel-40howtocompiledischargedatahydrologyproject001
 
First paper with the NITheCS affiliation
First paper with the NITheCS affiliationFirst paper with the NITheCS affiliation
First paper with the NITheCS affiliationRene Kotze
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringAndreina Uzcategui
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based ClusteringSSA KPI
 
2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...
2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...
2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...GIS in the Rockies
 
cloud_futures_2.0_Papazachos
cloud_futures_2.0_Papazachoscloud_futures_2.0_Papazachos
cloud_futures_2.0_PapazachosGeorge Spyrou
 
Temporal Scale-Spaces ScSp03
Temporal Scale-Spaces ScSp03Temporal Scale-Spaces ScSp03
Temporal Scale-Spaces ScSp03Daniel Fagerstrom
 
Box-fitting algorithm presentation
Box-fitting algorithm presentationBox-fitting algorithm presentation
Box-fitting algorithm presentationRidlo Wibowo
 
A new approach to dark matter
A new approach to dark matterA new approach to dark matter
A new approach to dark matterEran Sinbar
 
Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...
Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...
Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...Integrated Carbon Observation System (ICOS)
 
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...Lorena Santos
 
Path based Algorithms(Term Paper)
Path based Algorithms(Term Paper)Path based Algorithms(Term Paper)
Path based Algorithms(Term Paper)pankaj kumar
 

What's hot (20)

DM Measurement with CHIME
DM Measurement with CHIMEDM Measurement with CHIME
DM Measurement with CHIME
 
Determining the k in k-means with MapReduce
Determining the k in k-means with MapReduceDetermining the k in k-means with MapReduce
Determining the k in k-means with MapReduce
 
Self-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policiesSelf-adaptive container monitoring with performance-aware Load-Shedding policies
Self-adaptive container monitoring with performance-aware Load-Shedding policies
 
Automated design of multiphase space missions using hybrid optimal contro (1)
Automated design of multiphase space missions using hybrid optimal contro (1)Automated design of multiphase space missions using hybrid optimal contro (1)
Automated design of multiphase space missions using hybrid optimal contro (1)
 
Download-manuals-surface water-waterlevel-40howtocompiledischargedata
 Download-manuals-surface water-waterlevel-40howtocompiledischargedata Download-manuals-surface water-waterlevel-40howtocompiledischargedata
Download-manuals-surface water-waterlevel-40howtocompiledischargedata
 
First paper with the NITheCS affiliation
First paper with the NITheCS affiliationFirst paper with the NITheCS affiliation
First paper with the NITheCS affiliation
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means Clustering
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based Clustering
 
2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...
2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...
2018 GIS in Development: Grass GIS Striking Terrain Visualizations in the Roc...
 
CLIM: Transition Workshop - Investigating Precipitation Extremes in the US Gu...
CLIM: Transition Workshop - Investigating Precipitation Extremes in the US Gu...CLIM: Transition Workshop - Investigating Precipitation Extremes in the US Gu...
CLIM: Transition Workshop - Investigating Precipitation Extremes in the US Gu...
 
cloud_futures_2.0_Papazachos
cloud_futures_2.0_Papazachoscloud_futures_2.0_Papazachos
cloud_futures_2.0_Papazachos
 
Fall detection
Fall detectionFall detection
Fall detection
 
Time frequency analysis_journey
Time frequency analysis_journeyTime frequency analysis_journey
Time frequency analysis_journey
 
Temporal Scale-Spaces ScSp03
Temporal Scale-Spaces ScSp03Temporal Scale-Spaces ScSp03
Temporal Scale-Spaces ScSp03
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Box-fitting algorithm presentation
Box-fitting algorithm presentationBox-fitting algorithm presentation
Box-fitting algorithm presentation
 
A new approach to dark matter
A new approach to dark matterA new approach to dark matter
A new approach to dark matter
 
Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...
Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...
Raznjevic, Anja: Comparison of large eddy simulation of a point source methan...
 
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
Land Cover and Land use Classifiction from Satellite Image Time Series Data u...
 
Path based Algorithms(Term Paper)
Path based Algorithms(Term Paper)Path based Algorithms(Term Paper)
Path based Algorithms(Term Paper)
 

Similar to Paper2_CSE6331_Vivek_1001053883

Morocco2022_LocationProblem_Bondar.pptx
Morocco2022_LocationProblem_Bondar.pptxMorocco2022_LocationProblem_Bondar.pptx
Morocco2022_LocationProblem_Bondar.pptxouchenibrahim1
 
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...Spark Summit
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsMason Porter
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...Dataconomy Media
 
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural NetworkTraffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Networkivaderivader
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleKuldeep Jiwani
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Maninda Edirisooriya
 
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler..."Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...Dataconomy Media
 
Davis scholars studiopresentation
Davis scholars studiopresentationDavis scholars studiopresentation
Davis scholars studiopresentationbndavis18
 
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
Recurrence Quantification Analysis :Tutorial & application to eye-movement dataRecurrence Quantification Analysis :Tutorial & application to eye-movement data
Recurrence Quantification Analysis : Tutorial & application to eye-movement dataDeb Aks
 

Similar to Paper2_CSE6331_Vivek_1001053883 (20)

LSH
LSHLSH
LSH
 
Morocco2022_LocationProblem_Bondar.pptx
Morocco2022_LocationProblem_Bondar.pptxMorocco2022_LocationProblem_Bondar.pptx
Morocco2022_LocationProblem_Bondar.pptx
 
Db Scan
Db ScanDb Scan
Db Scan
 
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...Distributed Data Processing using Spark by  Panos Labropoulos_and Sarod Yataw...
Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
 
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural NetworkTraffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
Daamen r 2010scwr-cpaper
Daamen r 2010scwr-cpaperDaamen r 2010scwr-cpaper
Daamen r 2010scwr-cpaper
 
Introduction to TLS Workflow Presentation
Introduction to TLS Workflow PresentationIntroduction to TLS Workflow Presentation
Introduction to TLS Workflow Presentation
 
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler..."Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
 
LiDAR_Project
LiDAR_ProjectLiDAR_Project
LiDAR_Project
 
Presentation
PresentationPresentation
Presentation
 
429Talk
429Talk429Talk
429Talk
 
GIS Data Quality
GIS Data QualityGIS Data Quality
GIS Data Quality
 
Davis scholars studiopresentation
Davis scholars studiopresentationDavis scholars studiopresentation
Davis scholars studiopresentation
 
Internship
InternshipInternship
Internship
 
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
Recurrence Quantification Analysis :Tutorial & application to eye-movement dataRecurrence Quantification Analysis :Tutorial & application to eye-movement data
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
 

Paper2_CSE6331_Vivek_1001053883

  • 1. Extracting Dense Regions From HurricaneTrajectory Data[1] Praveen KumarTripathi, Madhuri Debnath, Ramez Elmasri Dept. of Computer Science, University ofTexas at Arlington Presented By: Vivek Kumar Sharma [1]”GeoRich’14 Proceedings ofWorkshop on Managing and Mining Enriched Geo-Spatial Data”
  • 2. Outline • Introduction • Trajectory Clustering on Point Data • DBSCAN • Dense Regions Extraction • Experiments and Analysis • Related Work • Conclusion • References
  • 3. Introduction • Weather Data:A good example of Spatial-temporal data(Time & Space) • Clustering: Key technique to analyze storm trajectories • Trajectories, Sub-Trajectories and Point Data • (long, lat) as spatial attributes and (wind-speed, time) as similarity measure • Monitoring & Predicting Weather conditions • Identifying origin, destination of Hurricanes and most affected regions • Hurricane dataset in Atlantic region from 1950-1999 with attributes longitude, latitude, time, wind-speed, pressure and status
  • 4. Trajectory Clustering on Point Data • Consider a trajectory as an individual Point • Points are not constrained to belong to its respective parent trajectory or sub-trajectory [1] http://weather.unisys.com/hurricane/atlantic/2012H/index.php
  • 5. DBSCAN • Two global parameters: • Eps: Maximum radius of the neighborhood • MinPts: Minimum number of points in an Eps-neighborhood of that point • Density = number of points within a specified radius r (Eps) • Core Object: object with at least MinPts objects within a radius ‘Eps-neighborhood’ • Border Object: object that on the border of a cluster • A noise point is any point that is not a core point or a border point.
  • 6. Background (I) • NEps(p): {q belongs to D | dist(p, q) <= Eps} • Directly density-reachable: A point p is directly density-reachable from a point q wrt. Eps, MinPts if • 1) p belongs to NEps(q) • 2) |NEps (q)| >= MinPts (core point condition) p q MinPts = 5 Eps = 1 cm
  • 7. Background (II) • Density-reachable: • A point p is density-reachable from a point q wrt. Eps, MinPts if there is a chain of points p1, …, pn, p1 = q, pn = p such that pi+1 is directly density-reachable from pi • Density-connected: • A point p is density-connected to a point q wrt. Eps, MinPts if there is a point o such that both, p and q are density-reachable from o wrt. Eps and MinPts. p q p1 p q o
  • 8. DBSCAN Algorithm • Arbitrary select a point p • Retrieve all points density-reachable from p wrt Eps and MinPts. • If p is a core point, a cluster is formed. • If p is a border point, no points are density-reachable from p and DBSCAN visits the next point of the database. • Continue the process until all of the points have been processed. Core Border Outlier Eps = 1cm MinPts = 5
  • 9. Dense Region Extraction • How to find regions of high storm activity • DBSCAN parameter Eps which gives the neighborhood • First neighborhood considering spatial attributes • Then adding non-spatial measure wind-speed as well • SpatialClustering • Spatial and non spatial clustering • Spatial temporal clustering
  • 10. Dense Region Extraction (Spatial Clustering) • Latitude and Longitude as data points • Haversine formula as similarity measure between two points • a = sin(/2) + cos(φ1) × cos(φ2) × sin2(λ/2) • c = 2 × arctan 2(pa, p(1 − a)) • d = R × c • where (φ/latitude) = φ2 − φ1 and (λ/longitude) = λ2 − λ1, R = 6371Km is the radius of the Earth, and d is the Haversine distance. • This formula is used to calculate spherical distance between two points because of spatial coordinates which are lying on spherical object(Earth)
  • 11. Dense Region Extraction (Spatial and non Spatial Clustering) • Extending DBSCAN to handle non-spatial attribute • If we have a data set D = {d1, d2, . . . , dn}, where di = (xi, yi, ai, bi) and (xi, yi) be the spatial attribute & (ai, bi) as non-spatial attribute • Similarity measure can be calculated as: • dist_spatial (d1,d2) = √(x1 − x2)^2 + (y1 − y2)^2 • dist_nonspatial (d1,d2) = √(a1 − a2)^2 + (b1 − b2)^2 • Remaining steps for DBSCAN will remain same to calculate dense regions for spatial and non-spatial attributes • Finally composite neighborhood will include data points common in both
  • 12. Dense Region Extraction (SpatialTemporal Clustering) • Consider Time as non-spatial attribute for analysis • Hurricane data is sampled at 6 hour interval • Haversine formula is being used to calculate similarity measure between spatial attributes of two data points • Manhattan distance for non-spatial attribute time is used • Relative time framework instead of Absolute time considered which means relative time from start of a particular trajectory • Important to analyze movement patterns of hurricane • Normalized time component(as per length of trajectory) varies between 0-1.0 • 0.00 signifies initial stage • 0.5 signifies middle stage • Higher values close to 1.0 signifies end of hurricane
  • 13. Experiments and Analysis • Implementation and Analysis being done on MATLAB • Different combinations of Eps and MinPts has been tried to get well separated and compact clusters • Four different types of analysis being done: • Analysing Spatial attributes • QualitativeAnalysis of clusters • Analysing combination of spatial and non-spatial attributes • Storm starting and landing information
  • 14. Experiment and Analysis (Analysing Spatial Attributes) • Eps= 35, MinPts= 10, k= 15
  • 15. Experiment and Analysis (Qualitative Analysis of Clusters) • Clustering based on spatial and non-spatial attributes • Wind-speed and Time • Wind-speed threshold has been varied from 20-100 in the interval of 10 • Time/Temporal threshold from 0.2-1.0 in the interval of 0.1 • Quality measure needed to check the clustering results • SSE= (Sum of the Square Error) • numclus is the number of clusters, N is the set of noise points and Ci is the ith cluster • Smaller the value of SSE, better will be the clustering results
  • 16. Experiment and Analysis (Combination of Spatial and non-spatial attributes) • Studying the nature of storms • Now neighboring points will be close to core data point spatially and will have similar wind-speed values • Reduced cluster size as they represents regions with same nature of storms • As wind-speed being reduced cluster size also reduces
  • 17. Experiment and Analysis (Combination of Spatial and non-spatial attributes) • SSE Results : • Reduction in wind-speed results compact and homogeneous clusters. • Spatial-Temporal Clustering: • Temporal threshold reduced from 1.0 to 0.2.
  • 18. Experiment and Analysis (Storm Starting and Landing Information) • Storm Starting Clusters • Considered starting three data points wrt first three timestamps for each trajectory • Storm Landing Clusters • Considered last three data points
  • 19. RelatedWork • DENTRAC (DENsity basedTRAjectory Clustering) • Post processing of clusters to get more domain specific knowledge • Partition and Group based Clustering • Using concept of MDL (Minimum Descriptive Length) identifies characteristic points • Representing trajectory by connecting consecutive characteristics points • Clustering algorithm using computational geometry and string processing • Remove noise by preprocessing trajectory data and divide into sub-trajectories • Slicing-STS-Miner using sequential patterns • Spatial temporal pattern convoy being proposed • Region based &Trajectory based Clustering • Classifying trajectories using duration of trajectory as an important feature
  • 20. Conclusion • Analyzed hurricane trajectory data as point data • Initial clustering based on spatial attributes • Influence of non-spatial attribute i.e. wind-speed obtained • Spatial temporal DBSCAN algorithm has been proposed in which normalized time is also considered as non-spatial attribute • Post-processing of clusters to get origin, landing and tracking information
  • 21. References • http://weather.unisys.com/hurrricane/atlantic/index.html. • Data Mining: Concepts andTechniques, 2nd ed.Morgan Kaufmann, 2006. • Derya Birant and Alp Kut. ST-DBSCAN:An algorithm for clustering spatial-temporal data. DataKnowl. Eng., 60(1):208–221, January 2007. • Chun-Sheng Chen, Christoph F. Eick, and NouhadJ. Rizk. Mining spatial trajectories using non- parametric density functions. In Petra Perner, editor, Machine Learning and Data Mining in Pattern Recognition,volume 6871 of Lecture Notes in Computer Science, pages 496–510. Springer Berlin Heidelberg, 2011. • Martin Ester, Hans peter Kriegel, Jorg S, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. pages 226–231.AAAI Press, 1996. • Jae gil Lee and Jiawei Han.Trajectory clustering:A partition-and-group framework. In In SIGMOD, pages 593–604, 2007.
  • 22. References • Joachim Gudmundsson,AndreasThom, and JanVahrenhold.Of motifs and goals: mining trajectory data. In Proceedings of the 20th InternationalConference on Advances in Geographic InformationSystems, SIGSPATIAL ’12, pages 129–138, NewYork, NY, USA, 2012.ACM. • Yan Huang, Liqin Zhang, and Pusheng Zhang.A framework for mining sequential patterns from spatio- temporal event data sets. IEEETransactions on Knowledge and Data Engineering, 20(4):433–448, 2008. • Hoyoung Jeung, Man LungYiu, Xiaofang Zhou,Christian S. Jensen, and HengTao Shen. Discovery of convoys in trajectory databases. Proc.VLDB Endow.,1(1):1068–1080,August 2008. • Jae-Gil Lee, Jiawei Han, Xiaolei Li, and Hector Gonzalez.Traclass:Trajectory classification using hierarchical region-based and trajectory-based clustering. Proc.VLDB Endow., 1(1):1081–1094,August 2008. • Dhaval Patel, Chang Sheng,Wynne Hsu, and Mong Li Lee. Incorporating duration information for trajectory classification. In Proceedings of the 2012 IEEE 28th InternationalConference on Data Engineering, ICDE ’12, pages 1132–1143, 2012. • MarcosVieira, Petko Bakalov, andVassilisTsotras.On-line discovery of flock patterns in spatio-temporal data, 2009.