Predicting growth of urban agglomerations through fractal analysis of geo spatial data
Upcoming SlideShare
Loading in...5
×
 

Predicting growth of urban agglomerations through fractal analysis of geo spatial data

on

  • 418 views

PREDICTING GROWTH OF URBAN AGGLOMERATIONS THROUGH FRACTAL ANALYSIS OF GEO-SPATIAL DATA ...

PREDICTING GROWTH OF URBAN AGGLOMERATIONS THROUGH FRACTAL ANALYSIS OF GEO-SPATIAL DATA

Location Analytics is one of the fastest emerging fields in the broad area of Business Intelligence/Data Science. By
some industry estimates, almost 80% of all data has a location dimension to it. Consequently, identification of
trends and patterns in spatially distributed information has far reaching applications ranging from urban planning, to
logistics and supply chain management, location based marketing, sales territory planning and retail store location.
In view of this, we present an approach based on Fractal Analysis (FA) of highly granular geo-spatial data.
Specifically, we use proprietary data available at approximately1 square km level for New Delhi, India provided by Indicus Analytics (India’s leading economic data analytics firm based in New Delhi). We compare and contrast the patterns and insights generated using the FA approach with other more traditional approaches such as spatial to correlation and structural similarity indices. Preliminary results indicate that there are indeed “selfsimilar” local patterns that are completely missed by spatial correlation that are accurately captured by the more sophisticated FA approach. These patterns provide deep insights into the underlying socio-economic and demographic processes and can be used to predict the spatial distribution of these variables in the future. For example, questions such as what are the pockets of population growth in a city and how will businesses and government respond to that growth can be answered using the proposed approach.

Statistics

Views

Total Views
418
Views on SlideShare
418
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Predicting growth of urban agglomerations through fractal analysis of geo spatial data Predicting growth of urban agglomerations through fractal analysis of geo spatial data Document Transcript

  • th 14 Esri India User Conference 2013 PREDICTING GROWTH OF URBAN AGGLOMERATIONS THROUGH FRACTAL ANALYSIS OF GEO-SPATIAL DATA Ashish Vanmali1, Saket Porwal2, Vikram Gadre3, Anshuman Gupta4, Laveesh Bhandari5 1,2,3 4,5 Department of Electrical Engineering, IIT Bombay, Mumbai– 400076, India Indicus Analytics, Nehru House, 4 Bahadur Shah Zafar Marg, New Delhi – 110002, India Abstract: About the Author: Location Analytics is one of the fastest emerging fields in the broad area of Business Intelligence/Data Science. By some industry estimates, almost 80% of all data has a location dimension to it. Consequently, identification of trends and patterns in spatially distributed information has far reaching applications ranging from urban planning, to logistics and supply chain management, location based marketing, sales territory planning and retail store location. In view of this, we present an approach based on Fractal Analysis (FA) of highly granular geo-spatial data. Specifically, we use proprietary data available at approximately1 square km level for New Delhi, India provided by Indicus Analytics (India’s leading economic data analytics firm based in New Delhi). We compare and contrast the patterns and insights generated using the FA approach with other more traditional approaches such as spatial to correlation and structural similarity indices. Preliminary results indicate that there are indeed “selfsimilar” local patterns that are completely missed by spatial correlation that are accurately captured by the more sophisticated FA approach. These patterns provide deep insights into the underlying socio-economic and demographic processes and can be used to predict the spatial distribution of these variables in the future. For example, questions such as what are the pockets of population growth in a city and how will businesses and government respond to that growth can be answered using the proposed approach. Mr. Ashish Vanmali, Ph.D. (Pursuing) Ashish V. Vanmali received his B.E. (EXTC) in 2001 from University of Mumbai, India and M. Tech. (Electrical Engg.) in 2008 from IIT Bombay, India. He is a Ph.D. candidate with Department of Electrical Engineering, IIT Bombay, India. He is an Assistant Professor with Vidyavardhini's C.O.E. & Tech., University of Mumbai, India. His research interests include image and video processing, biometrics, and data fusion. E mail ID: ashish@ee.iitb.ac.in Contact No.: +91 9890120301 Mr. Saket Porwal, M.Tech.(Pursuing) Saket Porwal is a final year M.Tech. student at IIT Bombay. His research interests include time-frequency aspects of signal processing and their application in data analysis. For additional details please refer to in.linkedin.com/pub/saket-porwal/41/959/250/ E mail ID: saketporwal@ee.iitb.ac.in Dr. Vikram Gadre Vikram Gadre is Professor with Department of Electrical Engineering, IIT Bombay, India. For details refer to http://www.ee.iitb.ac.in/web/faculty/homepage/vmga dre Dr. Anshuman Gupta Anshuman Gupta is Vice President with Indicus Analytics, New Delhi. He has over 15 years of experience in business analytics in the retail and CPG domains. For additional information please refer to in.linkedin.com/pub/anshuman-gupta-phd/9/b56/414/ Dr. Laveesh Bhandari Laveesh Bhandari is Director with Indicus Analytics. He completed his Ph.D. in Economics from Boston University. He has worked at NCAER conducting studies on Indian Industry and infrastructure, taught at IIT Delhi, and now heads Indicus Analytics – India’s premier economics research firm. He has authored and coauthored numerous publications on socio-economic development, health, education, poverty, inequality, etc. He writes frequently for newspapers such as Business Standard, Economic Times, India Today, etc. Email ID: laveesh@indicus.net Page 1 of 7
  • th 14 Esri India User Conference 2013 Introduction: All business operations occur within a context defined by their location [1]. Broadly, decision making requires triangulation between (1) internal operational and usage data (2) data on competing or synergistic options and (3) a precise idea on the scale and character of economic activity/ demography of the area. However, information for many locations across India (and the world) is either missing, flawed or simply not comparable due to any number of reasons. Consequently, most decision-making processes end up being based on imperfect/inadequate information that rely heavily of norms/thumb rules and/or gut instinct. Thus, most decisions dependent upon location information are constrained and suboptimal. To address this challenge, this paper describes a novel Fractal Analysis (FA) based approach for analyzing highly granular geo-spatial data for generating deep location-based insights by leveraging a number of key functionalities of ESRI’s technology platform. Previously, Lee De Cola [2] presented use of fractal analysis for classification of remotely sensed images. In [3] Pierre Frankhauser presented fractals as a tool for urban data analysis and also remarked the need of supplementary measures for the complete analysis. Keersmaecker et al. [4] presented comparison of fractal-based parameters calculated by different fractal methods for characterizing intra-urban diversity. Myint [5] provided a comparative study of various approaches for texture analysis and classification of remotely sensed data. A similar study of various spatial methods was carried out by Dale et al. [6] to conclude that no single method can reveal all the important characteristics of spatial data and the results of different analyses are not expected to be completely independent of each other. In this paper, we build on these and other research as described next. Data and Study Area: Six quantities measured at different geographical locations of 1 sq. km area of Delhi city are used as data under consideration. Each “grid cell” is identified by the latitude & longitude of its centroid. Overall, data was taken at 1602 different locations of Delhi. Specifically, we use proprietary data provided by Indicus Analytics for the following 6 variables/indices: (i) Population, (ii) Night-time light intensity, (iii) Points-of-Interest (POI), (iv) Road Length, (v) Index of telecom call intensity and (vi) Index of property tax collections. Table I lists these variables and the acronyms used in this paper. These are chosen as a sample set from the set of 5000+ socio-economic and demographic variables available with Indicus. Inverse distance weighted interpolation method [7,8] is employed for the interpolation as it is the preferred choice for geographical data interpolation. In this method, nearby points contribute more to the interpolation as opposed to the distant ones. Interpolated values are the weighted sum of the known values and the weight is inversely proportional to the distance between the interpolated values and the known ones. The simplest form of the Inverse distance weighting method is used in the present work known as Shepard method [7,8] with the weight function hi  p wi  ...(1) n hj p  j 0 where p is an arbitrary positive real number called the power parameter ( p is taken 2 in the present work) and hi is Euclidean distance given by hi  ( x  xi )2  ( y  yi )2 ...(2) where ( x, y ) are the coordinates of the interpolation point and ( xi , yi ) are the coordinates of each dispersion point. The weight Table I Quantities Under Consideration Quantity Population Night-time light intensity Index of telecom call intensity Index of number of Points-of-Interest Index of road length Index of property tax Page 2 of 7 Acronyms POP RAD CALLS POI ROADS TAX
  • th 14 Esri India User Conference 2013 function decays from unity to zero as the distance to the dispersion point increase. The weight functions are normalized to ensure that the weights sum to one. The interpolated point P ( x, y ) is then calculated as a weighted sum as P( x, y)   n i0 wi P( xi , yi ) ...(3) No extrapolation was used for the points outside the boundary of Delhi and these points were taken to be zero. All the data extraction and processing is carried out using ESRI ArcGIS technology. For example, the night-time lights raster image is processed using the ‘Zonal Statistics’ and ‘Zonal statistics as table’ tools under the ‘Spatial Analyst’ toolset of ArcGIS. The ‘Georeferencing’ tools provided in ArcGIS were used for all georeferencing issues. The ‘Extract values to points’ under the ‘Spatial Analyst’ toolset was used for extracting values of individual cells in a raster. Analysis Methodology: (a) Spatial Correlation Coefficient Spatial correlation coefficient is a common parameter by which one can comment on the similarity between the two signals globally. The correlation coefficient computed is given by    (X m n mn  X )(Ymn  Y ) ...(4)  x y where X and Y are the 2-D signals with means X and Y respectively.  x and  y are the standard deviation of X and Y respectively given by x    (X y    (Y m n mn  X )2 ...(5) and m n mn  Y )2 ...(6) (b) Fractal Characteristic & Fractal Dimension A quantity is called fractal if its fractal dimension is non-integer [9,10]. Fractal is a mathematical object that is both self-similar and chaotic. Self-similar refers that the object looks same at different scales and chaotic refers that the object is complex too [9,10]. More formally a continuous function f is said to be self-similar [9,10] if there exists a disjoint subsets S1 , S2 ,, Sk such that the function f on each S k is an affine transformation of f i.e. there exists a scale li  1 ,a translation ri , a constant ci and a weight wi such that f (t )  ci  wi f (li (t  ri )) t  Si ...(7) The fractal dimension of a point is zero, of a line segment is one, a square is two, and of a cube is three. In general, the fractal dimension is not an integer, but a fractional dimensional. To determine the fractal characteristics of a function, its fractal dimension can be calculated. There are several definitions of fractal dimension in the literature, the most famous measure is known as the box counting dimension [10]. The expression for the fractal dimension is log e N () F .D.   lim ...(8) 0 loge  where N () is the minimum number of boxes of size  needed to entirely enclose the object. In practice the box size can’t be zero, but one can go to the pixel level i.e. in practice the smallest size that can be imagine is the pixel of the image itself. In the present work the data sets are the images of 64×64. Hence if the size of the 64×64 box is taken unity, the size of the pixel would be 1/64. The data given is converted to the binary image by employing thresholding [11] and then the box counting algorithm [10] is employed. (c) The Hausdorff Metric Fractals are defined over Hausdorff metric space [12]. There is a notion of the Hausdorff distance between the two fractal sets. A metric is a function which measures distance on a space. The standard Euclidean distance between x and y in Rn is denoted as d E ( x, y ) . The Hausdorff metric is defined below (See Fig. 1), Page 3 of 7
  • th 14 Esri India User Conference 2013 Fig. 1 –Distance between a point x and element (set) B Fig. 2 –Distance between two sets A and B If x  R n , the “distance” between x to B is d ( x, B)  min d E ( x, b) ...(9) bB The “distance” from A to B is d ( A, B)  max d E ( x, B) ...(10) xA It can be noted that d is not metric, since d is not symmetric i.e. d ( A, B )  d ( B, A) (See Fig. 2). The Hausdorff distance, h( A, B) between A and B is then given by h( A, B)  max d ( A, B), d ( B, A) ...(11) (d) Structural Similarity Index Zhou et al. [13] proposed an algorithm for image quality assessment. They developed a parameter called structural similarity index by which one can make comment on the structural similarity between two images locally. Similarity measure between two images has the form    SSIM ( x, y)   l ( x, y )  c( x, y)   s ( x, y)  ...(12) where l ( x , y ) , c ( x, y ) and s ( x, y ) are luminance, contrast and structure comparison of images and   0 ,   0 ,   0 are parameters used to adjust the relative importance of the three components. Given two local image patches x and y of the two images, respectively, the luminance, contrast and structural similarities between them are evaluated as 2 x  y  C1 l ( x, y )  2 ...(13)  x   y 2  C1 c( x, y)  l ( x, y )  2 x y  C2 ...(14)  x 2   y 2  C2 2 xy  C3 ...(15)  x y  C3 where C1 , C2 and C3 are small stabilizing constants and  x ,  x and  xy represent the mean, standard deviation and crosscorrelation evaluations over a local window, respectively. The simplified expression with       1 and C3  C2 / 2 has the form SSIM ( x, y)  (2 x  y  C1 )(2 xy  C2 ) ...(16) 2 (  x   y 2  C1 )( x 2   y 2  C2 ) The quality map so produced, exhibits the local isotropic properties of the image under comparison with the values ranging between 0 (indicating dis-similarities) and 1 (indicating similarity). The overall quality measure is obtained by calculating mean SSIM (MSSIM) index. In the present study we have used a 8×8 window with C1  C2  0.01 . Page 4 of 7
  • th 14 Esri India User Conference 2013 (a) RAD-POP (d) RAD-ROADS (b) RAD-CALLS (c) RAD-POI (e) RAD-TAX Fig. 3 –SSIM Maps when paired with RAD (a) POI-POP (d) POI-ROADS (b) POI-RAD (e) POI-TAX Fig. 4 –SSIM Maps when paired with POI Page 5 of 7 (c) POI-CALLS
  • th 14 Esri India User Conference 2013 Results and Discussion: Spatial correlation coefficient, Hausdorff distance and mean SSIM values for different pairs of data are calculated and are presented in Table II, III and IV respectively. All these results are normalized in the range of 0 to 1. For spatial correlation coefficient and mean SSIM higher value indicate similarity whereas for Hausdorff distance lower value indicates similarity. In each table, 3 most similar scores are marked with Red color whereas 3 most dis-similar scores are marked with Blue color. Table V gives Fractal dimension of different quantities. Spatial correlation coefficient and Hausdorff distance helps in global evaluation of the quantities. Table II and III indicates that both these parameters produce quite identical results. In both the cases, quantities are most correlated when paired with RAD and are least correlated when paired with POI. Quantities like POP, CALLS and ROADS also exhibit good amount of correlation when paired with other quantities. Structural similarity helps in a global as well as local evaluation of the quantities. Mean SSIM gives the global trend whereas SSIM map gives the local trend of the quantities under consideration. The mean SSIM values in Table IV are found to be very close to unity. This is an indication that all these quantities are highly structurally similar in a global sense. To analyze the local trend one need to compare SSIM maps of different pairs. Fig. 3 and Fig. 4 show SSIM maps for different pairs when paired with RAD and POI respectively. Fig. 3 exhibits that SSIM maps when paired with RAD produce a similar image pattern. This does not hold true for SSIM maps when paired with POI as in Fig. 4. The patterns generated in this case are quite different for different pairs. This indicates that RAD will have good correlation in local sense and POI will have less correlation in local sense with the other quantities under consideration. Also a similarity trend is observed for quantities when paired with POP, CALLS, ROADS. These results are again similar to the results for spatial correlation coefficient and Hausdorff distance. Fractal dimension is an indication of self-similarity from a global to local sense and vice-versa. According to Table V, TAX exhibits the highest self-similarity followed by RAD and ROADS where as POI exhibits the least self-similarity. The mean and the standard deviation of the fractal dimension of the six quantities are 1.55 and 0.22 respectively. As a result, the quantities with fractal dimension in the range 1.33 to 1.77 will be correlated as oppose to the one outside this range. Accordingly, POP, RAD, CALLS and ROADS form a set of quantities which are correlated and POI and TAX form the set of outliers. This is again similar to the earlier results. There is another interpretation of fractal dimension i.e. fractal dimension tells about the space occupancy. If fractal dimension of a 2-D function is 2, it means that the function is occupying whole the 2-D space. If the fractal dimension is less than 2, it infers that it is occupying somewhat less space. For the comparison, the fractal dimension of the Delhi map generated from latitudesTable II Spatial Correlation Coefficient Table III Hausdorff Distance POP RAD CALLS POI ROADS TAX POP RAD CALLS POI ROADS TAX POP 1 0.8 0.75 0.58 0.74 0.58 POP 0 0.44 0.52 0.92 0.53 0.55 RAD 0.8 1 0.77 0.56 0.85 0.77 RAD 0.44 0 0.49 1 0.35 0.39 CALLS 0.75 0.77 1 0.66 0.75 0.6 CALLS 0.52 0.49 0 0.87 0.51 0.56 POI 0.58 0.56 0.66 1 0.62 0.39 POI 0.92 1 0.87 0 0.96 1 ROADS 0.74 0.85 0.75 0.62 1 0.7 ROADS 0.53 0.35 0.51 0.96 0 0.44 TAX 0.58 0.77 0.6 0.39 0.7 1 TAX 0.55 0.39 0.56 1 0.44 0 Table IV Mean SSIM Table V Fractal Dimension POP RAD CALLS POI ROADS 1 0.9925 0.9987 0.9976 0.9981 0.9911 RAD 0.9925 1 Quantity POP RAD CALLS POI ROADS TAX Delhi Map TAX POP 0.9909 0.9863 0.9948 0.9953 CALLS 0.9987 0.9909 1 0.9987 0.9980 0.9905 POI 0.9976 0.9863 0.9987 1 0.9964 0.9869 ROADS 0.9981 0.9948 0.9980 0.9964 1 0.9939 TAX 0.9911 0.9953 0.9905 0.9869 0.9939 1 Page 6 of 7 F.D. 1.50 1.67 1.50 1.18 1.63 1.82 1.84
  • th 14 Esri India User Conference 2013 longitudes combinations of the measurements is also calculated and is found to be 1.84 and is given in the last row of Table V. Fractal dimension of TAX is 1.82, i.e. we can say that tax is been collected from almost all Delhi space having fractal dimension 1.84. For POI, fractal dimension is 1.18, indicating that POI does not follow a uniform distribution among the space of Delhi and are distributed unevenly in this space. This is the extra information that fractal dimension brings in the analysis which is not captured by other parameters. Conclusion: In this work, we took the first steps of analyzing highly granular geo-spatial data using fractal analysis techniques. Since the data is available at 1-sq. km level, we are able to generate much deeper insights compared to coarser data at the state or district or even sub-district level. Another key feature of the work is the integration of data from multiple sources that helps to create a more complete and dynamic picture of the evolving socio-economic processes. The geo-spatial analytical capabilities of ESRI’s ArcGIS platform were critical in this work both from a data extraction/processing as well as visualization perspective. Currently we are working on extending the analysis to predict how the given variables will evolve in the future over various timeframes. This information would be critical for making key strategic and tactical level decisions for both businesses and government entities. Acknowledgment: Authors would like to thank Indicus Analytics, New Delhi, India (www.indicus.net) for their very generous assistance for providing the datasets for this work. References: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. P. A. Longley, M. F. Goodchild, D. J. Maguire, and D. W. Rhind, Geographic Information Systems and Science, Wiley, 2005. L. D. Cola, “Fractal analysis of a classified landsat scene,” Photogrammetric Engineering and Remote Sensing, vol. 55, no. 5, pp. 601–612, 1989. P. Frankhauser, “The fractal approach. a new tool for the spatial analysis of urban agglomerations,” Population, vol. 10, no. 1, pp. 205–240, 1998. M. L. De Keersmaecker, P. Frankhauser, and I. Thomas, “Using fractal dimensions for characterizing intra-urban diversity: The example of Brussels,” Geographical Analysis, vol. 35, no. 4, pp. 310–328, 2003. S. W. Myint, “Fractal approaches in texture analysis and classification of remotely sensed data: Comparisons with spatial autocorrelation techniques and simple descriptive statistics,” International Journal of Remote Sensing, vol. 24, no. 9, pp. 1925–1947, 2003. M. R. T. Dale, P. Dixon, M.-J. Fortin, P. Legendre, D. E. Myers, and M. S. Rosenberg, “Conceptual and mathematical relationships among methods for spatial analysis,” Ecography, vol. 25, no. 5, pp. 558–577, Oct 2002. D. Shepard, “A two-dimensional interpolation function for irregularly spaced data,” in Proceedings of the 1968 23rd ACM national conference, ser. ACM ’68, New York, USA, pp. 517–524, 1968. M. A. Azpurua, and K. D. Ramos, “A comparison of spatial interpolation methods for estimation of average electromagnetic field magnitude,” Progress In Electromagnetics Research M, vol. 14, pp. 135-145, 2010. S. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way, 3rd ed. Burlington, MA, USA: Academic Press, 2009, ch. Wavelet Zoom, pp. 242–259. R. Lopes and N. Betrouni, “Fractal and multifractal analysis: A review.” Medical Image Analysis, vol. 13, no. 4, pp. 634– 649, 2009. N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Transactions on Systems, Man and Cybernetics, vol. 9, no. 1, pp. 62–66, 1979. M. Barnsley, Fractals Everywhere, 2nd ed. Boston, MA, USA: Academic Press, 1993, ch. Metric Spaces, Equivalent Spaces: Classification of Subsets, and the Space of Fractals, pp. 5–41. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” Image Processing, IEEE Transactions on, vol. 13, no. 4, pp. 600–612, 2004. Page 7 of 7