Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne (Switzerland)

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne (Switzerland) - Presentation Transcript

    1. Mapping and classification of spatial data using Machine Learning Office software tools Vadim Timonin Institute of Geomatics and Analysis of Risk, University of Lausanne, Switzerland Vadim.Timonin @UNIL.ch
    2. Contents • Short description of the Machine Learning Office • SIC 2004: Application to the automatic cartography of radioactivity • Case study: Wind fields mapping with neural network and regularization technique.
    3. Machine Learning Office Part of the book: EPFL press June 2009
    4. June 20 09:00 – 12:00 Room T120 Practical work session using Machine Learning software
    5. Machine Learning Office Supervised Regression • Multilayer Perceptron (MLP) • General Regression Neural Networks (GRNN) • Radial Basis Function Neural Networks (RBFNN) • K-Nearest Neighbour (KNN) • Support Vector Regression (SVR) Classification • Multilayer Perceptron (MLP) • Probabilistic Neural Networks (PNN) • K-Nearest Neighbour (KNN) • Support Vector Machines (SVM)
    6. Machine Learning Office Unsupervised Clustering & density estimation • K-Means & EM algorithms • Gaussian Mixture Model (GMM) • Self-Organizing (Kohonen) Maps (SOM)
    7. Machine Learning Office Mixture of supervised and unsupervised Joint density estimation • Mixture Density Networks (MDN)
    8. Automatic Mapping of Pollution Data Procedure should be: 1. Simple, without difficult tuning of the models (can be used by “non-expert” in machine learning) 3. Result should be unique (does not depend on training algorithms, initial values, etc.)
    9. Automatic Mapping of Pollution Data Good candidates: 1. KNN 2. GRNN / PNN Not so good candidates (?): 1. MLP 2. RBFNN 3. SVM / SVR
    10. Automatic Mapping with Prior Knowledge in situations of Routine and Emergency Spatial Interpolation Comparison 2004 http://www.ai-geostats.org/ Official report: Automatic mapping algorithms for routine and emergency monitoring data. EUR 21595 EN EC. Dubois G. (Ed.), Office for Official Publications of the European Communities, Luxembourg, 150 p., November 2005.
    11. Spatial Interpolation Comparison 2004 Introduction Description of the concept of SIC 2004 Participants are invited using 200 observations (left, circles) to estimate (predict) values located at 1008 locations (right, crosses).
    12. Spatial Interpolation Comparison 2004 Introduction Prior data sets From these 1008 monitoring locations, a single sampling scheme of 200 monitoring stations was selected randomly and extracted for each of the 10 datasets, in order to allow participants to train and design their algorithms. These 200 sampling locations have a spatial distribution that can be considered as nearly random. From the summary statistics, one can see that the subsets of 200 points are representative of the whole set of 1008 points. Note that is the choice of participant to use or do not use these prior information for modeling. Statistics for the training sets (n = 200) Statistics for the full sets (n = 1008) Set No Min Mean Median Max Std.Dev Min Mean Median Max Std.Dev 1 55.8 97.6 98.0 150.0 19.1 55.0 98.9 99.5 193.0 21.1 2 55.9 97.4 97.9 155.0 19.3 54.9 98.8 99.5 188.0 21.2 3 59.9 98.8 100.0 157.0 18.5 59.9 100.3 101.0 192.0 20.4 4 56.1 93.8 94.8 152.0 16.8 56.1 95.1 95.4 180.0 18.8 5 56.4 92.4 92.0 143.0 16.6 56.1 93.7 94.0 168.0 18.1 6 54.4 89.8 90.4 133.0 15.9 54.4 90.9 91.6 168.0 17.2 7 56.1 91.7 91.7 140.0 16.2 56.1 92.5 92.9 166.0 16.9 8 54.9 92.4 92.5 148.0 16.6 54.9 93.5 94.1 176.0 18.1 9 56.5 96.6 97.0 149.0 18.2 56.5 97.8 98.7 183.0 19.9 10 54.9 95.4 95.7 152.0 17.2 54.9 96.6 97.1 183.0 19.0
    13. Results of the GRNN models with cross-validation tuning Emergency (joker) scenario Routine scenario Epicentre of accident (hot spot)
    14. Results In the following table the participants’ results for either of the two scenarios (routine and emergency) are presented. The results have been sorted by Minimum Absolute Error (MAE) obtained in the case of the emergency scenario. Other statistics shown in this table are the Mean Error (ME) that allows to assess the bias of the results, the Root Mean Squared Error (RMSE), as well as Pearson’s Correlation Coefficient (Ro) between true and estimated values. • GEOSTATS denotes Geostatistical techniques • NN Neural Networks • SVM Support Vector Machine In each column, the best results have been bolded.
    15. Results of the SIC 2004 exercise MAE ME RMSE Ro Participant Method routine joker routine joker routine joker routine joker Timonin NN 9.40 14.85 -1.25 -0.51 12.59 45.46 0.78 0.84 Fournier GEOSTATS 9.06 16.22 -1.32 -8.58 12.43 81.44 0.79 0.27 Pozdnoukhov SVM 9.22 16.25 -0.04 -6.70 12.47 81.00 0.79 0.28 (authors are Saveliev SPLINES 9.60 17.00 3.00 10.40 13.00 82.20 0.77 0.23 highlighted) Dutta Ingram NN GEOSTATS 9.92 9.10 17.50 18.55 0.20 -1.27 5.10 -4.64 13.10 12.46 80.60 54.22 0.76 0.79 0.29 0.86 Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50 Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50 Fournier GEOSTATS 9.22 19.43 -0.89 -0.22 12.51 73.50 0.78 0.48 Fournier OTHERS 9.29 19.44 -1.12 -0.12 12.56 71.87 0.78 0.53 Savelieva GEOSTATS 9.11 19.68 -1.39 -2.18 12.49 69.08 0.78 0.56 Palaseanu GEOSTATS 9.05 19.76 1.40 2.33 12.46 74.54 0.79 0.50 Rigol S. NN 12.10 20.30 -1.20 -9.40 15.80 84.10 0.67 0.12 Pebesma GEOSTATS 9.11 20.83 -1.22 0.92 12.44 73.73 0.79 0.50 Pebesma OTHERS 9.94 21.03 -1.35 4.50 13.32 72.12 0.78 0.51 Ingram GEOSTATS 9.08 21.77 -1.44 0.72 12.47 79.57 0.79 0.35 Lophaven GEOSTATS 9.70 22.20 1.20 -4.10 13.10 71.20 0.76 0.54 Saveliev SPLINES 9.30 22.20 1.60 0.60 12.60 76.40 0.78 0.41 Ingram GEOSTATS 9.47 22.53 -1.15 3.09 12.75 79.16 0.78 0.33 Pebesma GEOSTATS 9.11 23.26 -1.22 4.00 12.44 76.19 0.79 0.42 Rigol S. NN 16.00 25.30 -1.70 -11.10 20.80 87.50 0.55 0.02 Hofierka SPLINES 9.38 26.52 -1.27 4.29 12.68 77.98 0.78 0.38 Dutta NN 9.62 28.20 0.90 -0.22 12.70 80.10 0.78 0.31 Pebesma GEOSTATS 9.11 28.45 -1.22 12.01 12.44 81.41 0.79 0.38 Dutta NN 12.20 28.90 1.50 -1.29 15.90 79.90 0.64 0.33 Rigol S. NN 21.40 30.50 5.30 3.80 45.80 96.60 0.24 0.20 Ingram NN 9.72 38.29 -1.54 8.38 13.00 84.24 0.76 0.30 Dutta NN 9.93 38.50 2.18 17.98 13.30 87.30 0.76 0.27 Ingram NN 9.48 48.41 -1.22 -3.01 12.73 90.89 0.78 0.38 Pebesma GEOSTATS 9.11 146.36 -1.22 19.71 12.44 212.10 0.79 -0.27
    16. Results of the SIC 2004 exercise MAE ME RMSE Ro Participant Method routine joker routine joker routine joker routine joker Timonin NN 9.40 14.85 -1.25 -0.51 12.59 45.46 0.78 0.84 Fournier GEOSTATS 9.06 16.22 -1.32 -8.58 12.43 81.44 0.79 0.27 Pozdnoukhov SVM 9.22 16.25 -0.04 -6.70 12.47 81.00 0.79 0.28 Saveliev SPLINES 9.60 17.00 3.00 10.40 13.00 82.20 0.77 0.23 Dutta NN 9.92 17.50 0.20 5.10 13.10 80.60 0.76 0.29 Ingram GEOSTATS 9.10 18.55 -1.27 -4.64 12.46 54.22 0.79 0.86 Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50 Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50 Fournier GEOSTATS 9.22 19.43 -0.89 -0.22 12.51 73.50 0.78 0.48 Fournier OTHERS 9.29 19.44 -1.12 -0.12 12.56 71.87 0.78 0.53 Savelieva GEOSTATS 9.11 19.68 -1.39 -2.18 12.49 69.08 0.78 0.56 Palaseanu GEOSTATS 9.05 19.76 1.40 2.33 12.46 74.54 0.79 0.50 Rigol S. NN 12.10 20.30 -1.20 -9.40 15.80 84.10 0.67 0.12 Pebesma GEOSTATS 9.11 20.83 -1.22 0.92 12.44 73.73 0.79 0.50 Pebesma OTHERS 9.94 21.03 -1.35 4.50 13.32 72.12 0.78 0.51 Ingram GEOSTATS 9.08 21.77 -1.44 0.72 12.47 79.57 0.79 0.35 Lophaven GEOSTATS 9.70 22.20 1.20 -4.10 13.10 71.20 0.76 0.54 Saveliev SPLINES 9.30 22.20 1.60 0.60 12.60 76.40 0.78 0.41 Ingram GEOSTATS 9.47 22.53 -1.15 3.09 12.75 79.16 0.78 0.33 Pebesma GEOSTATS 9.11 23.26 -1.22 4.00 12.44 76.19 0.79 0.42 Rigol S. NN 16.00 25.30 -1.70 -11.10 20.80 87.50 0.55 0.02 Hofierka SPLINES 9.38 26.52 -1.27 4.29 12.68 77.98 0.78 0.38 Dutta NN 9.62 28.20 0.90 -0.22 12.70 80.10 0.78 0.31 Pebesma GEOSTATS 9.11 28.45 -1.22 12.01 12.44 81.41 0.79 0.38 Dutta NN 12.20 28.90 1.50 -1.29 15.90 79.90 0.64 0.33 Rigol S. NN 21.40 30.50 5.30 3.80 45.80 96.60 0.24 0.20 Ingram NN 9.72 38.29 -1.54 8.38 13.00 84.24 0.76 0.30 Dutta NN 9.93 38.50 2.18 17.98 13.30 87.30 0.76 0.27 Ingram NN 9.48 48.41 -1.22 -3.01 12.73 90.89 0.78 0.38 Pebesma GEOSTATS 9.11 146.36 -1.22 19.71 12.44 212.10 0.79 -0.27
    17. Modeling of wind fields with MLP and regularization technique (pp 168-172 of the book) Monitoring network: 111 stations in Switzerland (80 training + 31 for validation) Mapping of daily: • Mean speed • Maximum gust • Average direction
    18. Modeling of wind fields with MLP and regularization technique Monitoring network: 111 stations in Switzerland (80 training + 31 for validation) Mapping of daily: • Mean speed • Maximum gust • Average direction Input information: X,Y geographical coordinates DEM (resolution 500 m) 23 DEM-based « geo-features » Total 26 features Model: MLP 26-20-20-3
    19. Training of the MLP Model: MLP 26-20-20-3 Training: • Random initialization • 500 iterations of the RPROP algorithm
    20. Results: naîve approach
    21. Results: Noisy ejection regularization
    22. Results: summary Noisy ejection regularization Without regularization (overfitting)
    23. Thank you for your attention! Next stop is: June 20 09:00 – 12:00 Room T120 Practical work session using Machine Learning software

    + Beniamino  MurganteBeniamino Murgante, 4 months ago

    custom

    566 views, 0 favs, 0 embeds more stats

    Mapping and classification of spatial data using ma more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 566
      • 566 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 7
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories