Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine Learning Project


Published on

Published in: Technology, Education

Machine Learning Project

  1. 1. An introduction to Torch; A Machine Learning Library in C++. Analysis and Implementation of CLARANS K-medoid Clustering Algorithm in the Java programming Language. Application of K-medoid to image Processing. By Adeyemi Fowe CPSC 7375 (Machine Learning) Spring 2008 Instructor Dr Mariofanna (Fani) Milanova Computer Science Department University of Arkansas at Little rock. Final Project Presentation
  2. 2. Torch : Usage: Powerful & Fast High Learning Curve Made for Linux env. C++ OOP structure but C codes. Open Source plain txt .cc codes Features: Gradient Machines Support vector machines Ensemble models K-nearest-neighbors Distributions and Classifiers Speech recognition tools
  3. 3. Torch Structure; Multiple Inheritance
  4. 4. Torch3Vision Built on Torch Solid Image Processing More user friendly More sample codes examples Supports: pgm, ppm, gif, tif, jpeg Camera control; e.g Sony pan/tilt/zoom
  5. 5. Application; Face Detection
  6. 6. Sample Use; A Demo on Linux Console?
  7. 7. Clustering (Unsupervised Learning)
  8. 8. Clustering (Unsupervised Learning) Different types of Clustering: Partitioning Algorithms: K-means, K-medoid. Hierarchical Clustering: Tree of clusters rather than disjoint. Density Based Clustering: Cluster based on region of concentration. Statistical Clustering: Statistical techniques like probability and test of hypothesis .
  9. 9. K-Means & K-medoid
  10. 10. K-Means & K-medoid K-means clustering use the exact center of a cluster (means or the center of gravity) while K-medoid uses the most centrally located object in a cluster (medoid). K-medoid is less sensitive to outliers Compared to K-means. K value (number of clusters) has to be determined a-priori.
  11. 11. K-medoid Algorithms PAM (Partitioning Around Medoids) was developed by Kaufman and Rousseeuw (1990) Designed by Kaufman and Rousseeuw to handle large data sets, CLARA (Clustering LARge Applications) CLARANS: Clustering Large Applications based on Randomized Search. Raymond T. Ng and Jiawei Han(2002)
  12. 13. CLARANS Minimum Cost Search The diagram illustrates CLARANS algorithm which performs random search for Minimum cost over the entire data set. By changing swapping a medoid one at a time.
  13. 14. Java Implementation of CLARANS K-medoid Algorithm
  14. 15. To form a cluster (image classification). A medoid has to navigate within this 3-D space to find the closest set of pixels. This would make K-medoid take the pixel gray values into consideration wile clustering.
  15. 16. Sample Image
  16. 17. Extracted Gray-Values using TorchVision
  17. 18. 3D Plot of Pixel Gray-Values
  18. 19. Gray Image Pixel Map
  19. 20. Spectra and Spatial Pattern Recognition Spectral pattern recognition refers to the set of spectral radiances measurements obtained in the various wavelength bands for each pixel. Spatial pattern recognition involves the categorization of image pixels on the basis of their spatial relationship with pixels surrounding them. The aim of this experiment is to delineate the behavior of the K-medoid clustering algorithm while varying this two criteria. We want to show that changing the weight w is a compromise of spectra spatial pattern of an image.
  20. 21. Spatial and Spectral Differences Cost of assigning node i to representative pixel j is given by: The weight w, serves has a measure of our preference for spatial or spectra pattern recognition. It’s a weight metric for the preference structure in MCDA. When w=0: Spatial pattern only. When w=1: Spectral pattern only. When 0<w<1: Both Spatial and Spectra pattern is considered; A typical MADA .
  21. 22. CLARANS Clusters; K=3
  22. 23. Results for Spatial& Spectra
  23. 25. Pixel Distance Functions Reference:
  24. 26. Chebyshev Distance; Chess Board Distance
  25. 27. The Lp Space
  26. 29. Lp Space and Decision Making
  27. 30. This clearly displays a Manhattan cluster for w=0; only spatial properties. This decision maker needs to consider the how the edges of the clusters Should be formed. This decision would Most likely be informed by the type of Information to be extracted.
  28. 31. Conclusion We implemented the more efficient CLARANS Algorithm for K-medoid using the Java programming language. We take advantage of our code and explore the differences in distance functions which could be part of the choice of a user. We showed that the choice of functions should depend on the expected edge-orientation of the clusters.
  29. 32. Thank You. Questions?
  30. 33. References [1] Chan, Y. (2001). Location Theory and Decision Analysis, ITP/South-Western [2] Chan, Y. Location, transport and land-use: Modeling spatial-temporal information. Heidelberg, Germany: Springer-Verlag. [3] Craig M. Wittenbrink, Glen Langdon, Jr. Gabriel Fernandez (1999), Feature Extraction of Clouds from GOES Satellite Data for Integrated Model Measurement Visualization, work paper [4] Raymond T. Ng, Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining, Proceedings of the 20th VLDB Conference Santiago, Chile, 1994 [5] Osmar R. Zaiane, Andrew Foss, Chi-Hoon Lee, and Weinan Wang, On Data Clustering Analysis: Scalability, Constraints and Validation, work paper [6] Gerald J. Dittberner (2001), NOAA’s GOES Satellite System – Status and Plans [7] Weather satellites teacher’s guide, Published by Environment Canada, ISBN Cat. No. En56-172/2001E-IN 0-662-31474-3 [8] ArcView user’s manual [9] Websites: [10]Images: h ttp:// [11] Torch3vision Sebastien Marcel and Yann Rodriguez | [12] R. Collobert, S. Bengio, and J. Mariéthoz. Torch: a modular machine learning software library . Technical Report IDIAP-RR 02-46, IDIAP, 2002 [13] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.