Visualization and Data Mining techniques By-  Group number- 14 Chidroop Madhavarapu(105644921) Deepanshu Sandhuria(1055951...
References <ul><li>http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/10335/ftp:zSzzSzftp.cs.umn.eduzSzde...
Motivation <ul><li>Visualization for Data Mining </li></ul><ul><li>•  Huge amounts of information </li></ul><ul><li>•  Lim...
Why Visual Data Mining
Why Visual Data Mining
VDM Approach <ul><li>VDM takes advantage of both, </li></ul><ul><li>The power of automatic calculations, and </li></ul><ul...
Levels of VDM <ul><li>No or very limited integration </li></ul><ul><ul><li>Corresponds to the application of either tradit...
Methods of Data Visualization <ul><li>Different methods are available for visualization of data  </li></ul><ul><li>based o...
Univariate data <ul><li>Measurement of single quantitative  variable </li></ul><ul><li>Characterize distribution </li></ul...
Histogram
Pie Chart
Bivariate Data <ul><li>Constitutes of paired samples of two quantitative variables </li></ul><ul><li>Variables are related...
Scatter plots
Line graphs
Multivariate Data <ul><li>Multi dimensional representation of multivariate data </li></ul><ul><li>Represented using follow...
Icon based Methods
Pixel Based Methods <ul><li>Approach: </li></ul><ul><ul><li>Each attribute value is represented by one colored pixel (the ...
Dense Pixel Display <ul><li>Approach: </li></ul><ul><ul><li>Each attribute value is represented by one colored pixel (the ...
Visual Data Mining: Framework and Algorithm Development Ganesh, M., Han, E.H., Kumar, V., Shekar, S., &  Srivastava, J. (1...
References <ul><li>http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/10335/ftp:zSzzSzftp.cs.umn.eduzSzde...
Abstract <ul><li>VDM refers to refers to the use of visualization techniques in Data Mining process to  </li></ul><ul><ul>...
Components of VQLBCI <ul><li>The three major components of VQLBCI are  Visual Representations, Computations and Events. </...
Visual Development of Algorithms <ul><li>Most interesting use of visual data mining is the development of new insights and...
ER diagram for the search space of decision tree learning algorithm
General Framework <ul><li>Learning a  classification decision tree from a training data set can be regarded as a process o...
General Framework <ul><li>Model Candidate corresponds to the partial classification decision tree. Each node of the decisi...
Search Process
Acceptability Constraint <ul><li>Model Constraints consist of Acceptability constraints, Expandability constraints and a D...
Expandability Constraint <ul><li>An Expandability constraint predicate specifies whether a leaf model atom is expandable o...
Traversal Strategy <ul><li>Traversal strategy ranks expandable leaf model atoms based on the model atom attributes. EX: </...
Steps in Visual Algorithm Development <ul><li>No single algorithm is the best all the time, performance is highly data dep...
BF <ul><li>This algorithm is based on the Best-First search idea. </li></ul><ul><li>For Acceptability criteria, it include...
CDP + <ul><li>CDP+ is a modification of CDP </li></ul><ul><li>CDP has dynamic pruning using expandability constraint E3. <...
Comparison of different classification learning algorithms
Experiment <ul><li>The new BF and CDP+ algorithms are compared with the C4.5 and CDP algorithms. </li></ul><ul><li>Various...
Classification error for 10 data sets
Nodes generated for 10 data sets
Final decision tree size
Results/Conclusion <ul><li>CDP has accuracy comparable to C4.5 while generating considerably fewer nodes. </li></ul><ul><l...
Conclusion <ul><li>Different datasets require different algorithms for best results. </li></ul><ul><li>Diverse user requir...
Data Mining for Selective Visualization of Large Spatial Datasets <ul><li>Proceedings of  14th IEEE International Conferen...
References <ul><li>http://citeseer.ist.psu.edu/cache/papers/cs/27216/http:zSzzSzwww-users.cs.umn.eduzSzzCz7EctluzSzPaperTa...
Basic Terminology <ul><li>Spatial databases </li></ul><ul><ul><li>Alphanumeric data + geographical cordinates </li></ul></...
Spatial  Cluster
Contribution <ul><li>Propose and implement the  CubeView  visualization system  </li></ul><ul><li>General data cube operat...
Challenges in spatial data mining <ul><li>Classical data mining - numbers and categories.  </li></ul><ul><li> Spatial data...
Application Domain <ul><li>The Traffic Management Center - Minnesota Department of Transportation (MNDOT) has a database t...
Basic Concepts <ul><li>Spatial Data Warehouse </li></ul><ul><li>Spatial Data Mining </li></ul><ul><li>Spatial Outliers Det...
Spatial Data Warehouse <ul><li>Employs data cube structure </li></ul><ul><li>Outputs - albums of maps. </li></ul><ul><li>T...
Spatial Data Mining <ul><li>Process of discovering interesting and useful but implicit spatial patterns. </li></ul><ul><li...
Spatial Outliers Detection <ul><li>Suspiciously deviating observations </li></ul><ul><li>Local instability </li></ul><ul><...
Basic Structure – CubeView
CubeView Visualization System <ul><li>Each node in cube – a visualization style </li></ul><ul><ul><li>S - Traffic volume o...
Dimension Lattice
CubeView Visualization System
CubeView Visualization System
CubeView Visualization System
Data Mining Algorithms for Visualization <ul><li>Problem Definition </li></ul><ul><ul><li>Given a spatial graph  G ={ S , ...
Data Mining Algorithms for Visualization <ul><li>Problem Definition cont… </li></ul><ul><ul><li>S( x ) - difference of the...
Data Mining Algorithms for Visualization <ul><li>Few points </li></ul><ul><ul><li>First, the neighborhood can be selected ...
Data Mining Algorithms for Visualization <ul><li>Algorithms </li></ul><ul><ul><li>Test Parameters Computation(TPC) Algorit...
Data Mining Algorithms for Visualization
Data Mining Algorithms for Visualization
Data Mining Algorithms for Visualization
Software <ul><li>http://www.cs.umn.edu/research/shashi-group/vis/traffic_volumemap2.htm </li></ul><ul><li>http://www.cs.um...
Visualization and Data Mining techniques Thank you!!!!
Upcoming SlideShare
Loading in …5
×

cs412slides

713 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
713
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

cs412slides

  1. 1. Visualization and Data Mining techniques By- Group number- 14 Chidroop Madhavarapu(105644921) Deepanshu Sandhuria(105595184) Data Mining CSE 634 Prof. Anita Wasilewska
  2. 2. References <ul><li>http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/10335/ftp:zSzzSzftp.cs.umn.eduzSzdeptzSzuserszSzkumarzSzdatavis.pdf/ganesh96visual.pdf </li></ul><ul><li>http://www.ailab.si/blaz/predavanja/ozp/gradivo/2002-Keim-Visualization%20in%20DM-IEEE%20Trans%20Vis.pdf </li></ul><ul><li>http:// www.geocities.com/anand_palm / </li></ul><ul><li>http://citeseer.ist.psu.edu/cache/papers/cs/27216/http:zSzzSzwww-users.cs.umn.eduzSzzCz7EctluzSzPaperTalkFilezSzits02.pdf/shekhar02cubeview.pdf </li></ul><ul><li>http://www.cs.umn.edu/Research/shashi-group/ </li></ul><ul><li>http://www.cs.umn.edu/Research/shashi-group/Book/sdb-chap1.pdf </li></ul><ul><li>http://www.cs.umn.edu/research/shashi-group/ alan_planb .pdf </li></ul><ul><li>http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/27637/http:zSzzSzwww-users.cs.umn.eduzSzzCz7EpushengzSzpubzSzkdd2001zSzkdd.pdf/shekhar01detecting.pdf </li></ul>
  3. 3. Motivation <ul><li>Visualization for Data Mining </li></ul><ul><li>• Huge amounts of information </li></ul><ul><li>• Limited display capacity of output devices </li></ul><ul><li>Visual Data Mining (VDM) is a new approach for </li></ul><ul><li>exploring very large data sets, combining traditional </li></ul><ul><li>mining methods and information visualization techniques. </li></ul>
  4. 4. Why Visual Data Mining
  5. 5. Why Visual Data Mining
  6. 6. VDM Approach <ul><li>VDM takes advantage of both, </li></ul><ul><li>The power of automatic calculations, and </li></ul><ul><li>The capabilities of human processing. </li></ul><ul><ul><li>Human perception offers phenomenal abilities to extract structures from pictures. </li></ul></ul>
  7. 7. Levels of VDM <ul><li>No or very limited integration </li></ul><ul><ul><li>Corresponds to the application of either traditional information </li></ul></ul><ul><ul><li>visualization or automated data mining methods. </li></ul></ul><ul><li>Loose integration </li></ul><ul><ul><li>Visualization and automated mining methods are applied sequentially. </li></ul></ul><ul><ul><li>The result of one step can be used as input for another step. </li></ul></ul><ul><li>Full integration </li></ul><ul><ul><li>Automated mining and visualization methods applied in parallel. </li></ul></ul><ul><ul><li>Combination of the results. </li></ul></ul>
  8. 8. Methods of Data Visualization <ul><li>Different methods are available for visualization of data </li></ul><ul><li>based on type of data </li></ul><ul><li>Data can be </li></ul><ul><li>Univariate </li></ul><ul><li>Bivariate </li></ul><ul><li>Multivariate </li></ul>
  9. 9. Univariate data <ul><li>Measurement of single quantitative variable </li></ul><ul><li>Characterize distribution </li></ul><ul><li>Represented using following methods </li></ul><ul><ul><li>Histogram </li></ul></ul><ul><ul><li>Pie Chart </li></ul></ul>
  10. 10. Histogram
  11. 11. Pie Chart
  12. 12. Bivariate Data <ul><li>Constitutes of paired samples of two quantitative variables </li></ul><ul><li>Variables are related </li></ul><ul><li>Represented using following methods </li></ul><ul><ul><li>Scatter plots </li></ul></ul><ul><ul><li>Line graphs </li></ul></ul>
  13. 13. Scatter plots
  14. 14. Line graphs
  15. 15. Multivariate Data <ul><li>Multi dimensional representation of multivariate data </li></ul><ul><li>Represented using following methods </li></ul><ul><ul><li>Icon based methods </li></ul></ul><ul><ul><li>Pixel based methods </li></ul></ul><ul><ul><li>Dynamic parallel coordinate system </li></ul></ul>
  16. 16. Icon based Methods
  17. 17. Pixel Based Methods <ul><li>Approach: </li></ul><ul><ul><li>Each attribute value is represented by one colored pixel (the value ranges of the attributes are mapped to a fixed color map). </li></ul></ul><ul><ul><li>The values of each attribute are presented in separate sub windows. </li></ul></ul><ul><li>Examples: </li></ul><ul><ul><li>Dense Pixel Displays </li></ul></ul>
  18. 18. Dense Pixel Display <ul><li>Approach: </li></ul><ul><ul><li>Each attribute value is represented by one colored pixel (the value ranges of the attributes are mapped to a fixed color map). </li></ul></ul><ul><ul><li>Different attributes are presented in separate sub windows. </li></ul></ul>
  19. 19. Visual Data Mining: Framework and Algorithm Development Ganesh, M., Han, E.H., Kumar, V., Shekar, S., & Srivastava, J. (1996). Working Paper. Twin Cities, MN: University of Minnesota, Twin Cities Campus.
  20. 20. References <ul><li>http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/10335/ftp:zSzzSzftp.cs.umn.eduzSzdeptzSzuserszSzkumarzSzdatavis.pdf/ganesh96visual.pdf </li></ul><ul><li>http://www.ailab.si/blaz/predavanja/ozp/gradivo/2002-Keim-Visualization%20in%20DM-IEEE%20Trans%20Vis.pdf </li></ul><ul><li>http:// www.geocities.com/anand_palm / </li></ul>
  21. 21. Abstract <ul><li>VDM refers to refers to the use of visualization techniques in Data Mining process to </li></ul><ul><ul><li>Evaluate </li></ul></ul><ul><ul><li>Monitor </li></ul></ul><ul><ul><li>Guide </li></ul></ul><ul><li>This paper provides a framework for VDM via the loose coupling of databases and visualization systems. </li></ul><ul><li>The paper applies VDM towards designing new algorithms that can learn decision trees by manually refining some of the decisions made by well known algorithms such as C4.5. </li></ul>
  22. 22. Components of VQLBCI <ul><li>The three major components of VQLBCI are Visual Representations, Computations and Events. </li></ul>
  23. 23. Visual Development of Algorithms <ul><li>Most interesting use of visual data mining is the development of new insights and algorithms. </li></ul><ul><li>The figure below shows the ER diagram for learning classification decision trees. </li></ul><ul><li>This model allows the user to monitor the quality and impact of decisions made by the learning procedure. </li></ul><ul><li>Learning procedure can be refined interactively via a visual interface. </li></ul>
  24. 24. ER diagram for the search space of decision tree learning algorithm
  25. 25. General Framework <ul><li>Learning a classification decision tree from a training data set can be regarded as a process of searching for the best decision tree that meets user-provided goal constraints. </li></ul><ul><li>The problem space of this search process consists of Model Candidates, Model Candidate Generator and Model Constraints. </li></ul><ul><li>Many existing classification-learning algorithms like C4.5 and CDP fit nicely within this search framework. New learning algorithms that fit user’s requirements can be developed by defining the components of the problem space. </li></ul>
  26. 26. General Framework <ul><li>Model Candidate corresponds to the partial classification decision tree. Each node of the decision tree is a Model Atom </li></ul><ul><li>Search process is the process of finding a final model candidate such that it meets user goal specifications. </li></ul><ul><li>Model Candidate Generator transforms the current model candidate into a new model candidate by selecting one model atom to expand from the expandable leaf model atoms. </li></ul><ul><li>Model Constraints (used by Model Candidate Generator) provide controls and boundaries to the search space. </li></ul>
  27. 27. Search Process
  28. 28. Acceptability Constraint <ul><li>Model Constraints consist of Acceptability constraints, Expandability constraints and a Data-Entropy calculation function. </li></ul><ul><li>Acceptability constraint predicate specifies when a model candidate is acceptable and thus allows search process to stop. EX: </li></ul><ul><ul><li>A1) Total no of expandable leaf model atoms = 0. </li></ul></ul><ul><ul><li>A2) Overall error rate of the model candidate <= acceptable error rate. </li></ul></ul><ul><ul><li>A3) Total number of model atoms in the model candidate>= maximal allowable tree size. </li></ul></ul><ul><ul><li>A1 is used in C4.5 and CDP </li></ul></ul>
  29. 29. Expandability Constraint <ul><li>An Expandability constraint predicate specifies whether a leaf model atom is expandable or not. EX: </li></ul><ul><ul><li>C4.5 uses E1 and E2 </li></ul></ul><ul><ul><li>CDP uses E2 and E3 </li></ul></ul>
  30. 30. Traversal Strategy <ul><li>Traversal strategy ranks expandable leaf model atoms based on the model atom attributes. EX: </li></ul><ul><ul><li>Increasing order of depth </li></ul></ul><ul><ul><li>Decreasing order of depth </li></ul></ul><ul><ul><li>Orders based on other model atom attributes. </li></ul></ul>
  31. 31. Steps in Visual Algorithm Development <ul><li>No single algorithm is the best all the time, performance is highly data dependent. </li></ul><ul><li>By changing different predicates of model constraints, users can construct new classification-learning algorithm. </li></ul><ul><li>This enables users to find an algorithm that works the best on a given data set. </li></ul><ul><li>Two algorithms are developed : BF based on Best First search idea and CDP+ which is a modification of CDP </li></ul>
  32. 32. BF <ul><li>This algorithm is based on the Best-First search idea. </li></ul><ul><li>For Acceptability criteria, it includes A1 and A2 with a user specified acceptable error rate. </li></ul><ul><li>The Traversal strategy chosen is T3 </li></ul><ul><li>In Best-First, expandable leaf model atoms are ranked according to the decreasing order of the number of misclassified training cases. (local error rate * size of subset training data set) </li></ul><ul><li>The traversal strategy will expand a model atom that has the most misclassified training cases, thus reducing the overall error rate the most. </li></ul>
  33. 33. CDP + <ul><li>CDP+ is a modification of CDP </li></ul><ul><li>CDP has dynamic pruning using expandability constraint E3. </li></ul><ul><li>Here, the depth is modified according to the size of the training data set of the model atom. </li></ul><ul><li>We set </li></ul><ul><li>B is the branching factor of the decision tree, t is the size of training data set belonging to model atom, T is the whole training data set. </li></ul>
  34. 34. Comparison of different classification learning algorithms
  35. 35. Experiment <ul><li>The new BF and CDP+ algorithms are compared with the C4.5 and CDP algorithms. </li></ul><ul><li>Various metrics are selected to compare the efficiency, accuracy and size of final decision trees of the classification algorithm. </li></ul><ul><li>The generation efficiency of the nodes is measured in terms of the total number of nodes generated. </li></ul><ul><li>To compare accuracy of the various algorithms, the mean classification error on the test data sets have been computed. </li></ul>
  36. 36. Classification error for 10 data sets
  37. 37. Nodes generated for 10 data sets
  38. 38. Final decision tree size
  39. 39. Results/Conclusion <ul><li>CDP has accuracy comparable to C4.5 while generating considerably fewer nodes. </li></ul><ul><li>CDP+ has accuracy comparable to C4.5 while generating considerably fewer nodes. </li></ul><ul><li>CDP+ outperformed CDP in error rate and number of nodes generated. </li></ul><ul><li>Considering all performance metrics together, CDP+ is the best overall algorithm. </li></ul><ul><li>Considering classification accuracy alone, C4.5P is the winner. </li></ul>
  40. 40. Conclusion <ul><li>Different datasets require different algorithms for best results. </li></ul><ul><li>Diverse user requirements put different constraints on the final decision tree. </li></ul><ul><li>The experiment shows that Interactive Visual Data Mining Framework can help find the most suitable algorithm for a given data set and group of user requirements. </li></ul>
  41. 41. Data Mining for Selective Visualization of Large Spatial Datasets <ul><li>Proceedings of 14th IEEE International Conference on Tools with Artificial Intelligence </li></ul><ul><li>(ICTAI'02),  2002. </li></ul><ul><li>Washington (November 2002), DC, USA, </li></ul><ul><li>Shashi Shekhar, Chang-Tien Lu, Pusheng Zhang, Rulin Liu </li></ul><ul><li>Computer Science & Engineering Department </li></ul><ul><li>University of Minnesota </li></ul>
  42. 42. References <ul><li>http://citeseer.ist.psu.edu/cache/papers/cs/27216/http:zSzzSzwww-users.cs.umn.eduzSzzCz7EctluzSzPaperTalkFilezSzits02.pdf/shekhar02cubeview.pdf </li></ul><ul><li>http://www.cs.umn.edu/Research/shashi-group/ </li></ul><ul><li>http://www.cs.umn.edu/Research/shashi-group/Book/sdb-chap1.pdf </li></ul><ul><li>http://www.cs.umn.edu/research/shashi-group/ alan_planb .pdf </li></ul><ul><li>http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/27637/http:zSzzSzwww-users.cs.umn.eduzSzzCz7EpushengzSzpubzSzkdd2001zSzkdd.pdf/shekhar01detecting.pdf </li></ul>
  43. 43. Basic Terminology <ul><li>Spatial databases </li></ul><ul><ul><li>Alphanumeric data + geographical cordinates </li></ul></ul><ul><li>Spatial mining </li></ul><ul><ul><li>Mining of spatial databases </li></ul></ul><ul><li>Spatial datawarehouse </li></ul><ul><ul><li>Contains geographical data </li></ul></ul><ul><li>Spatial outliers </li></ul><ul><ul><li>Observations that appear to be inconsistent with the remainder of that set of data </li></ul></ul>
  44. 44. Spatial Cluster
  45. 45. Contribution <ul><li>Propose and implement the CubeView visualization system </li></ul><ul><li>General data cube operations </li></ul><ul><li>Built on the concept of spatial data warehouse to support data mining and data visualization </li></ul><ul><li>Efficient and scalable spatial outlier detection algorithms </li></ul>
  46. 46. Challenges in spatial data mining <ul><li>Classical data mining - numbers and categories. </li></ul><ul><li> Spatial data – </li></ul><ul><ul><li>more complex and </li></ul></ul><ul><ul><li>extended objects such as points, lines and polygons. </li></ul></ul><ul><li>Second, classical data mining works with explicit inputs, whereas spatial predicates and attributes are often implicit. </li></ul><ul><li>Third, classical data mining treats each input independently of other inputs. </li></ul>
  47. 47. Application Domain <ul><li>The Traffic Management Center - Minnesota Department of Transportation (MNDOT) has a database to archive sensor network. </li></ul><ul><li>Sensor network includes </li></ul><ul><ul><li>about nine hundred stations </li></ul></ul><ul><ul><li>each of which contains one to four loop detector </li></ul></ul><ul><li>Measurement of Volume and occupancy. </li></ul><ul><ul><li>Volume is # vehicles passing through station in 5-minute interval </li></ul></ul><ul><ul><li>Occupancy is percentage of time station is occupied with vehicles </li></ul></ul>
  48. 48. Basic Concepts <ul><li>Spatial Data Warehouse </li></ul><ul><li>Spatial Data Mining </li></ul><ul><li>Spatial Outliers Detection </li></ul>
  49. 49. Spatial Data Warehouse <ul><li>Employs data cube structure </li></ul><ul><li>Outputs - albums of maps. </li></ul><ul><li>Traffic data warehouse </li></ul><ul><ul><li>Measures - volume and occupancy </li></ul></ul><ul><ul><li>Dimensions - time and space. </li></ul></ul>
  50. 50. Spatial Data Mining <ul><li>Process of discovering interesting and useful but implicit spatial patterns. </li></ul><ul><li>key goal is to partially ‘automate’ knowledge discovery </li></ul><ul><li>Search for “nuggets” of information embedded in very large quantities of spatial data. </li></ul>
  51. 51. Spatial Outliers Detection <ul><li>Suspiciously deviating observations </li></ul><ul><li>Local instability </li></ul><ul><li>Each Station </li></ul><ul><ul><li>Spatial attributes – time, space </li></ul></ul><ul><ul><li>Non spatial attributes – volume, occupancy </li></ul></ul>
  52. 52. Basic Structure – CubeView
  53. 53. CubeView Visualization System <ul><li>Each node in cube – a visualization style </li></ul><ul><ul><li>S - Traffic volume of station at all times. </li></ul></ul><ul><ul><li>T TD – Time of the day </li></ul></ul><ul><ul><li>T DW – Day of the week </li></ul></ul><ul><ul><li>ST TD – Daily traffic volume of each station </li></ul></ul><ul><ul><li>T TD T DW S– Traffic volume at each station at different times on different days </li></ul></ul>
  54. 54. Dimension Lattice
  55. 55. CubeView Visualization System
  56. 56. CubeView Visualization System
  57. 57. CubeView Visualization System
  58. 58. Data Mining Algorithms for Visualization <ul><li>Problem Definition </li></ul><ul><ul><li>Given a spatial graph G ={ S , E } </li></ul></ul><ul><ul><li>S - s1, s2, s3, s4…….. </li></ul></ul><ul><ul><li>E – edges (neighborhood of stations) </li></ul></ul><ul><ul><li>f ( x ) - attribute value for a data record </li></ul></ul><ul><ul><li>N ( x )- fixed cardinality set of neighbors of x </li></ul></ul><ul><ul><li>) - Average attribute value of x neighbors </li></ul></ul><ul><ul><li>S( x ) - difference of the attribute value of each data object and the average attribute value of neighbors. </li></ul></ul>
  59. 59. Data Mining Algorithms for Visualization <ul><li>Problem Definition cont… </li></ul><ul><ul><li>S( x ) - difference of the attribute value of each data object and the average attribute value of neighbors. </li></ul></ul><ul><ul><li>Test for detecting an outlier </li></ul></ul><ul><ul><li>confidence level threshold θ </li></ul></ul>
  60. 60. Data Mining Algorithms for Visualization <ul><li>Few points </li></ul><ul><ul><li>First, the neighborhood can be selected based on a fixed cardinality or a fixed graph distance or a fixed Euclidean distance. </li></ul></ul><ul><ul><li>Second, the choice of neighborhood aggregate function can be mean, variance, or auto-correlation. </li></ul></ul><ul><ul><li>Third, the choice for comparing a location with its neighbors can be either just a number or a vector of attribute values. </li></ul></ul><ul><ul><li>Finally, the statistic for the base distribution can be selected as normal distribution. </li></ul></ul>
  61. 61. Data Mining Algorithms for Visualization <ul><li>Algorithms </li></ul><ul><ul><li>Test Parameters Computation(TPC) Algorithm </li></ul></ul><ul><ul><li>Route Outlier Detection(ROD) Algorithm </li></ul></ul>
  62. 62. Data Mining Algorithms for Visualization
  63. 63. Data Mining Algorithms for Visualization
  64. 64. Data Mining Algorithms for Visualization
  65. 65. Software <ul><li>http://www.cs.umn.edu/research/shashi-group/vis/traffic_volumemap2.htm </li></ul><ul><li>http://www.cs.umn.edu/research/shashi-group/vis/DataCube.htm </li></ul>
  66. 66. Visualization and Data Mining techniques Thank you!!!!

×