Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

  • I am the owner of this file. Please delete it asap, or else I will report this to the administrator.
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this


  1. 1. Data mining exercise with SPSS Clementine Lab 6 Winnie Lam Email: [email_address] Website: The Hong Kong Polytechnic University Department of Computing Last update:22/09/2005
  2. 2. Introduction - Neural Networks <ul><li>It is also known as A rtificial N eural N etworks </li></ul><ul><li>It can be considered as simplified mathematical models of brain-like systems and they function as parallel distributed computing networks </li></ul><ul><li>Its functionality is loosely based on the neuron (functional unit of the nervous system) </li></ul>This image is copyright Dennis Kunkel at
  3. 3. Neuron INPUT OUTPUT
  4. 4. Neural Networks A single neuron have 5 components 1. Input x 2. Weight w 3. Bias b 4. Activation function f 5. Output y Σ f x n x 1 x 2 w 1 w 2 w n x 0 =1 INPUT OUTPUT y w 0 (bias,b)
  5. 5. <ul><li>Example: X, a bunch of faces </li></ul><ul><li>x , a single face </li></ul><ul><li>f ( x ) =1 or -1 for x in X </li></ul><ul><li>(X is the set of objects we intend to separate) </li></ul>An illustration
  6. 6. An illustration f(  f( 
  7. 7. <ul><li>In Clementine, the neural networks used are feedforward neural networks, also known as multilayer perceptrons . </li></ul><ul><li>The neurons in such networks (or units ) are arranged in layers. </li></ul>Neural Networks
  8. 8. Stage 1: Data Understanding Data file is located in: http://
  9. 9. Data Understanding <ul><li>Given: Data file (DRUG1n) </li></ul>Answer by yourself: 1. How many no. of attributes? 2. How many no. of records? 3. Any problems in the data?
  10. 10. Data Preparation Result Add node: Var. File (in Source Palette) Task 1 : Import data into Clementine
  11. 11. Stage 2: Data Preparation
  12. 12. Data Preparation Add Node: Derive (in Field Ops Palette) Task 2 : Derive a new field “Na_to_K” (ratio of Na to K) Result
  13. 13. Add Node: Filter (in Field Ops Palette) Task 3 : Discard the fields “Na” and “K” Result Data Preparation
  14. 14. Add Node: Partition (in Field Ops Palette) Task 4: Partition the dataset into Training and Testing set (50/50) Data Preparation
  15. 15. Add Node: Select (in Record Ops Palette) Task 5: Select Training and Testing set ? 95 records ? 105 records Data Preparation
  16. 16. Task 6a : Define and update the fields’ value and type b : Set the input (Age, Sex, BP, Cholesterol, Na_to_K) and output (Drug) Add Node: Type (in Field Ops Palette) Data Preparation ?
  17. 17. Stage 2: Data Mining Neural Networks
  18. 18. Data Mining – Neural Networks Add Node: Neural Net (in Modeling Palette) Goal: Classification for “ drug ” attribute Result
  19. 19. Data Mining – Neural Networks Goal: Validate the model with “test set” Result
  20. 20. Data Mining – Neural Networks IF the selection of fields are done in the type node THEN choose “Use type node settings” ELSE choose “Use custom settings” and select targets and inputs
  21. 21. Data Mining – Neural Networks 6 training methods for building neural network models. Randomly splits the data into separate training and test sets for purposes of model building. Stopping criteria Default: the network will stop training when the network appears to have reached its optimally trained state.
  22. 22. Data Mining – Neural Networks <ul><li>More advanced settings: </li></ul><ul><li>specifying the no. of hidden layers and the </li></ul><ul><li>no. of nodes in each layer </li></ul><ul><li>learning rates </li></ul>
  23. 23. Data Mining <ul><li>New Task: </li></ul><ul><li>Discover the rules for classification of drugs with C5.0 </li></ul><ul><li>2. Determine its accuracy with the test set </li></ul>
  24. 24. SUMMARY <ul><li>Today, you’ve learnt : </li></ul><ul><li>Revise how to derive new attributes </li></ul><ul><li>Discard useless fields </li></ul><ul><li>Perform data partition (Training and test) </li></ul><ul><li>Neural Networks modeling </li></ul><ul><li>Validation with test set </li></ul>