Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Effects of change propagation resulting from adaptive preprocessing in multicomponent predictive systems

189 views

Published on

Predictive modelling is a complex process that requires a number of steps to transform raw data into predictions. Preprocessing of the input data is a key step in such process, and the selection of proper preprocessing methods is often a labour intensive task. Such methods are usually trained offline and their parameters remain fixed during the whole model deployment lifetime. However, preprocessing of non-stationary data streams is more challenging since the lack of adaptation of such preprocessing methods may degrade system performance. In addition, dependencies between different predictive system components make the adaptation process more challenging. In this paper we discuss the effects of change propagation resulting from using adaptive preprocessing in a Multicomponent Predictive System (MCPS). To highlight various issues we present four scenarios with different levels of adaptation. A number of experiments have been performed with a range of datasets to compare the prediction error in all four scenarios. Results show that well managed adaptation considerably improves the prediction performance. However, the model can become inconsistent if adaptation in one component is not correctly propagated throughout the rest of system components. Sometimes, such inconsistency may not cause an obvious deterioration in the system performance, therefore being difficult to detect. In some other cases it may even lead to a system failure as was observed in our experiments.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Effects of change propagation resulting from adaptive preprocessing in multicomponent predictive systems

  1. 1. Effects of change propagation resulting from adaptive preprocessing in multicomponent predictive systems Manuel Martín Salvador, Marcin Budka, Bogdan Gabrys {msalvador,mbudka,bgabrys}@bournemouth.ac.uk Data Science Institute. Bournemouth University KES-2016, York, UK September 7th, 2016
  2. 2. Outline 1. Prologue 2. Introduction to MCPS 3. Motivation 4. Reactive adaptation of MCPS 5. Experiments 6. Conclusion
  3. 3. PROLOGUE
  4. 4. Butterfly effect Small causes can have large effects — Edward Lorenz (1917 - 2008) Source: GloWings
  5. 5. Change propagation Controlled change management in a system CC by TheGiantVermin
  6. 6. Data streams “Infinite” number of records Continuously arriving to the system at different or same rates Can be stationary or evolving
  7. 7. Data streams Examples: ● Sensors in manufacturing industry ● Traffic monitoring sensors ● Event logs in websites ● Transactions in the financial sector “Infinite” number of records Continuously arriving to the system at different or same rates Can be stationary or evolving A single engine of Airbus A320 has more than 1000 sensors generating 10GB/s!!
  8. 8. INTRODUCTION TO MCPS
  9. 9. Data Stream Data stream learning for online prediction Predictive Model Online Supervised Learning Algorithm Predictions True labels t+k t
  10. 10. Data Stream Data stream learning for online prediction Predictive Model PredictionsPreprocessing Postprocessing Multicomponent Predictive System (MCPS)
  11. 11. MCPS composition Manual ● WEKA ● RapidMiner ● Knime ● IBM SPSS Automatic ● Auto-WEKA (Bayesian optimisation) ● Auto-sklearn (Bayesian optimisation + Meta-learning) ● TPOT (Genetic programming) ● e-Lico IDA (Ontologies + Planning) Example of WEKA workflow
  12. 12. Formalising MCPS o token (data) i place transition Well-handled and Acyclic Workflow Petri net (WA-WF-net) MCPS = (P, T, F)
  13. 13. Formalising MCPS o prediction i place transition Well-handled and Acyclic Workflow Petri net (WA-WF-net) MCPS = (P, T, F) “Automatic composition and optimisation of multicomponent predictive systems” @ IEEE TNNLS (under review) http://bit.ly/automatic-mcps-tnnls
  14. 14. Formalising MCPS Classifier o Replace missing values Dimensionality reduction Outlier handling token (data) i place transition Well-handled and Acyclic Workflow Petri net (WA-WF-net) MCPS = (P, T, F) “Automatic composition and optimisation of multicomponent predictive systems” @ IEEE TNNLS (under review) http://bit.ly/automatic-mcps-tnnls
  15. 15. MOTIVATION
  16. 16. Data changes over time Snapshot of SYN dataset at different times
  17. 17. Need of model adaptation Streaming error (mean over last 10 samples) SYN dataset with GFMM classifier GFMMZ-Score PCA Min-Max Wrongly classified
  18. 18. Need of preprocessing adaptation Streaming error (mean over last 10 samples) SYN dataset with GFMM classifier GFMMZ-Score PCA Min-Max Wrongly classified (out of [0,1]) New hyperboxes
  19. 19. Main strategies for MCPS adaptation Adaptation strategies GLOBAL LOCAL Re-composition Full Partial Hyperparameter optimisation (keep components) Full Partial Parameterisation (keep components and hyperparameters) Full Partial
  20. 20. Main strategies for MCPS adaptation Adaptation strategies GLOBAL LOCAL Re-composition Full Partial Hyperparameter optimisation (keep components) Full Partial Parameterisation (keep components and hyperparameters) Full Partial “Adapting Multicomponent Predictive Systems using Hybrid Adaptation Strategies with Auto-WEKA in Process Industry” @ AutoML / ICML 2016 http://bit.ly/adapting-mcps-paper This work!
  21. 21. Need of change propagation Streaming error (mean over last 10 samples) SYN dataset with GFMM classifier GFMMZ-Score PCA Min-Max Inconsistent hyperboxes due to a different input space
  22. 22. REACTIVE ADAPTATION OF MCPS
  23. 23. Reactive adaptation of MCPS GFMMZ-Score PCA Min-Max Time i p1 p2 p3 o [-3.1, 2.7] x1 = 3.6
  24. 24. Reactive adaptation of MCPS GFMMZ-Score PCA Min-Max Time i p1 p2 p3 o data meta-data [-3.1, 2.7] x1 = 3.6 [-3.1, 3.6]
  25. 25. Reactive adaptation of MCPS GFMMZ-Score PCA Min-Max Time i p1 p2 p3 o data meta-data prediction [-3.1, 2.7] x1 = 3.6 [-3.1, 3.6]
  26. 26. Updating a component: GFMM 0 1 1 0 (-3.1) (2.7) x1 x2 0 1 1 0 (-3.1) (3.6) x1 x2 Hyperboxes are mapped to the new input space
  27. 27. EXPERIMENTS
  28. 28. Experiments Name # Attr # Class Type SYN 2 2 Synthetic ELEC 7 2 Real COVERTYPE 54 7 Real GAS 128 6 Real Datasets Scenarios Id Adap. Model Adap. Prepro. Change Propagation #1 No No No #2 Yes No No #3 Yes Yes No #4 Yes Yes Yes First 200 samples for initial training, rest 400 for testing and online learning GFMMZ-Score PCA Min-Max
  29. 29. Results #3 crashes due to lack of change propagation when changing PCA components
  30. 30. CONCLUSION
  31. 31. Conclusion Only model adaptation may not be enough to cope with evolving data streams, adaptive preprocessing should be considered. However, “blind” adaptation of components can result in inconsistent models or even in a system crash. Local adaptation of a component may require adapting further components. Therefore, a system must be reactive and propagate changes. The definition of MCPS has been extended to support change propagation using a new token for meta-data in a coloured Petri net (cMCPS).
  32. 32. Future work Large study to measure the actual cost of adaptation. Open questions: ● How to handle propagation requiring changes of the Petri net structure? ● How to handle transformations in systems with nonlinear components? ● How to order components to reduce the cost of adaptation? ● Can a meta-data token be removed at an early stage instead of being fully propagated?
  33. 33. Thanks! Paper: http://bit.ly/change-propagation-mcps Slides: http://www.slideshare.net/draxus Manuel <msalvador@bournemouth.ac.uk> @draxus

×