RapidMiner5 2.6 - Performance Validation and Visualization
Validation Validation is the process of assessing how well your mining models perform against real data. It is important that you validate your mining models by understanding their quality and characteristics before you deploy them into a production environment.
Performance Validation When applying a model to a real-world problem, one usually wants to rely on a statistically significant estimation of its performance. There are several ways to measure this performance by comparing predicted label and true label. This can of course only be done if the latter is known.
Performance Validation in RapidMiner The usual way to estimate performance is therefore, to split the labeled dataset into a training set and a test set, which can be used for performance estimation. The operators in this section realize different ways of evaluating the performance of a model and splitting the dataset into training and test set.
Performance Validation in RapidMiner The validation Operator
The two parts of the validation operator: Training and Testing
Visualization Visualization operators provide visualization techniques for data and other RapidMiner objects. Visualization is probably the most important tool for getting insight in your data and the nature of underlying patterns.
SOM plots Visualize model by SOM: Generates a SOM plot (transforming arbitrary number of dimensions to two) of the given data set and colorizes the landscape with the predictions of the given model. This class provides an operator for the visualization of arbitrary models with help of the dimensionality reduction via a SOM of both the data set and the given model.