Be the first to like this
Models in Genetic Based Machine Learning (GBML) systems are commonly used to gain understanding of how the system works and, as a consequence, adjust it better. In this paper we propose models for the probability of having a good initial population using the Attribute List Knowledge Representation (ALKR) for discrete inputs using the GABIL encoding. We base our work in the schema and covering bound models previously proposed for XCS. The models are extended to (a) deal with the combination of ALKR+GABIL representation, (b) explicitly handle datasets with niche overlap and (c) model the impact of using covering and a default rule in the representation. The models are designed and evaluated within the framework of the BioHEL GBML system and are empirically evaluated using first boolean datasets and later also nominal datasets of higher cardinality. The models in this paper allow us to evaluate the challenges presented by problems with high cardinality (in terms of number of attributes and values of the attributes) as well as the benefits contributed by each of the components of BioHEL's representation and initialisation operators.