In this paper we perform an exploratory analysis of a finan- cial data set from a Spanish bank. Our goal is to do risk prediction in credit operations, and as data is collected continuously and reported on a monthly basis, this gives rise to a streaming data classification problem. Our analysis reveals some practical problems that have not previously been thoroughly analyzed in the context of streaming data analysis: the class labels are not immediately available and the rele- vant predictive features and entities under study (in this case the set of customers) may vary over time. In order to address these problems, we propose to use a dynamic classifier with a wrapper feature subset selection to find relevant features at di↵erent time steps. The proposed model is a special case of a more general framework that can also ac- commodate more expressive models containing latent variables as well as more sophisticated feature selection schemes.
Full text link: http://www.idi.ntnu.no/~helgel/papers/BorchaniMartinezMasegosaLangsethNielsenSalmeronFernandezMadsenSaezSCAI15.pdf