Title: Fingerprinting Latent Structure in Data
Abstract: As data-hungry algorithms find wide spread applications, there is an increased interest in exploring these algorithms in the context of small-data. In many niche industrial applications, data is not only held in secrecy for reasons of privacy and competitive advantage, but is also limited in volume and variety. Under these constraints, data-driven algorithms are expected to exhibit low application misbehavior. If one can discard data that does not match capabilities of underlying algorithms, there is better control over how unexpected data can influence the application behavior. In this talk, data fingerprinting techniques are presented in the context of small-data and application behavior. Capturing and representing a latent structure in data as a fingerprint helps evolve algorithm complexity, thereby improving application reliability. As an illustration, problems involving question answering, cell structure detection, and recognition of classes of short textual messages will be discussed.