Traditional market research is generally conducted by questionnaires or other forms of explicit feedback, directly asked to an ad hoc panel of individuals that in aggregate are representative of a larger group of people. Unfortunately, those traditional approaches are often invasive, nonscalable, and biased. Indirect approaches based on sparse and implicit consumer feedback (e.g., social network interactions, web browsing, or online purchases) are more scalable, authentic, and more suitable for real-time consumer insights.
Although those sources of implicit consumer feedback provide relevant and detailed pictures of the population, they individually provide only a limited set of observable behaviors.
The Holy Grail of market research is the ability to merge different sources of consumers interests into an augmented view that connects all the dots across multiple domains.
Unfortunately, user-centric "fusion" algorithms present many limitations in the case of heterogeneous datasets strongly differing in terms of size and density and when the number of sources to merge increases.
We propose a novel approach of Audience Projection able to define a target audience as a subset of the population in a source domain and to project this target to a set of users into a destination dataset.
We will show how libraries such as spaCy can provide Deep Learning implementations for Named Entity Recognition (NER) to match related brands and we will use Bayesian Inference to transfer knowledge from the source domain. This way, we can estimate the probability of the user to belong to the target using the source distribution of volume of interests of common entities as model evidence and the source target size as prior probability.
Gianmario Spacagna is the chief scientist and head of AI at Helixa. His team’s mission is building the next generation of behavior algorithms and models of human decision making with careful attention to their potential and effects on society. His experience covers a diverse portfolio of machine learning algorithms and data products across different industries. Previously, he worked as a data scientist in IoT automotive (Pirelli Cyber Technology), retail and business banking (Barclays Analytics Centre of Excellence), threat intelligence (Cisco Talos), predictive marketing (AgilOne), plus some occasional freelancing. He’s a co-author of the book Python Deep Learning, contributor to the “Professional Manifesto for Data Science,” and founder of the Data Science Milan community. Gianmario holds a master’s degree in telematics (Polytechnic of Turin) and software engineering of distributed systems (KTH of Stockholm). After having spent half of his career abroad, he now lives in Milan. His favorite hobbies include home cooking, hiking, and exploring the surrounding nature on his motorcycle.