Be the first to like this
Fighting fake registrations, phishing, spam and other types of abuse on the consumer web appears at first glance to be an application tailor-made for machine learning: you have lots of data and lots of features, and you are looking for a binary response (is it an attack or not) on each request. However, building machine learning systems to address these problems in practice turns out to be anything but a textbook process. In particular, you must answer such questions as:
- How do we obtain quality labeled data?
- How do we keep models from "forgetting the past"?
- How do we test new models in adversarial environments?
- How do we stop adversaries from learning our classifiers?
In this talk I will explain how machine learning is typically used to solve abuse problems, discuss these and other challenges that arise, and describe some approaches that can be implemented to produce robust, scalable systems.