“The ability to detect malware has needed to drastically change in the past few years away from traditional signature or list based techniques. Couple this with the rise of mobile device based attacks, where the scale of the data is predicted to be 60% of the internet in 2018*, our online lives will need Machine Learning (ML) and Data Science to ensure its security. At Wandera we have successfully implemented a malware detection (and classification) ML model at scale with the use of Apache Spark (MLib) and the PMML via OpenScoring paradigm. In this talk we will touch on the training data and why we use Spark at all, the features we extract from mobile phone applications and how we then obtain our high accuracy scores in the cloud. At Wandera we have successfully implemented a Malware detection (and classification) ML model at scale with the use of Apache Spark (MLib) and the PMML via OpenScoring paradigm. *https://blog.cloudflare.com/our-predictions-for-2018/”