In this talk we will present some techniques that we use on a day to day basis in our research, where we combine our internet-wide data scanning and acquisition platform with ML/Data science techniques which allows us to find things faster or extract results in a more automated way. We will focus on practical cases and examples that even our audience at home will be able to use if they want. A couple of examples we will look at is how to classify images such as VNC screenshots, we will look at network scans and using machine learning to classify them and also the use of natural language processing to analyze CVEs. We will also talk a bit about a data analysis and classification pipeline architecture, we will look at the different technologies and what they do and how they can be used. We will start by giving a very brief entry to the data science world and talk about: Technologies Techniques How these relate to infosec Algorithms and how they can be used How people can come into the world of data and machine learning Data visualization techniques and what are the best choices for different types of data A couple of examples we will look at is how to classify images such as VNC or x11 screenshots, OCR, we will look at network scans and using machine learning to classify them and also the use of natural language processing to analyze CVEs. We will look at scoring and classification algorithms and how they can be used on ip addresses and we will talk about the use of learning and how we are applying it in real life. We will also talk a bit about a data analysis and classification pipeline architecture, we will look at the different technologies and what they do and how they can be used. Some specific examples of our research that should give you an idea of some things we will talk about can be seen here: https://blog.binaryedge.io/2015/11/10/ssh/ https://blog.binaryedge.io/2015/09/30/vnc-image-analysis-and-data-science/ https://blog.binaryedge.io/2015/08/10/data-technologies-and-security-part-1/