"How do you make over 250TB of data useful to data scientists? Bring a lot of CPUs? If only it were so simple!
Understanding customer behavior requires having high quality, reliable data. Any data quality challenges are magnified with high volumes of data, limiting data scientists’ ability to understand, clean, and use data. If you have to clean up 600 million events per day, it’s like cleaning Moscone West’s floors: as soon as you’re done, you have to start all over again.
Come learn how McAfee built a data-driven pipeline using Azure Databricks to maintain high data quality and comprehensive lineage to enable data scientists to be more productive and make sound statistical inferences."