This document discusses using Spark and R for petabyte-scale data science at Comcast. It describes Comcast's top data initiatives, how Spark and SparkR enable machine learning algorithms, and an example of using a Hidden Markov Model with SparkR to analyze streaming data and detect hidden states. Performance testing showed the model processing 1.7 billion observations per day in 30 minutes with 92% accuracy.