Over 109 million subscribers are enjoying more than 125 million hours of TV shows and movies per day on Netflix. This leads to massive amount of data flowing through our data ingestion pipeline to improve service and user experience. They are powering various data analytic cases like personalization, operational insight, fraud detection. At the heart of this massive data ingestion pipeline is a self-serve stream processing platform that processes 3 trillion events and 12 PB of data every day. We have recently migrated this stream processing platform from Samza to Flink. In this talk, we will share the challenges and issues that we run into when running Flink at scale in cloud. We will dive deep into the troubleshooting techniques and lessons learned.