Successfully reported this slideshow.
Your SlideShare is downloading. ×

How Data is Transforming the Dutch Media Industry

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 38 Ad

How Data is Transforming the Dutch Media Industry

Download to read offline

RTL Netherlands exists for 30 years in 2019. Video has been our core business. AI gives us the opportunity to deeply understand what our consumers love. On our Spark platform in AWS we apply several AI and ML methods to extract and analyze features. A selection of our content intelligence pipelines: * Object and person detection in videos. * Multi-modal emotion detection. * Speaker identification. * Script and subtitle keyword extraction. * Among others All of these features are used for different data science products: new show and episode creation, talkshow subject selection, interpret viewing ratings among others. Our future goal is to personalize TV on our video-on-demand platform. Not only recommend other series that you like, but also to create personalized talkshows and soap opera's with the subjects, storylines, guests and characters that you like. Video is this our basis, but digitally the opportunities are much more diverse. With this talk I want to inspire and share knowledge. Visiting the Spark Summit 2018, I learned a lot. Some talks even helped to further build this content intelligence project. It would be amazing to give back to the Spark community. Especially when they visit my hometown of Amsterdam. I want to surprise the attendees with the story of this unknown Dutch TV channel, that is taking a leading role on content intelligence in the Netherlands and Europe. It will be an open, inspiring talk with technical details on the pipelines and technology that we used. Accompanied with the end use cases. Including drawbacks and challenges we faced. Not a talk about ambitions, but concrete results of the next level of TV innovation. RTL NL was the first broadcaster of Big Brother and The Voice. And I'm confident that the next break-out hit will be Spark driven.

RTL Netherlands exists for 30 years in 2019. Video has been our core business. AI gives us the opportunity to deeply understand what our consumers love. On our Spark platform in AWS we apply several AI and ML methods to extract and analyze features. A selection of our content intelligence pipelines: * Object and person detection in videos. * Multi-modal emotion detection. * Speaker identification. * Script and subtitle keyword extraction. * Among others All of these features are used for different data science products: new show and episode creation, talkshow subject selection, interpret viewing ratings among others. Our future goal is to personalize TV on our video-on-demand platform. Not only recommend other series that you like, but also to create personalized talkshows and soap opera's with the subjects, storylines, guests and characters that you like. Video is this our basis, but digitally the opportunities are much more diverse. With this talk I want to inspire and share knowledge. Visiting the Spark Summit 2018, I learned a lot. Some talks even helped to further build this content intelligence project. It would be amazing to give back to the Spark community. Especially when they visit my hometown of Amsterdam. I want to surprise the attendees with the story of this unknown Dutch TV channel, that is taking a leading role on content intelligence in the Netherlands and Europe. It will be an open, inspiring talk with technical details on the pipelines and technology that we used. Accompanied with the end use cases. Including drawbacks and challenges we faced. Not a talk about ambitions, but concrete results of the next level of TV innovation. RTL NL was the first broadcaster of Big Brother and The Voice. And I'm confident that the next break-out hit will be Spark driven.

Advertisement
Advertisement

More Related Content

More from Databricks (20)

Advertisement

How Data is Transforming the Dutch Media Industry

  1. 1. WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics
  2. 2. How data is transforming the Dutch media industry Maurits van der Goes | RTL Netherlands #UnifiedDataAnalytics #SparkAISummit
  3. 3. 4
  4. 4. RTL NL 5 8.9 million Daily TV viewers LINEAIR TELEVISION 779 million Online views per month ONLINE VIDEO 2.3 million unique vistors daily DIGITAL PUBLISHING
  5. 5. 6 Consumers Content What When Where
  6. 6. 7
  7. 7. 8
  8. 8. 9 Personalisation Emotion Detection Talkshow Analysis Automated Trailers Ratings Forecasting
  9. 9. 10 Personalisation
  10. 10. 11 Domain News Content Articles Model Content
  11. 11. Taxonomy SIMILARITY TF-IDF Embeddings 23.3% TF.IDF 22.2% Taxonomy 17.6% Editor’s pick (baseline) Random -19.3% Uplift 14
  12. 12. 15 Domain Films & Series Content Long Video Model Behavior
  13. 13. 16 Explainability Neural networks Last watched A/B testing
  14. 14. 17 30 minutes more VIEWING TIME per user per month
  15. 15. 18 Emotion Detection
  16. 16. 19 We tell stories that touch the mind & heart MEDIA = EMOTION
  17. 17. Emotion detection 20 Face& Emotion Detection Musical Genre & Mood Speaker Emotion
  18. 18. 21 oarriaga/face_classification
  19. 19. BERT 22 Bidirectional Encoder Representations from Transformers (Devlin et al., 2018) google-research/bert (Kaggle) BERT-Base, Multilingual Cased • 104 languages • 12-layer • 768-hidden • 12-heads • 110M parameters
  20. 20. 23
  21. 21. 24 Talkshow Analysis
  22. 22. FuzzyWuzzy Levenshtein Distance Scenario & subtitles matching 25 SubtitlesScenario Items + TS
  23. 23. Item classification 26 MLlib • Crime • Entertainment • Lifestyle • Royalty NaiveBayes 0,89946 Logistic Regression (Count Vector Features) 0,84533 Logistic Regression (TF-IDF Features) 0,81268 RandomForrest 0,56564 Item
  24. 24. 27
  25. 25. 28 Ratings Forecasting
  26. 26. Model components –0.12 pp/yr –2.65 pp/yr –1.95 pp/yr Four weather variables (temperature, wind, precipitation, sunshine), measured with respect to average conditions for the time of year Interactions: • n = 1, 2 Fourier terms • saturday/sunday • internal 𝑓 𝑡 = $ %&' ( 𝑎% cos 2π𝑛𝑡 𝑃 + 𝑏% sin 2π𝑛𝑡 𝑃 , 𝑃 = 1 year Truncated Fourier series:Dummy variables for weekdays Interaction with n = 1, 2 Fourier terms Piecewise-linear functionDummy variables for various “events” (e.g. holidays, various sports events) Components: trendComponents: trend + seasonalComponents: trend + seasonal + weekdayComponents: trend + seasonal + weekday + weatherComponents: trend + seasonal + weekday + weather + event 29
  27. 27. Overall Statistics Manual Data Reduction Mean Error 0.93 0.29 -68% Mean Percentage Error 5.4% 1.3% -77% Mean Absolute Error 1.63 1.27 -22% Mean Absolute Percentage Error 9.5% 7.3% -23% 30
  28. 28. 31 Automated Trailers
  29. 29. Why PROMO GENERATION? Promos are ideal for building awareness, branding and teasing the audience. Decline in viewership per channel -> increase in channels & promos. Promo generation is costly, a highly labor-intensive process. Artificial Intelligence can aid in analyzing, selecting an combining potential candidate shots and scenes. 32
  30. 30. Video Aesthetics Optical Flow Shot Segmentation Key Frame Extraction Object Detection, Recognition & Segmentation Face Detection, Recognition & Emotion Visual Similarity Feature extraction (Pachyderm) 33 Features
  31. 31. MAN OR MACHINE?
  32. 32. Man or Machine?MAN OR MACHINE?
  33. 33. MAN OR MACHINE?
  34. 34. 37 What WhenWhere Personalisation Emotion Detection Talkshow Analysis Automated Trailers & moreRatings Forecasting
  35. 35. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT

×