SlideShare a Scribd company logo
How a media data platform drives
real-time insights & analytics using Spark
Bart Van Der Vurst
Partner at element61
The media landscape is in its biggest transformation
Ongoing market changes
▪ rise of the “Tech Giants”
▪ death of anonymous tracking and
third-party cookies
▪ GDPR / privacy
▪ digital transformation
▪ growing paid content, digital subscriptions
▪ search for personalisation
Data is flooding the media landscape
Enormous amount of data is tracked continuously
▪ Online behavioural or clickstream data
▪ Reading behaviour & attention time
▪ Interests & cross-brand tracking
▪ Digital subscription data
▪ Digital newspaper
▪ Weekly magazines
▪ Programmatic advertising & marketing
▪ Impressions
▪ Clicks
VALUE?
Media companies need a data & digital turnaround
Challenge to tackle
▪ From tackling small to huge data volumes
▪ Subscription data vs. clickstream data
▪ From reporting to data-driven (media) services
▪ From classical BI to a modern data platform
2 years ago,
Roularta & element61 embarked this challenge
Who is Roularta Media Group?
▪ Roularta Media Group is a Belgian multimedia
group, market leader in the field of magazines,
local media and business television
since 1954
▪ 1300+ employees, EUR 295+ mln revenue
Who is element61?
▪ Analytics & AI consultancy team (70 people)
& Databricks partner in Belgium
Roularta build a vision focused on its own data
Roularta build a data strategy with 4 pillars
A data analytics platform serving as foundation
of reporting, real-time personalization & insight services
The available BI stack couldn’t deliver on requirements
BI Stack vs. Modern Requirements
▪ One Analytical Data Hub
▪ Manage big volumes of data – 2,5 TB/month
▪ High performance platform – 20 mio trans/day
▪ True real-time dependencies
(content recommendation, advertising audiences, etc.)
▪ Process structured & unstructured data
▪ GDPR compliant
▪ Advanced scoring, modeling and AI/ML capabilities
▪ Data democratization/self-service
▪ Dashboards in different departments
We’ve built this modern data analytics platform
with Azure & Databricks
Approach taken
▪ Leverage latest
best-practices
(e.g. Delta Lake)
▪ First use-case live in
<6 months
▪ Built iteratively
& use-case driven
▪ First focused on
reporting, now added
AI services
We use Delta Lake end-to-end
What it means
▪ Generally available 2019Q2
Introduced at Roularta 2019Q3
▪ Used across all data lake layers
and for both real-time & batch
data processing
▪ Cornerstone for the GDPR solution
& compliance put into place
Tangible results quickly: We started with editorial
dashboards serving real-time insights
Newsletter dashboards for optimizing marketing
We use predictive article quality scoring
for editorial tuning & paywall decisions
Traffic score Engagement score Conversion score
(Predicted) article quality score
Data of every article
Want to know more?
(Re-)watch the Data & AI session “Building an ML Tool to predict Article Quality Scores using Delta & MLFlow”
Quality scores as self-service insight cockpits
The best is yet to come !
• Enhanced content classification algorithms
for traffic, engagement & conversion
• Dynamic content-tagging for advertising
• Consumer segmentation & profiling
• Publisher dashboards
• Marketeer dashboards
• Further use of (Artificial) Intelligence for Dynamic Paywall
• Newsfeed curation based on behaviour & content potential
Roularta has a media data platform capable to scale
… and best if yet to come
• Enhanced content classification algorithms
for traffic, engagement & conversion
• Publisher dashboards
• Dynamic content-tagging for advertising
• Consumer segmentation
• Marketeer dashboards
• Further AI for Dynamic Paywall
• Newsfeed curation based on behaviour &
content potential
Questions?
Feedback
Your feedback is important to us.
Don’t forget to rate
and review the sessions.

More Related Content

What's hot

Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time ResponsesDenodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
VoltDB
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Dataconomy Media
 
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo
 
Real-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsReal-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data Grids
Ali Hodroj
 
Data in Motion vs Data at Rest
Data in Motion vs Data at RestData in Motion vs Data at Rest
Data in Motion vs Data at Rest
Internap
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
Paul Van Siclen
 
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays
 
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo
 
From ingest to insights with AWS
From ingest to insights with AWSFrom ingest to insights with AWS
From ingest to insights with AWS
Paul Van Siclen
 
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database HypeHybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Ali Hodroj
 
Why Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery PlatformWhy Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery Platform
syed_javed
 
Building a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data InfrastructureBuilding a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data Infrastructure
Databricks
 
Altis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data PlatformAltis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data Platform
Altis Consulting
 
Advanced analytics integration with python
Advanced analytics integration with pythonAdvanced analytics integration with python
Advanced analytics integration with python
Paul Van Siclen
 
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
confluent
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
confluent
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
SingleStore
 
Getting Started with Big Data Analytics
Getting Started with Big Data AnalyticsGetting Started with Big Data Analytics
Getting Started with Big Data Analytics
Rob Winters
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big Data
Rob Winters
 

What's hot (20)

Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time ResponsesDenodo DataFest 2017: Outpace Your Competition with Real-Time Responses
Denodo DataFest 2017: Outpace Your Competition with Real-Time Responses
 
How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
 
Real-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data GridsReal-time Microservices and In-Memory Data Grids
Real-time Microservices and In-Memory Data Grids
 
Data in Motion vs Data at Rest
Data in Motion vs Data at RestData in Motion vs Data at Rest
Data in Motion vs Data at Rest
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
 
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
apidays LIVE Singapore - Democratising data access with APIs by Tarush Aggarw...
 
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
 
From ingest to insights with AWS
From ingest to insights with AWSFrom ingest to insights with AWS
From ingest to insights with AWS
 
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database HypeHybrid Transactional/Analytics Processing: Beyond the Big Database Hype
Hybrid Transactional/Analytics Processing: Beyond the Big Database Hype
 
Why Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery PlatformWhy Finance Should Consider Agile Modern Data Delivery Platform
Why Finance Should Consider Agile Modern Data Delivery Platform
 
Building a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data InfrastructureBuilding a Modern FinTech Big Data Infrastructure
Building a Modern FinTech Big Data Infrastructure
 
Altis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data PlatformAltis Webinar: Use Cases For The Modern Data Platform
Altis Webinar: Use Cases For The Modern Data Platform
 
Advanced analytics integration with python
Advanced analytics integration with pythonAdvanced analytics integration with python
Advanced analytics integration with python
 
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
Using Kafka in Your Organization with Real-Time User Insights for a Customer ...
 
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Getting Started with Big Data Analytics
Getting Started with Big Data AnalyticsGetting Started with Big Data Analytics
Getting Started with Big Data Analytics
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big Data
 

Similar to How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark

Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
loginworks software
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
Big Data User Group Karlsruhe/Stuttgart
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
DATAVERSITY
 
Concurrency - Modern BI
Concurrency - Modern BIConcurrency - Modern BI
Concurrency - Modern BI
Jake Borzym
 
Business Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data VarietyBusiness Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data Variety
www.panorama.com
 
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Fred Isbell
 
Five Trends in Real Time Applications
Five Trends in Real Time ApplicationsFive Trends in Real Time Applications
Five Trends in Real Time Applications
confluent
 
Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
loginworks software
 
Analytics trends report 2017
Analytics trends report 2017Analytics trends report 2017
Analytics trends report 2017
Robert Sibo
 
Data Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for TableauData Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for Tableau
Arunima Gupta
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industry
Parviz Iskhakov
 
Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution  Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution
Sirinporn Setworaya
 
Cloud computing: Stan Freck
Cloud computing: Stan FreckCloud computing: Stan Freck
Cloud computing: Stan Freck
Lisa Malone
 
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip ReportGartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Paul Woudstra
 
Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021
Safe Software
 
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking GroupHybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
HostedbyConfluent
 
Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0
Amar Roy
 
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudTransformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
Raul Goycoolea Seoane
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Matt Stubbs
 
The Cloud for SMEs
The Cloud for SMEsThe Cloud for SMEs
The Cloud for SMEs
Chris Knowles
 

Similar to How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark (20)

Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Concurrency - Modern BI
Concurrency - Modern BIConcurrency - Modern BI
Concurrency - Modern BI
 
Business Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data VarietyBusiness Intelligence Meets Big Data Variety
Business Intelligence Meets Big Data Variety
 
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
 
Five Trends in Real Time Applications
Five Trends in Real Time ApplicationsFive Trends in Real Time Applications
Five Trends in Real Time Applications
 
Data Mining Services in various types
Data Mining Services in various typesData Mining Services in various types
Data Mining Services in various types
 
Analytics trends report 2017
Analytics trends report 2017Analytics trends report 2017
Analytics trends report 2017
 
Data Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for TableauData Visualization Trends - Next Steps for Tableau
Data Visualization Trends - Next Steps for Tableau
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industry
 
Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution  Company Profile - NPC with TIBCO Spotfire solution
Company Profile - NPC with TIBCO Spotfire solution
 
Cloud computing: Stan Freck
Cloud computing: Stan FreckCloud computing: Stan Freck
Cloud computing: Stan Freck
 
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip ReportGartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
Gartner IT Infrastructure & Operations Management Summit 2014 - Trip Report
 
Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021Data Integration Trends Businesses Should Watch for in 2021
Data Integration Trends Businesses Should Watch for in 2021
 
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking GroupHybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
Hybrid Cloud Streaming and Modernising Payments at Lloyds Banking Group
 
Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0Di in the age of digital disruptions v1.0
Di in the age of digital disruptions v1.0
 
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias CloudTransformacion del Negocio Financiero por medio de Tecnologias Cloud
Transformacion del Negocio Financiero por medio de Tecnologias Cloud
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
 
The Cloud for SMEs
The Cloud for SMEsThe Cloud for SMEs
The Cloud for SMEs
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 

Recently uploaded (20)

一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 

How a Media Data Platform Drives Real-time Insights & Analytics using Apache Spark

  • 1. How a media data platform drives real-time insights & analytics using Spark Bart Van Der Vurst Partner at element61
  • 2. The media landscape is in its biggest transformation Ongoing market changes ▪ rise of the “Tech Giants” ▪ death of anonymous tracking and third-party cookies ▪ GDPR / privacy ▪ digital transformation ▪ growing paid content, digital subscriptions ▪ search for personalisation
  • 3. Data is flooding the media landscape Enormous amount of data is tracked continuously ▪ Online behavioural or clickstream data ▪ Reading behaviour & attention time ▪ Interests & cross-brand tracking ▪ Digital subscription data ▪ Digital newspaper ▪ Weekly magazines ▪ Programmatic advertising & marketing ▪ Impressions ▪ Clicks VALUE?
  • 4. Media companies need a data & digital turnaround Challenge to tackle ▪ From tackling small to huge data volumes ▪ Subscription data vs. clickstream data ▪ From reporting to data-driven (media) services ▪ From classical BI to a modern data platform
  • 5. 2 years ago, Roularta & element61 embarked this challenge Who is Roularta Media Group? ▪ Roularta Media Group is a Belgian multimedia group, market leader in the field of magazines, local media and business television since 1954 ▪ 1300+ employees, EUR 295+ mln revenue Who is element61? ▪ Analytics & AI consultancy team (70 people) & Databricks partner in Belgium
  • 6. Roularta build a vision focused on its own data
  • 7. Roularta build a data strategy with 4 pillars
  • 8. A data analytics platform serving as foundation of reporting, real-time personalization & insight services
  • 9. The available BI stack couldn’t deliver on requirements BI Stack vs. Modern Requirements ▪ One Analytical Data Hub ▪ Manage big volumes of data – 2,5 TB/month ▪ High performance platform – 20 mio trans/day ▪ True real-time dependencies (content recommendation, advertising audiences, etc.) ▪ Process structured & unstructured data ▪ GDPR compliant ▪ Advanced scoring, modeling and AI/ML capabilities ▪ Data democratization/self-service ▪ Dashboards in different departments
  • 10. We’ve built this modern data analytics platform with Azure & Databricks Approach taken ▪ Leverage latest best-practices (e.g. Delta Lake) ▪ First use-case live in <6 months ▪ Built iteratively & use-case driven ▪ First focused on reporting, now added AI services
  • 11. We use Delta Lake end-to-end What it means ▪ Generally available 2019Q2 Introduced at Roularta 2019Q3 ▪ Used across all data lake layers and for both real-time & batch data processing ▪ Cornerstone for the GDPR solution & compliance put into place
  • 12. Tangible results quickly: We started with editorial dashboards serving real-time insights
  • 13. Newsletter dashboards for optimizing marketing
  • 14. We use predictive article quality scoring for editorial tuning & paywall decisions Traffic score Engagement score Conversion score (Predicted) article quality score Data of every article Want to know more? (Re-)watch the Data & AI session “Building an ML Tool to predict Article Quality Scores using Delta & MLFlow”
  • 15. Quality scores as self-service insight cockpits
  • 16. The best is yet to come ! • Enhanced content classification algorithms for traffic, engagement & conversion • Dynamic content-tagging for advertising • Consumer segmentation & profiling • Publisher dashboards • Marketeer dashboards • Further use of (Artificial) Intelligence for Dynamic Paywall • Newsfeed curation based on behaviour & content potential
  • 17. Roularta has a media data platform capable to scale … and best if yet to come • Enhanced content classification algorithms for traffic, engagement & conversion • Publisher dashboards • Dynamic content-tagging for advertising • Consumer segmentation • Marketeer dashboards • Further AI for Dynamic Paywall • Newsfeed curation based on behaviour & content potential
  • 19. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.