SlideShare a Scribd company logo
1 of 20
Download to read offline
AWS Big Data in Everyday Use at Yle
Saku Vaittinen, Yle / Jukka Dahlbom, Webscale
9.2.2017
Yle Data Cloud
• Helps to create better user experience
• Content development
• Service optimization
• Recommendation
• Marketing automation
• Purpose to create broad and real-time data for strategic
decisions and actions
• Reachability, demographics
Yle Data Cloud
Data => Information => Knowledge
● Daily predictions, how they match with reality,
accuracy over time
● Identify control set of users that can be used as a
reference point
● Combine Internet metrics with analog TV
measurements
Yle Data Cloud
● What we measure?
○ Page hits
○ Heartbeats
○ Social media
○ Articles read, time spent with media, date / time /
location
■ AMR (Average Minute Rating)
○ Genre
○ Age groups
Content
User
information
Behavior
Data Cloud Relational Data Hub
Sources Usage
Data hub
Raw data PublishingStructured
(Data vault)
Web events
Dashboards
Recommen-
dations
Strategy
Marketing
automation
Panel
S3
Yle Analytics Pipeline - Current situation
Elastic
Beanstalk
Kinesis Streams
Analytics
Collector
Web events
~100
mill./day
Kinesis Firehose
RedShift
Lambda S3 EMR
Daily
archiving
Areena
Recommendation
Delay: 10-15 minutes
Small batch sizes
Compression and
larger file sizes.
~90 Gb per day
CloudWatch
Dashboards
Small sized files
Delay: ~minute
Managed and/or serverless processing?
● Firehose used for data ingestion to Redshift
● Inflexible for faster analytics needs (< 2 minutes)
● Kinesis + Lambda consumers for both S3 archiving
and fast lane analysis (TBD).
DevOps in analytics pipeline
● Baseline support from dedicated operations team
● Terraform for infrastructure management
● Most other Yle services run as Dockerized APIs in
ECS cluster.
Redshift consumers
● Data mart (Postgre) for web-accessed precomputed
results
● Scheduled lambdas for very light queries
● Lambda-driven task containers for long but memory
light queries
● Lambda-started EC2 instances for memory intensive
computing. (Recommendations, user classifications)
● Data scientists running exploratory queries
Redshift performance
● Default user group is limited to 5 concurrent queries
● Set up WLM queues for different workloads, split by
usage.
● Isolate data scientists into separate WLM queue that
doesn’t block scheduled activity.
Lambda for batch queries?
● Fast, serverless, stateless, cheap, reactive.
● Limited by 300s max timeout
● Unreliable in high load situations.
Lambda-driven task containers for batch
● Lambda allows reactive and scheduled running of
tasks.
● Task containers not limited by execution timeouts.
● Logging and monitoring support for ECS containers
is already there (ELK).
Why not use AWS Batch?
● AWS Batch wasn’t available when batch containers
were first needed - published in re:Invent Dec 2016
● Not yet supported in Terraform v0.8.5.
● Once support is there, switch to AWS Batch from
homebrew resources.
Monitoring?
● Cloudwatch Alerts for component red/green health
● Cloudwatch Dashboards for overall graph view
● Kibana and Cloudwatch Logs (with custom scripting)
for log management.
Monitoring
Monitoring
AWS Big Data at Yle Drives Strategic Decisions

More Related Content

What's hot

Model-driven and low-code development for event-based systems | Bobby Calderw...
Model-driven and low-code development for event-based systems | Bobby Calderw...Model-driven and low-code development for event-based systems | Bobby Calderw...
Model-driven and low-code development for event-based systems | Bobby Calderw...HostedbyConfluent
 
OPEN'17_2_Customer Experience_Essent
OPEN'17_2_Customer Experience_EssentOPEN'17_2_Customer Experience_Essent
OPEN'17_2_Customer Experience_EssentKangaroot
 
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...HostedbyConfluent
 
#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...
#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...
#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...confluent
 
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...Elasticsearch
 
Creación de una plataforma de observabilidad centralizada
Creación de una plataforma de observabilidad centralizadaCreación de una plataforma de observabilidad centralizada
Creación de una plataforma de observabilidad centralizadaElasticsearch
 
Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...
Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...
Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...HostedbyConfluent
 
Cassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsCassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsBrenden Matthews
 
O monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insightO monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insightElasticsearch
 
ODCA infrastructure as-a-service Framework & Usage Scenarios
ODCA infrastructure as-a-service Framework & Usage ScenariosODCA infrastructure as-a-service Framework & Usage Scenarios
ODCA infrastructure as-a-service Framework & Usage ScenariosOpen Data Center Alliance
 
Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...
Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...
Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...confluent
 
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, PresetStreaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, PresetHostedbyConfluent
 
Our journey to aws - Maylin Leal
Our journey to aws - Maylin LealOur journey to aws - Maylin Leal
Our journey to aws - Maylin LealUNICORNS IN TECH
 
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...confluent
 
J-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National PoliceJ-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...
Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...
Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...HostedbyConfluent
 
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018iguazio
 
Elastic APM: Amping up your logs and metrics for the full picture
Elastic APM: Amping up your logs and metrics for the full pictureElastic APM: Amping up your logs and metrics for the full picture
Elastic APM: Amping up your logs and metrics for the full pictureElasticsearch
 
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...Bert Jan Schrijver
 

What's hot (20)

Model-driven and low-code development for event-based systems | Bobby Calderw...
Model-driven and low-code development for event-based systems | Bobby Calderw...Model-driven and low-code development for event-based systems | Bobby Calderw...
Model-driven and low-code development for event-based systems | Bobby Calderw...
 
OPEN'17_2_Customer Experience_Essent
OPEN'17_2_Customer Experience_EssentOPEN'17_2_Customer Experience_Essent
OPEN'17_2_Customer Experience_Essent
 
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
 
#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...
#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...
#Re-Imagine Autoscaling Stream Consumers in Cloud Environments (Sunil Kaitha,...
 
Iaas Pricing Models
Iaas Pricing ModelsIaas Pricing Models
Iaas Pricing Models
 
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
 
Creación de una plataforma de observabilidad centralizada
Creación de una plataforma de observabilidad centralizadaCreación de una plataforma de observabilidad centralizada
Creación de una plataforma de observabilidad centralizada
 
Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...
Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...
Transformation During a Global Pandemic | Ashish Pandit and Scott Lee, Univer...
 
Cassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsCassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming Analytics
 
O monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insightO monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insight
 
ODCA infrastructure as-a-service Framework & Usage Scenarios
ODCA infrastructure as-a-service Framework & Usage ScenariosODCA infrastructure as-a-service Framework & Usage Scenarios
ODCA infrastructure as-a-service Framework & Usage Scenarios
 
Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...
Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...
Building Information Systems using Event Modeling (Bobby Calderwood, Evident ...
 
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, PresetStreaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
Streaming Data Analytics with ksqlDB and Superset | Robert Stolz, Preset
 
Our journey to aws - Maylin Leal
Our journey to aws - Maylin LealOur journey to aws - Maylin Leal
Our journey to aws - Maylin Leal
 
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
Simplifying Event Streaming: Tools for Location Transparency and Data Evoluti...
 
J-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National PoliceJ-Spring 2017 - Microservices in action at the Dutch National Police
J-Spring 2017 - Microservices in action at the Dutch National Police
 
Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...
Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...
Driving a Digital Thread Program in Manufacturing with Apache Kafka | Anu Mis...
 
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
The Serverless Native Mindset: Ben Kehoe, iRobot, Serverless NYC 2018
 
Elastic APM: Amping up your logs and metrics for the full picture
Elastic APM: Amping up your logs and metrics for the full pictureElastic APM: Amping up your logs and metrics for the full picture
Elastic APM: Amping up your logs and metrics for the full picture
 
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
Continuous Delivery Amsterdam - Microservices in action at the Dutch National...
 

Similar to AWS Big Data at Yle Drives Strategic Decisions

AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAmazon Web Services
 
Public Cloud Workshop
Public Cloud WorkshopPublic Cloud Workshop
Public Cloud WorkshopAmer Ather
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresOzgun Erdogan
 
Co 4, session 2, aws analytics services
Co 4, session 2, aws analytics servicesCo 4, session 2, aws analytics services
Co 4, session 2, aws analytics servicesm vaishnavi
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudAmazon Web Services
 
Clash of Technologies Google Cloud vs Microsoft Azure
Clash of Technologies Google Cloud vs Microsoft AzureClash of Technologies Google Cloud vs Microsoft Azure
Clash of Technologies Google Cloud vs Microsoft AzureMihail Mateev
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Anubhav Kale
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)Amazon Web Services
 
Satrtup Bootcamp - Scale on AWS
Satrtup Bootcamp - Scale on AWSSatrtup Bootcamp - Scale on AWS
Satrtup Bootcamp - Scale on AWSIdan Tohami
 
Big data and Analytics on AWS
Big data and Analytics on AWSBig data and Analytics on AWS
Big data and Analytics on AWS2nd Watch
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindAvere Systems
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
 
Nairobi OpenStack Meetup - July 2013
Nairobi OpenStack Meetup - July 2013Nairobi OpenStack Meetup - July 2013
Nairobi OpenStack Meetup - July 2013adamnelson
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewAmazon Web Services
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020
AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020
AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020Tim Wagner
 

Similar to AWS Big Data at Yle Drives Strategic Decisions (20)

AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Public Cloud Workshop
Public Cloud WorkshopPublic Cloud Workshop
Public Cloud Workshop
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with Postgres
 
Co 4, session 2, aws analytics services
Co 4, session 2, aws analytics servicesCo 4, session 2, aws analytics services
Co 4, session 2, aws analytics services
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
Clash of Technologies Google Cloud vs Microsoft Azure
Clash of Technologies Google Cloud vs Microsoft AzureClash of Technologies Google Cloud vs Microsoft Azure
Clash of Technologies Google Cloud vs Microsoft Azure
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
 
Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark Solving Office 365 Big Challenges using Cassandra + Spark
Solving Office 365 Big Challenges using Cassandra + Spark
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
 
Satrtup Bootcamp - Scale on AWS
Satrtup Bootcamp - Scale on AWSSatrtup Bootcamp - Scale on AWS
Satrtup Bootcamp - Scale on AWS
 
Big data and Analytics on AWS
Big data and Analytics on AWSBig data and Analytics on AWS
Big data and Analytics on AWS
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
Nairobi OpenStack Meetup - July 2013
Nairobi OpenStack Meetup - July 2013Nairobi OpenStack Meetup - July 2013
Nairobi OpenStack Meetup - July 2013
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020
AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020
AWS Serverless Community Day Keynote and Vendia Launch 6-26-2020
 

More from Rolf Koski

AWS Tampere Meetup February 2019 - Real World Well-Architected
AWS Tampere Meetup February 2019 - Real World Well-ArchitectedAWS Tampere Meetup February 2019 - Real World Well-Architected
AWS Tampere Meetup February 2019 - Real World Well-ArchitectedRolf Koski
 
AWS Finland Meetup 2020 January
AWS Finland Meetup 2020 JanuaryAWS Finland Meetup 2020 January
AWS Finland Meetup 2020 JanuaryRolf Koski
 
AWS Finland Meetup 2019 November
AWS Finland Meetup 2019 NovemberAWS Finland Meetup 2019 November
AWS Finland Meetup 2019 NovemberRolf Koski
 
AWS Finland Meetup 2019 October
AWS Finland Meetup 2019 OctoberAWS Finland Meetup 2019 October
AWS Finland Meetup 2019 OctoberRolf Koski
 
AWS Finland Meetup 2019 September - sponsored by Digia
AWS Finland Meetup 2019 September - sponsored by DigiaAWS Finland Meetup 2019 September - sponsored by Digia
AWS Finland Meetup 2019 September - sponsored by DigiaRolf Koski
 
AWS Finland meetup 2019 september - sponsored by Zalando
AWS Finland meetup 2019 september - sponsored by ZalandoAWS Finland meetup 2019 september - sponsored by Zalando
AWS Finland meetup 2019 september - sponsored by ZalandoRolf Koski
 
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer story
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer storyAWS Stockholm Meetup June 2019 - Cybercom DeepRacer story
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer storyRolf Koski
 
Serverless Days Helsinki 2019 Rolf Koski - Business Driven Availability
Serverless Days Helsinki 2019 Rolf Koski - Business Driven AvailabilityServerless Days Helsinki 2019 Rolf Koski - Business Driven Availability
Serverless Days Helsinki 2019 Rolf Koski - Business Driven AvailabilityRolf Koski
 
AWS Finland Meetup 2019 April
AWS Finland Meetup 2019 AprilAWS Finland Meetup 2019 April
AWS Finland Meetup 2019 AprilRolf Koski
 
AWS Community Day 2019 - Business Driven Availability
AWS Community Day 2019 - Business Driven AvailabilityAWS Community Day 2019 - Business Driven Availability
AWS Community Day 2019 - Business Driven AvailabilityRolf Koski
 
Match AWS Pori - Rolf Koski - Cybercom
Match AWS Pori - Rolf Koski - CybercomMatch AWS Pori - Rolf Koski - Cybercom
Match AWS Pori - Rolf Koski - CybercomRolf Koski
 
AWS Finland meetup 2018 August
AWS Finland meetup 2018 AugustAWS Finland meetup 2018 August
AWS Finland meetup 2018 AugustRolf Koski
 
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...Rolf Koski
 
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...Rolf Koski
 
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...Rolf Koski
 
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...Rolf Koski
 
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...Rolf Koski
 
AWS Finland meetup 2017 October
AWS Finland meetup 2017 OctoberAWS Finland meetup 2017 October
AWS Finland meetup 2017 OctoberRolf Koski
 
AWS Finland meetup 2017 August
AWS Finland meetup 2017 AugustAWS Finland meetup 2017 August
AWS Finland meetup 2017 AugustRolf Koski
 
AWS Finland User Group Meetup 2017-05-23
AWS Finland User Group Meetup 2017-05-23AWS Finland User Group Meetup 2017-05-23
AWS Finland User Group Meetup 2017-05-23Rolf Koski
 

More from Rolf Koski (20)

AWS Tampere Meetup February 2019 - Real World Well-Architected
AWS Tampere Meetup February 2019 - Real World Well-ArchitectedAWS Tampere Meetup February 2019 - Real World Well-Architected
AWS Tampere Meetup February 2019 - Real World Well-Architected
 
AWS Finland Meetup 2020 January
AWS Finland Meetup 2020 JanuaryAWS Finland Meetup 2020 January
AWS Finland Meetup 2020 January
 
AWS Finland Meetup 2019 November
AWS Finland Meetup 2019 NovemberAWS Finland Meetup 2019 November
AWS Finland Meetup 2019 November
 
AWS Finland Meetup 2019 October
AWS Finland Meetup 2019 OctoberAWS Finland Meetup 2019 October
AWS Finland Meetup 2019 October
 
AWS Finland Meetup 2019 September - sponsored by Digia
AWS Finland Meetup 2019 September - sponsored by DigiaAWS Finland Meetup 2019 September - sponsored by Digia
AWS Finland Meetup 2019 September - sponsored by Digia
 
AWS Finland meetup 2019 september - sponsored by Zalando
AWS Finland meetup 2019 september - sponsored by ZalandoAWS Finland meetup 2019 september - sponsored by Zalando
AWS Finland meetup 2019 september - sponsored by Zalando
 
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer story
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer storyAWS Stockholm Meetup June 2019 - Cybercom DeepRacer story
AWS Stockholm Meetup June 2019 - Cybercom DeepRacer story
 
Serverless Days Helsinki 2019 Rolf Koski - Business Driven Availability
Serverless Days Helsinki 2019 Rolf Koski - Business Driven AvailabilityServerless Days Helsinki 2019 Rolf Koski - Business Driven Availability
Serverless Days Helsinki 2019 Rolf Koski - Business Driven Availability
 
AWS Finland Meetup 2019 April
AWS Finland Meetup 2019 AprilAWS Finland Meetup 2019 April
AWS Finland Meetup 2019 April
 
AWS Community Day 2019 - Business Driven Availability
AWS Community Day 2019 - Business Driven AvailabilityAWS Community Day 2019 - Business Driven Availability
AWS Community Day 2019 - Business Driven Availability
 
Match AWS Pori - Rolf Koski - Cybercom
Match AWS Pori - Rolf Koski - CybercomMatch AWS Pori - Rolf Koski - Cybercom
Match AWS Pori - Rolf Koski - Cybercom
 
AWS Finland meetup 2018 August
AWS Finland meetup 2018 AugustAWS Finland meetup 2018 August
AWS Finland meetup 2018 August
 
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...
AWS Community Day Nordics 2018 - Aino Health: Transition to serverless and le...
 
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...
AWS Community Day Nordics 2018 - Vivek Balakrishnan (Rovio): Learnings from g...
 
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...
AWS Community Day Nordics 2018 - Alexander Schachtschabel (Dazzle Rocks): Big...
 
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...
AWS Community Day Nordics 2018 - Saku Vaittinen (VR): Data driven public tran...
 
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...
AWS Community Day Nordics 2018: Rolf Koski - Building Successful Enterprise C...
 
AWS Finland meetup 2017 October
AWS Finland meetup 2017 OctoberAWS Finland meetup 2017 October
AWS Finland meetup 2017 October
 
AWS Finland meetup 2017 August
AWS Finland meetup 2017 AugustAWS Finland meetup 2017 August
AWS Finland meetup 2017 August
 
AWS Finland User Group Meetup 2017-05-23
AWS Finland User Group Meetup 2017-05-23AWS Finland User Group Meetup 2017-05-23
AWS Finland User Group Meetup 2017-05-23
 

Recently uploaded

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 

Recently uploaded (20)

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 

AWS Big Data at Yle Drives Strategic Decisions

  • 1. AWS Big Data in Everyday Use at Yle Saku Vaittinen, Yle / Jukka Dahlbom, Webscale 9.2.2017
  • 2. Yle Data Cloud • Helps to create better user experience • Content development • Service optimization • Recommendation • Marketing automation • Purpose to create broad and real-time data for strategic decisions and actions • Reachability, demographics
  • 3.
  • 4.
  • 5. Yle Data Cloud Data => Information => Knowledge ● Daily predictions, how they match with reality, accuracy over time ● Identify control set of users that can be used as a reference point ● Combine Internet metrics with analog TV measurements
  • 6. Yle Data Cloud ● What we measure? ○ Page hits ○ Heartbeats ○ Social media ○ Articles read, time spent with media, date / time / location ■ AMR (Average Minute Rating) ○ Genre ○ Age groups
  • 7.
  • 8. Content User information Behavior Data Cloud Relational Data Hub Sources Usage Data hub Raw data PublishingStructured (Data vault) Web events Dashboards Recommen- dations Strategy Marketing automation Panel
  • 9. S3 Yle Analytics Pipeline - Current situation Elastic Beanstalk Kinesis Streams Analytics Collector Web events ~100 mill./day Kinesis Firehose RedShift Lambda S3 EMR Daily archiving Areena Recommendation Delay: 10-15 minutes Small batch sizes Compression and larger file sizes. ~90 Gb per day CloudWatch Dashboards Small sized files Delay: ~minute
  • 10. Managed and/or serverless processing? ● Firehose used for data ingestion to Redshift ● Inflexible for faster analytics needs (< 2 minutes) ● Kinesis + Lambda consumers for both S3 archiving and fast lane analysis (TBD).
  • 11. DevOps in analytics pipeline ● Baseline support from dedicated operations team ● Terraform for infrastructure management ● Most other Yle services run as Dockerized APIs in ECS cluster.
  • 12. Redshift consumers ● Data mart (Postgre) for web-accessed precomputed results ● Scheduled lambdas for very light queries ● Lambda-driven task containers for long but memory light queries ● Lambda-started EC2 instances for memory intensive computing. (Recommendations, user classifications) ● Data scientists running exploratory queries
  • 13. Redshift performance ● Default user group is limited to 5 concurrent queries ● Set up WLM queues for different workloads, split by usage. ● Isolate data scientists into separate WLM queue that doesn’t block scheduled activity.
  • 14. Lambda for batch queries? ● Fast, serverless, stateless, cheap, reactive. ● Limited by 300s max timeout ● Unreliable in high load situations.
  • 15. Lambda-driven task containers for batch ● Lambda allows reactive and scheduled running of tasks. ● Task containers not limited by execution timeouts. ● Logging and monitoring support for ECS containers is already there (ELK).
  • 16. Why not use AWS Batch? ● AWS Batch wasn’t available when batch containers were first needed - published in re:Invent Dec 2016 ● Not yet supported in Terraform v0.8.5. ● Once support is there, switch to AWS Batch from homebrew resources.
  • 17. Monitoring? ● Cloudwatch Alerts for component red/green health ● Cloudwatch Dashboards for overall graph view ● Kibana and Cloudwatch Logs (with custom scripting) for log management.