3. @tati_alchueyrMulti
NIX Conf
tati.__doc__
● Brazilian living in London since 2014
● Senior Data Engineer at the BBC Datalab team
● Graduated in Computer Engineering at Unicamp, Brazil
● Passionate software developer for 16 years
● Experience in the private and public sectors
● Developed software for Medicine, Media and Education
3
4. @tati_alchueyrMulti
NIX Conf
I ❤ Ukraine
In 2019, Amanda and I went to Kharkiv for 3 days, when:
● We were Keynote Speakers at OctopusCon
● We lectured at the Kharkiv National University of Radio Electronics
● I was really impressed with the Ukranian Tech Community
● We had a Dolphin therapy session at the Nemo Dolphinarium
Credit: @obestwalter
Credit: OctopusCon
4
5. @tati_alchueyrMulti
NIX Conf
BBC: British Broadcasting Corporation
● Founded in 1922
● In the UK…
○ The BBC has no advertisements
○ If a resident wants to watch the BBC, they pay a TV
License
● Values
○ Independent, impartial and honest
○ Audiences are at the heart of everything we do
● Purpose
Inform Educate Entertain+ +
5
6. @tati_alchueyrMulti
NIX Conf
bbc.stats()
● BBC TV reaches 91% UK adult population
● BBC News reaches 426 million global audience weekly
Reference 1: BBC
Reference 2: BBC
Image Credit: BBC6
8. @tati_alchueyrMulti
NIX Conf
BBC.
Vision
For the BBC to be a leader in Machine Learning that
delights audiences and prioritises the needs of
individuals and society over corporations and states.
Mission
To develop and deploy Machine Learning at BBC scale
so that teams can tailor services to individuals whilst
upholding our editorial values.
8
22. @tati_alchueyrMulti
NIX Conf
1-2 months of work:
● Collected data (quick-and-dirty™ scripts)
● Compared existing Python Factorisation Machines libraries (winner: LightFM)
● Trained and predicted recommendations (quick-and-dirty™ scripts)
● Implemented a qualitative experiment tool
● Recruited volunteers to join the qualitative experiment
● Ran qualitative experiment, comparing:
○ External provider recommendations
○ Our own Factorization Machines-powered recommendations
The prototype
22
23. @tati_alchueyrMulti
NIX Conf
Qualitative experiment: how
Who
● ~30 test users recruited
○ Internal BBC employees
○ Under 35
How
● Two sets with 9 recommendations each:
○ External provider
○ Internal factorisation machines
● Users, without knowing the origin of the recs, had to:
○ choose “the best”, “both”, or “neither”
○ explain why
23
26. @tati_alchueyrMulti
NIX Conf
Productionising machine learning
Configuration
Data Collection
and
Transformation
Feature Extraction
Data
Verification
Machine
Resource
Management
Serving
Infrastructure
Monitoring
Process Management
Tools
Analysis ToolsML Code
Image copied from presentation by Googler @mpyeager
26
27. @tati_alchueyrMulti
NIX Conf
Machine learning workflow
Input
Processing
Output
User activity data Content metadata
Recommendations
Machine Learning model
training
Predict recommendations
27
28. @tati_alchueyrMulti
NIX Conf
Machine learning workflow
Input
Processing
Output
User activity data Content metadata
Business Rules, part I - Non-personalised
- Recency
- Availability
- Excluded Masterbrands
- Excluded genres
Business Rules, part II - Personalised
- Already seen items
- Local radio (if not consumed previously)
- Specific language (if not consumed previously)
- Episode picking from a series
- Diversification (1 episode per brand/series)
Recommendations
Machine Learning model
training
Predict recommendations
28
29. @tati_alchueyrMulti
NIX Conf
Steps to be done in the workflows, before the API
Input
Processing
Output
User activity data Content metadata
Business Rules, part I - Non-personalised
- Recency
- Availability
- Excluded Masterbrands
- Excluded genres
Business Rules, part II - Personalised
- Already seen items
- Local radio (if not consumed previously)
- Specific language (if not consumed previously)
- Episode picking from a series
- Diversification (1 episode per brand/series)
Recommendations
Machine Learning model
training
Predict recommendations
29
31. @tati_alchueyrMulti
NIX Conf
model
Recommendation API strategies
API
API
user
activity
content
metadata
cached
recs
A. On the fly
B. Precompute
predicts & applies rules
retrieves pre-computed recommendations
Goal:
1500 requests/s
with P95 responses
< 60 ms
31
32. @tati_alchueyrMulti
NIX Conf
Recommendation API: load performance
On the fly Precomputed Precomputed
Concurrent load tests
requests/s
50 50 1500
Success percentage 63.88% 100% 100%
Latency of p50 (success) 323.78 ms 1.68 ms 4.75 ms
Latency of p95 (success) 939.28 ms 3.21 ms 57.53 ms
Latency of p99 (success) 979.24 ms 4.51 ms 97.49 ms
Maximum successful
requests per second
23 50 1500
Goal:
1500 requests/s
with P95 responses
< 60 ms
Machine type: c2-standard-8, Python 3.7, Sanic workers: 7, Prediction threads: 1, vCPU cores: 7, Memory: 15 Gi, Deployment Replicas: 1
32
33. @tati_alchueyrMulti
NIX Conf
model
Strategies to serve recommendations
API
API
user
activity
content
metadata
cached
recs
A. On the fly
B. Precompute
predicts & applies rules
retrieves pre-computed recommendations
33
34. @tati_alchueyrMulti
NIX Conf
Steps to be done in the workflows, before the API
Input
Processing
Output
User activity data Content metadata
Business Rules, part I - Non-personalised
- Recency
- Availability
- Excluded Masterbrands
- Excluded genres
Business Rules, part II - Personalised
- Already seen items
- Local radio (if not consumed previously)
- Specific language (if not consumed previously)
- Episode picking from a series
- Diversification (1 episode per brand/series)
Precomputed
recommendations
Machine Learning model
training
Predict recommendations
34
43. @tati_alchueyrMulti
NIX Conf
Limitation of Apache Airflow
Issue:
Depending on the
volumes of data, a single
PythonOperator task
which usually takes
10 min could take almost
3h!
Consequences:
Overall delay
Blocked worker
43
44. @tati_alchueyrMulti
NIX Conf
Limitation of Apache Airflow
Time estimations (in seconds) to predict recommendations using a c2-standard-30 instance (30 vCPU and 120 GB RAM)
44
45. @tati_alchueyrMulti
NIX Conf
Limitation of Apache Airflow
Time estimations (in seconds) to predict recommendations using a c2-standard-30 instance (30 vCPU and 120 GB RAM)
2h to predict
recommendations for
10k users
What about 5 million
users - or more?
45
46. @tati_alchueyrMulti
NIX Conf
Limitation of Apache Airflow: solutions
Delegating processing to other services
● Tasks which scale vertically (better hardware)
○ Airflow Compute Engine (Virtual Machine) Operator (GceInstanceStartOperator)
○ Airflow Kubernetes Pod Operator (GKEPodOperator)
● Tasks which scale horizontally (can be split and distributed in multiple nodes)
○ Airflow Dataflow Operator (Google Dataflow, Apache Beam )
○ Airflow Dataproc Operator (Google Dataproc, Apache Spark & Hadoop)
46
51. @tati_alchueyrMulti
NIX Conf
Apache Beam: overview of Dataflow job
Parallel processing “effortlessly”
Image from the book “Google Cloud Platform In Action” by JJ Geewax, Chapter 20
51
54. @tati_alchueyrMulti
NIX Conf
Adoption of Apache Beam & Dataflow
“Serverless” parallel processing of 41,258,135 items (27.32 GB) with
Python in 1min 24s using 10 default workers
54
55. @tati_alchueyrMulti
NIX Conf
Pure Airflow
PythonOperator in
Cloud Composer
DataflowOperator
running a Beam
pipeline within
Dataflow
episode
availability episode
s/PythonOperator/DataflowOperator
Computation time reduced almost by one
order of magnitude
Document
type
PythonOperator DataflowOperator Performance
gain
episode 60 min 6 min 90%
availability
episode
12 min 5 min 58%
55
59. @tati_alchueyrMulti
NIX Conf
To Beam or not to Beam?
● 8.4 GiB distributed in 130 parquet files
● Task: read only one of the columns and export that in new files
● Three implementations:
○ Single-threaded PyArrow in my computer (Quad-Core 16 GB RAM)
○ Dataflow autoscaling, up to 10 default workers
○ Dataflow fixed amount of 10 workers
● What is the most efficient vCPU, memory and time-wise?
59
60. @tati_alchueyrMulti
NIX Conf
To Beam or not to Beam?
PyArrow Dataflow
(autoscaling)
Dataflow
(fixed workers)
Time 3m56.355s 12m27.314s 7m44.518s
Total vCPU 0.05 vCPU hr 0.997 vCPU hr 0.979 vCPU hr
Total memory 0.016 GB hr 3.739 GB hr 3.673 GB hr
60
61. @tati_alchueyrMulti
NIX Conf
Does a better machine means faster?
n1-standard-1:
● 1 vCPU
● 3.75 GB RAM
n1-standard-4
● 4 vCPU
● 15 GB RAM
61
67. @tati_alchueyrMulti
NIX Conf
Error message from worker: ConnectionReset
Solutions for memory-intensive beam transformations
● Use custom machine type with extended memory
● Use shared memory feature from Beam 2.24
67
Founded in 1922
“Our organisation exists in order to serve individuals and society as a whole rather than a small set of stakeholders.”
UK population: 66.44 million
Ukraine: ~ 42.22 million
World wide population: 7.7 billion people as of April 2019
Image from Seven worlds, one planet
~12 million penguins live in Antarctica
https://oceanites.org/wp-content/uploads/2019/06/SOAP-2019-Online.pdf
Multi-disciplinary team
Architecture
Data science
Editorial
Engineering
Product Management
Project Management