PyConUK 2018 - Journey from HTTP to gRPC

A journey from HTTP
to gRPC
Tati Al-Chueyr
@tati_alchueyr
PyCon UK 2018
15 September 2018, Cardiff

tati_alchueyr.__doc__
● computer engineer by Unicamp
● senior data engineer at BBC
(previously engineer at EF, globo.com &
Ministry of Science and Technology of Brazil)
● open source enthusiast
● pythonist since 2003
● Amanda’s mummy

bbc.datalab.mission
“Bring the BBC’s data together accessible
through a common platform, along with flexible
and scalable tools to support machine learning
to enable content enrichment and deeper
personalization”

recommendation: the magic
user
identifier
personalised
content

recommendation: monolithic approach
user
identifier
personalised
content
the monolith
database

recommendation: microservices approach
user
identifier
personalised
content
view orchestrate recommend
content database

once upon a time,
in a not far away land...

there was the plan of building microservices

even before birth, each microservice is
entitled a personality

CPU intensive microservice
characteristics
● consumes lots of
CPU
examples
● fors and whiles
● serialisation
● recommendation
models
● mathematical
computation

I/O intensive microservice
characteristics
● consumes lots of I/O
examples
● communicates to
○ databases
○ other APIs
○ files

all the
microservices
will be built with
the same
technology

the recommender platform started being built
content database
the first
generation of
microservices
was developed
using
version
1

the microservices started to grow up
content database
they happily
learned
how to talk to
each other using
JSON over
HTTP
http http http
http http
json json json
json
json
version
1

and their contracts were defined using swagger
content database
version
1

however...
content database
althought for
some tasks they
performed well,
for others they
would have
huge latencies
if there were
concurrent
users
version
1

import time
from flask import Flask
app = Flask(__name__)
@app.route("/io")
def io():
time.sleep(2)
return "IO bound task completed"
@app.route("/cpu")
def cpu():
total = 0
for i in range(19840511):
total += i*i
return "CPU bound task completed"
$ FLASK_APP=flask_api.py flask runFlask==1.0.2
flask_api.py

$ boom http://127.0.0.1:5000/io -c 10 -n 10
Running 10 queries - concurrency 10
-------- Results --------
Successful calls 10
Total time 2.0495 s
Average 2.0302 s
Fastest 2.0227 s
Slowest 2.0377 s
Amplitude 0.0150 s
Standard deviation 0.004716
RPS 4
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
RPS: Request Per Second
BSI: Boom Speed Index

$ boom http://127.0.0.1:5000/cpu -c 10 -n 10
-------- Results --------
Successful calls 10
Total time 19.2738 s
Average 18.8230 s
Fastest 18.1451 s
Slowest 19.2623 s
Amplitude 1.1172 s
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------

why?
Flask uses Werkzeug as WSGI
● By default Werkzeug uses threads
Python has GIL (Global Interpreter Lock)
● Only one thread can execute python bytecode a time
● the threads fight for CPU - blocking one another all the way,
returning the response to the user almost in the same time
● This does not affect I/O - but does affect computations

the flask microservices were in crisis...

along came
recommend
content database
spawning and
managing
multiple
processes of
each
microservice -
bringing an
overall relief
version
2
view orchestrate

a gunicorn microservice
version
2
micoservice master
worker
worker
worker
n
and with
synchronous
gunicorn
workers, each
microservice
could now
respond to n
concurrent
requests
s
s
s

import time
@app.route("/io")
def io():
time.sleep(2)
@app.route("/cpu")
def cpu():
total = 0
total += i*i
$ gunicorn --bind 0.0.0.0:5000 --workers 1 flask_api:appFlask==1.0.2
gunicorn==19.9.0
flask_api.py
unchanged
changed

$ boom http://127.0.0.1:5000/io -c 10 -n 10
-------- Results --------
Successful calls 10
Average 11.0316 s
Fastest 2.0171 s
Slowest 20.0579 s
Amplitude 18.0407 s
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
1 x
s

$ boom http://127.0.0.1:5000/io -c 10 -n 10
-------- Results --------
Successful calls 10
Average 6.0398 s
Fastest 2.0494 s
Slowest 10.0412 s
Amplitude 7.9918 s
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
2 x
s

$ boom http://127.0.0.1:5000/io -c 10 -n 10
-------- Results --------
Successful calls 10
Total time 2.0541 s
Average 2.0300 s
Fastest 2.0233 s
Slowest 2.0432 s
Amplitude 0.0199 s
RPS 4
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
10 x
s

$ boom http://127.0.0.1:5000/cpu -c 10 -n 10
-------- Results --------
Successful calls 10
Average 10.0470 s
Fastest 1.8415 s
Slowest 18.3008 s
Amplitude 16.4593 s
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
1 x
s

$ boom http://127.0.0.1:5000/cpu -c 10 -n 10
-------- Results --------
Successful calls 10
Average 6.2530 s
Fastest 2.8555 s
Slowest 10.5788 s
Amplitude 7.7232 s
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
3 x
s

$ boom http://127.0.0.1:5000/cpu -c 10 -n 10
-------- Results --------
Successful calls 10
Average 7.6599 s
Fastest 5.9517 s
Slowest 10.1776 s
Amplitude 4.2260 s
RPS 0
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------
5 x
s

why?
Gunicorn, by default, spawns synchronous processes for each
worker
● the number of workers limits the amount of concurrent requests, for
this reason I/O was affected in a negative way (compared to
previous “no limit”)
● for CPU, there is a significant improvement to pure Flask/Werkzeug,
since the service can now handle requests concurrently without
having to wait for all of them to finish
● There is a limit of the # workers (suggested: (2 x $num_cores) + 1).
After a point they will start thrashing system resources decreasing
the throughput of the entire system

the microservices were doing their
work, until...

stronger forces decided to replace them by
recommend
content database
version
1
view orchestrate
with the
promise of
higher
performance
and
free type
checking

along came the new generation...
content database
there was a
learning curve
towards gRPC
version
3

the microservices started to grow up
content database
and in two
months time
most of the
microservices
started talking
protocol buffers
over tcp
http tcp tcp
tcp http
json pb3 pb3
pb3
json
version
3
pb3
json

syntax = "proto3";
message Empty {
}
service Sample {
rpc IntenseProcess(Empty) returns (Empty) {}
rpc IntenseIO(Empty) returns (Empty) {}
}
grpc.proto
$ pip install grpcio==1.15.0 grpcio-tools==1.15.0
$ python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=.
grpc.proto
$ ls
grpc_pb2.py
grpc_pb2_grpc.py The 2 in pb2 indicates that the generated code
is following Protocol Buffers Python API
version 2. It has no relation to the Protocol
Buffers Language version, which is the one
indicated by syntax the .proto ﬁle.

grpc_server.py
$ python grpc_server.py
grpcio==1.15.0
grpcio-tools==1.15.0
import time
from concurrent import futures
import grpc
import grpc_pb2
import grpc_pb2_grpc
class SampleServicer(grpc_pb2_grpc.SampleServicer):
def IntenseProcess(self, request, context):
total = 0
total += i*i
return grpc_pb2.Empty()
def IntenseIO(self, request, context):
time.sleep(2)
return grpc_pb2.Empty()
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
grpc_pb2_grpc.add_SampleServicer_to_server(SampleServicer(), server)
print('Starting server. Listening on port 5001.')
server.add_insecure_port('[::]:50051')
server.start()
try:
while True:
time.sleep(86400)
except KeyboardInterrupt:
server.stop(0)

our path
flask
gunicorn
grpc
start “end”

next steps
flask
gunicorn
grpc
start “end”
?

asynchronous workers
version
2
micoservice master
worker
worker
worker
n
and the sync
workers were
replaced by
async ones
a
a
a

import time
@app.route("/io")
def io():
time.sleep(2)
@app.route("/cpu")
def cpu():
total = 0
total += i*i
$ gunicorn --bind 0.0.0.0:5000 --workers 1 flask_api:app
-k gevent
Flask==1.0.2
gunicorn==19.9.0
flask_api.py
unchanged
changed

$ boom http://127.0.0.1:5000/io -c 10 -n 10
-------- Results --------
Successful calls 10
Total time 2.0366 s
Average 2.0219 s
Fastest 2.0185 s
Slowest 2.0252 s
Amplitude 0.0066 s
RPS 4
BSI :(
-------- Status codes --------
Code 200 10 times.
-------- Legend --------

with processes and not threads

future
flask
gunicorn
grpc
start “end”
?
diversity

images attribution
Creative Commons Attribution:
● Mobile phone: https://commons.wikimedia.org/wiki/File:Mobile_phone_font_awesome.svg
● Database: https://commons.wikimedia.org/wiki/File:Noun_project_database_1276526_cc.svg
● Cute monsters inspiration: https://all-free-download.com/free-vector/download/cute-colorful-monsters_6816778.html
● Hats: http://www.clker.com/clipart-white-hard-hat-9.html
● Zoom in: https://icons8.com/icon/714/zoom-in

https://findouthow.datalab.rocks
join us:

thank you! diolch!
questions?
Tati Al-Chueyr
@tati_alchueyr
PyCon UK 2018
15 September 2018, Cardiff

PyConUK 2018 - Journey from HTTP to gRPC

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PyConUK 2018 - Journey from HTTP to gRPC

Similar to PyConUK 2018 - Journey from HTTP to gRPC (20)

More from Tatiana Al-Chueyr

More from Tatiana Al-Chueyr (19)

Recently uploaded

Recently uploaded (20)

PyConUK 2018 - Journey from HTTP to gRPC