Gnocchi v4 performance improvements and benchmark results

past and present...
gord[at]live.ca
@gord_chung

v4 features
□ simplified scheduling
□ less pandas, more numpy
□ Redis incoming driver
□ In-memory incoming Ceph
driver
□ Other general features:
■ http://gnocchi.xyz/releasenotes/4.0.html
■ http://gnocchi.xyz/releasenotes/unreleased.html

scheduling
incoming data sharded
into sacks to allow simple
division of work across
metricd workers

numpy
old
Pandas - a monolithic, all-in-one, data
analysis toolkit
new
Numpy - a lightweight, high-performance,
N-dimensional array (and a bit more)
library

in-memory
the memory is mightier.
leverage Redis driver or
LevelDB/RocksDB
internals for Ceph

benchmarks
back with another one of those block rockin’ beats

v2 & v3
node1
- OpenStack controller node
- Ceph Monitor Service
- Redis (coordination)
node2
- OpenStack Compute Node
- Ceph OSD node (10 OSDs + SSD
Journal)
- 18 metricd (24 in v2)
node3
- Gnocchi API (32 workers)
Journal)
node4
Journal)
- PostgreSQL (
environment
v4.x
node1
- OpenStack controller node
- Ceph Monitor Service
- Redis
- MySQL
node2
Journal)
node3
Journal)
- Gnocchi API (32 workers)
- 18 metricd
all nodes are physical servers:
- 24CPU (48 hyperthreaded)
- 256GB memory
- 10K disks
- 1GB network
- CentOS 7.1
less services and hardware when
running v4. all gnocchi services on
single node
all tests use Ceph as a storage
driver for aggregates.

data generated using benchmark
tool in client (modified to use
threads). 4 clients w/ 12 threads
running simultaneously.
write throughput
total
datapoints
written per
second.
(higher is
better)

number of
requests
made per
second.
(higher is
better)
write throughput

test case 1
1K resources, 20 metrics
each. flood Gnocchi with
60 individual points per
metric. 1.2M calls/run.
run it a few times.

time to
POST 1.2M
individual
measures
for 20K
metrics to
Gnocchi.
post time
v3.1 had anomaly that caused
degradation over time.

processing time
v4 tests use 18 metricd, v3 test
uses 54 metricd
time to
aggregate
all
measures
according to
policy.
(lower is
better)

v4 only comparison
processing time

processing time
number of
recorded,
unprocessed
measures
over a single
run
poor scheduling logic resulted
inefficient handling of many
tiny objects in v3.

processing time
number of
recorded,
unprocessed
measures
over a single
run backlog size dependent on
both API’s ability to write
data and metricd’s ability
to process it.

test case 2
1K resources, 20 metrics
60 batched points per
metric. 20K calls/run. run
it a few times.

processing time
v4 tests use 18 metricd for 3x8
aggregates/metric, v2 and v3
tests, use 72 and 54 metricd
respectively
time to
aggregate
all
measures
according to
policy.
(lower is
better)

aggregation time
time to
aggregate 60
measures of
a metric into
3x8
aggregates
(lower is
better)
average time reflects a
combination of scheduling
efficiency, computation
efficiency and IO performance.

test case 3
500 resources, 20 metrics
720 batched points per
metric. 10K calls/run. run
it a few times.

time to
aggregate
all
measures
according to
policy.
(lower is
better)
processing time
v4 tests use 18 metricd for 3x8
aggregates/metric. v2 and v3
tests, use 72 metricd

aggregation time
time to
aggregate 720
measures of a
metric into
3x8
aggregates
(lower is
better)
computation efficiency improved
for larger series. ~3x
improvement for 60 points and
~6x improvement for 720 points

some more numbers
peep this...

time to
aggregate
metric with
varying
unbatched
measure
sizes (lower
is better)
processing time
numbers represent optimal
performance. benchmark was
taken under zero load.

time to
retrieve a
single time
series using
curl and
client
(lower is
better)
query time
client overhead attributed to
but not limited to formatting
no significant performance
difference vs v3

time to
aggregate
all
measures
according to
default
‘medium’
policy.
(lower is
better)
default configurations
v3 tests use 54 metricd.
v4 tests use 18 metricd.
- v3 medium policy:
- minute/hourly/daily rollups
- 8 aggregates each
- v4 medium policy:
- minute/hourly rollups
- 6 aggregates each

thanks!
Any questions?
You can find me at
@gord_chung
gord[at]live.ca
?

Credits
Special thanks to all the people who
made and released these awesome
resources for free:
□ Presentation template by
SlidesCarnival

Gnocchi v4 performance improvements and benchmark results

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Gnocchi v4 performance improvements and benchmark results

Similar to Gnocchi v4 performance improvements and benchmark results (20)

Recently uploaded

Recently uploaded (20)

Gnocchi v4 performance improvements and benchmark results