1. Ceilometer
CERN use case:
● CERN delivers resources in form of virtual machines and via traditional
batch and Grid computing
● Individual batch nodes execute payload from different users and
communities
● Accounting should cover both use cases
● Interesting metrics include
● What is the resource usage of experiment A during December ?
● What is the resource usage of user B last year ?
● Accounting information has to be reported to Grid bodies (WLCG) by
experiment
Facts:
● Details of user's jobs present in batch accounting database already
● It is a huge DB with around 400,000 records being added everyday
Solution
● Use of ceilometer as single source of truth for accounting data
● Batch data is put in the ceilometer database for accounting purpose
3. Ceilometer: Current Implementation
Ceilometer
Agent Central
With batch Plugin
Ceilometer
Collector
for batch Data
Ceilometer
Database
(mongodb)
RabbitMQRabbitMQ-LSF
Ceilometer
Agent
Central
Ceilometer
Collector
Ceilometer
API
Ceilometer
Agent
Compute
batch specific
instances
Batch
accounting
database
IaaS specific
instances
4. Ceilometer: Current Implementation
● Written a ceilometer-agent-central plugin, which polls
the batch accounting database for unpublished records
● The unpublished records are then pushed to metering
queue (RabbitMQ)
● The ceilometer-collector instance consumes the
messages from the metering queue and inserts them in
the ceilometer database (mongodb)
5. Ceilometer: Current Implementation
● In order to decrease the load on the openstack
messaging server, the batch data is being pushed to a
different messaging server than the one to which other
openstack messages (e.g. those from agent-compute)
go.
● This means that there are dedicated instances of
agent-central and collector for VM and batch metering
● The collectors writes the data into a single database
6. Ceilometer: LSF Data Statistics
● The batch plugin is run once per hour if the previous
run has finished
● Most runs do not have any unpublished data as data in
the batch accounting database arrives in bursts
● Most data of the day is published to the messaging
server within 2 runs of around 200,000 job records
each
● It takes around 5 hrs to complete one such run
7. Ceilometer: Batch Data Statistics
● The average rate of record publishing to the batch
rabbitmq server is 11 Hz. This includes
– the time to read unpublished records,
– push them to the rabbit-server and
– marking records in batch accounting database as
published
● Most of this time is spent in records publishing only
● The time for activities other than publishing is
minuscule
● The grow rate of the mongodb database is about
2GB/day