Ceilometer
CERN use case:
● CERN delivers resources in form of virtual machines and via traditional
batch and Grid computing
● Individual batch nodes execute payload from different users and
communities
● Accounting should cover both use cases
● Interesting metrics include
● What is the resource usage of experiment A during December ?
● What is the resource usage of user B last year ?
● Accounting information has to be reported to Grid bodies (WLCG) by
experiment
Facts:
● Details of user's jobs present in batch accounting database already
● It is a huge DB with around 400,000 records being added everyday
Solution
● Use of ceilometer as single source of truth for accounting data
● Batch data is put in the ceilometer database for accounting purpose
CERN's idea to use ceilometer
Ceilometer: Current Implementation
Ceilometer
Agent Central
With batch Plugin
Ceilometer
Collector
for batch Data
Ceilometer
Database
(mongodb)
RabbitMQRabbitMQ-LSF
Ceilometer
Agent
Central
Ceilometer
Collector
Ceilometer
API
Ceilometer
Agent
Compute
batch specific
instances
Batch
accounting
database
IaaS specific
instances
Ceilometer: Current Implementation
● Written a ceilometer-agent-central plugin, which polls
the batch accounting database for unpublished records
● The unpublished records are then pushed to metering
queue (RabbitMQ)
● The ceilometer-collector instance consumes the
messages from the metering queue and inserts them in
the ceilometer database (mongodb)
Ceilometer: Current Implementation
● In order to decrease the load on the openstack
messaging server, the batch data is being pushed to a
different messaging server than the one to which other
openstack messages (e.g. those from agent-compute)
go.
● This means that there are dedicated instances of
agent-central and collector for VM and batch metering
● The collectors writes the data into a single database
Ceilometer: LSF Data Statistics
● The batch plugin is run once per hour if the previous
run has finished
● Most runs do not have any unpublished data as data in
the batch accounting database arrives in bursts
● Most data of the day is published to the messaging
server within 2 runs of around 200,000 job records
each
● It takes around 5 hrs to complete one such run
Ceilometer: Batch Data Statistics
● The average rate of record publishing to the batch
rabbitmq server is 11 Hz. This includes
– the time to read unpublished records,
– push them to the rabbit-server and
– marking records in batch accounting database as
published
● Most of this time is spent in records publishing only
● The time for activities other than publishing is
minuscule
● The grow rate of the mongodb database is about
2GB/day

Ceilometer lsf-intergration-openstack-summit

  • 1.
    Ceilometer CERN use case: ●CERN delivers resources in form of virtual machines and via traditional batch and Grid computing ● Individual batch nodes execute payload from different users and communities ● Accounting should cover both use cases ● Interesting metrics include ● What is the resource usage of experiment A during December ? ● What is the resource usage of user B last year ? ● Accounting information has to be reported to Grid bodies (WLCG) by experiment Facts: ● Details of user's jobs present in batch accounting database already ● It is a huge DB with around 400,000 records being added everyday Solution ● Use of ceilometer as single source of truth for accounting data ● Batch data is put in the ceilometer database for accounting purpose
  • 2.
    CERN's idea touse ceilometer
  • 3.
    Ceilometer: Current Implementation Ceilometer AgentCentral With batch Plugin Ceilometer Collector for batch Data Ceilometer Database (mongodb) RabbitMQRabbitMQ-LSF Ceilometer Agent Central Ceilometer Collector Ceilometer API Ceilometer Agent Compute batch specific instances Batch accounting database IaaS specific instances
  • 4.
    Ceilometer: Current Implementation ●Written a ceilometer-agent-central plugin, which polls the batch accounting database for unpublished records ● The unpublished records are then pushed to metering queue (RabbitMQ) ● The ceilometer-collector instance consumes the messages from the metering queue and inserts them in the ceilometer database (mongodb)
  • 5.
    Ceilometer: Current Implementation ●In order to decrease the load on the openstack messaging server, the batch data is being pushed to a different messaging server than the one to which other openstack messages (e.g. those from agent-compute) go. ● This means that there are dedicated instances of agent-central and collector for VM and batch metering ● The collectors writes the data into a single database
  • 6.
    Ceilometer: LSF DataStatistics ● The batch plugin is run once per hour if the previous run has finished ● Most runs do not have any unpublished data as data in the batch accounting database arrives in bursts ● Most data of the day is published to the messaging server within 2 runs of around 200,000 job records each ● It takes around 5 hrs to complete one such run
  • 7.
    Ceilometer: Batch DataStatistics ● The average rate of record publishing to the batch rabbitmq server is 11 Hz. This includes – the time to read unpublished records, – push them to the rabbit-server and – marking records in batch accounting database as published ● Most of this time is spent in records publishing only ● The time for activities other than publishing is minuscule ● The grow rate of the mongodb database is about 2GB/day