1. The document proposes a low-code solution for billing in a private cloud using open-source tools like KillBill and Prometheus.
2. It outlines an initial architecture that would ingest usage metrics from products, aggregate the data, and publish billing events to KillBill for invoicing and payments.
3. Exporters would collect metrics from products like S3 and ingress and expose them in a format readable by Prometheus for long-term storage and analysis by the billing system.
6. What is FinOps – management point
“FinOps” is a movement that advocates:
1. a collaborative working relationship between DevOps and
Finance data-driven management of infrastructure spending
2. Transparency between IT and finance
3. Cost efficiency, profitability and product delivery pace
7. What is FinOps – IT point
Near real time reporting
+
Just-in-time processes
+
IT and finance teams work together
+
Shared cloud dictionary
=
FinOps
+
Trust
=
Balance between speed of changes, availability of services and cloud costs
8. Opensource FinOps for public cloud
Public cloud
$$$
Mock API
https://cutt.us/tKkrA
$$$
9. Opensource FinOps for private cloud
Private cloud
Doesn’t exist!!!
Neither IAAS nor PAAS!
$$$
10. IAAS billing
• The best part is dead
• Ceilometer is not the best + deprecated
• Alive projects aren’t opensource
ISP solutions!
11. Private cloud FinOps tools
1 relevant link on 5th page!
2 relevant link on 2nd page!
16. • Cloud/Virtualization/Backup/Storage software Extractor Templates –
AWS, GCP, Openstack, Vmware, Veam, NetApp…
• API for custom extractors
• Near real-time extraction/billing tools
• Lookup management – organization hierarchy
• Rate management
• Budget management, chargeback/showback
• Account management with CMDB integration
• Reporting with drilldown
• RBAC + SSO
Exivity features
Very small dev team
but perfect in-house
FinOps platform!
From organization to VM
Not a good option for Sber
+
no Open-Source!
17. Full billing cycle reference architecture
All
IAAS/PAAS
products
Billing event
ingestion
User/product
onboarding
ingestion
database
Event
aggregation
Publish to
billing API
Billing
18. Billing features
• Extensible product catalog
• Extensible product pricing models
• Extensible product cancelation policies
• Invoicing
• API
• Proven performance
• …………………………..
Too hard to implement from scratch!
20. Opensource BSS
Language License Community Features set API Documentation Huge customers
Java JBL dead rich Decent Bad No
Java Apache 2.0 alive huge Decent Good Yes
PHP AGPL-3 small rich Good Not bad Yes
22. KillBill deployment architecture
UI and analytic schema
PG or MySQL
OLTP schema
PG or MySQL
K8s cluster
KillBill
KillBill
KillBill
KaUI
Analytic plugin
4 core, 16 Gb
4 core, 16 Gb
23. KillBill features
• Flat and hierarchical account
• Tiered plans, multi-phase plans
• Plans versioning + deferred execution time
• Prepaid/postpaid billing
• Usage/subscription billing
• Invoicing
• Payment/taxation engine
• Entitlement and overdue decoupled engines
• Each service could be extended by plugins + set of plugins in the box
Very cool for complicated enterprises
24. KillBill used features
• Flat and hierarchical account
• Tiered plans, multi-phase plans
• Plans versioning + deferred execution time
• Prepaid/postpaid billing
• Usage/subscription billing
• Invoicing
• Payment/taxation engine
• Entitlement and overdue decoupled engines
• Each service could be extended by plugins + set of plugins in the box
• …
42. In-house billing implementation steps
• Plan catalog population
• Usage metrics delivery
• Warning about not-paid products
• Usage metrics population
43. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
ingestion
User /product
onboarding
ingestion
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Metrics
44. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
ingestion
User /product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Metrics
45. Metrics
1. Timeseries data
2. Value = usage value
3. Mandatory labels:
• Unique private cloud resource name – RN
• Metric type: counter or gauge
• Unit type from KillBill plan
47. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
ingestion
User /product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Metrics
48. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
consumption
User /product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Prometheus Exposition Format
Metrics VMAgent
YAML with scrape
endpoint
49. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
consumption
User /product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
VMAgent
YAML with scrape
endpoint
50. Billing data population concept
• Easier for IAAS/PAAS products
• Harder for IAAS/PAAS products
Pull billing data
Push metrics
Metrics exposition endpoint
Billing API interaction
51.
52. Billing data population
• Easier for IAAS/PAAS products
• Harder for IAAS/PAAS products
Pull billing data
Push metrics
Single metrics exposition endpoint
Billing API interaction
Developers are not happy with any approach!
State DB
Survive reboot
Buffer
flow
consumer
53. Generic connector
Что мы хотим?
Биллинг!
Что мы сделаем для этого?
Ничего!
Кто поработает за нас?
Биллинг!
•Inbound traffic
•Outbound traffic
•API calls count
56. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
consumption
User /product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
VMAgent
YAML with scrape
endpoint
57. Ingress exporter duties
1. Scan all ingresses labeled for billing
2. Extract RN from ingress path
3. Extract nginx_ingress_controller_bytes_sent_sum +
nginx_ingress_controller_request_size_sum + nginx_ingress_controller_request into download-
bytes_total, upload-bytes_total, api_call_count_total counter metrics
4. Label metrics with RN, unit and metric type
5. Write logs
6. Expose metrics in PEF via endpoint
Lines: 105
Тут скрин настройки nginx
58. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
consumption
User /product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
VMAgent
YAML with scrape
endpoint
59. Billing importer duties
1. Extract data from VictoriaMetrics just after previous successful
KillBill interaction
2. Aggregate event by RN, unit type and metric type – 24 hour interval
3. Write delta for counter and gauge usage to KillBill API in bulk mode
4. Write self-healing data
5. Write logs
6. Repeat each 60 seconds
Lines: 112
Thank you VM for INCREASE function
60. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
consumption
User /product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
VMAgent
YAML with scrape
endpoint
62. IIAS full billing cycle initial architecture
IAAS/PAAS
product
Billing event
consumption
User/product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
VMAgent
YAML with scrape
endpoint
63. Common product billing initial architecture
IAAS/PAAS
product
Billing event
consumption
User/product
onboarding
ingestion
Timeseries
database
Event
aggregation
Publish to
billing API
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
VMAgent
YAML with scrape
endpoint
Self-service
portal
Event log
New/Remove
user/product
events
New/Remove
product
events
New/Remove
user/product
events
Portal
connector
accounts and
subscriptions
Ingress
API calls
Metrics
API call count
Inbound traffic
Outbound traffic
Ingress
exporter
Billing importer
Usage metrics
Aggregated usage metrics
64. 1. Scan all ingresses labeled for billing
2. Extract RN from S3 service tenant namespace name
3. Extract get, head, post, put requests information from S3 /metrics endpoint
4. Merge get+head and put+post into get-head-requests_total and put-post-requests_total
counter metrics
5. Count delete requests just in case
6. Extract nginx_ingress_controller_bytes_sent_sum into download-bytes_total counter
metrics
7. Extract from S3 CLI current used space for tenant into bytes_sum to gauge metric
8. Label metrics with RN, unit and metric type
9. Write logs
10. Expose metrics in PEF via endpoint
Lines: 112
S3 exporter duties
Тут скрин настройки nginx
No changes in S3 service source code!
Mediation microservice!
65. Common product billing initial architecture
IAAS/PAAS
product
Timeseries
database
Billing
XML with
plans
Prometheus Exposition Format
Metrics
API call count
Inbound traffic
Outbound traffic
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
Self-service
portal
Event log
New/Remove
user/product
events
Portal
connector
Billing importer
accounts and
subscriptions
Aggregated usage metrics
New/Remove
user/product
events
Usage metrics
Ingress
exporter
Ingress
API calls
New/Remove
product
events
VMAgent
YAML with scrape
endpoint
66. Advanced product billing initial architecture
IAAS/PAAS
product
Timeseries
database
Billing
XML with
plans
Prometheus Exposition Format
Metrics
API call count
Inbound traffic
Outbound traffic
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
Self-service
portal
Event log
New/Remove
user/product
events
Portal
connector
Billing importer
accounts and
subscriptions
Aggregated usage metrics
New/Remove
user/product
events
Usage metrics
Ingress
exporter
Ingress
API calls
New/Remove
product
events
VMAgent
YAML with scrape
endpoint
67. Advanced product billing initial architecture
Timeseries
database
Billing
XML with
plans
Prometheus Exposition Format
Metrics
API call count
Inbound traffic
Outbound traffic
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
Self-service
portal
Event log
New/Remove
user/product
events
Portal
connector
Billing importer
accounts and
subscriptions
Aggregated usage metrics
New/Remove
user/product
events
Usage metrics
Ingress
exporter
Ingress
New/Remove
product
events
VMAgent
YAML with scrape
endpoint
S3
product
S3 API calls
S3
exporter
S3 CLI logs
Metrics
storage size
Metrics
API calls per group
Metrics
Outbound traffic
68. Billing product vision
из Виктории!!!
• Take some data from log/endpoint
• Transform to consumption metrics
• Expose to Prometheus
90% of
code!
81. Low-Code billing architecture
Any
product
Timeseries
database
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Inbound traffic
Outbound traffic
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
Self-service
portal
Event log
New/Remove
user/product
events
Portal
connector
Billing importer
accounts and
subscriptions
Aggregated usage metrics
New/Remove
user/product
events
usage metrics
Any API calls
Any logs
Ingress
Metrics
Any product metric
Metrics
Any logs metric
mediation
TOML with
pipeline
New/Remove
product
events
VMAgent
YAML with scrape
endpoint
82. Final Low-Code billing architecture
Any
product
Timeseries
database
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Inbound traffic
Outbound traffic
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
Self-service
portal
Event log
New/Remove
user/product
events
Portal
connector
Billing importer
accounts and
subscriptions
Aggregated usage metrics
New/Remove
user/product
events
usage metrics
Any API calls
Any logs
Ingress
Metrics
Any product metric
Metrics
Any logs metric
mediation
TOML with
pipeline
New/Remove
product
events
VMAgent
YAML with scrape
endpoint
YAML Quotas
Definition
Language
Quotas
enforcer
83. Final Low-Code billing architecture
Any
product
Timeseries
database
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Inbound traffic
Outbound traffic
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
Self-service
portal
Event log
New/Remove
user/product
events
Portal
connector
Billing importer
accounts and
subscriptions
Aggregated usage metrics
New/Remove
user/product
events
usage metrics
Any API calls
Any logs
Ingress
Metrics
Any product metric
Metrics
Any logs metric
mediation
TOML with
pipeline
New/Remove
product
events
VMAgent
YAML with scrape
endpoint
YAML Quotas
Definition
Language
Quotas
enforcer
84. Final Low-Code billing architecture
Any
product
Timeseries
database
Billing
XML with
plans
Prometheus Exposition Format
Metrics
Inbound traffic
Outbound traffic
Retention – 1 month
Timeseries
database
Retention – 2 years
History and disputes
investigation
Self-service
portal
Event log
New/Remove
user/product
events
Portal
connector
Billing importer
accounts and
subscriptions
Aggregated usage metrics
New/Remove
user/product
events
usage metrics
Any API calls
Any logs
Ingress
Metrics
Any product metric
Metrics
Any logs metric
mediation
TOML with
pipeline
New/Remove
product
events
VMAgent
YAML with scrape
endpoint
YAML Quotas
Definition
Language
Quotas
enforcer
Ответственности
Команда биллинга – код!
Продуктовые команды – Low-Code
85. Pains
• Generic connector is K8S-tailored
• Analytic reports are too slow without KillBill analytics plugin
• KillBill is slow for balance requests – 2-3 seconds delay
• KillBill is Java-based
• Metrics history has gaps
• VRL
hard for disputes
86. Pros
• Basic billing metrics – no efforts from dev teams
• All SRE are familiar with used software
• 80% of the solution – YAML + XML + TOML
• KillBill community is alive