Rudraksh Kulshreshtha, from Indiqus presenting how to architect lean CloudStack deployments for Edge use cases.
------------------------------------------
CloudStack European User Group Virtual happened on May 27th. The first CSEUG Virtual proved to be a huge success. It collected people from 23 countries – Germany, the United Kingdom, Switzerland, India, Bulgaria, Greece, Poland, Serbia, Brazil, Chile, Russia, USA, Canada, Japan, France, Uruguay, Korea …
We also had a record number of registrations and attendees for a CloudStack User Group Event. The physical distance was not a stopper for our speakers, who joined the event from 6 different countries.
------------------------------------------
About CloudStack: https://cloudstack.apache.org/
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Designing Lean CloudStack Environments for the Edge - IndiQus - CloudStack European User Group Virtual, May 2021
1. Designing lean CloudStack
environments for the edge
Our experiences on the edge
Rudraksh MK
IndiQus Technologies
https://indiqus.com
CSEUG Virtual Meetup - May 27 2021
2. $ whoami
- Infrastructure and data engineering
- Currently heading our DevOps and R&D team
- Also an amateur historian
- Hit me up to shoot the breeze on the state of infrastructure!
3. $ whoarewe
- An 8-year-old startup specialising in edge cloud monetisation.
- Our platform currently enables service providers in 7 countries across 4 continents.
- We ship with a deep understanding of cloud-native infrastructure, with a team consisting of cloud
and managed IT veterans.
4. When people say “edge”..
- Bringing the cloud closer to end-users.
- Usually means things like
- Self-driving cars
- Industrial automation
- IoT, especially when it comes to smart homes
- CDNs!
- Running specialised AI workloads for specialised scenarios on the ground.
5. - The idea to take compute, storage and networking to end users still remains.
- But - we also look at regional cloud platforms as edge clouds - because they cater to regional
requirements, and are meant to be as resilient as hyperscalers like AWS or Google.
- Constraints around costs or resources can be similar to typical edge scenarios, if not the same.
- Allow for scenarios where compliance dictates the need to keep workloads/data in-country.
Our take...
6. ...and this is where we come in.
- Enable cloud service providers to quickly deploy edge clouds for their
customers in various geographies.
- Make life easier by working towards making Day-2 operations more
efficient.
- Prove that an open-source infrastructure stack is as mission-ready as
expensive enterprise stacks like VMware.
7. Now the thing is..
- When people say “edge”, ACS is unfortunately the last thing they think
of.
- OpenStack
- StarlingX
- RHOSP
- We want to change that. ACS is actually terrific for the edge.
8. Okay. How?
- Marry edge ACS deployments with a combination of in-house and open source tech.
- Enable the management of multiple edge ACS deployments from a single control plane for
service providers.
- Deliver a stack focusing on optimising
- Service management
- Billing
- Telemetry
- Deployments
11. Common architectural considerations
1. Costs. A regional service provider doesn’t have the kind of cash an AWS or a Google has.
2. Deployments must be as rapid as possible. The same goes for upgrades.
3. Day 2 operations. Service providers need to be able to manage their edge clouds from the get-go,
with minimal intervention from us.
14. The ACS edge management zone
- Typically deployed across a two-server Xen cluster, for operational ease.
- Runs the ACS management and usage components, as well as various other open source and
in-house workloads
- Zabbix
- Syslog
- OTRS
- apiculus
16. Compute & Storage
- Maximise. More for less, because it’s hyperconverged.
- That’s why no Xen for production hosts - we only use KVM or VMware for compute.
- Multiple options for block storage:
- Ceph RBD or Storpool for allowing ACS access to high-performant and resilient block storage.
- For VMware-backed zones, hook in with vSAN.
- A separate 10G switch for storage networks
- Minimise failure and the cost of backup storage by deploying Open-E’s ZFS storage provider on a single
node.
- Everything runs on commodity hardware.
- The overriding motto here: “more for less”.
17. Networking
- Separate 1/10G switches for compute and storage.
- Separate 1G switch for out-of-band IPMI access to physical hosts.
- Again - maximise networking hardware output where possible!
- Network security is one of the main outliers here when it comes to using open source/commodity
tech:
- pfSense is powerful - but the bare metal cost of deploying and running a pfSense appliance is actually
higher.
- This is where we use Fortigate for physical load balancing, firewalls as well as IDS.
- However - it can easily be swapped out for pfSense, in order to maintain “ideological purity”.
18. Stephen Hawking & Pink Floyd, Keep Talking (The Division Bell, 1994)
“..all we need to do, is keep talking.”
20. Deploying AND configuring an ACS edge cloud
- A lot of work has been done on automating ACS deployments. Yay for Ansible!
- But what about the hardware it’s supposed to run on or manage?
- And configuring? All those clicks!
- So what we do:
- Automate management and production zone infrastructure with Terraform
- Use Ansible to deploy stable ACS builds
- Use custom Python scripts to automate configuration:
- Hosts, zones, clusters
- Offerings
22. Monitoring & Logging
- Operating a cloud is no less than
launching and tracking a rocket.
- If you can’t measure it, you can’t
control it.
23. Zabbix at scale
- With Zabbix, we monitor:
- Management infrastructure
- Production infrastructure
- But well, Zabbix is..clunky.
- Not easy to maintain a unified Zabbix deployment for tracking multiple ACS edge zones.
- Configuring Zabbix can be painful.
- It could do with a nicer UI
24. So..Prometheus & Grafana?
- A single Grafana plane connected to multiple Prometheus+TimescaleDB deployments for each
ACS edge zone.
- Track physical infrastructure, Cloudstack management servers, databases - everything - because
there’s usually an exporter for everything!
- If there is no exporter..well, a bit of Golang and you’re sorted :)
- Far easier to setup for new edge zones or existing zones with resource upgrades. Another point
for Ansible.
25. Logging is..thorny.
- Disparate logging sources.
- Syslog is great..till it isn’t.
- ELK seems like a terrific solution to our woes, but:
- It suffers from the same deployment/maintainability issues as Zabbix
- Also, their new licensing seems a little odd.
- Loki to the rescue!
- Easy enough to hook up existing syslog servers with Loki.
- Exploring logs with Grafana is a breeze!
27. Extending ACS
- To be fair - Java isn’t one’s strong suit.
- Our method for adding to ACS:
- Building Python/Golang microservices that use ACS APIs extensively
- Optional to deploy, since they aren’t included with the ACS codebase/builds
28. Our extensions
- Converge
- An autoscaling layer for CloudStack.
- Responsible for horizontally scaling virtual machines across multiple hypervisors
- Eliminates the need for investing in Netscaler.
- LeanKube
- Our own home-grown K3s-clusters-as-a-Service for ACS
- Fast, CNCF-compliant Kubernetes clusters on KVM/ESXi.
- Multiple storage options - Ceph/Storpool/Longhorn
- Integrations with the Rancher stack for easy cluster management and application catalogues.
- Hermes
- An attempt at bringing AWS-style load balancers to ACS
- Uses HAProxy
- Deploy L3/L7 load balancers for ACS VMs as well as LeanKube K3s clusters
- Visual configuration, as opposed to writing HAProxy configuration syntax
29. Day 2? CHECK!
...and Day X?
What’s up next?
1. Extending Prometheus exporters for
ACS, to provide greater visibility
2. Using Packer to provision CloudStack
templates.
3. Building forecasting on top of our
telemetry stack, to reduce reactionary
responses and increase preventive
responses to anomalous events.
4. Securing operator access to ACS
management environments more
efficiently.
5. The Holy Grail - ACS-Kubernetes
appliances, and running them off
edge devices like the Intel NUC board.