April 2023
Tor Bendiksen and Luuk Stolk, ING ICHP Tiger Team
Stateful workloads on a container platform
Data Services hosted on ICHP
2
1. Introduction
2. Containers and (Non-)Persistence
3. Data Services hosted on ICHP
4. ICHP v1 vs ICHP v2
5. Storage Layer Options
Agenda
3
Introduction
Our Market Leaders
We serve 37 million customers in more than 40 countries
Our Challengers & Growth markets
§ Netherlands*
§ Belgium
§ Luxembourg
§ Australia
§ France **
§ Germany
§ Italy
§ Spain
Wholesale Banking
international network and global franchises
Challengers Markets Growth Markets
Market Leaders
Map highlights countries where ING has an office
(**) In 2022, ING discontinued its retail activities in these markets
§ Poland
§ Romania
§ Turkey
§ Philippines **
§ Stakes in Asia
(*) ING’s corporate head office is located in Amsterdam,
The Netherlands
4
Get to know us
Tor Bendiksen
Luuk Stolk
Our purpose
Empowering people to stay a
Step ahead in life and in
business
Our priorities
Sustainability
at the heart
Superior
customer
experience
ING at a glance
5
What is ICHP
ING Container Hosting Platform
Standardized OpenShift/Kubernetes hosting platform
Namespace as a Service for container workloads on ING Private Cloud
Dedicated clusters for hosting Data Services
Self Service through ING Cloud Portal
Integration with Azure DevOps for application workload deployment
Based on Bare Metal for all consumer workloads
With over 2000 namespaces on non-production and 500 on production
6
Containers and (non-)persistence
Containers and Non-Persistence
7
Worker Node 1
• Pods and containers are ephermeral / volatile
• Apps in containers should be stateless processes
• No persistency inside the container for (12 factor) apps
Worker Node 2
* https://12factor.net/
Worker
Node 2
Containers and Persistence
8
Worker
Node 1
• Use external data services for persistence
• Generic services for common persistent data store solutions
DBaaS
Bind
config
S3 ELK
9
Data Services hosted on ICHP
Data Services hosted on ICHP
10
Work Load– Worker Nodes
Ingress / Routing
PostgreSQL
Elastic
Cluster
• Ready to deploy platform
• Compliancy
• Cost reduction
• Leverage platform capabilities for
ü Scalability
ü Resilience
ü Automation
Keep your data SAFE – ELKaaS use case
11
Portworx
Ø Scalability of platform
Ø Availability
Ø Fast
Ø Elasticity
+ 2 cores
+ 100 GB
Keep your data SAFE – HA/DR
Cluster 1
Portworx
Cluster 2
Portworx
Data Center 1 Data Center 2
Service Model for Data Services
13
Kubernetes Platform
Azure
DevOps
NaaS
ICHP
Namespace Namespace Namespace
IPC
Portal
Request
Deploy
IPC
Consumer
Use
Kubernetes Platform
NaaS
ICHP
Namespace Namespace Namespace
Data Service
Data Service
Instance
Data Service
Owner
Responsibility
Portworx
• Data replication
• Resilient to for instance node / volume outages
• Backup / restore
• Zone aware deployment
• Local storage capability with persistent volume interface
• (Technical) capabilities to guarantee local storage SLA
• Namespace as a Service
• Compliant platform
Double Replication, to do or not to do
14
Portworx
S3 Snapshot
Replication
Replication
Preventing nodes or disks to
become single point of failure for
data availability
15
Projects on Data Services v2
ELKaaS
• Elastic 8
• Custom ELKaaS
Operator
MDPL
• RTK2 primary
• IAT
• Many additional
components
Cloud Pak for Data
• Data lake workloads
• IBM backed
• Portworx based
See also: ING Booth (S75) MDPL session: Thursday 16:30 – 17:30
Past, Current and Potential Data Services candidates
16
Messaging/
Eventing
Pulsar
Kafka
• AKS
SAS
Viya 4
Undecided
PSQL
CloudNative
PG operator
Undecided
Scale / sizing of implementations ICHP v2
17
Cores Memory Pods Storage Namespaces Bandwidth Nodes
Stateless Non Prod 2304 27648 GB 24000 n/a 2600 2x 25 GB per node 36
Stateless Prod 1536 18432 GB 5000 n/a 475 2x 25 GB per node 24
MDPL 1536 18432 GB 5500 153 / 122 TB 20 4x 25 GB per node 24
ELKaaS 1280 15360 GB 2800 2150 / 1720 TB 170 4x 25 GB per node 20
CP4D 640 7860 GB n/a 14 / 12 TB n/a 4x 25 GB per node 10
* ICHPv1 Stateless
ü Namespaces
Prod: 612
Non-Prod: 3500
ü Pods
Prod: 18500
Non-Prod: 25000
ü Nodes
Prod: 80
Non-Prod: 300
Risk & security ICHP v2
18
Security Event Monitoring / Anomaly Detection Falco, Kubernetes audit logs
State Compliancy OpenShift Compliance Operator, NIST based
Policy Management Kyverno
State enforcement / configuration drift detection GitOps - Argo CD
Image scanning (shift left) Prisma Cloud
Immutability • No high privileged access to clusters and nodes
• No terminal and ssh access to containers in
Acceptance and Production
• Read-only access to namespaces, only
deployments through Azure DevOps pipelines
• No privileged containers
• No local persistency except for Data Services
Multi-tenancy • Network policies
• Resource quotas
See also: April 19th 11:00 - Thijs Ebbers & Diana Iordan: Zero Privilege Architectures
19
ICHPv1 vs ICHPv2
20
What has changed
• Openshift 4
• Installer Provisioned Infrastructure
• Hands off installation
• Hands off running
• GitOps
21
Drift Detection and Reconciliation
Target environment
Git
Repo
Pull Request Pull code
Automatically…
Detect deviations in your target environment from the desired state (in Git)
Enforce the desired state
22
GitOps@ICHP – What do we use it for?
Deploy and manage ICHP clusters
Server / OoenShift configuration
Feature deployment and configuration
E.g. API’s, Logging and Monitoring, Risk and Security tooling (TSCM, SEM, etc)
Bare
OpenShift
cluster incl.
GitOps
Configure
Server /
OpenShift
ADO Pipeline
Install
features
OpenShift GitOps
See also: ING Booth (S75) GitOps session: Wednesday 16:30 – 17:30
23
Storage Layer Options
Portworx
24
Pros
• Class leading
• The only viable early choice
• Unbeatable speed
• Enterprise ready
• Good support
• Rapid development
Cons
• Documentation
• Aggressive caching
• Rapid development
Rook
25
• Rook used for orchestration
• Ceph for storage backend
• File, Block and Object storage
• Replication across nodes/zones
• Feature parity in primary use
Components
• Ceph-mon
• Ceph-osd
• Ceph-mds
• Ceph-rgw
• Ceph-mgr
26
In a Nutshell
27
ICHP Data Service Hosting Characteristics
• Not for direct use by application containers
Ø Application containers should continue to use data services for persistency
• Dynamic volume provisioning (no pre allocation required) with Portworx
• Dedicated storage clusters for
• Serving storage volumes
• Running data service related container workload
• Use of local disks (SATA SSD – RAID 5)
• Namespaced volumes (not accessible from other namespaces)
• Support for fully automated provisioning
• Infrastructure platform risk controls covered
• No overcommit on storage allocation
• Availability zone awareness
28
Questions?
29
Thank you and come see us at KubeCon!
April 19th
11:00
Talk
Thijs Ebbers & Diana Iordan: Zero
Privilege Architectures
April 19th
16:30
Talk
Adnan Hodzic: K8s, Resistance is
Futile
April 20th
10:30
Booth S75
Tor Bendiksen & Luuk Stolk: Meet the
Speakers
April 20th
11:30
Booth S75
Mark de Jong & Rob de Boer: ICHP
Workload deployment Quality
April 20th
12:30
Booth S75
Robbin Siepman: ICHP Namespace
as a Service
April 18th
16:30
Booth S75
Arijan Luiken & Salvatore Vitale:
Banking Observability at Scale
Booth S75
Jan-Willem Bijma & Kamil Nocon: GitOps@ING
SlideShare.net/ING
@ING_News LinkedIn.com/company/ING
YouTube.com/ING Flickr.com/INGGroup
Facebook.com/ING
ing.com Medium.com/ing-blog
Follow us
32
Backup Slides
Availability - Red / Blue zone awareness
33
Portworx
App X App X
Why Data Services?
34
• Cost effective
• Keep applications as ‘disposable’ components
• Dealing with persistent data is complex
• Very specific requirements that can (potentially) break compliancy
• Therefore: single stakeholder, solution pattern and concern
Keep your data SAFE
35
Scalable
Add nodes
Add disks
Available
Replication
Zone aware
Fast
Local SSD
Short I/O
path
Elastic
Fast pod
(auto)scaling
Resize on
demand
How?
• Run data services on a Kubernetes compatible container based storage provider
• Portworx

ING Data Services hosted on ICHP DoK Amsterdam 2023

  • 1.
    April 2023 Tor Bendiksenand Luuk Stolk, ING ICHP Tiger Team Stateful workloads on a container platform Data Services hosted on ICHP
  • 2.
    2 1. Introduction 2. Containersand (Non-)Persistence 3. Data Services hosted on ICHP 4. ICHP v1 vs ICHP v2 5. Storage Layer Options Agenda
  • 3.
  • 4.
    Our Market Leaders Weserve 37 million customers in more than 40 countries Our Challengers & Growth markets § Netherlands* § Belgium § Luxembourg § Australia § France ** § Germany § Italy § Spain Wholesale Banking international network and global franchises Challengers Markets Growth Markets Market Leaders Map highlights countries where ING has an office (**) In 2022, ING discontinued its retail activities in these markets § Poland § Romania § Turkey § Philippines ** § Stakes in Asia (*) ING’s corporate head office is located in Amsterdam, The Netherlands 4 Get to know us Tor Bendiksen Luuk Stolk Our purpose Empowering people to stay a Step ahead in life and in business Our priorities Sustainability at the heart Superior customer experience ING at a glance
  • 5.
    5 What is ICHP INGContainer Hosting Platform Standardized OpenShift/Kubernetes hosting platform Namespace as a Service for container workloads on ING Private Cloud Dedicated clusters for hosting Data Services Self Service through ING Cloud Portal Integration with Azure DevOps for application workload deployment Based on Bare Metal for all consumer workloads With over 2000 namespaces on non-production and 500 on production
  • 6.
  • 7.
    Containers and Non-Persistence 7 WorkerNode 1 • Pods and containers are ephermeral / volatile • Apps in containers should be stateless processes • No persistency inside the container for (12 factor) apps Worker Node 2 * https://12factor.net/
  • 8.
    Worker Node 2 Containers andPersistence 8 Worker Node 1 • Use external data services for persistence • Generic services for common persistent data store solutions DBaaS Bind config S3 ELK
  • 9.
  • 10.
    Data Services hostedon ICHP 10 Work Load– Worker Nodes Ingress / Routing PostgreSQL Elastic Cluster • Ready to deploy platform • Compliancy • Cost reduction • Leverage platform capabilities for ü Scalability ü Resilience ü Automation
  • 11.
    Keep your dataSAFE – ELKaaS use case 11 Portworx Ø Scalability of platform Ø Availability Ø Fast Ø Elasticity + 2 cores + 100 GB
  • 12.
    Keep your dataSAFE – HA/DR Cluster 1 Portworx Cluster 2 Portworx Data Center 1 Data Center 2
  • 13.
    Service Model forData Services 13 Kubernetes Platform Azure DevOps NaaS ICHP Namespace Namespace Namespace IPC Portal Request Deploy IPC Consumer Use Kubernetes Platform NaaS ICHP Namespace Namespace Namespace Data Service Data Service Instance Data Service Owner Responsibility Portworx • Data replication • Resilient to for instance node / volume outages • Backup / restore • Zone aware deployment • Local storage capability with persistent volume interface • (Technical) capabilities to guarantee local storage SLA • Namespace as a Service • Compliant platform
  • 14.
    Double Replication, todo or not to do 14 Portworx S3 Snapshot Replication Replication Preventing nodes or disks to become single point of failure for data availability
  • 15.
    15 Projects on DataServices v2 ELKaaS • Elastic 8 • Custom ELKaaS Operator MDPL • RTK2 primary • IAT • Many additional components Cloud Pak for Data • Data lake workloads • IBM backed • Portworx based See also: ING Booth (S75) MDPL session: Thursday 16:30 – 17:30
  • 16.
    Past, Current andPotential Data Services candidates 16 Messaging/ Eventing Pulsar Kafka • AKS SAS Viya 4 Undecided PSQL CloudNative PG operator Undecided
  • 17.
    Scale / sizingof implementations ICHP v2 17 Cores Memory Pods Storage Namespaces Bandwidth Nodes Stateless Non Prod 2304 27648 GB 24000 n/a 2600 2x 25 GB per node 36 Stateless Prod 1536 18432 GB 5000 n/a 475 2x 25 GB per node 24 MDPL 1536 18432 GB 5500 153 / 122 TB 20 4x 25 GB per node 24 ELKaaS 1280 15360 GB 2800 2150 / 1720 TB 170 4x 25 GB per node 20 CP4D 640 7860 GB n/a 14 / 12 TB n/a 4x 25 GB per node 10 * ICHPv1 Stateless ü Namespaces Prod: 612 Non-Prod: 3500 ü Pods Prod: 18500 Non-Prod: 25000 ü Nodes Prod: 80 Non-Prod: 300
  • 18.
    Risk & securityICHP v2 18 Security Event Monitoring / Anomaly Detection Falco, Kubernetes audit logs State Compliancy OpenShift Compliance Operator, NIST based Policy Management Kyverno State enforcement / configuration drift detection GitOps - Argo CD Image scanning (shift left) Prisma Cloud Immutability • No high privileged access to clusters and nodes • No terminal and ssh access to containers in Acceptance and Production • Read-only access to namespaces, only deployments through Azure DevOps pipelines • No privileged containers • No local persistency except for Data Services Multi-tenancy • Network policies • Resource quotas See also: April 19th 11:00 - Thijs Ebbers & Diana Iordan: Zero Privilege Architectures
  • 19.
  • 20.
    20 What has changed •Openshift 4 • Installer Provisioned Infrastructure • Hands off installation • Hands off running • GitOps
  • 21.
    21 Drift Detection andReconciliation Target environment Git Repo Pull Request Pull code Automatically… Detect deviations in your target environment from the desired state (in Git) Enforce the desired state
  • 22.
    22 GitOps@ICHP – Whatdo we use it for? Deploy and manage ICHP clusters Server / OoenShift configuration Feature deployment and configuration E.g. API’s, Logging and Monitoring, Risk and Security tooling (TSCM, SEM, etc) Bare OpenShift cluster incl. GitOps Configure Server / OpenShift ADO Pipeline Install features OpenShift GitOps See also: ING Booth (S75) GitOps session: Wednesday 16:30 – 17:30
  • 23.
  • 24.
    Portworx 24 Pros • Class leading •The only viable early choice • Unbeatable speed • Enterprise ready • Good support • Rapid development Cons • Documentation • Aggressive caching • Rapid development
  • 25.
    Rook 25 • Rook usedfor orchestration • Ceph for storage backend • File, Block and Object storage • Replication across nodes/zones • Feature parity in primary use Components • Ceph-mon • Ceph-osd • Ceph-mds • Ceph-rgw • Ceph-mgr
  • 26.
  • 27.
    27 ICHP Data ServiceHosting Characteristics • Not for direct use by application containers Ø Application containers should continue to use data services for persistency • Dynamic volume provisioning (no pre allocation required) with Portworx • Dedicated storage clusters for • Serving storage volumes • Running data service related container workload • Use of local disks (SATA SSD – RAID 5) • Namespaced volumes (not accessible from other namespaces) • Support for fully automated provisioning • Infrastructure platform risk controls covered • No overcommit on storage allocation • Availability zone awareness
  • 28.
  • 29.
    29 Thank you andcome see us at KubeCon! April 19th 11:00 Talk Thijs Ebbers & Diana Iordan: Zero Privilege Architectures April 19th 16:30 Talk Adnan Hodzic: K8s, Resistance is Futile April 20th 10:30 Booth S75 Tor Bendiksen & Luuk Stolk: Meet the Speakers April 20th 11:30 Booth S75 Mark de Jong & Rob de Boer: ICHP Workload deployment Quality April 20th 12:30 Booth S75 Robbin Siepman: ICHP Namespace as a Service April 18th 16:30 Booth S75 Arijan Luiken & Salvatore Vitale: Banking Observability at Scale Booth S75 Jan-Willem Bijma & Kamil Nocon: GitOps@ING
  • 30.
  • 32.
  • 33.
    Availability - Red/ Blue zone awareness 33 Portworx App X App X
  • 34.
    Why Data Services? 34 •Cost effective • Keep applications as ‘disposable’ components • Dealing with persistent data is complex • Very specific requirements that can (potentially) break compliancy • Therefore: single stakeholder, solution pattern and concern
  • 35.
    Keep your dataSAFE 35 Scalable Add nodes Add disks Available Replication Zone aware Fast Local SSD Short I/O path Elastic Fast pod (auto)scaling Resize on demand How? • Run data services on a Kubernetes compatible container based storage provider • Portworx