1
© Worldpay 2016. All rights reserved.
Delivering Multi-Tenancy Applications on Hadoop
David M Walker
Enterprise Data Platform Programme Director & Technical Architect
5th April 2017
2 © Worldpay 2017. All rights reserved.2
Transactions Daily.
On average that’s per second.
merchants using >
payment methods & currencies
in countries and in the UK we
process % of all non-cash transactions
Worldpay In (Big) Numbers
In Store
Online
Mobile
3 © Worldpay 2017. All rights reserved.3
Who are our customers?
• You probably interact with Worldpay several times a day without realising it:
• And we are also process the payments for over:
̶ 16,000 hairdressers - 24,000 restaurants - 9,000 pubs - etc.
• After today you will probably notice everywhere
4 © Worldpay 2017. All rights reserved.4
Worldpay & Big Data
• In April 2015 we made the strategic decision to commit to a new enterprise
wide data platform to:
̶ Provide deep analytics and data driven decisions as well as traditional
reporting
̶ Source information from across all our platforms and bring it to one place
̶ Make this information available to our colleagues, our customers and our
partners
̶ Exploit disruptive open-source technologies
̶ Full commitment from CEO, CIO and the Head of Data who initiated the
project
• But with 13.1 billion transactions to a total value of £402bn from 2015 alone
and with a significant proportion of both your card and my card transaction
history in the system it had to be SECURE
5 © Worldpay 2017. All rights reserved.5
Some Stats About Our Environment
• Two Production (PRD) Clusters (96 nodes), Two PPE Clusters (16 nodes) One DTE Cluster (8 Nodes)
̶ All environments are built using the same templates and build instructions
̶ The average Data Node has 12x4Tb disk, 256Gb Memory and 20 cores
̶ Our clusters are on premise and we have the capability to burst to cloud infrastructure with secured
(tokenised) data
̶ We have plans to expand rapidly over the next 24 months
• Security is Key
̶ Because we have so much PCI & PII data we must be both secure and comply with regulators
• We’ve upgraded from HDP 2.3 to HDP 2.4 to HDP 2.5 in 18 months including many point releases
̶ And security and ease of management have improved with each release
• We’ve loaded 80+ Billion Card Transactions from two of Worldpay’s systems
̶ And we are busy at work to get all the other systems on board as both batch and real-time streams
• We’re in the process of delivering to Users and Systems
̶ Users have secure data access with a range of desktop and web tools to the Transaction History
̶ We are in the process of deploying Machine Learning Derived Algorithms back into payment platforms
6
© Worldpay 2016. All rights reserved.
What Is A Multi-Tenancy Cluster?
7 © Worldpay 2017. All rights reserved.7
Large Multi-Tenancy Developments
• A building like The Shard in London has many tenant types
and many tenants. These will include:
• Offices
• Retail Arcade
• Restaurants & Bars
• Hotel
• etc.
• They will also have many components and services
provided in the building including:
• Water
• Gas
• Electricity
• Air Conditioning
• Internet
• Security
• Building Management
8 © Worldpay 2017. All rights reserved.8
The analogy with the Enterprise Data Platform
• We have many tenant types
• Data Warehousing
• Decision Services
• Data APIs
• Technical Insights
• Search
• etc.
• These tenancy type each have components and services
they need to operate
• Data Sources
• Batch, Stream/CDC data, Log Files
• Data Lake and Derived Data Sets
• Data Ingest and Manipulation Tools
• Reporting and Analytic Tooling
• Engines for running models
• Governance (Building Management)
• Security
9 © Worldpay 2017. All rights reserved.9
Our Tenancy Types: Data Warehousing
• It will surprise many but despite the
innovations of big data there is still a
requirement inside the business for
reports and dashboards
• We don’t have a single ‘Enterprise Data Model’
but we do have a number of ‘Narrative
Models’ – third normal form data models that
describe aspects of the business and are used
to populate data marts in Hive and reported
on with tools such as Tableau
Data Warehousing
10 © Worldpay 2017. All rights reserved.10
Our Tenancy Types: Decision Services
• Our data scientists can use the historical data
we have available to examine the factors
including
• What affects whether a transaction
successfully completes?
• How smooth the transaction from a
customer perspective (did a 3D Secure
appear, etc.)?
• Is it fraud?
• Using this information we can generate
Predictive Models that can be seeded back
into the transaction path and used to optimise
the way in which to process a transaction
Data Warehousing
Decision Services
11 © Worldpay 2017. All rights reserved.11
Our Tenancy Types: Data APIs
• Our data API tenants share data from EDP with
other systems
• This may include either de-tokenising our data
into clear (e.g. for sharing with fraud agencies)
or double encrypting the data (e.g. when
sharing it with another company so we can
trace lineage)
• Data is shared with both internal and external
organisations
Data Warehousing
Decision Services
Data APIs
12 © Worldpay 2017. All rights reserved.12
Our Tenancy Types: Technical Insights
• Our Technical Insights tenancy type stores systems
and security monitoring and logs
• These are gathered from various platforms and
made available for analysis and reporting
• The data can be used for both simple and complex
analysis
• We start with simple examples about which
systems need patching, how many support calls
were opened against a specific system, uptime of
servers, etc.
• But we are looking for the complex relationships –
given a pattern of events in routers and servers we
need to add more capacity or take preventative
maintenance allowing us to offer better outcomes
to our merchants
Data Warehousing
Decision Services
Data APIs
Technical Insights
13 © Worldpay 2017. All rights reserved.13
Our Tenancy Types: Search
• Search is a growing requirement
• Our business is moving to use a concept of
‘Operational Events’ where all systems
generate an event when something changes
• These will be stored in our Operational Event
Store within the cluster
• We need to make this data quickly and easily
available via search tools
• The data will help satisfy queries from our
colleagues, merchants, consumers and
business partners
Data Warehousing
Decision Services
Data APIs
Technical Insights
Search
14 © Worldpay 2017. All rights reserved.14
Why Tenancy Types and not just Tenants ?
• Many ‘Big Data’ environments talk about and have multiple tenants
• Each use case developed using the best tool for the job be an agile team
• But behind this is a wave of hidden costs relating to management and upgrades
• The long term operability of the cluster will depend on being able to easily identify:
• Which product components are being used and can the be upgraded?
• Which data sets are required and when are they available?
• How to manage the SLAs and isolate different components with different SLAs ?
• A tenancy type is defined by a common collection of tools and data sets used for a functionally
similar purposes – it provides a design pattern for engineering teams to work towards
15 © Worldpay 2017. All rights reserved.15
How our Multi-Tenancy model changes with Hortonworks 3
• Our model of tenancy types is geared towards a move to Hortonworks 3
• We want to define a tenancy type that uses a collection of containers
• As opposed to the current concept of multiple components
• This allows us to better manage versions of the components
• We want to blueprint the set of containers for a tenancy type
• Then we can rapidly roll out tenants of that tenancy type
• And then (as we are on-premise) we can burst our workloads to the cloud
16 © Worldpay 2017. All rights reserved.16
Optimizing The Platform Support
• We operate a ‘metalic’ service level: Gold, Silver, Bronze
• Good architecture means being able to isolate components for maintenace
• The optimal solution is to keep each component at the lowest possible support level
• Allows us to take components down to perform upgrades, etc.
• Reduces in-house and external support costs
• We have a Hortonworks First Policy for the Hadoop Platform
• If the functionality exists in the Hortonworks Data Platform we will strive to use it over
purchasing/using another product
17 © Worldpay 2017. All rights reserved.17
An Example: The Decision Service Tenancy Type
Decision Service is a capability that is created using components from the Enterprise Data Platform (EDP) and
encompasses:
• The ability to ingest data in real time from operational systems (Attunity/Kafka/Flume/Data Capture Job)
• The ability to analyse data either on the stream as it arrives (Kafka) or historically in a database (Hive)
• The tools to allow data scientists to do machine learning (Spark, using Python and Scala ML libraries)
• The ability to publish and run machine learning models to offer a service (PMML and OpenScoring.io)
• The ability to allow other systems to access the decision service via a RESTful API
• The ability to support decision services in production - the required DR, Integration Testing, Performance
Testing, Service Transition and Governance
18 © Worldpay 2017. All rights reserved.18
PMML
Many decision mechanisms will be individually deployed to form a complete service
Workflow Management
Version Control
Intelligent Account Verification
Predict Fraud
Dynamic 3DS
Payment Recycling
Other Similar Decisions
RESTAPI
OperationalPlatform
Customer
Core
Data
Modelling
Data
Scoring
Data
Data Lake
Batch
Stream
Data Ingest
Batch
Stream
Data Ingest
Other
Platforms
Data Profiling
Feature Engineering
Provisioning
Lifecycle Dashboard
Tools
Algorithms
Scoring Libraries
A/B Testing
Model Health Scoring/Validation
Data Refresh
Deployment
Data Science
Model Management
Event Calendar
Decision Service
19 © Worldpay 2017. All rights reserved.19
Our vision is to optimise every single transaction balanced across Cost,
Acceptance & Risk weighted to meet customer preferences
AcceptanceCost Risk
Outcome Priority
P
O
P
O
O
ABC
Fraud
CV2
AVS
3DS
Retry
Route
20 © Worldpay 2017. All rights reserved.20
We have begun to analyze the potential customer outcomes
Existing client solution
Hybrid Model
Pure Machine Learning
ML model
performance only
current data
Disclaimer: These numbers are the results for only one merchant
21 © Worldpay 2017. All rights reserved.21
Operation & Security Infrastructure
The Technical Insights Tenancy Type
Windows
Servers
Web
& File
Servers
Virtual-
isation
Servers
Linux
Servers
including
syslog
Database
Servers
(Oracle
MSSQL)
Firewalls
& Anti-
DDOS
SNMP &
Other
Event
Traps
Physical
Access
Logs
CMDB &
Service
Now
Anti-
Virus
Logs
Vulner-
ability
Scans
Enterprise Data Platform
EventCapture
EventStore
Security & IT
Ops
AnalyticsWorkbench
Reports
Dashboards
Investigations
Advanced Analytics
Machine Learning
Search
Data
Science
Security
Single Pane
Of Glass
IT
Single Pane
Of Glass
Third Party Security Products
Beginners
Advanced
22 © Worldpay 2017. All rights reserved.22
One of our live dashboards – Sensitive data obscured!
23 © Worldpay 2017. All rights reserved.23
Technical Insights:
Eat Your Own Dogfood
Using our own data load
metrics to look for
technical debt and
necessary remedial work
24 © Worldpay 2017. All rights reserved.24
So where are we now and where do we expect to be in two years?
• Data Warehousing
• Two Live Tenants – one for Shopper
Insight and one for Financial Reporting
• We would expect around around ten
narrative models and three reporting
tools to be deployed
• Decision Services
• Multiple decision services being
developed now
• Expect there to be at least tens of decision
services to be deployed
• Search
• PoC Starting
• Data API
• 1 API live
• 3 more planned for the coming months
• As many as required on-going
• Technical Insights
• 15 dashboards delivered from two source
systems
• Deploying now to access hundreds of
sources and devices
• Other Tenancy Types
• More to come – we just don’t know what
they are yet
25 © Worldpay 2017. All rights reserved.25
ENTERPRISE DATA PLATFORM
Who are our technology partners?
26
© Worldpay 2016. All rights reserved.
Leaders in Modern Money
Innovating In Secure Modern Data Analytics
Thank You
David M Walker (david.walker@worldpay.com)
Enterprise Data Platform Programme Director

Worldpay - Delivering Multi-Tenancy Applications in A Secure Operational Platoform

  • 1.
    1 © Worldpay 2016.All rights reserved. Delivering Multi-Tenancy Applications on Hadoop David M Walker Enterprise Data Platform Programme Director & Technical Architect 5th April 2017
  • 2.
    2 © Worldpay2017. All rights reserved.2 Transactions Daily. On average that’s per second. merchants using > payment methods & currencies in countries and in the UK we process % of all non-cash transactions Worldpay In (Big) Numbers In Store Online Mobile
  • 3.
    3 © Worldpay2017. All rights reserved.3 Who are our customers? • You probably interact with Worldpay several times a day without realising it: • And we are also process the payments for over: ̶ 16,000 hairdressers - 24,000 restaurants - 9,000 pubs - etc. • After today you will probably notice everywhere
  • 4.
    4 © Worldpay2017. All rights reserved.4 Worldpay & Big Data • In April 2015 we made the strategic decision to commit to a new enterprise wide data platform to: ̶ Provide deep analytics and data driven decisions as well as traditional reporting ̶ Source information from across all our platforms and bring it to one place ̶ Make this information available to our colleagues, our customers and our partners ̶ Exploit disruptive open-source technologies ̶ Full commitment from CEO, CIO and the Head of Data who initiated the project • But with 13.1 billion transactions to a total value of £402bn from 2015 alone and with a significant proportion of both your card and my card transaction history in the system it had to be SECURE
  • 5.
    5 © Worldpay2017. All rights reserved.5 Some Stats About Our Environment • Two Production (PRD) Clusters (96 nodes), Two PPE Clusters (16 nodes) One DTE Cluster (8 Nodes) ̶ All environments are built using the same templates and build instructions ̶ The average Data Node has 12x4Tb disk, 256Gb Memory and 20 cores ̶ Our clusters are on premise and we have the capability to burst to cloud infrastructure with secured (tokenised) data ̶ We have plans to expand rapidly over the next 24 months • Security is Key ̶ Because we have so much PCI & PII data we must be both secure and comply with regulators • We’ve upgraded from HDP 2.3 to HDP 2.4 to HDP 2.5 in 18 months including many point releases ̶ And security and ease of management have improved with each release • We’ve loaded 80+ Billion Card Transactions from two of Worldpay’s systems ̶ And we are busy at work to get all the other systems on board as both batch and real-time streams • We’re in the process of delivering to Users and Systems ̶ Users have secure data access with a range of desktop and web tools to the Transaction History ̶ We are in the process of deploying Machine Learning Derived Algorithms back into payment platforms
  • 6.
    6 © Worldpay 2016.All rights reserved. What Is A Multi-Tenancy Cluster?
  • 7.
    7 © Worldpay2017. All rights reserved.7 Large Multi-Tenancy Developments • A building like The Shard in London has many tenant types and many tenants. These will include: • Offices • Retail Arcade • Restaurants & Bars • Hotel • etc. • They will also have many components and services provided in the building including: • Water • Gas • Electricity • Air Conditioning • Internet • Security • Building Management
  • 8.
    8 © Worldpay2017. All rights reserved.8 The analogy with the Enterprise Data Platform • We have many tenant types • Data Warehousing • Decision Services • Data APIs • Technical Insights • Search • etc. • These tenancy type each have components and services they need to operate • Data Sources • Batch, Stream/CDC data, Log Files • Data Lake and Derived Data Sets • Data Ingest and Manipulation Tools • Reporting and Analytic Tooling • Engines for running models • Governance (Building Management) • Security
  • 9.
    9 © Worldpay2017. All rights reserved.9 Our Tenancy Types: Data Warehousing • It will surprise many but despite the innovations of big data there is still a requirement inside the business for reports and dashboards • We don’t have a single ‘Enterprise Data Model’ but we do have a number of ‘Narrative Models’ – third normal form data models that describe aspects of the business and are used to populate data marts in Hive and reported on with tools such as Tableau Data Warehousing
  • 10.
    10 © Worldpay2017. All rights reserved.10 Our Tenancy Types: Decision Services • Our data scientists can use the historical data we have available to examine the factors including • What affects whether a transaction successfully completes? • How smooth the transaction from a customer perspective (did a 3D Secure appear, etc.)? • Is it fraud? • Using this information we can generate Predictive Models that can be seeded back into the transaction path and used to optimise the way in which to process a transaction Data Warehousing Decision Services
  • 11.
    11 © Worldpay2017. All rights reserved.11 Our Tenancy Types: Data APIs • Our data API tenants share data from EDP with other systems • This may include either de-tokenising our data into clear (e.g. for sharing with fraud agencies) or double encrypting the data (e.g. when sharing it with another company so we can trace lineage) • Data is shared with both internal and external organisations Data Warehousing Decision Services Data APIs
  • 12.
    12 © Worldpay2017. All rights reserved.12 Our Tenancy Types: Technical Insights • Our Technical Insights tenancy type stores systems and security monitoring and logs • These are gathered from various platforms and made available for analysis and reporting • The data can be used for both simple and complex analysis • We start with simple examples about which systems need patching, how many support calls were opened against a specific system, uptime of servers, etc. • But we are looking for the complex relationships – given a pattern of events in routers and servers we need to add more capacity or take preventative maintenance allowing us to offer better outcomes to our merchants Data Warehousing Decision Services Data APIs Technical Insights
  • 13.
    13 © Worldpay2017. All rights reserved.13 Our Tenancy Types: Search • Search is a growing requirement • Our business is moving to use a concept of ‘Operational Events’ where all systems generate an event when something changes • These will be stored in our Operational Event Store within the cluster • We need to make this data quickly and easily available via search tools • The data will help satisfy queries from our colleagues, merchants, consumers and business partners Data Warehousing Decision Services Data APIs Technical Insights Search
  • 14.
    14 © Worldpay2017. All rights reserved.14 Why Tenancy Types and not just Tenants ? • Many ‘Big Data’ environments talk about and have multiple tenants • Each use case developed using the best tool for the job be an agile team • But behind this is a wave of hidden costs relating to management and upgrades • The long term operability of the cluster will depend on being able to easily identify: • Which product components are being used and can the be upgraded? • Which data sets are required and when are they available? • How to manage the SLAs and isolate different components with different SLAs ? • A tenancy type is defined by a common collection of tools and data sets used for a functionally similar purposes – it provides a design pattern for engineering teams to work towards
  • 15.
    15 © Worldpay2017. All rights reserved.15 How our Multi-Tenancy model changes with Hortonworks 3 • Our model of tenancy types is geared towards a move to Hortonworks 3 • We want to define a tenancy type that uses a collection of containers • As opposed to the current concept of multiple components • This allows us to better manage versions of the components • We want to blueprint the set of containers for a tenancy type • Then we can rapidly roll out tenants of that tenancy type • And then (as we are on-premise) we can burst our workloads to the cloud
  • 16.
    16 © Worldpay2017. All rights reserved.16 Optimizing The Platform Support • We operate a ‘metalic’ service level: Gold, Silver, Bronze • Good architecture means being able to isolate components for maintenace • The optimal solution is to keep each component at the lowest possible support level • Allows us to take components down to perform upgrades, etc. • Reduces in-house and external support costs • We have a Hortonworks First Policy for the Hadoop Platform • If the functionality exists in the Hortonworks Data Platform we will strive to use it over purchasing/using another product
  • 17.
    17 © Worldpay2017. All rights reserved.17 An Example: The Decision Service Tenancy Type Decision Service is a capability that is created using components from the Enterprise Data Platform (EDP) and encompasses: • The ability to ingest data in real time from operational systems (Attunity/Kafka/Flume/Data Capture Job) • The ability to analyse data either on the stream as it arrives (Kafka) or historically in a database (Hive) • The tools to allow data scientists to do machine learning (Spark, using Python and Scala ML libraries) • The ability to publish and run machine learning models to offer a service (PMML and OpenScoring.io) • The ability to allow other systems to access the decision service via a RESTful API • The ability to support decision services in production - the required DR, Integration Testing, Performance Testing, Service Transition and Governance
  • 18.
    18 © Worldpay2017. All rights reserved.18 PMML Many decision mechanisms will be individually deployed to form a complete service Workflow Management Version Control Intelligent Account Verification Predict Fraud Dynamic 3DS Payment Recycling Other Similar Decisions RESTAPI OperationalPlatform Customer Core Data Modelling Data Scoring Data Data Lake Batch Stream Data Ingest Batch Stream Data Ingest Other Platforms Data Profiling Feature Engineering Provisioning Lifecycle Dashboard Tools Algorithms Scoring Libraries A/B Testing Model Health Scoring/Validation Data Refresh Deployment Data Science Model Management Event Calendar Decision Service
  • 19.
    19 © Worldpay2017. All rights reserved.19 Our vision is to optimise every single transaction balanced across Cost, Acceptance & Risk weighted to meet customer preferences AcceptanceCost Risk Outcome Priority P O P O O ABC Fraud CV2 AVS 3DS Retry Route
  • 20.
    20 © Worldpay2017. All rights reserved.20 We have begun to analyze the potential customer outcomes Existing client solution Hybrid Model Pure Machine Learning ML model performance only current data Disclaimer: These numbers are the results for only one merchant
  • 21.
    21 © Worldpay2017. All rights reserved.21 Operation & Security Infrastructure The Technical Insights Tenancy Type Windows Servers Web & File Servers Virtual- isation Servers Linux Servers including syslog Database Servers (Oracle MSSQL) Firewalls & Anti- DDOS SNMP & Other Event Traps Physical Access Logs CMDB & Service Now Anti- Virus Logs Vulner- ability Scans Enterprise Data Platform EventCapture EventStore Security & IT Ops AnalyticsWorkbench Reports Dashboards Investigations Advanced Analytics Machine Learning Search Data Science Security Single Pane Of Glass IT Single Pane Of Glass Third Party Security Products Beginners Advanced
  • 22.
    22 © Worldpay2017. All rights reserved.22 One of our live dashboards – Sensitive data obscured!
  • 23.
    23 © Worldpay2017. All rights reserved.23 Technical Insights: Eat Your Own Dogfood Using our own data load metrics to look for technical debt and necessary remedial work
  • 24.
    24 © Worldpay2017. All rights reserved.24 So where are we now and where do we expect to be in two years? • Data Warehousing • Two Live Tenants – one for Shopper Insight and one for Financial Reporting • We would expect around around ten narrative models and three reporting tools to be deployed • Decision Services • Multiple decision services being developed now • Expect there to be at least tens of decision services to be deployed • Search • PoC Starting • Data API • 1 API live • 3 more planned for the coming months • As many as required on-going • Technical Insights • 15 dashboards delivered from two source systems • Deploying now to access hundreds of sources and devices • Other Tenancy Types • More to come – we just don’t know what they are yet
  • 25.
    25 © Worldpay2017. All rights reserved.25 ENTERPRISE DATA PLATFORM Who are our technology partners?
  • 26.
    26 © Worldpay 2016.All rights reserved. Leaders in Modern Money Innovating In Secure Modern Data Analytics Thank You David M Walker (david.walker@worldpay.com) Enterprise Data Platform Programme Director

Editor's Notes

  • #18 Core edp capability that we utilise to drive out decisions The capability is independnet of any platform
  • #19 Capability to drive deployment of services Decison no. X Point about model deployment – planning for production grade deployment process for models ...a/b testign etc Re-use across bus/ platforms….alexander Built generallically not for a specific customers – e.g sme ecom = insurance product Smaller merchnats? Risk product