ATM Refresh
Embedding QA,
Improving Quality & Reducing Costs
EuroSTAR 2009
December 2009
Or
How to reinvent your ATM lifecycle and
save £millions
2
The ATM Refresh Programme - Some Background
• The current system
• Bespoke, host-based application providing ATM acquiring, debit card authorisation and
connectivity to national debit switch (LINK)
• Bespoke, proprietary ATM application for transaction processing and alerting running on
Windows/NT
• SNA Communications
• No direct connections to VISA or Mastercard
• IBM crypto-processors
• The new system
• BASE24-atm, Release 6 Version 9 on HP NonStop Blades
• Wincor Nixdorf ProCash/NDC ATM Application for transaction processing running on
Windows/XP
• Wincor Nixdorf ProView for ATM monitoring
• Wincor Nixdorf Platform Security Agent for ATM lockdown
• IP Communications throughout
• Thales HSMS
• VISA EU and BankNet connectivity
• New PC Cores inside ATMS, along with some EPP upgrades
• New debit card host application
• Wincor Nixdorf ProView Analysis for ATM Channel Business Intelligence
3
The Public Face Of ATM Failure
Trade Press Reports
26 March 2008 - 14:17
Customers cash in on 'double your money' ATM
Hundreds of people flocked to a faulty Payzone ATM in the UK city of Hull last week after it
started dispensing twice the amount of cash keyed in for withdrawals.
29 February 2008 - 14:55
Nationwide admits ATM blunder
Nationwide Building Society has been forced to apologise to thousands of customers after a
technical glitch led to accounts not being debited when cash was withdrawn from some
ATMs in Northern Ireland.
10 April 2008 - 10:14
Danske glitch wipes out Northern Bank, National Irish and Sampo ATMs
Danske Bank says a glitch with its IBM-managed payments network resulted in customers of
its subsidiaries in three countries being unable to use their cards to withdraw money from
cash machines.
The financial impacts of ATM failure?
06 November 2008 - 13:09
RBS ATM dishes out 'free' cash
Residents of the English market town of Chorley flocked to a Royal Bank of Scotland (RBS)
ATM last week after word spread that the unit was dispensing double the amount of cash
requested.
5
So Where Does The Money Go?
 Come common assertions:
– ATM systems are difficult and expensive to to change and
maintain
– The underlying technology is what drives expense.
– Manage the cost of change by not changing
 My view:
– The cost software is small compared to the cost of the
lifecycle
– The technology is largely irrelevant in the cost of change
– The best way to manage the cost of change is to plan for
constant change.
 An explanation:
– Software pricing is elastic
– Human behaviour is the biggest driver in the cost of systems
maintenance
– Everything gets better with practice!
6
Sc = Tt * Ct * I * Rc
Sc = 5 * 8 * 5 * 20
Sc = 4000
Sc -> Test Scenarios
Tt -> Transaction Types
Ct -> Card Types
I -> Issuer End Points
Rc -> Response Codes
Maths Disclaimer:
Not all card types support all transaction types and all response codes
But the order of magnitude is relevant
000s not 00s
Previous Manual Testing Regime
800 test Scripts
23 Man-days to execute….
….. AT LEAST
... none of which address Windows/XP
environment failures from the earlier slide
What was the scale of the problem at Barclays?
7
What was our test philosophy?
 Accept the scale of the problem!
 Embed QA throughout the life cycle – not just at the end
 Simulate AND Automate the testing
 Focus specialist resources on defect resolution and
change management instead of on repetitive test execution
 Accelerate components into integration testing
8
Current State of Play
 Streamlined and semi-automated software management
model implemented
 “Change Anything – Test Everything” Philosophy
 4500 Transaction Test Scenarios in regression suite so far,
which are executed in a 12 hour time window
 “Smoke test” of ~450 transactions executed every night,
overnight,
 Same test artefacts used by the Business in UAT
9
What does it look like?
• Simulated ATM hardware on desk top
• Drives REAL ATM software in virtual ATM
• Transactions processed by BASE24
• Alerts processed by ProView
• Issuer Systems simulated by VersaTest
• All simulators programmatically compare
results and record outcomes in HP Quality
Centre
10
BRIDGE:Test Environment
(Windows Blade Server)
Versatest Environment
(Windows Blade Server)
BASE24
HP NonStop Blade
Cards Host
IBM zSeries
VISA DFS
BNET DFS
LIS5 DFS
BICI DFS
HISOI DFS
HISOA DFS
VATM1
VATM2
VATM16
BRIDGE Sim1
BRIDGE Sim2
BRIDGE Sim16
(Via SNA proxy on HP)
HP Quality Centre – Enterprise Test Management
ATM Alerting SubSystem
System Diagram
VersatestAutomationServer
11
How can we exploit this investment?
 Faster Time To Market for new changes, while reducing risk
 Demonstrate the multi-vendor capability of any ATM
application
 Extended beyond the UK ATM network to support testing
across Barclays Group.
 The ATM test tool is also being used for developing training
material for branch staff
12
Any Questions?
Questions?
Enquiries To
james.tomaney@barclays.com
Barclays ATMR
Testing Partners
VersaTest Issuer Simulator BRIDGE:Test ATM Test Tool
And test automation experts And test automation experts
ATM Domain Testing Skills

James Tomaney - Automated Testing for the ATM Channel

  • 1.
    ATM Refresh Embedding QA, ImprovingQuality & Reducing Costs EuroSTAR 2009 December 2009 Or How to reinvent your ATM lifecycle and save £millions
  • 2.
    2 The ATM RefreshProgramme - Some Background • The current system • Bespoke, host-based application providing ATM acquiring, debit card authorisation and connectivity to national debit switch (LINK) • Bespoke, proprietary ATM application for transaction processing and alerting running on Windows/NT • SNA Communications • No direct connections to VISA or Mastercard • IBM crypto-processors • The new system • BASE24-atm, Release 6 Version 9 on HP NonStop Blades • Wincor Nixdorf ProCash/NDC ATM Application for transaction processing running on Windows/XP • Wincor Nixdorf ProView for ATM monitoring • Wincor Nixdorf Platform Security Agent for ATM lockdown • IP Communications throughout • Thales HSMS • VISA EU and BankNet connectivity • New PC Cores inside ATMS, along with some EPP upgrades • New debit card host application • Wincor Nixdorf ProView Analysis for ATM Channel Business Intelligence
  • 3.
    3 The Public FaceOf ATM Failure
  • 4.
    Trade Press Reports 26March 2008 - 14:17 Customers cash in on 'double your money' ATM Hundreds of people flocked to a faulty Payzone ATM in the UK city of Hull last week after it started dispensing twice the amount of cash keyed in for withdrawals. 29 February 2008 - 14:55 Nationwide admits ATM blunder Nationwide Building Society has been forced to apologise to thousands of customers after a technical glitch led to accounts not being debited when cash was withdrawn from some ATMs in Northern Ireland. 10 April 2008 - 10:14 Danske glitch wipes out Northern Bank, National Irish and Sampo ATMs Danske Bank says a glitch with its IBM-managed payments network resulted in customers of its subsidiaries in three countries being unable to use their cards to withdraw money from cash machines. The financial impacts of ATM failure? 06 November 2008 - 13:09 RBS ATM dishes out 'free' cash Residents of the English market town of Chorley flocked to a Royal Bank of Scotland (RBS) ATM last week after word spread that the unit was dispensing double the amount of cash requested.
  • 5.
    5 So Where DoesThe Money Go?  Come common assertions: – ATM systems are difficult and expensive to to change and maintain – The underlying technology is what drives expense. – Manage the cost of change by not changing  My view: – The cost software is small compared to the cost of the lifecycle – The technology is largely irrelevant in the cost of change – The best way to manage the cost of change is to plan for constant change.  An explanation: – Software pricing is elastic – Human behaviour is the biggest driver in the cost of systems maintenance – Everything gets better with practice!
  • 6.
    6 Sc = Tt* Ct * I * Rc Sc = 5 * 8 * 5 * 20 Sc = 4000 Sc -> Test Scenarios Tt -> Transaction Types Ct -> Card Types I -> Issuer End Points Rc -> Response Codes Maths Disclaimer: Not all card types support all transaction types and all response codes But the order of magnitude is relevant 000s not 00s Previous Manual Testing Regime 800 test Scripts 23 Man-days to execute…. ….. AT LEAST ... none of which address Windows/XP environment failures from the earlier slide What was the scale of the problem at Barclays?
  • 7.
    7 What was ourtest philosophy?  Accept the scale of the problem!  Embed QA throughout the life cycle – not just at the end  Simulate AND Automate the testing  Focus specialist resources on defect resolution and change management instead of on repetitive test execution  Accelerate components into integration testing
  • 8.
    8 Current State ofPlay  Streamlined and semi-automated software management model implemented  “Change Anything – Test Everything” Philosophy  4500 Transaction Test Scenarios in regression suite so far, which are executed in a 12 hour time window  “Smoke test” of ~450 transactions executed every night, overnight,  Same test artefacts used by the Business in UAT
  • 9.
    9 What does itlook like? • Simulated ATM hardware on desk top • Drives REAL ATM software in virtual ATM • Transactions processed by BASE24 • Alerts processed by ProView • Issuer Systems simulated by VersaTest • All simulators programmatically compare results and record outcomes in HP Quality Centre
  • 10.
    10 BRIDGE:Test Environment (Windows BladeServer) Versatest Environment (Windows Blade Server) BASE24 HP NonStop Blade Cards Host IBM zSeries VISA DFS BNET DFS LIS5 DFS BICI DFS HISOI DFS HISOA DFS VATM1 VATM2 VATM16 BRIDGE Sim1 BRIDGE Sim2 BRIDGE Sim16 (Via SNA proxy on HP) HP Quality Centre – Enterprise Test Management ATM Alerting SubSystem System Diagram VersatestAutomationServer
  • 11.
    11 How can weexploit this investment?  Faster Time To Market for new changes, while reducing risk  Demonstrate the multi-vendor capability of any ATM application  Extended beyond the UK ATM network to support testing across Barclays Group.  The ATM test tool is also being used for developing training material for branch staff
  • 12.
    12 Any Questions? Questions? Enquiries To james.tomaney@barclays.com BarclaysATMR Testing Partners VersaTest Issuer Simulator BRIDGE:Test ATM Test Tool And test automation experts And test automation experts ATM Domain Testing Skills

Editor's Notes

  • #2 Barclays Bank in 2008 has radically altered its approach to testing the ATM service in conjunction with a major programme to refresh the ATM infrastructure, moving away from a bespoke software and NT in the ATM, and bespoke mainframe software, and towards ProCash/NDC on Windows and BASE24-atm on HP NonStop. The reason for this change was a review of the previous approach against the programme’s stated objectives to deliver a flexible, maintainable system that would deliver a rapid Time To Market for new business services. From the review, it was clear that the pervious test approach would have been too slow and costly, making the system less maintainable and undermining the flexibility of the product-based solution selected. Rapid deployment of new services therefore, could only be achieved through the acceptance of greater risk, and although lowering risk wasn’t stated as a programme objective, we took it as read! The underlying premise of the new approach is that QA is a behaviour and not a project phase. The approach asserts that continual, rapid, automatic re-execution of QA activities will deliver more reliable results than a one-off testing phase. This is because: Each re-execution tests relatively little change, accelerating defect identification Automatic execution increases the time available to testers to develop test coverage and to investigate defects Automatic execution encourages testers to exercise all scenarios rather than subjectively determined sample subsets Manual test phases, at the end of a project, inevitably get squeezed as delivery dates loom, leading to the reduction in test coverage
  • #4 Before undertaking anything, it’s always good to ask “why are we doing this?” Inadequate testing of any service leads directly to service interruptions. the ATM channel is probably unique for the speed with which it advertises a defect or interruption directly to our customers and to the customers of our competitors. <CLICK> All real live ATMs Last two – took them myself HSBC at Embankment Tube – advantages of HSBC’s global branding was that all those tourists knew exactly who HSBC is Nationwide – At Wembley Stadium – all that brand building money sponsoring England Football gets NWBS the right to the only ATMs inside Wembley and this is what they do with it.
  • #6 it is the application lifecycle that matters. Most money is spent on synchronising the activities of multiple workstreams and then repeating work in integration when system components change
  • #7 Start with the reason for exhaustive testing. ATMR supports 26 combinations of card type, transaction type and authoriser. Each of the 5 authorisers can generate over 40 different response codes Manual testing can conduct ~5 straight forward transactions per person per hour (including verification and recording) Executing the 4000 test scenarios manually therefore will take 115 mandays at a RAD cost of £28,750 (using RAD’s blended manday rates) But this only covers successful transactions (from the system’s perspective) – i.e. a response is always received, even if the customer doesn’t like it! We call this “Happy Path” testing Some interesting statistics on the “Unhappy Path” testing The Wincor Nixdorf hardware can generate up to 1500 events types, at least 1000 of which result from specific, low level hardware failure that can’t be created on demand at a real ATM – these events impact help desk response and MI calculations for availability calculations Amounts from £10 - £250 dispensed using £10 and £20 notes presents over 90 note mix combinations There are >4000 combinations of base HTML screen, active overlay and foreign language overlay in the ATMR system There are thousands (~5k) of test scenarios…many of which can lead to the ATM service failures! Manual testing of these complex scenarios (including verification and recording) is much slower, as it can involve hardware manipulation before and after each test, and some scenarios cannot be created manually at all without damaging equipment. A single test could easily take an hour or more in some cases. 5000 complex transactions is equal to >12 man months, at RAD’s target blended man day rates, that costs more than £60,000 – more crucially, it has a finite minimum elapsed time, and requires 12 people, permanently, to keep pace with the rate of change in Windows ATMs, so that £60k is really £720kpa, to regression test the ATM network manually How much of this testing has to be repeated all of the time I hear you ask? Honestly, not all of it every month, but which bits to repeat each month is a difficult question. Getting the answer to that question wrong is what delivers the results seen on the previous 2 slides.
  • #8 You can’t wish away the complexity of systems like this and simply reducing the scope of testing while identifying the risk in a RAIDs log somewhere is an invitation to disaster QA is a mind-set – a philosophy even – not a project phase What ever we think, we don’t test the hardware – we don’t have the facilities and we expect and accept that the vendor tests it. We test the behaviour of the software in relation to hardware events, and we test the behaviour of multiple software components integrated in a single operating environment. Card schemes provide standard simulation tools for use in completing their certification processes (part of attestation) – but these are focussed on demonstrating compliance and not on the exhaustive testing we need to do to prove service resilience Our automation tools can work 24 hours per day submitting transactions without breaks and automatically verifying outcomes and capturing results into the GTS dashboard in seconds. Our domain-expert test resources are now focused on identifying possible failure scenarios, building test cases, expanding coverage and resolving defects, instead of standing in front of ATMs pressing keypads and taking digital photos of the screens. We implemented standard test scenarios immediately with the an end-to-end test environment based on the standard product software deliverables from ACI and Wincor. The harness executes daily. Then as we determine configuration settings to meet business needs, or take delivery of customisations and CRs from the vendors, we drop the changes into the system and add new test scenarios to the harness to exercise the changes. Each new change is automatically and instantly regression tested to ensure it hasn’t broken anything. The Card Schemes all provide or recommend some sort of PC-based simulator to facilitate the certification process. These are not automated test tools and are not integrated with other testing facilities (such as Quality Centre). They are designed to demonstrate compliance with scheme regulations, not to exhaustively test the behaviour of the Bank’s acquiring systems. The simulation of each external component allows the more rapid integration of components as they become available. This reduces the risk associated with delaying integration testing until all required components are available – that is, any integration failures between components are detected earlier in the project cycle while a) there is more time to remedy them b) the staff required are still available to the project. A long hiatus between the completion of systems testing for a component and subsequent integration testing increases the chances that the relevant staff have been redeployed before issues are discovered. This feature has allowed us to absorb a multi-month delay in the availability of one component, keeping the project on time and avoiding ~£3m of project cost that could have arisen from funding other teams while waiting for the delayed component The early start to integration testing and the frequent, rapid re-execution of the whole cycle provides valuable MI for project managers. Trends in defect detection and resolution rates are more observable and statistically more accurate because more data points are available
  • #9 Between April and November, working with our partners Ascert, level Four, ACI, Wincor Nixdorf and NMQA, we have captured a range of different test packs, progressively growing coverage. Crucially, we are able to use these packs to test and re-test BASE24 and ProCash/NDC, but also to exercise our Host Systems (at the Bank and at Barclaycard), our ATM help desk service (outsourced), our Windows monitoring environment and our Operations bridge monitoring for the ATM service. We can initiate these test cycles quickly and automatically to suit the test schedules of the various interested groups, meaning that we don’t have to try and coordinate the test phases of these diverse teams and groups.
  • #10 The demo now covers ATM acquired transactions using the BRIDGE:Test tool with VERSATEST acting as the LINK Switch to authorise, and LINK acquired transactions, using the VERSATEST tool to simulate LINK and to simulate the Cards Systems host.
  • #11 Our Windows environments for the simulation tools allows access to the test harness from anywhere in the world on the Bank’s network. This supports the use of shared test facilities across the Bank and potentially extends our “testing day” by allowing groups offshore to execute tests the results of which are then analysed by staff back on shore during their working day.
  • #12 Recent example from the ATMR project A new delivery of a security agent resulted in a regression test of the whole build (300+ transactions) executed in 90 minutes. This regression test automatically compares results including pixel level screen images and identified that the base font in use on the screens had subtly changed, even thought the application version was unmodified – the new security component didn’t recognise the Barclays-specific font so the HTML had defaulted to the next nearest. This saved in the region of a man-week’s effort of manual acceptance testing. Indeed the problem may not have been spotted by a human tester While the choice of font isn’t fatal, the rapid automatic detection of a fault introduced by an unrelated change illustrates the point well The group’s ATMs provide essentially the same services worldwide, yet incur the same costs in multiple locations around the world, solving and resolving the same problems. Radical approaches to life cycle management, such as this testing model, will enable the Bank to capitalise on the investment made and reap the rewards from economy of scale. Efficient life cycles allow a small number of people to support many different ATM environments – VocaLink supports a BASE24 system driving ~30,000 ATMs in the UK for around 15 IADs and small Banks/Building Societies with around the same size team we use to support BOSSS to drive ~4,000 machines Deployment of appropriate automation like this allows the Bank to build a streamlined application lifecycle that would allow a single team to support the entire worldwide ATM estate in a single shared environment. We’re able to use MFIX as it was intended to maintain BASE24 for fixes and CSMS (applying all fixes automatically) because we can exhaustively retest the system everytime. The project is using this facility to improve training material for branch staff Good company? This model was deployed at Bank Of America – they had an existing NCR software problem related to Touch Screens that was costing huge amounts of money in down time and lost interchange fees and significant level of customer dissatisfaction – after 6 months NCR couldn’t recreate or fix it in the lab. The Bank implemented a soak test with this toolset that ran thousands of transactions through the software in a Virtual ATM, simulating the field usage over a period of weeks (which is the timescale over which the problem would surface in the field). Execution in the lab took 48 hours and when the error occurred a detailed log of events was produced showing exactly what the error was. The fix took a few hours, but the Bank had carried the cost of the error for 6 months.