Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander

581 views

Published on

Presentation at IBM InterConnect on March 21, 2017.

Santander is one of the largest companies in the world, yet size is no guarantee of future survival given several challenges in the retail banking industry, primarily from disruptive new startups and a changing regulatory landscape. Success requires cutting-edge cloud computing solutions that achieve better resource utilization through automatic application scaling to match demand; and an associated, finer-grained cost model that helps distribute compute load at a lower cost. Learn how IBM and Santander partnered to create next-generation solutions for retail banking with the OpenWhisk open source project hosted on IBM Bluemix, which enables serverless architectures for event driven programming.

Published in: Software
  • Be the first to comment

Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander

  1. 1. Serverless Architectures in Banking: Apache OpenWhisk on IBM Bluemix at Santander IBM InterConnect 2017 – March 21, 2017
  2. 2. 1 About the speakers Daniel Krook Software Architect/Engineer & Developer Advocate at IBM krook@us.ibm.com Luis Enriquez Head of Platform Engineering & Architecture at Santander Group luis.enriquez@gruposantander.com
  3. 3. 2 Agenda 1 2 3 4 Results, conclusions, future directions Serverless architectures Apache OpenWhisk on IBM Bluemix Check processing overview and solution
  4. 4. 3 Santander is one of the world’s largest banks
  5. 5. 4 Goals and results of the OpenWhisk Proof of Concept Goals & Principles • Hybrid solution • Greater deployment choices • Avoid vendor lock-in • Scalability and elasticity • Respond to workload peaks • Asynchronous and event-driven • Developer-friendly solution • Efficiency Results • Automated process, reducing time and error avoidance • Elasticity, bursting into the cloud • Simple and easy to maintain technical solution • Significant cost saving potential
  6. 6. 5 Agenda 2 3 4 Results, conclusions, future directions Serverless architectures Apache OpenWhisk on IBM Bluemix Check processing overview and solution 1
  7. 7. 6 With a serverless platform developers focus more on code, less on infrastructure Bare metal Virtual machines Containers Functions Decreasing concern (and control) over stack implementation Increasingfocusonbusinesslogic
  8. 8. 7 Serverless platforms address 12 Factors for developers I Codebase Handled by developer (Manage versioning of functions themeselves) II Dependencies Handled by developer, facilitated by serverless platform (Runtimes and packages) III Config Handled by platform (Environment variables or injected event parameters) IV Backing services Handled by platform (Connection information injected as event parameters) V Build, release, run Handled by platform (Deployed resources immutable and internally versioned) VI Processes Handled by platform (Single stateless containers used) VII Port binding Handled by platform (Actions or functions automatically discovered) VIII Concurrency Handled by platform (Process model hidden and scales in response to demand) IX Disposability Handled by platform (Lifecycle hidden from user, fast startup and elastic scale prioritized) X Dev/prod parity Handled by developer (Developer is deployer. Scope of what differs narrower) XI Logs Handled by platform (Developer writes to console.log, platform streams logs) XII Admin processes Handled by developer (No distinction between one off processes and long running)
  9. 9. 8 Emerging workloads are a good fit for event driven programming Execute app logic in response to database change Perform edge analytics in response to sensor input Provide cognitive computing via a conversational bot Schedule tasks according to a specific timetable Invoke autoscaled mobile backend services
  10. 10. 9 New cost models more accurately charge for compute time While many applications must still be deployed in an always on model, serverless architectures provide an alternative that can result in substantial cost savings for a variety of event driven workloads. Applications billed by compute time (millisecond) rather than reserved memory (GB/hour). Means a greater linkage between cloud resources used and business operations executed.
  11. 11. 10 Technological and business factors make serverless compelling Serverless architectures are gaining traction Cost models getting more granular and efficient Growth of event driven workloads that need automated scale Platforms to facilitate cloud native design for developers
  12. 12. 11 Agenda 3 4 Results, conclusions, future directions Apache OpenWhisk on IBM Bluemix Check processing overview and solution 21 Serverless architectures
  13. 13. 12 OpenWhisk enables these serverless, event-driven workloads Serverless deployment and operations model Optimized utilization, fine grained metering at any scale Flexible, extensible, polyglot programming model Open source and open ecosystem (Apache Incubator) Ability to run in public, private, and hybrid models Apache OpenWhisk a cloud platform that executes code in response to events
  14. 14. 13 Developers work with triggers, actions, rules, and packages Data sources define events they emit as Triggers. Developers map Actions to Triggers via Rules. Packages provide integration with external services. T A P R
  15. 15. 14 OpenWhisk Comparison to traditional PaaS or IaaS models Traditional Model Serverless Model • Continuous polling often used • Charged even when idling • No auto-scaling in response to load • Introduces event-driven programming model • Charges only for what is used • Auto-scales in response to current load Request Polling Application CF Container VM Trigger OpenWhisk Engine Running Action Running Action Running Action Idle compute resources Deploy action within milliseconds, run it, free up resources Pool of Actions JS Swift Docker
  16. 16. 15 Agenda 3 4 Results, conclusions, future directions Check processing overview and solution 1 Serverless architectures 2 Apache OpenWhisk on IBM Bluemix
  17. 17. 16 Business Drivers at Santander for a Serverless Architecture – 1/2 What value do microservices and serverless architectures provide? Compared to a PaaS offering, FaaS charges the customer based on the actual time used by the service itself. Server uptime is not billed (serverless). Independent scalability, integration and delivery pipelines, testability and development flows make it more streamlined and automated, resulting into less maintenance efforts and savings on operations and development costs. Provides a great way to quickly and reliably connect or relay private/public/hybrid SOA or Cloud APIs at low cost $ ¥ € £ Billing Model Low Complexity Integration Capability
  18. 18. 17 • However, outcome depends on each scenario • Not everything can or should rely on FaaS. E.g: very active back-ends, complex front-end applications etc. would simply underperform • OpenWhisk in particular are excellent to design a web of microservices whose purpose is to relay or orchestrate other services (e.g. IoT, reactive post- processes applied on other Cloud feeds etc.) • Microservices are another tool for architects to support the general IT Cloud transition, and as such should be used in conjunction with other solutions. Business Drivers at Santander for a Serverless Architecture – 2/2 What value do microservices and serverless architectures provide?
  19. 19. 18 Scenario: This PoC intends to present how OpenWhisk could improve the following business process: Bank clerks manual entry of routing and account numbers when cashing Santander Bank customers’ checks. The purpose of this proof of concept is to show how OpenWhisk can be used for an event-driven, serverless architecture, that processes the deposit of checks to a bank account using optical character recognition (OCR), replacing manual inputs and avoid correlated human errors. Proof of Concept: “OpenChecks” check processing OpenWhisk by the example: service enablement and orchestration
  20. 20. 19 Check data parsing with OCR overview OCR will be used to parse the data at the bottom of the check representing: • The routing number • The account number If this information is not readable or does not follow the presented format, the check will be considered invalid. Routing number Deposit from account number The hand-written amount data is not currently parsable nor is the deposit to account information provided on a check itself. This data needs to be passed as metadata (that is, encoded in the file name as supplied by the bank clerk).
  21. 21. 20 Deployment model approaches evaluated This proof of concept had three different deployment models, each one with its advantages and disadvantages. Deployment of the computing engine on Cloud • Serverless computing Deployment of the computing engine on premises • Sensitive data • Avoid Vendor lock-in Deployment of the computing model on both Cloud and on- premises • Total cost of ownership Cloud Local Cloud Bursting
  22. 22. 21 Logical architecture Architectural Diagram as deployed on the cloud (Apache OpenWhisk on IBM Bluemix)
  23. 23. 22 Workload split between public and private OpenWhisk and Cloudant instances
  24. 24. 23 Workload split between public and private OpenWhisk and Cloudant instances (with hybrid scheduling)
  25. 25. 24 • Checks are scanned and uploaded by the front-office clerks • OpenWhisk Bluemix resizes the scans in smaller sizes and stores them along with the originals into remote databases • Databases are replicated over to on-prem servers • On-prem OpenWhisk kickstarts the OCR, parses the checks and stores the result into a local database • Statistics such as the total amount processed, total checks that could not be parsed with success etc. are calculated by either the local or remote OpenWhisk systems, alternatively, based on an arbitrary dispatching method. These stats are stored in a remote database, which is replicated over a local instance continuously. • Clerks connect a local front-end to consult these statistics from the local database. “OpenChecks” OCR in a hybrid environment Hybrid Deployment with Cloud Bursting: Workflow Highlights
  26. 26. 25 Proof of Concept: OpenChecks OCR Hybrid Deployment with Cloud Bursting: Demo Front-end Statistics Screenshot
  27. 27. 26 • No data resides only in the cloud, there’s always a local replica. If necessary (regulatory concerns), there’s a way to use only on-prem storage. • Tasks are split in a hybrid way: part of the flow is done on-prem, the rest on the cloud. • This simulation stresses on the versatile nature of OpenWhisk: it both orchestrates (e.g. database change feeds handlers, statistics computation dispatcher) and processes (e.g. image resize, statistics computation). It is both the foreman and the laborer. • General deployment as well as DevOps integration is quick and should not disrupt other services • Big data document-based CouchDB (or Cloudant) is used in anticipation of large data volumes • Communication relies entirely on HTTPS REST APIs Proof of Concept: OpenChecks OCR Hybrid Deployment with Cloud Bursting: Architectural Highlights
  28. 28. 27 Agenda 4 Results, conclusions, future directions 1 Serverless architectures 2 Apache OpenWhisk on IBM Bluemix 3 Check processing overview and solution
  29. 29. 28 Cost savings estimation from a check processing use case 1 https://www.federalreserve.gov/paymentsystems/check_govcheckprocannual.htm Estimating that • Number of USA check transactions in 2016: 60 million1 • Average time of execution in seconds: 7 seconds • Allocated memory per execution in GB: 0.256 GB • Cost per GB-second of execution: 0.000017 USD With these estimations we can predict that the total yearly cost to process every paper check in 2016 would be approximately $1,830 USD if based on OpenWhisk. Yearly Cost = # of Executions x Average Time (in seconds) x Allocated Memory per Execution x $ per GB/second
  30. 30. 29 • As of today, the OCR service can only cover efficiently Account and Routing numbers for US bank checks. In the future, other technologies should be surveyed in order to handle the amounts on the checks. • The front-end is currently showing the checks statistics, for demonstration purposes. It should be enhanced to allow for data correction and validation from the clerks. • Beyond use by the clerks at the banks, the same logic could be used to support mobile check deposit (with deposit to acount information inferred from the user, and amount data input manually. • Another OpenWhisk function could be created in a similar fashion to integrate with Santander Bank internal systems of record. Proof of Concept: OpenChecks OCR Hybrid Deployment with Cloud Bursting: Challenges and Potential Improvements
  31. 31. 30 • The full cost of an on premises cluster of virtual machines or containers to run OpenWhisk (and CouchDB) in a highly available configuration should be weighed against the lower cost of using it on a hosted instance in Bluemix. There is cost versus risk, but the key thing is you have flexibility to decide with OpenWhisk. • Use the new release of the Watson text analysis service rather than packaging Tesseract with MICR training data in a Docker container (or using Tesseract.js). • With serverless, cost is tightly bound to value gained, so code optimizations are very important at scale. • There has been a lot of work done to make OpenWhisk actions runnable and testable locally outside the OpenWhisk environment. This is key to a an end to end workflow that requires versioning of functions. • OpenWhisk native sequences and triggers/feeds should be preferred over manual programmatic action chaining in order to support composability. Proof of Concept: OpenChecks OCR Hybrid Deployment with Cloud Bursting: Challenges and Potential Improvements
  32. 32. 31 Why use OpenWhisk on IBM Bluemix? Provides a rich ecosystem of building blocks from various domains. Supports an open ecosystem that allows sharing microservices via packages. Takes care of low-level details such as scaling, load balancing, logging and fault tolerance. Hides infrastructural complexity allowing developers to focus on business logic. Allows developers to compose solutions using modern abstractions and chaining. Charges only for code that runs. Is open and designed to support an open community. Supports multiple runtimes and arbitrary binary programs encapsulate in Docker containers.
  33. 33. 32 Try Apache OpenWhisk on IBM Bluemix bit.ly/ibm-ow
  34. 34. Thank you Our purpose is to help people and businesses prosper. Our culture is based on the belief that everything we do should be

×