Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Acorns"

579 views

Published on

Within fintech catching fraudsters is one of the primary opportunities for us to use streaming applications to apply ML models in real-time. This talk will be a review of our journey to bring fraud decisioning to our tellers at Capital One using Kafka, Flink and AWS Lambda. We will share our learnings and experiences to common problems such as custom windowing, breaking down a monolith app to small queryable state apps, feature engineering with Jython, dealing with back pressure from combining two disparate streams, model/feature validation in a regulatory environment, and running Flink jobs on Kubernetes.

Published in: Technology
  • Be the first to comment

Flink Forward San Francisco 2018: Andrew Gao & Jeff Sharpe - "Finding Bad Acorns"

  1. 1. FINDING BAD ACORNS ANDREW GAO & JEFF SHARPE FLINK FORWARD 2018
  2. 2. ANDREW GAO JEFF SHARPE
  3. 3. Developing a Fraud Defense Platform Fraud Defense at the Teller Using Flink Our journey to build a Fraud Decisioning Platform and use Flink to build out the use cases
  4. 4. DEVELOPING A FRAUD DEFENSE PLATFORM
  5. 5. OUR USERS Fraud Operator Customer Data Scientist Data Analyst Engineer Product Owner
  6. 6. OUR USERS Fraud Operator Customer Data Scientist Data Analyst Engineer Product Owner
  7. 7. ARCHITECTURE DATA ACTIONS MAGIC!
  8. 8. RUNNING ON
  9. 9. RUNNING ON
  10. 10. PROS • Community support for Docker/Kube • Resilient • Easy to tear down and bring back • Maximizing resource efficiency CONS • Maintaining your own Kubernetes solution • Containing blast radius • Edge cases when combining # of technology solutions Developing on Kubernetes has been challenging but very rewarding
  11. 11. FRAUD DEFENSE AT THE TELLER
  12. 12. A FLINK MONOLITH • Problem: Develop a stream processing workflow for two legacy batch data sources • First Attempt: Do everything in Flink and take advantage of Flink Connected Streams
  13. 13. 1 2 3 Using Flink operators to build our application workflow 4
  14. 14. PROS • Cheap • Not a lot of Code/Config • Scalability / Availability • Deployments are a breeze CONS • Not truly stateless • Start-up time AWS Lambda is a good fit for our use case and works well with our underlying technologies
  15. 15. 1 2 3 Using Flink operators to build our application workflow 4
  16. 16. 90 Day Storage Window CUSTOM WINDOWS FOR OPTIMIZATION AND PORTABILITY 30 Day Virtual View 90 Day Filtered View
  17. 17. CUSTOM WINDOWS FOR OPTIMIZATION AND PORTABILITY Most-Recent-Beyond-24-Hours Window 24 Hour Offset Dynamic Window
  18. 18. 1 2 3 Using Flink operators to build our application workflow 4
  19. 19. USING JYTHON TO BRIDGE THE GAP TO DATA SCIENTISTS Flink Jython Adapter .py .py .py .py Windows Data Featur e Featur e Featur e Featur e Featur e Featur e Featur e Featur e .py .py .py .py Data
  20. 20. GITFLOW AND JYTHON IMPROVE TRACEABILITY Featur e JAR v1.0.42 Junit Tests Pull Request Merge Build Develop Denied Failed Maven Import Junit Tests Build Flink Job JAR Commit
  21. 21. 1 2 3 Using Flink operators to build our application workflow 4
  22. 22. FEATURES EXIST TO FEED MODELS FeatureFeature Model Model Score H20 Tensor Flow Seldon (whatever)
  23. 23. BREAKING UP THE MONOLITH • Problem: Back Pressure leading to Delayed Transactions • Solution: Break up the monolith Flink App into small Queryable State Apps
  24. 24. CHIPMUNKS
  25. 25. •Connected Streams •Flink Keyed State •Checkpointing/Savepointing •Queryable State Features Used •Flink Versioning (FLINK-7783, FLINK-8487) •Keyed Source Function •Kafka Offsets Issues We had a lot of fun and success using Flink, but not without a few hiccups
  26. 26. Developing a Fraud Defense Platform Fraud Defense at the Teller Using Flink Our journey to build a Fraud Decisioning Platform and use Flink to build out the use cases QUESTIONS?

×