Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
October 2019
Predicting Banking Customer Needs with
an Agile Approach to Analytics in the Cloud
Who is presenting today …
2
Milan BerkaJakub Mašek
- Machine learning engineer at
DataSentics, working for Moneta’s
DataSq...
Agenda
Background:
• Who is MONETA Money Bank a what is the role of Datasentics
• Moneta’s journey into the cloud
• Creati...
§ Major Czech banking institution
§ 4th in size, 1st in innovation
§ 1 mio clients; 181 branches; 650 ATMs
§ 3.000 employe...
… almost forgot to mention „Tom“ - our advertising star
Make data science and machine learning have a real
impact on organizations across the world - demystify
the hype and black...
Moneta and it’s journey to the cloud
2018
2019
2020
2021
10% cloud-based
30+% cloud-based
50+% cloud-based
Optimal cloud
h...
Birth of Datasquad as a new analytical DNA supporting the
cloud journey and making „digital“ into real
New analytical worl...
Datasquad is pioneering the new analytical world
DATALAKE
PLATFORM
DATA TEAM EVANGELIZATION
& SERVICE
DATA SCIENCE
SOLUTIO...
Main goal: utilize cloud services as much as possible
Technology:
§ Storage: AWS S3 with auto-encryption
§ ETL: AWS Glue
§...
Analytical platform in the cloud
Datalake structure
Data:
§ Adform data (terabytes)
§ Web data (terabytes)
§ Geo-data (gigabytes)
§ Branches/ATM data (giga...
Use-cases
“Online” data
Web analytics data
(AdobeAnalytics/GA)
Campaign data (Adform)
Real estate market data
“Offline” da...
Use-cases
“Online” data
Web analytics data
(AdobeAnalytics/GA)
Campaign data (Adform)
Real estate market data
Feature Stor...
If we look at a typical customer journey for a
consumer loan, we see a relevant touchpoint
gap, an opportunity for us to a...
If we look at a typical customer journey for a
consumer loan, we see a relevant touchpoint
gap, an opportunity for us to a...
USE CASE: Digital marketing cost analysis
17
→ WE HAVE PROVEN, THAT DISPLAY ADS DRIVE SALES INDIRECTLY
1
THERE IS OBVIOUS ...
USE CASE – Moneta Ad Quality
18
2
→ DIFFERENT COST PER VISIBLE MINUTE
ACROSS DIFFERENT WEBSITES
WE CAN INCREASE AD VISIBIL...
19
Locality (L) attractiveness is given by
surrounded points of interests
To measure attractiveness, weights of individual...
20
→ PRAGUE – EXPOSED AREAS BY PREDICTED PERFORMANCE INDEXWE CAN PREDICT PERFORMANCE IN ANY LOCALITY IN CZ
DATA WE USED
• ...
Use-case deep dive: DSID = Enabler for the Digital attribution
model
Problem: we have many identifiers (internal id, phone...
Use-case deep dive: DSID = Enabler for the Digital attribution
model
Answer:
GraphFrames!
Use-case deep dive: DSID = Enabler for the Digital attribution
model
WebsiteID InternalID
W1 I1
W2 I1
W3 NULL
WebsiteID Ad...
Use-case deep dive: DSID = Enabler for the Digital attribution
model
src dst
W1 I1
W2 I1
W3 NULL
W1 A1
W2 A2
W3 A3
I3 0196...
Use-case deep dive: DSID = Enabler for the Digital attribution
model
id Component
W1 1
W2 1
W3 2
I3 3
I1 1
A1 1
A2 1
A3 2
...
Next steps
26
- Major goal: Continue with democratizing of the platform, the ultimate goal is to have a self-serving data
...
Question
How many members does Data Squad have?
5.5
(3 from Moneta, 2.5 from DataSentics)
Wrap up
29
Even with the small team you can do big things …
Achieving this - you need to have supportive environment
and y...
Thank you for your attention
Upcoming SlideShare
Loading in …5
×

of

Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 1 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 2 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 3 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 4 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 5 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 6 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 7 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 8 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 9 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 10 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 11 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 12 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 13 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 14 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 15 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 16 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 17 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 18 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 19 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 20 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 21 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 22 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 23 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 24 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 25 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 26 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 27 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 28 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 29 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud Slide 30
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

2 Likes

Share

Download to read offline

Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud

Download to read offline

Moneta has repeatedly been recognized as the most innovative bank on the Czech market. This is due in large part to their strategy of completely shifting to the cloud and using data and advanced analytics to innovate the customer experience with use cases ranging from real-time recommendations to fraud detection.

In this talk, we’ll share how we migrated to the cloud to create an agile environment for analytics and AI. From rapid prototyping machine learning use cases to moving models into production, core to this approach was building a unified platform for data and analytics on Apache Spark, Databricks and AWS. Discussion topics include:

Moneta’s strategy and roadmap for moving to the cloud and creation of the data squad
Overview of use cases including ATM/branch location optimization using geo-data, digital channel attribution, identify fraud detection, etc.
Deep dive into the use of digital behavioural data (web, mobile app, internet banking) and offline transactions to understand and predict customer needs in near-real time using Spark MLLib
Approach to building the agile analytics platform and the specific challenges of using the cloud in a financial institution

Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud

  1. 1. October 2019 Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud
  2. 2. Who is presenting today … 2 Milan BerkaJakub Mašek - Machine learning engineer at DataSentics, working for Moneta’s DataSquad - Spark-certified developer - Roles: - Building the analytical platform - Productionalizing the usecases - Evangelize Spark across the company - Leader of DataSquad at MONETA - Experienced data science manager - Roles: - Partnering with the different departments across the bank - Helping finding them the ML opportunities - Managing the process milan.berka@datasentics.com www.linkedin.com/in/milan-berka/ jakub.masek@moneta.cz www.linkedin.com/in/jakub-mašek- 19631155
  3. 3. Agenda Background: • Who is MONETA Money Bank a what is the role of Datasentics • Moneta’s journey into the cloud • Creation of Data Squad Building the analytical platform: • Setting up an analytical environment in the cloud fully utilizing AWS and Databricks • Hurdles along the way Use-cases: • Utilizing online data in digital marketing and customer value management • Optimization of branches/ATM Next steps, Q&A
  4. 4. § Major Czech banking institution § 4th in size, 1st in innovation § 1 mio clients; 181 branches; 650 ATMs § 3.000 employees § Undergoing digital transformation § Collecting innovation awards § Smart Banka (mobile app) § Digital products § Migration to the cloud Moneta Money Bank - Czech bank for Czech people
  5. 5. … almost forgot to mention „Tom“ - our advertising star
  6. 6. Make data science and machine learning have a real impact on organizations across the world - demystify the hype and black magic surrounding AI/ML and bring to life transparent production-level data science solutions and products delivering tangible impact and innovation. DataSentics - European Data Science Center of Excellence based in Prague - Machine learning and cloud data engineering boutique - Helping customers build end-to-end data solutions in cloud - Incubator of ML-based products - 50 specialist (data science, data/software engineering) - Partner of Databricks & Microsoft
  7. 7. Moneta and it’s journey to the cloud 2018 2019 2020 2021 10% cloud-based 30+% cloud-based 50+% cloud-based Optimal cloud hosting Growing Platform as a Service • Primary Datacenter migration • Cloud design & initiation • First set of application migrated to Amazon Cloud • PaaS, SaaS and Containers • Automation embedded into the key processes • Second Datacenter migration • AS400 refresh/hosting • Software and Infrastructure harmonization • Platform as a Service, implemented for the selected capabilities • Use the most optimal hosting strategy for each application • Further infrastructure and application optimization • Hosted fixed telephony • Software as a Service implemented for the selected capabilities
  8. 8. Birth of Datasquad as a new analytical DNA supporting the cloud journey and making „digital“ into real New analytical worldOld analytical world -Tools: -On-premise Oracle data warehouse with limited computational power -On-premise SAS for modelling -Data: Mainly offline (transactions, …) -Tools: - Cloud-based, elastic and scalable – unlimited resources - Data in Datalake - Spark, Python, R -Data: - offline (internal data) - online (web-browsing data, digital marketing data, …)
  9. 9. Datasquad is pioneering the new analytical world DATALAKE PLATFORM DATA TEAM EVANGELIZATION & SERVICE DATA SCIENCE SOLUTIONS - POC; MVP - Products - Frameworks - onboarding - Evangelize Spark and new technologies
  10. 10. Main goal: utilize cloud services as much as possible Technology: § Storage: AWS S3 with auto-encryption § ETL: AWS Glue § Access Management: AWS IAM + ADFS § Analytical service: Databricks § Security measures: AWS S3 auto encryption, AWS EBS auto-encryption, Databricks SSO, Databricks without access to internet, hashing of all sensitive data Building the analytical platform
  11. 11. Analytical platform in the cloud
  12. 12. Datalake structure Data: § Adform data (terabytes) § Web data (terabytes) § Geo-data (gigabytes) § Branches/ATM data (gigabytes) § Onboarding/fraud data (gigabytes) § Transactions (terabytes)
  13. 13. Use-cases “Online” data Web analytics data (AdobeAnalytics/GA) Campaign data (Adform) Real estate market data “Offline” data Branch/ATM performance Sales data Onboarding data CVM data Feature Store CVM STORY DIGITAL STORY RISK STORY BRANCH / ATM STORY FRAUD / AML STORY
  14. 14. Use-cases “Online” data Web analytics data (AdobeAnalytics/GA) Campaign data (Adform) Real estate market data Feature Store CVM STORY DIGITAL STORY RISK STORY BRANCH / ATM STORY FRAUD / AML STORY “Offline” data Branch/ATM performance Sales data Onboarding data CVM data
  15. 15. If we look at a typical customer journey for a consumer loan, we see a relevant touchpoint gap, an opportunity for us to address … 15 … and we already have a plan in motion to address this opportunity Digital Story Digital marketing cost analysis 1 Moneta Ad Quality2 Ad Targeting users in „think“ phase 3 „Think“ phase predictors in CVM campaigns 4
  16. 16. If we look at a typical customer journey for a consumer loan, we see a relevant touchpoint gap, an opportunity for us to address … 16 … and we already have a plan in motion to address this opportunity Digital Story Digital marketing cost analysis 1 Moneta Ad Quality2 Ad Targeting users in „think“ phase 3 „Think“ phase predictors in CVM campaigns 4
  17. 17. USE CASE: Digital marketing cost analysis 17 → WE HAVE PROVEN, THAT DISPLAY ADS DRIVE SALES INDIRECTLY 1 THERE IS OBVIOUS POTENTIAL IN THE „THINK“ PHASE DATA WE USED • Advertising data (what user, on which specific website/page/context, for how long has seen or interacted with our Ads, for how much) • Moneta Website behavior • Marketing costs WHAT WE DID • We implemented an attribution model to prove how online ad impressions (not clicks!) drive sales. An attribution model shows how each market channel drives conversions. Here we wanted to see what contribution each channel makes to closing consumer loans. NEXT STEPS • Incrementally start to reallocate more budget to Online Ads (upper funnel – think phase) and evaluate impact on efficiency BUSINESS CASE • Increase digital sales for the same media spending. By better split between Online Ads and Search Marketing channel Costs (units) Cost efficiency Performance - Adform 1 11,3 Brand - Adform 17 6,6 Performance - remarketing 23 2,4 Performance - display 26 1,2 Performance – search 1 115 1 Performance - social 0,75 0,5 Brand – youtube 0,4 0,18 1 Performance – search chosen as a reference with cost effeciency ratio 1
  18. 18. USE CASE – Moneta Ad Quality 18 2 → DIFFERENT COST PER VISIBLE MINUTE ACROSS DIFFERENT WEBSITES WE CAN INCREASE AD VISIBILITY TO USERS IN THINK PHASE DATA WE USED • Advertising data (what user, on which specific website/page/context, for how long has seen or interacted with our Ads, for how much) WHAT WE DID • We see an ENORMOUS difference in visible time of online Ads. Cost per 1 visible minute in Online differs from 15 to 35 CZK in NEXT STEPS • Create engine to optimize Online Ads buying (buy more visible ads) BUSINESS CASE • We should be able to buy at least 20% more media time for the same budget Analytical output - Cost per visible minute → ADJUSTING ADFORM BY DISADVANTAGING DOMAINS WITH EXPENSIVE VISIBLE MINUTES Adform implementation – multipliers autoweb.cz 0.75 autozine.cz 0.8 autozive.cz 0.9 avizo.cz 0.85 babinet.cz 0.95 babyweb.cz 0.65 banger.cz 0.85 banky.cz 0.85 bazarbox.cz 0.7 behani.cz 0.85 bejvavalo.cz 0.85 bezrealitky.cz 0.65 biatlonmag.cz 0.8 biginzerce.cz 0.7 bike-mania.cz 0.85 ... ... API Quality model
  19. 19. 19 Locality (L) attractiveness is given by surrounded points of interests To measure attractiveness, weights of individual points of interests need to be set MONETA wants to compare localities in terms of business KPI - possible bank performance 200 m eters L • Total attractiveness of the measured point is given by the sum of partial weights • Two possible scenarios how to set the weights: By expert (e.g. Bank 50; Bus station 15 …) having dimensionless index Data Science approach (machine learnig) - using internal data to set KPI and having interpretable resuls 1 2 181 Branch Story Moneta needs to independently evaluate every single locality or branch network cross the country … v Assumption v Target variablev Approach
  20. 20. 20 → PRAGUE – EXPOSED AREAS BY PREDICTED PERFORMANCE INDEXWE CAN PREDICT PERFORMANCE IN ANY LOCALITY IN CZ DATA WE USED • Geospatial data - points of interests • Population statistics • Internal data – performance of our existing branches; costs; # FTEs; ATM performance WHAT WE DID • We wanted to evaluate every single location in CZ in terms of footfall. The closest equivalent to footfall is visitors' rate which is measured only for 15% of our network. But visitors' rate is strongly corelated with business KPI - performance rate - which was finally used as a proxy variable for our model. We are now able to predict possible banking performance of any observed location. MODEL VARIABLES • # of transportation in 200m • # of food in 200m • # of competitors and highly exposed areas • City population Branch Story use case
  21. 21. Use-case deep dive: DSID = Enabler for the Digital attribution model Problem: we have many identifiers (internal id, phone, website cookie, Adform cookie) of a person/client, which shows at different times at different places – how do we connect all these into a single ID? I1 I2 I3 W1 W2 W3 W4 W5 A1 A2 A3
  22. 22. Use-case deep dive: DSID = Enabler for the Digital attribution model Answer: GraphFrames!
  23. 23. Use-case deep dive: DSID = Enabler for the Digital attribution model WebsiteID InternalID W1 I1 W2 I1 W3 NULL WebsiteID AdformID W1 A1 W2 A2 W3 A3 InternalD Phone I1 999999 I2 999999 I3 019645 df3.filter(not_fake(col(‘Phone’)) df1.withColumn(‘src’, ‘WebsiteId’) df1.withColumn(‘dst’, ‘InternalId’) df2.withColumn(‘src’, ‘WebsiteId’) df2.withColumn(‘dst’, ‘AdformId’) df3.withColumn(‘src’, ‘InternalId’) df3.withColumn(‘dst’, ‘Phone’) df = df1 .union(df2) .union(df3) .distinct()
  24. 24. Use-case deep dive: DSID = Enabler for the Digital attribution model src dst W1 I1 W2 I1 W3 NULL W1 A1 W2 A2 W3 A3 I3 019645 vertices = df .selectExpr(‘src AS id’) .union(df.selectExpr(‘dst AS id’)) edges = df g = GraphFrame(vertices, edges) df_connected = g.connected_components()
  25. 25. Use-case deep dive: DSID = Enabler for the Digital attribution model id Component W1 1 W2 1 W3 2 I3 3 I1 1 A1 1 A2 1 A3 2 019645 3 plus further adjustements: • filter business clients • disjoint the groups with two or more internal ids • … = DSID Statistics: - Number of vertices (ids): 14 969 170 - Number of edges: 30 029 363 - Running time: ~20 min
  26. 26. Next steps 26 - Major goal: Continue with democratizing of the platform, the ultimate goal is to have a self-serving data platform - Continue with the use-cases and moving them to production - Implement company-wide feature store - Employ new technologies (in particular - Spark Structured Streaming)
  27. 27. Question How many members does Data Squad have?
  28. 28. 5.5 (3 from Moneta, 2.5 from DataSentics)
  29. 29. Wrap up 29 Even with the small team you can do big things … Achieving this - you need to have supportive environment and you need to be disruptive to drive changes and show the added value to prove that: … „data is really the new oil for your company“ Safety always first Data science is about data AND science – doing science is always linked with blind paths – be patient and keep going!
  30. 30. Thank you for your attention
  • GnanaSekhar4

    Jan. 27, 2020
  • tushar_kale

    Nov. 10, 2019

Moneta has repeatedly been recognized as the most innovative bank on the Czech market. This is due in large part to their strategy of completely shifting to the cloud and using data and advanced analytics to innovate the customer experience with use cases ranging from real-time recommendations to fraud detection. In this talk, we’ll share how we migrated to the cloud to create an agile environment for analytics and AI. From rapid prototyping machine learning use cases to moving models into production, core to this approach was building a unified platform for data and analytics on Apache Spark, Databricks and AWS. Discussion topics include: Moneta’s strategy and roadmap for moving to the cloud and creation of the data squad Overview of use cases including ATM/branch location optimization using geo-data, digital channel attribution, identify fraud detection, etc. Deep dive into the use of digital behavioural data (web, mobile app, internet banking) and offline transactions to understand and predict customer needs in near-real time using Spark MLLib Approach to building the agile analytics platform and the specific challenges of using the cloud in a financial institution

Views

Total views

400

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

16

Shares

0

Comments

0

Likes

2

×