SlideShare a Scribd company logo
1 of 19
CONTINUOUS EMBEDDING SPACES
for BANK TRANSACTION DATA
Ali Batuhan Dayıoğlugil 1
and Yusuf Sinan Akgül2
1
Yeditepe University, Cybersoft R&D Center, İstanbul, Turkey
2
Gebze Technical University, Kocaeli, Turkey
ISMIS 2017, WARSAW
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 2 of 18
OUTLINE
● Introduction & Motivations
● Modern NLP Techniques
● Purpose of the Work
● An NLP Approach for Bank Data
● Experiments
● Future Works
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 3 of 18
Introduction
● Growing data: todays treasure
● Health, e-commerce, banking, assurance, games...
● Biggest challenge is extracting valuable
information
● Behavior recognition
● Forecasting
● Detecting abnormal activities
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 4 of 18
Introduction
In case of Banks:
● Millions of transactions everyday
● Processing of data:
● rule based systems
● domain experts
● Missing complicated relations and patterns between same
and different customer transactions (limited capacity)
● As a result
● Less successful product offers
● Economical loss due to undetected frauds
● Dissatisfaction of customers
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 5 of 18
Motivations
● New era in banking: Smart systems
– Learning models
– Automatically discovering hidden patterns
– Faster than human decisions
– Less prone to human errors
● Essential tasks by Smart Systems:
– Fraud Detection
– New product offers
– Customer behavior analysis
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 6 of 18
Modern NLP Techniques
● Representation of words with their hidden features:
● Continuous embedding workspaces
– Fast in process, highly promising, domain independent, fully unsupervised...
● Application Fields:
– Language modeling, language translations
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 7 of 18
Nature of Bank Data
● Specifications:
– Structured and ordered (Row-Column)
– Large variety of attributes;
● Demographic, Transactional, Logs, Engineered Features
– Various customer behavior
– Fast, Cumulative, High Volume
– Lack of labeled data
● Role of Domain Experts
– Segmentation
– Campaign Planning
– Customer Propensity & New product design
– Outlier Identification
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 8 of 18
Nature of Bank Data
● Popular techniques for financial data extraction:
– Rule based approaches
● Difficult to keep up to date
● Hard to add/remove rules regularly
– Supervised machine learning methods
● Incapable of detecting deep relations in data
● Require labeled data
– Domain Experts
● Subjective decisions
● Prone to human errors
● Not enough resource to process all data
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 9 of 18
Purpose of the Work
“Predicting customer behavior using transaction and
demographic data without any manual labeling”
● Explaining deep relations in data
● Objective approach
● Up-to-date models
● Decisions with less domain expert dependent
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 10 of 18
NLP Design over Bank Data
● Similarities:
– Time series transactions ~ Sentences in a context
– Attributes of transactions ~ Words in a sentence
● Successful results of continuous embedding
spaces with libraries: Word2Vec, GloVe and
FastText
● Need of numerical inputs for machine learning
models
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 11 of 18
NLP Design over Bank Data
● Transaction Data:
– TRX= {Ti
}, i=1..ni
● Transaction Elements
– Ti = {ti,j}, j=1..m
● Mostly categorical attributes
● Numerical elements are clustered and irrelevant elements are ignored
● Transaction elements
are converted to vectors
and compared by cosine
similarity
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 12 of 18
Experiment: Space Transformation
● Data:
– 1.8 M transaction data (4 weeks) of a medium-sized Turkish
Bank
– Enriched with customer demographic data
– 8 categorical and 2 numerical attributes (clustered)
– Dictionary size is 137
● Vector Transformation
– Word2Vec library
● Skip-gram, 20-length vectors (for each element value)
● Comparison of vectors in 2D (PCA)
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 13 of 18
Experiment: Space Transformation
● Nearest neighbours of three transaction element values
(Age, business segment and profession respectively) with
respect to cosine similarities
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 14 of 18
Experiment: Space Transformation
● PCA of business segment element value embedding vectors
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 14 of 18
Experiment: Space Transformation
●
Embedding vectors of same element values with artificially divided
‘High-income’ value
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 15 of 18
Experiment: Classification
● Predicting business segment vectors Bi (20) using Transaction vectors i
(180)
(without segment vector)
● Aim: model produce vectors positions close to business segment element value
vectors
● ANN parameters:
– Cosine similarity (loss func.)
– Stochastic Gradient
Descent as optimizer
– tanh as activation func.
– Single hidden layer, 60 nodes
– Learning rate: 0,018
– Batch size: 100
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 16 of 18
Experiment: Classification
● 4 cross-fold validation with designed ANN model
● Comparison of predicted business segment vectors i
with real vectors Bi and selecting nearest (top) 5
● Proposed model accuracies for business segment
attribute:
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 17 of 18
Future Work
● Extracting deep and hidden patterns in customer
transactions may be used in:
– Creating relevant products
– Detecting fraud attempts by examining customer
behaviours
– Defining abnormal customer activities
28/07/2017Dayioglugil A. B., Akgul Y. S. Page 18 of 18
Thank you for listening.
Any question?

More Related Content

Similar to Continuous Embedding Spaces for Bank Transaction Data

Big Data overview
Big Data overviewBig Data overview
Big Data overviewalexisroos
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsVMware Tanzu
 
Data Science Meetup - ATB - May 16, 2018.pptx
Data Science Meetup - ATB - May 16, 2018.pptxData Science Meetup - ATB - May 16, 2018.pptx
Data Science Meetup - ATB - May 16, 2018.pptxGunjanKPahwa
 
Lunchalytics Meetup - ATB - May 16, 2018.pptx
Lunchalytics Meetup - ATB - May 16, 2018.pptxLunchalytics Meetup - ATB - May 16, 2018.pptx
Lunchalytics Meetup - ATB - May 16, 2018.pptxGunjanKaur17
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdfLars Albertsson
 
Nonprofits + Data: Pathway to Innovation
Nonprofits + Data: Pathway to InnovationNonprofits + Data: Pathway to Innovation
Nonprofits + Data: Pathway to InnovationTim Sarrantonio
 
1.1 Introduction Business Analytics.pptx
1.1 Introduction Business Analytics.pptx1.1 Introduction Business Analytics.pptx
1.1 Introduction Business Analytics.pptxuthiramoorthyarumuga1
 
BUSINESS INTELIIGENCE AND ANALYTICS.pptx
BUSINESS INTELIIGENCE AND ANALYTICS.pptxBUSINESS INTELIIGENCE AND ANALYTICS.pptx
BUSINESS INTELIIGENCE AND ANALYTICS.pptxShamshadAli58
 
Big Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingBig Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingGianpaolo Zampol
 
The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...
The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...
The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...mysociety
 
Data & Analytics at Scale
Data & Analytics at ScaleData & Analytics at Scale
Data & Analytics at ScaleWalid Mehanna
 
Key Roles In Data-Driven Organisation
Key Roles In Data-Driven OrganisationKey Roles In Data-Driven Organisation
Key Roles In Data-Driven OrganisationKnoldus Inc.
 
Key Roles In Data-Driven Organisation
Key Roles In Data-Driven OrganisationKey Roles In Data-Driven Organisation
Key Roles In Data-Driven OrganisationKnoldus Inc.
 
Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...
Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...
Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...Data Con LA
 
How to start a career in Data Science_.pptx
How to start a career in Data Science_.pptxHow to start a career in Data Science_.pptx
How to start a career in Data Science_.pptxAvinash Sharma
 
Latest Updated Resume
Latest Updated ResumeLatest Updated Resume
Latest Updated ResumeKaran Gupta
 
Application of data science in finance.pdf
Application of data science in finance.pdfApplication of data science in finance.pdf
Application of data science in finance.pdfGowthamReddyA
 
Careers in Data Science _ Navigating the Digital Frontier (1).pptx
Careers in Data Science _  Navigating the Digital Frontier (1).pptxCareers in Data Science _  Navigating the Digital Frontier (1).pptx
Careers in Data Science _ Navigating the Digital Frontier (1).pptx2075AAGEPRATIK
 

Similar to Continuous Embedding Spaces for Bank Transaction Data (20)

Big Data overview
Big Data overviewBig Data overview
Big Data overview
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and Tools
 
Data Science Meetup - ATB - May 16, 2018.pptx
Data Science Meetup - ATB - May 16, 2018.pptxData Science Meetup - ATB - May 16, 2018.pptx
Data Science Meetup - ATB - May 16, 2018.pptx
 
Lunchalytics Meetup - ATB - May 16, 2018.pptx
Lunchalytics Meetup - ATB - May 16, 2018.pptxLunchalytics Meetup - ATB - May 16, 2018.pptx
Lunchalytics Meetup - ATB - May 16, 2018.pptx
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
Nonprofits + Data: Pathway to Innovation
Nonprofits + Data: Pathway to InnovationNonprofits + Data: Pathway to Innovation
Nonprofits + Data: Pathway to Innovation
 
1.1 Introduction Business Analytics.pptx
1.1 Introduction Business Analytics.pptx1.1 Introduction Business Analytics.pptx
1.1 Introduction Business Analytics.pptx
 
BUSINESS INTELIIGENCE AND ANALYTICS.pptx
BUSINESS INTELIIGENCE AND ANALYTICS.pptxBUSINESS INTELIIGENCE AND ANALYTICS.pptx
BUSINESS INTELIIGENCE AND ANALYTICS.pptx
 
Big Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingBig Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in Banking
 
The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...
The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...
The State of Open Data: research findings Mor Rubinstein (360Giving, UK) Tim ...
 
Data & Analytics at Scale
Data & Analytics at ScaleData & Analytics at Scale
Data & Analytics at Scale
 
Life of a data engineer
Life of a data engineerLife of a data engineer
Life of a data engineer
 
Future of Business Analytics in Accounting
Future of Business Analytics in AccountingFuture of Business Analytics in Accounting
Future of Business Analytics in Accounting
 
Key Roles In Data-Driven Organisation
Key Roles In Data-Driven OrganisationKey Roles In Data-Driven Organisation
Key Roles In Data-Driven Organisation
 
Key Roles In Data-Driven Organisation
Key Roles In Data-Driven OrganisationKey Roles In Data-Driven Organisation
Key Roles In Data-Driven Organisation
 
Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...
Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...
Data Con LA 2019 - Big Data Modeling with Spark SQL: Make data valuable by Ja...
 
How to start a career in Data Science_.pptx
How to start a career in Data Science_.pptxHow to start a career in Data Science_.pptx
How to start a career in Data Science_.pptx
 
Latest Updated Resume
Latest Updated ResumeLatest Updated Resume
Latest Updated Resume
 
Application of data science in finance.pdf
Application of data science in finance.pdfApplication of data science in finance.pdf
Application of data science in finance.pdf
 
Careers in Data Science _ Navigating the Digital Frontier (1).pptx
Careers in Data Science _  Navigating the Digital Frontier (1).pptxCareers in Data Science _  Navigating the Digital Frontier (1).pptx
Careers in Data Science _ Navigating the Digital Frontier (1).pptx
 

Recently uploaded

NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样
一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样
一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样yhavx
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeBoston Institute of Analytics
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...siskavia95
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...ssuserf63bd7
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxStephen266013
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.pptRachmaGhifari
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证zifhagzkk
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024patrickdtherriault
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单aqpto5bt
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"John Sobanski
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshareraiaryan448
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Voces Mineras
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444saurabvyas476
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsBrainSell Technologies
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancingmohamed Elzalabany
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证pwgnohujw
 

Recently uploaded (20)

NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样
一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样
一比一原版(Monash毕业证书)莫纳什大学毕业证原件一模一样
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di  Ban...
obat aborsi Banjarmasin wa 082135199655 jual obat aborsi cytotec asli di Ban...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
Audience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptxAudience Researchndfhcvnfgvgbhujhgfv.pptx
Audience Researchndfhcvnfgvgbhujhgfv.pptx
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
一比一原版(ucla文凭证书)加州大学洛杉矶分校毕业证学历认证官方成绩单
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 

Continuous Embedding Spaces for Bank Transaction Data

  • 1. CONTINUOUS EMBEDDING SPACES for BANK TRANSACTION DATA Ali Batuhan Dayıoğlugil 1 and Yusuf Sinan Akgül2 1 Yeditepe University, Cybersoft R&D Center, İstanbul, Turkey 2 Gebze Technical University, Kocaeli, Turkey ISMIS 2017, WARSAW
  • 2. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 2 of 18 OUTLINE ● Introduction & Motivations ● Modern NLP Techniques ● Purpose of the Work ● An NLP Approach for Bank Data ● Experiments ● Future Works
  • 3. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 3 of 18 Introduction ● Growing data: todays treasure ● Health, e-commerce, banking, assurance, games... ● Biggest challenge is extracting valuable information ● Behavior recognition ● Forecasting ● Detecting abnormal activities
  • 4. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 4 of 18 Introduction In case of Banks: ● Millions of transactions everyday ● Processing of data: ● rule based systems ● domain experts ● Missing complicated relations and patterns between same and different customer transactions (limited capacity) ● As a result ● Less successful product offers ● Economical loss due to undetected frauds ● Dissatisfaction of customers
  • 5. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 5 of 18 Motivations ● New era in banking: Smart systems – Learning models – Automatically discovering hidden patterns – Faster than human decisions – Less prone to human errors ● Essential tasks by Smart Systems: – Fraud Detection – New product offers – Customer behavior analysis
  • 6. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 6 of 18 Modern NLP Techniques ● Representation of words with their hidden features: ● Continuous embedding workspaces – Fast in process, highly promising, domain independent, fully unsupervised... ● Application Fields: – Language modeling, language translations
  • 7. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 7 of 18 Nature of Bank Data ● Specifications: – Structured and ordered (Row-Column) – Large variety of attributes; ● Demographic, Transactional, Logs, Engineered Features – Various customer behavior – Fast, Cumulative, High Volume – Lack of labeled data ● Role of Domain Experts – Segmentation – Campaign Planning – Customer Propensity & New product design – Outlier Identification
  • 8. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 8 of 18 Nature of Bank Data ● Popular techniques for financial data extraction: – Rule based approaches ● Difficult to keep up to date ● Hard to add/remove rules regularly – Supervised machine learning methods ● Incapable of detecting deep relations in data ● Require labeled data – Domain Experts ● Subjective decisions ● Prone to human errors ● Not enough resource to process all data
  • 9. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 9 of 18 Purpose of the Work “Predicting customer behavior using transaction and demographic data without any manual labeling” ● Explaining deep relations in data ● Objective approach ● Up-to-date models ● Decisions with less domain expert dependent
  • 10. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 10 of 18 NLP Design over Bank Data ● Similarities: – Time series transactions ~ Sentences in a context – Attributes of transactions ~ Words in a sentence ● Successful results of continuous embedding spaces with libraries: Word2Vec, GloVe and FastText ● Need of numerical inputs for machine learning models
  • 11. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 11 of 18 NLP Design over Bank Data ● Transaction Data: – TRX= {Ti }, i=1..ni ● Transaction Elements – Ti = {ti,j}, j=1..m ● Mostly categorical attributes ● Numerical elements are clustered and irrelevant elements are ignored ● Transaction elements are converted to vectors and compared by cosine similarity
  • 12. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 12 of 18 Experiment: Space Transformation ● Data: – 1.8 M transaction data (4 weeks) of a medium-sized Turkish Bank – Enriched with customer demographic data – 8 categorical and 2 numerical attributes (clustered) – Dictionary size is 137 ● Vector Transformation – Word2Vec library ● Skip-gram, 20-length vectors (for each element value) ● Comparison of vectors in 2D (PCA)
  • 13. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 13 of 18 Experiment: Space Transformation ● Nearest neighbours of three transaction element values (Age, business segment and profession respectively) with respect to cosine similarities
  • 14. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 14 of 18 Experiment: Space Transformation ● PCA of business segment element value embedding vectors
  • 15. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 14 of 18 Experiment: Space Transformation ● Embedding vectors of same element values with artificially divided ‘High-income’ value
  • 16. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 15 of 18 Experiment: Classification ● Predicting business segment vectors Bi (20) using Transaction vectors i (180) (without segment vector) ● Aim: model produce vectors positions close to business segment element value vectors ● ANN parameters: – Cosine similarity (loss func.) – Stochastic Gradient Descent as optimizer – tanh as activation func. – Single hidden layer, 60 nodes – Learning rate: 0,018 – Batch size: 100
  • 17. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 16 of 18 Experiment: Classification ● 4 cross-fold validation with designed ANN model ● Comparison of predicted business segment vectors i with real vectors Bi and selecting nearest (top) 5 ● Proposed model accuracies for business segment attribute:
  • 18. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 17 of 18 Future Work ● Extracting deep and hidden patterns in customer transactions may be used in: – Creating relevant products – Detecting fraud attempts by examining customer behaviours – Defining abnormal customer activities
  • 19. 28/07/2017Dayioglugil A. B., Akgul Y. S. Page 18 of 18 Thank you for listening. Any question?