SlideShare a Scribd company logo
1 of 13
Download to read offline
CS598 Data Mining Capstone, Summer 2022
Paper Review by Sathish Rama (sbrama2)
Paper :
Review of “Intention-aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection”
From KDD ’21, August 14–18, 2021, Virtual Event, Singapore.
Link to Paper : https://dl.acm.org/doi/10.1145/3447548.3467142
Background & Motivation
o Fraud transactions – major threat to e-commerce platforms
o Increase in organized fraud
o Complex scenarios
Current/Related Methods
o Traditional methods are based on statistical features (doesn’t
capture user behavior )
o Various deep learning models based on user behavioral data are
proposed. Sequence-based & tree based methods have been
extensively studied.
o Existing methods despite remarkable success, treat each
transaction as independent data instance without considering
transaction level interactions or intent of a set of transactions thus
ignoring rich information
Motivation for the proposed method
Leveraging rich interactions among transactions & behavior sequence
for fraud detection
Paper Method - IHGAT ( Intention-aware Heterogenous Graph Attention networks)
o Transaction intention network is devised using cross interaction information over transactions and
intentions.
o Using above, a graph neural network method is coined IHGAT ( Intention-aware Heterogenous Graph
Attention networks)
o The IHGAT model is used to detect if a transaction is a fraud or not.
o Experimented on real world Alibaba platform to show results for both offline & online model
Problem
o Detect fraud transaction using proposed method and label each transaction as 0 or 1 where 1 denotes
fraud transaction and 0 otherwise.
Concepts Used in the Proposed Method
o Behavior sequence : Chronologically ordered behaviors. A behavior
sequence of a user is shown in figure(a) below.
o Behavior tree : A tree-like data structure consisting of behavior nodes.
A behavior node is unique identification of the behavior using a name and
id. Below is figure showing behavior tree with intentions.
o User intentions : Every branch in a behavior tree denotes a user
intention. For example in figure (b) below four different user intentions
are marked with different colors. The first ‘Intention1” is presented as
{Home, Search, Product List}, which corresponds to the leftmost branch of
the behavior tree.
o Heterogeneous transaction-intention network (HTIN): A HTIN is
denoted as G = {V, E}, where V and E are the nodes and edges,
respectively. The node set V consists of transaction nodes and user
intention nodes. The edge set E contains two types of edges, transaction-
transaction edges and transaction-intention
Architecture of proposed IHGAT method
The overall process has two stages. First, the intention
neighbors are aggregated by a sequence-based model
with attention mechanism. Then a multi-head graph
attention layer is applied to aggregate transaction
neighbors.
1. User intention is modeled by embedding layer and sequence
encoding.
2. Intention neighbors of a transaction node are aggregated by
LTSM attention mechanism. LTSM model is a long short-
term memory network, used in deep learning esp. in
sequence prediction problems.
3. Multi head graph attention layer is used to aggregate
interactions among transactions.
4. After aggregating the intention and transaction neighbors,
the obtained representation is fed into multiple fully
connected neural networks and a regression layer with a
sigmoid unit and then predicted fraud probability(𝑝) of
transaction is derived.
Experiments
• Extensive experiments on a large-scale real-world industrial dataset are conducted
• First, verify the performance on the task of fraud transactions detection and perform ablation tests to demonstrate the
effectiveness of every component in the model.
• Then major hyper-parameters were observed analyzed and looked closely.
• Results visualized to demonstrate the interpretability of the method
Dataset
• Large-scale industrial dataset from Alibaba Group (online e-commerce
platform ) is used.
• Randomly sampled 1.27 million transactions (ranging from 2020/05/01 to
2020/05/31) for training and 0.31 million transactions (ranging from
2020/06/01 to 2020/06/7) for testing as shown in Table 1.
• For each transaction, last 24 hours of user behavior is back tracked and
behavior sequence and behavior tree is generated and then HTIN is
constructed
• 1.76 million transaction and intention nodes
• 21.93 million transaction-intention and transaction-transaction edges as shown in Table 2.
Experiments
Baselines : To demonstrate effectiveness of proposed method, sequence-based models, tree-based models, graph-based models, and
variants of the proposed model are compared as baselines.
• Sequence-based Methods: LSTM, BiLTSM, GRU, CNN and Transformer methods are compared
• Tree-based Methods: CS Tree-LSTM and LIC Tree-LSTM methods
• Graph-based Methods: GraheSAGE and GAT
• Ablation Test : The proposed method and multiple variants of IHGAT are derived to analyze performance such as
o One variant without edge among transactions
o One variant without transaction attention mechanism
o One variant without intention attention mechanism
o One variant without considering order information of intentions
Evaluation Metrics : Two widely used metrics, namely AUC and R@P𝑁 , to measure the performance of fraud transactions detection.
• AUC is defined as the area under ROC curve
• R@P𝑁 indicates the Recall rate when the Precision rate equals to 𝑁 ( high precision rate is needed for fraud detection problems)
ROC Curve : A metric used to measure the performance of a model. The ROC curve depicts the rate of true positives with respect to the
rate of false positives
Higher AUC and R@P𝑁 indicates higher performance of the approaches.
Results
Results Comparison across different
methods
• Proposed method IHGAT is significantly
better than all the baselines
• Proposed method when compared
1. With sequence-based methods: AUC is
at least 3.79% higher & R@P0.9 is
64.21% higher
2. With Tree-based methods: AUC is higher
by 1.82% and R@P0.9 by 23.16%
3. With Graph based methods: AUC is
higher by 1.05% and R@P0.9 by 8.93%.
• Within the proposed method, a variant without the transaction-transaction interactions(IHGAT𝑇−𝑇), obtains the worst performance among all the
variants with 2.62% decreased in AUC and 25.77% decreased in R@P0.9 respectively
• From the results of IHGAT𝐼𝐴𝑡𝑡 and IHGAT𝐼𝐿𝑆𝑇𝑀 , we can see that the attention mechanism on user intentions can capture the key user intention and
the order information among user intentions is useful in the task of fraud transactions detection.
The main reason IHGAT to score better is, it captured both transaction-intention and transaction-transaction interactions.
Results
Effects of Behavior Sequence Length
• Divided the testing set into 5 groups to analyze the effects of different behavior sequence lengths as shown below.
• Overall, both tree-based and graph-based models are better than the sequence-based approaches in all sequence lengths.
• Graph-based models, namely GraphSAGE and GAT, achieve the better performances than LIC tree-LSTM when the sequence length is less than 120,
but poor performance seen when it is greater than 120, except for IHGAT.
• One observation is, elaborate user intention modeling seem to play a important role in longer sequence groups.
• The performances of most models, as the increase of behavior sequence length, improve obviously at the beginning, and then flatten to some extent.
Proposed model, benefits from the construction of user intentions and heterogeneous transaction-intention network obtains
• Best results in various sequence lengths
• Achieves a significant improvement on longer sequences.
Results
Other Major Hyper-parameters
The paper investigated the effects of two major parameters.
• Sliding window(l) :
o An important component in building transaction-transaction interactions
o It is observed that for both AUC and R@P0.9, the performances gets better as the sliding window size increases, and 𝑙 = 3 gets the best
performance
o The reason is too small window size could not build complete transaction-transaction edges, while too large window sizes may introduce
interference edges that are not very closely related.
• Embedding dimensions:
o Lower dimensions may not be able to completely represent user behavior, while higher dimensions cannot improve classification
performances and may cost more training time.
Results Visualization
• The paper visualized the attention weights of a fraud transaction(𝑇0), as shown below. The behavior sequence of 𝑇0 is segmented
into five intentions from 𝐼1 to 𝐼5, shown in Figure (a).
• Figure (b) shows I2 and I4 gets higher value. I4 is a intuitive pattern is an intuitive pattern of potential fraudsters, as they tend to
switch accounts frequently to avoid the identification rules of platforms.
• Figure (b) shows transaction neighbors. It is observed that T0 has the highest weight & T2 is the second highest. The edge
between these is established using same remark of transaction and it is observed that fraudsters sometimes uses such common
remarks(or secret code) to communicate with their accomplices.
Conclusion
• The paper investigated the detection of fraud transactions by elaborately modeling user intentions and leveraging the
transaction-level interactions
• Devised a heterogeneous transaction intention network and a graph-based neural model (IHGAT) to detect the fraud transaction.
• Experiments conducted on a real-world dataset show that proposed model is effective in fraud transactions detection
provided good interpretability of results.
• I found this method very interesting, and it clearly shows better results compared to sequence-based methods.
• I’m curious about how if any real-world data challenges may impact the performance of this method such as
o We may miss some transaction data in a sequence due to network/system failure so how would the method perform.
o Sometimes there may be benign transaction patterns with similar comments for frequent pattern shopping such as buying
gifts to family members or fund transfers with friends etc
o Amount of compute needed to actively detect fraud with low latency response times since building a large IHGAT network
with several embeddings and large sliding window could be very compute intensive.
Other comments
Thank you

More Related Content

Similar to CS598 DataMining Capstore Paper Review Presentation - sbrama2.pdf

An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...IRJET Journal
 
Real Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningReal Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningIRJET Journal
 
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfShrutiGarg649495
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning TechniquesIRJET Journal
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Performance Comparisons among Machine Learning Algorithms based on the Stock ...
Performance Comparisons among Machine Learning Algorithms based on the Stock ...Performance Comparisons among Machine Learning Algorithms based on the Stock ...
Performance Comparisons among Machine Learning Algorithms based on the Stock ...IRJET Journal
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmIOSR Journals
 
IRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine LearningIRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine LearningIRJET Journal
 
Detecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency DataDetecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency DataITIIIndustries
 
Analysis of computational
Analysis of computationalAnalysis of computational
Analysis of computationalcsandit
 
Machine learning classification analysis model community satisfaction with tr...
Machine learning classification analysis model community satisfaction with tr...Machine learning classification analysis model community satisfaction with tr...
Machine learning classification analysis model community satisfaction with tr...IAESIJAI
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - ReportAkanksha Gohil
 
The role of Louvain-coloring clustering in the detection of fraud transactions
The role of Louvain-coloring clustering in the detection of fraud transactionsThe role of Louvain-coloring clustering in the detection of fraud transactions
The role of Louvain-coloring clustering in the detection of fraud transactionsIJECEIAES
 
REAL ESTATE PRICE PREDICTION
REAL ESTATE PRICE PREDICTIONREAL ESTATE PRICE PREDICTION
REAL ESTATE PRICE PREDICTIONIRJET Journal
 
IRJET - House Price Prediction using Machine Learning and RPA
 IRJET - House Price Prediction using Machine Learning and RPA IRJET - House Price Prediction using Machine Learning and RPA
IRJET - House Price Prediction using Machine Learning and RPAIRJET Journal
 
Intelligent Supermarket using Apriori
Intelligent Supermarket using AprioriIntelligent Supermarket using Apriori
Intelligent Supermarket using AprioriIRJET Journal
 

Similar to CS598 DataMining Capstore Paper Review Presentation - sbrama2.pdf (20)

Q04602106117
Q04602106117Q04602106117
Q04602106117
 
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...
 
Real Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine LearningReal Estate Investment Advising Using Machine Learning
Real Estate Investment Advising Using Machine Learning
 
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdfTanvi_Sharma_Shruti_Garg_pre.pdf.pdf
Tanvi_Sharma_Shruti_Garg_pre.pdf.pdf
 
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
IRJET -  	  Fraud Detection in Credit Card using Machine Learning TechniquesIRJET -  	  Fraud Detection in Credit Card using Machine Learning Techniques
IRJET - Fraud Detection in Credit Card using Machine Learning Techniques
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Data analysis
Data analysisData analysis
Data analysis
 
Performance Comparisons among Machine Learning Algorithms based on the Stock ...
Performance Comparisons among Machine Learning Algorithms based on the Stock ...Performance Comparisons among Machine Learning Algorithms based on the Stock ...
Performance Comparisons among Machine Learning Algorithms based on the Stock ...
 
H017124652
H017124652H017124652
H017124652
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient Algorithm
 
IRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine LearningIRJET- Credit Card Fraud Detection using Machine Learning
IRJET- Credit Card Fraud Detection using Machine Learning
 
Detecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency DataDetecting Fraud Using Transaction Frequency Data
Detecting Fraud Using Transaction Frequency Data
 
Analysis of computational
Analysis of computationalAnalysis of computational
Analysis of computational
 
M033059064
M033059064M033059064
M033059064
 
Machine learning classification analysis model community satisfaction with tr...
Machine learning classification analysis model community satisfaction with tr...Machine learning classification analysis model community satisfaction with tr...
Machine learning classification analysis model community satisfaction with tr...
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - Report
 
The role of Louvain-coloring clustering in the detection of fraud transactions
The role of Louvain-coloring clustering in the detection of fraud transactionsThe role of Louvain-coloring clustering in the detection of fraud transactions
The role of Louvain-coloring clustering in the detection of fraud transactions
 
REAL ESTATE PRICE PREDICTION
REAL ESTATE PRICE PREDICTIONREAL ESTATE PRICE PREDICTION
REAL ESTATE PRICE PREDICTION
 
IRJET - House Price Prediction using Machine Learning and RPA
 IRJET - House Price Prediction using Machine Learning and RPA IRJET - House Price Prediction using Machine Learning and RPA
IRJET - House Price Prediction using Machine Learning and RPA
 
Intelligent Supermarket using Apriori
Intelligent Supermarket using AprioriIntelligent Supermarket using Apriori
Intelligent Supermarket using Apriori
 

Recently uploaded

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

CS598 DataMining Capstore Paper Review Presentation - sbrama2.pdf

  • 1. CS598 Data Mining Capstone, Summer 2022 Paper Review by Sathish Rama (sbrama2) Paper : Review of “Intention-aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection” From KDD ’21, August 14–18, 2021, Virtual Event, Singapore. Link to Paper : https://dl.acm.org/doi/10.1145/3447548.3467142
  • 2. Background & Motivation o Fraud transactions – major threat to e-commerce platforms o Increase in organized fraud o Complex scenarios Current/Related Methods o Traditional methods are based on statistical features (doesn’t capture user behavior ) o Various deep learning models based on user behavioral data are proposed. Sequence-based & tree based methods have been extensively studied. o Existing methods despite remarkable success, treat each transaction as independent data instance without considering transaction level interactions or intent of a set of transactions thus ignoring rich information Motivation for the proposed method Leveraging rich interactions among transactions & behavior sequence for fraud detection
  • 3. Paper Method - IHGAT ( Intention-aware Heterogenous Graph Attention networks) o Transaction intention network is devised using cross interaction information over transactions and intentions. o Using above, a graph neural network method is coined IHGAT ( Intention-aware Heterogenous Graph Attention networks) o The IHGAT model is used to detect if a transaction is a fraud or not. o Experimented on real world Alibaba platform to show results for both offline & online model Problem o Detect fraud transaction using proposed method and label each transaction as 0 or 1 where 1 denotes fraud transaction and 0 otherwise.
  • 4. Concepts Used in the Proposed Method o Behavior sequence : Chronologically ordered behaviors. A behavior sequence of a user is shown in figure(a) below. o Behavior tree : A tree-like data structure consisting of behavior nodes. A behavior node is unique identification of the behavior using a name and id. Below is figure showing behavior tree with intentions. o User intentions : Every branch in a behavior tree denotes a user intention. For example in figure (b) below four different user intentions are marked with different colors. The first ‘Intention1” is presented as {Home, Search, Product List}, which corresponds to the leftmost branch of the behavior tree. o Heterogeneous transaction-intention network (HTIN): A HTIN is denoted as G = {V, E}, where V and E are the nodes and edges, respectively. The node set V consists of transaction nodes and user intention nodes. The edge set E contains two types of edges, transaction- transaction edges and transaction-intention
  • 5. Architecture of proposed IHGAT method The overall process has two stages. First, the intention neighbors are aggregated by a sequence-based model with attention mechanism. Then a multi-head graph attention layer is applied to aggregate transaction neighbors. 1. User intention is modeled by embedding layer and sequence encoding. 2. Intention neighbors of a transaction node are aggregated by LTSM attention mechanism. LTSM model is a long short- term memory network, used in deep learning esp. in sequence prediction problems. 3. Multi head graph attention layer is used to aggregate interactions among transactions. 4. After aggregating the intention and transaction neighbors, the obtained representation is fed into multiple fully connected neural networks and a regression layer with a sigmoid unit and then predicted fraud probability(𝑝) of transaction is derived.
  • 6. Experiments • Extensive experiments on a large-scale real-world industrial dataset are conducted • First, verify the performance on the task of fraud transactions detection and perform ablation tests to demonstrate the effectiveness of every component in the model. • Then major hyper-parameters were observed analyzed and looked closely. • Results visualized to demonstrate the interpretability of the method Dataset • Large-scale industrial dataset from Alibaba Group (online e-commerce platform ) is used. • Randomly sampled 1.27 million transactions (ranging from 2020/05/01 to 2020/05/31) for training and 0.31 million transactions (ranging from 2020/06/01 to 2020/06/7) for testing as shown in Table 1. • For each transaction, last 24 hours of user behavior is back tracked and behavior sequence and behavior tree is generated and then HTIN is constructed • 1.76 million transaction and intention nodes • 21.93 million transaction-intention and transaction-transaction edges as shown in Table 2.
  • 7. Experiments Baselines : To demonstrate effectiveness of proposed method, sequence-based models, tree-based models, graph-based models, and variants of the proposed model are compared as baselines. • Sequence-based Methods: LSTM, BiLTSM, GRU, CNN and Transformer methods are compared • Tree-based Methods: CS Tree-LSTM and LIC Tree-LSTM methods • Graph-based Methods: GraheSAGE and GAT • Ablation Test : The proposed method and multiple variants of IHGAT are derived to analyze performance such as o One variant without edge among transactions o One variant without transaction attention mechanism o One variant without intention attention mechanism o One variant without considering order information of intentions Evaluation Metrics : Two widely used metrics, namely AUC and R@P𝑁 , to measure the performance of fraud transactions detection. • AUC is defined as the area under ROC curve • R@P𝑁 indicates the Recall rate when the Precision rate equals to 𝑁 ( high precision rate is needed for fraud detection problems) ROC Curve : A metric used to measure the performance of a model. The ROC curve depicts the rate of true positives with respect to the rate of false positives Higher AUC and R@P𝑁 indicates higher performance of the approaches.
  • 8. Results Results Comparison across different methods • Proposed method IHGAT is significantly better than all the baselines • Proposed method when compared 1. With sequence-based methods: AUC is at least 3.79% higher & R@P0.9 is 64.21% higher 2. With Tree-based methods: AUC is higher by 1.82% and R@P0.9 by 23.16% 3. With Graph based methods: AUC is higher by 1.05% and R@P0.9 by 8.93%. • Within the proposed method, a variant without the transaction-transaction interactions(IHGAT𝑇−𝑇), obtains the worst performance among all the variants with 2.62% decreased in AUC and 25.77% decreased in R@P0.9 respectively • From the results of IHGAT𝐼𝐴𝑡𝑡 and IHGAT𝐼𝐿𝑆𝑇𝑀 , we can see that the attention mechanism on user intentions can capture the key user intention and the order information among user intentions is useful in the task of fraud transactions detection. The main reason IHGAT to score better is, it captured both transaction-intention and transaction-transaction interactions.
  • 9. Results Effects of Behavior Sequence Length • Divided the testing set into 5 groups to analyze the effects of different behavior sequence lengths as shown below. • Overall, both tree-based and graph-based models are better than the sequence-based approaches in all sequence lengths. • Graph-based models, namely GraphSAGE and GAT, achieve the better performances than LIC tree-LSTM when the sequence length is less than 120, but poor performance seen when it is greater than 120, except for IHGAT. • One observation is, elaborate user intention modeling seem to play a important role in longer sequence groups. • The performances of most models, as the increase of behavior sequence length, improve obviously at the beginning, and then flatten to some extent. Proposed model, benefits from the construction of user intentions and heterogeneous transaction-intention network obtains • Best results in various sequence lengths • Achieves a significant improvement on longer sequences.
  • 10. Results Other Major Hyper-parameters The paper investigated the effects of two major parameters. • Sliding window(l) : o An important component in building transaction-transaction interactions o It is observed that for both AUC and R@P0.9, the performances gets better as the sliding window size increases, and 𝑙 = 3 gets the best performance o The reason is too small window size could not build complete transaction-transaction edges, while too large window sizes may introduce interference edges that are not very closely related. • Embedding dimensions: o Lower dimensions may not be able to completely represent user behavior, while higher dimensions cannot improve classification performances and may cost more training time.
  • 11. Results Visualization • The paper visualized the attention weights of a fraud transaction(𝑇0), as shown below. The behavior sequence of 𝑇0 is segmented into five intentions from 𝐼1 to 𝐼5, shown in Figure (a). • Figure (b) shows I2 and I4 gets higher value. I4 is a intuitive pattern is an intuitive pattern of potential fraudsters, as they tend to switch accounts frequently to avoid the identification rules of platforms. • Figure (b) shows transaction neighbors. It is observed that T0 has the highest weight & T2 is the second highest. The edge between these is established using same remark of transaction and it is observed that fraudsters sometimes uses such common remarks(or secret code) to communicate with their accomplices.
  • 12. Conclusion • The paper investigated the detection of fraud transactions by elaborately modeling user intentions and leveraging the transaction-level interactions • Devised a heterogeneous transaction intention network and a graph-based neural model (IHGAT) to detect the fraud transaction. • Experiments conducted on a real-world dataset show that proposed model is effective in fraud transactions detection provided good interpretability of results. • I found this method very interesting, and it clearly shows better results compared to sequence-based methods. • I’m curious about how if any real-world data challenges may impact the performance of this method such as o We may miss some transaction data in a sequence due to network/system failure so how would the method perform. o Sometimes there may be benign transaction patterns with similar comments for frequent pattern shopping such as buying gifts to family members or fund transfers with friends etc o Amount of compute needed to actively detect fraud with low latency response times since building a large IHGAT network with several embeddings and large sliding window could be very compute intensive. Other comments