Summary : Analysing Digital Banking Reviews Using Text Mining IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2020 Li Chen Cheng.
+ Paper implimentation code
Roadmap to Membership of RICS - Pathways and Routes
TextMining.pptx
1. Analysing Digital Banking
Reviews Using
Text Mining
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
2020
Li Chen Cheng and Legaspi Rhea Shannayne
Department of Information and Finance Management
National Taipei University of Technology Taipei, Taiwan
2. OUTLINE
PROBLEM
Prioritize features
Summary
METHODOLOGY
How to find Hidden
patterns in huge data
INTRODUCTION
Digital Banking
Why text Mining ?
01 02
04
Latent Allocation DLA
Association mining rules
SOLUTION
03
CONCLUSION
06
2
solution implementation
DEMO CODE
05
5. WhyTextMiNING?
01.
The feedback provided by the users usually reflect their
satisfaction on the application , and Satisfaction has a direct
impact on the user's loyalty (Wang et al, 2018).
Critical for companies to understand their market
7. Unstructured nature
Huge number
of user reviews
Some may not even provide
meaningful insights.
7
Researcher stated that the
emotional sentiment of the
user is only weakly correlated
to the app rating(2017).
9. Reveal which areas the digital
banking application can further optimize for
customer satisfaction and retention.
Maingoal
Consumer reviews
?
10. This Paper
10
Goal
Identify which features need to be prioritized and improved upon, so
they will be able to focus their efforts and better target their
strategies to meet customers' expectations.
What are users most concerned about?
Which features are viewed positively and which are viewed
negatively?
Research Questions
Not only about the satisfaction but also what topic should the Digital bank improve !
12. Topic Modeling
12
How can we label such a huge
amount of data if not manually?
Problem
Analyze a huge amount of data,
cluster them into similar groups,
and label each group.
Clustering documents
into topics
consists of a cluster
of words that often
occur together
It is based on the assumption that:
● A document is composed of several topics
● A topic is composed of a collection of words
13. Topic Modeling : Latent Dirichlet Allocation (LDA)
13
● Latent Dirichlet Allocation is a probabilistic method for Topic Modelling
● We have to choose the number of topics k that we want to ‘discover’ in our corpus
● The results then find these hidden topics, and give us the words that make up each
topic, in the form of a probability distribution over the vocabulary for each topic.
14. 14
Comment 1: "I am very happy with the bank's mobile
app, it is very convenient and easy to use."
Comment 2: "I am very disappointed with the bank's
customer service, they are not helpful at all."
Comment 3: "I find the bank's interest rates to be
very competitive and fair."
.
.
.
Topic 3: Interest rates
["mobile app", "convenient",
"user-friendly", "easy to use",
etc]
Topic 1: Mobile app
Topic 2: Customer service
Topic 1
% Topic 1
% Topic 2
15. Paper extracted Topics
15
["customer", "service", "poor","waste", "phone", "email" ]
● reveals that the usual contact methods of the
users are through phone and email.
● reveals that the application's customer service is
currently
unsatisfactory.
16. Association Analysis
16
Technique that is used to uncover the underlying relationships and
patterns in a dataset. It is used to identify the items or attributes that
are frequently co-occurring or associated in some way.
18. SUMMARY
18
1-First, this research applied LDA to pre-processed text
reviews to obtain the relevant topics .
2- Second, keywords are extracted from the text based on their
count frequency and labelled into specific digital banking
features.
3- The reviews are then divided into two clusters based
on the current review rating ("positive" and "negative") and
association rule mining is applied to determine the
relationship between the above-mentioned features and the
given rating.
LDA +Association mining
uncover the underlying
topics present in a corpus
find patterns and relationships within
the topics identified by LDA.
presented at the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) , and it is a conference that focuses on research related to social networks analysis and mining.
igital banks are banks that operate primarily online, offering banking services to customers through digital channels such as mobile apps, online portals
They also often have more flexible account opening processes, such as the ability to open an account entirely online
With the advent of the internet and social media,
information has become more widespread and easier to access.
Now what are the problem with customer reviews ?
Generate base lear
which proves that the app
rating is not an accurate representation of the customers'
sentiment.
obtain valuable knowledge
that was initially unrevealed in online reviews.
based on
so it’s not only about sentiment analysis , it’s take the reviews and based on them they extract what topic are users concerned about (is the online accounting , the mony ..) and are they satisfied with it or not.
okay they will understand the sentiment (the satisfaction ) but what topic should the digal bank improve ?
obtain valuable knowledge
that was initially unrevealed in online reviews.
https://moj-analytical-services.github.io/NLP-guidance/LDA.html
https://aiinarabic.com/latent-dirichlet-allocation-algorithm-using-python/
topic can be described as a collection of words, like [‘ball’, ‘cat’, ‘house’] and [‘airplane’, ‘clouds’], but in practice, what an algorithm does is assign each word in our vocabulary a ‘participation’ value in a given topic. The words with the highest values can be considered as the true participants of a topic.
https://moj-analytical-services.github.io/NLP-guidance/LDA.html
https://aiinarabic.com/latent-dirichlet-allocation-algorithm-using-python/
words that present each topic
https://www.linkedin.com/pulse/topic-modeling-using-latent-dirichlet-allocation-saket-khare-frm
Each topic would be represented by a set of words that are likely to be semantically related. represented by words such as "mobile app", "convenient", "user-friendly", "easy to use", etc.
The model assigns a probability to each word within a topic and also assigns a probability to each topic in the document.
LDA to, and then use association rule mining to find patterns and relationships within the topics identified by LDA.