Association Rule Mining in Social Network Data

Association Rule Mining
in Social Network Data
PRESENTED BY: HOSSEIN MOBASHER
COURSE: DATA MINING

19/2
Contents
• Introduction
• Related Works
• The proposed Framework
• Experimental Evaluation
• Conclusion

19/3
Introduction
• The use of social networks has altered the way of life of online community since
last decade.
• Social data uses in:
• Academic applications
• E-commerce
• Discovers the user habits and interests of different geographical online communities
• Sentimental analysis of users
• Purpose: Support analysts in decision-making and optimal resource
management in businesses as well as web maintenance.

19/4
Introduction (continue)
• The social data is one of the powerful sources of data:
• To get knowledge about social communities
• Investigate the behavior and other different aspects of the online communities
• User-generated contents (UGC) used to help online organizations to enhance
their services based on user perspectives.
• The data mining techniques are effectively exploited to discover hidden,
interested and meaningful knowledge from the social data.

19/5
Related Works
• TwitterEcho
• Collect data from distributed architecture (Portuguese Twittosphere)
• Use of micro-blogging as the means to predict the political sentiment.
• TWICALL
• Discovers important events, categorizes and classifies them
• NIF-T
• Exploring data published on micro-blogging websites (i.e. Twitter)

19/6
The proposed Framework
• Environment for the association rule mining to discover hidden patterns from
tweets.

19/7
Collecting and preprocessing of tweets
• Access tweets using Twitter API.
• Received tweets are unsuitable for the subsequent processes.
• Includes information which is not required for problem under consideration
• Remove unnecessary information and transform them into items and related
contextual features.
Access data using
Twitter API
Remove Unnecessary
Information
Transform into
suitable format
Mapped into a
transactional database

19/8
Collecting and preprocessing of tweets
• Transformed tweets are then mapped into a transactional database.
• Composed of set of stems
• i.e. “Imagination is more important than knowledge” may be mapped into {imagination,
important, knowledge}
Access data using
Twitter API
Remove Unnecessary
Information
Transform into
suitable format
Mapped into a
transactional database

19/9
Discovery of Correlations
• Use apriori method to extract frequent itemset mining.
• An association rule is usually represented as: If Body then Head
• If Body happens then there are more chance that Head may also happen
• It is the relationship between them
• Strength of the rule depends on association rule support and confidence
• The higher the strength of the rule, higher the association in between the terms.
• 𝑖𝑚𝑎𝑔𝑖𝑛𝑎𝑡𝑖𝑜𝑛 ⇒ 𝑘𝑛𝑜𝑤𝑙𝑒𝑑𝑔𝑒
• Support = 40%
• Confidence = 70%

19/10
Taxonomy Generation
• Automatically generates taxonomy based on tweet attributes (i.e. frequent
keywords that are generated in the previous phase).
• The more generalized or high-level concepts or correlations can be extracted.
• The taxonomy nodes represent distinct terms extracted from tweet contents
• Graph extraction
• Graph partitioning and pruning

19/11
Taxonomy Generation (Graph extraction)
• Strong correlations are detected using previous phase result.
• Generated correlations are represented in graph format
• Edge: The implications present in the rule
• Vertices: Items of tweet contents
• 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑
𝑠𝑜𝑐𝑖𝑒𝑡𝑦, 𝑝𝑒𝑜𝑝𝑙𝑒 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦
𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑
𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑
𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦

19/12
Taxonomy Generation (Graph partitioning and pruning)
• Makes the graph compact
• Prunes edges which do not have string relevant relationship by performing
vertex labeling. (Label represents level of taxonomy)

19/13
Analyzing Correlations
• The selection and ranking of the significant correlations
• The selection is made having
• A rule schema < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑,∗ > ⇒ < 𝑃𝑙𝑎𝑐𝑒,∗ >
• Given interesting rule items < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑, 𝑆𝑐ℎ𝑜𝑜𝑙 > ⇒ < 𝑃𝑙𝑎𝑐𝑒, 𝐿𝑜𝑛𝑑𝑜𝑛 >
• The results ranked based on their support and confidence quality indexes.

19/14
Experimental Evaluation
• The proposed framework highlights famous topical subjects (i.e. European
Union)
• The results includes 58 transactions with 209 distinct items (i.e. keywords).
• Firstly, the effectiveness is presented in two scenarios:
• User behavior analysis
• Topic trend analysis
• Secondly, the effectiveness is presented as quality of generated taxonomies.

19/15
User Behavior Analysis
• Extracted correlations allow experts to highlight hidden and potentially
interesting user behaviors.
• 𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑, 𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦, 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑
• Proposed framework automatically generates the taxonomy from the mined rules.
• The taxonomy clearly highlights the behavior of people towards the peace.

19/16
Topic Trend Analysis
• Discovery and analysis of currently matter of contention on Twitter.
• Domain expert wants to discover subjects of topical interest for Twitter users.
• The taxonomy suggests that society as a general and people in particular are
concerns with peace in the World.

19/17
Quality of generated taxonomies
• The evaluation of taxonomy generation is measured with
• Global quality (Using geometry average)
• Local quality (Degree of correlation between non-leaf and leaf nodes)
• Spread (Number of nodes across the taxonomy to move from node to its root node in graph)
• The results are compared with the approach of
• “Evolutionary Taxonomy Construction from Dynamic Tag Space”, 2010

19/18
Quality of generated taxonomies (continue)
• Global quality remained same in both approaches.
• Produced pretty balanced local quality vs. spread measurement indexes.
• Proposed approach takes slightly less time comparing with the approach
reported in.

19/19
Conclusion
• Present the mechanism of extracting hidden correlations between contents.
• Generated correlations are helpful to understand the hidden associations
among the textual and contextual features of the UGC.
• Proposed approach automatically generates taxonomy.
• The experimental results validate the efficiency and effectiveness of the
proposed framework.

Thanks for your attentions 
Questions ?

Association Rule Mining in Social Network Data

Recommended

Recommended

More Related Content

Similar to Association Rule Mining in Social Network Data

Similar to Association Rule Mining in Social Network Data (20)

More from Hossein Mobasher

More from Hossein Mobasher (7)

Association Rule Mining in Social Network Data