SlideShare a Scribd company logo
1 of 20
Download to read offline
Association Rule Mining
in Social Network Data
PRESENTED BY: HOSSEIN MOBASHER
COURSE: DATA MINING
19/2
Contents
• Introduction
• Related Works
• The proposed Framework
• Experimental Evaluation
• Conclusion
19/3
Introduction
• The use of social networks has altered the way of life of online community since
last decade.
• Social data uses in:
• Academic applications
• E-commerce
• Discovers the user habits and interests of different geographical online communities
• Sentimental analysis of users
• Purpose: Support analysts in decision-making and optimal resource
management in businesses as well as web maintenance.
19/4
Introduction (continue)
• The social data is one of the powerful sources of data:
• To get knowledge about social communities
• Investigate the behavior and other different aspects of the online communities
• User-generated contents (UGC) used to help online organizations to enhance
their services based on user perspectives.
• The data mining techniques are effectively exploited to discover hidden,
interested and meaningful knowledge from the social data.
19/5
Related Works
• TwitterEcho
• Collect data from distributed architecture (Portuguese Twittosphere)
• Use of micro-blogging as the means to predict the political sentiment.
• TWICALL
• Discovers important events, categorizes and classifies them
• NIF-T
• Exploring data published on micro-blogging websites (i.e. Twitter)
19/6
The proposed Framework
• Environment for the association rule mining to discover hidden patterns from
tweets.
19/7
Collecting and preprocessing of tweets
• Access tweets using Twitter API.
• Received tweets are unsuitable for the subsequent processes.
• Includes information which is not required for problem under consideration
• Remove unnecessary information and transform them into items and related
contextual features.
Access data using
Twitter API
Remove Unnecessary
Information
Transform into
suitable format
Mapped into a
transactional database
19/8
Collecting and preprocessing of tweets
• Transformed tweets are then mapped into a transactional database.
• Composed of set of stems
• i.e. “Imagination is more important than knowledge” may be mapped into {imagination,
important, knowledge}
Access data using
Twitter API
Remove Unnecessary
Information
Transform into
suitable format
Mapped into a
transactional database
19/9
Discovery of Correlations
• Use apriori method to extract frequent itemset mining.
• An association rule is usually represented as: If Body then Head
• If Body happens then there are more chance that Head may also happen
• It is the relationship between them
• Strength of the rule depends on association rule support and confidence
• The higher the strength of the rule, higher the association in between the terms.
• 𝑖𝑚𝑎𝑔𝑖𝑛𝑎𝑡𝑖𝑜𝑛 ⇒ 𝑘𝑛𝑜𝑤𝑙𝑒𝑑𝑔𝑒
• Support = 40%
• Confidence = 70%
19/10
Taxonomy Generation
• Automatically generates taxonomy based on tweet attributes (i.e. frequent
keywords that are generated in the previous phase).
• The more generalized or high-level concepts or correlations can be extracted.
• The taxonomy nodes represent distinct terms extracted from tweet contents
• Graph extraction
• Graph partitioning and pruning
19/11
Taxonomy Generation (Graph extraction)
• Strong correlations are detected using previous phase result.
• Generated correlations are represented in graph format
• Edge: The implications present in the rule
• Vertices: Items of tweet contents
• 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑
𝑠𝑜𝑐𝑖𝑒𝑡𝑦, 𝑝𝑒𝑜𝑝𝑙𝑒 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦
𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑
𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑
𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦
19/12
Taxonomy Generation (Graph partitioning and pruning)
• Makes the graph compact
• Prunes edges which do not have string relevant relationship by performing
vertex labeling. (Label represents level of taxonomy)
19/13
Analyzing Correlations
• The selection and ranking of the significant correlations
• The selection is made having
• A rule schema < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑,∗ > ⇒ < 𝑃𝑙𝑎𝑐𝑒,∗ >
• Given interesting rule items < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑, 𝑆𝑐ℎ𝑜𝑜𝑙 > ⇒ < 𝑃𝑙𝑎𝑐𝑒, 𝐿𝑜𝑛𝑑𝑜𝑛 >
• The results ranked based on their support and confidence quality indexes.
19/14
Experimental Evaluation
• The proposed framework highlights famous topical subjects (i.e. European
Union)
• The results includes 58 transactions with 209 distinct items (i.e. keywords).
• Firstly, the effectiveness is presented in two scenarios:
• User behavior analysis
• Topic trend analysis
• Secondly, the effectiveness is presented as quality of generated taxonomies.
19/15
User Behavior Analysis
• Extracted correlations allow experts to highlight hidden and potentially
interesting user behaviors.
• 𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑, 𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦, 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑
• Proposed framework automatically generates the taxonomy from the mined rules.
• The taxonomy clearly highlights the behavior of people towards the peace.
19/16
Topic Trend Analysis
• Discovery and analysis of currently matter of contention on Twitter.
• Domain expert wants to discover subjects of topical interest for Twitter users.
• The taxonomy suggests that society as a general and people in particular are
concerns with peace in the World.
19/17
Quality of generated taxonomies
• The evaluation of taxonomy generation is measured with
• Global quality (Using geometry average)
• Local quality (Degree of correlation between non-leaf and leaf nodes)
• Spread (Number of nodes across the taxonomy to move from node to its root node in graph)
• The results are compared with the approach of
• “Evolutionary Taxonomy Construction from Dynamic Tag Space”, 2010
19/18
Quality of generated taxonomies (continue)
• Global quality remained same in both approaches.
• Produced pretty balanced local quality vs. spread measurement indexes.
• Proposed approach takes slightly less time comparing with the approach
reported in.
19/19
Conclusion
• Present the mechanism of extracting hidden correlations between contents.
• Generated correlations are helpful to understand the hidden associations
among the textual and contextual features of the UGC.
• Proposed approach automatically generates taxonomy.
• The experimental results validate the efficiency and effectiveness of the
proposed framework.
Thanks for your attentions 
Questions ?

More Related Content

Similar to Association Rule Mining in Social Network Data

Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Gayane Sedrakyan
 
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Citadelh2020
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Rajul Kukreja
 
System dynamics prof nagurney
System dynamics prof nagurneySystem dynamics prof nagurney
System dynamics prof nagurneyHouw Liong The
 
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"eMadrid network
 
Empowering Digital Direct Democracy: Policy making via Stance Classification
Empowering Digital Direct Democracy: Policy making via Stance ClassificationEmpowering Digital Direct Democracy: Policy making via Stance Classification
Empowering Digital Direct Democracy: Policy making via Stance ClassificationSamos2019Summit
 
Landscape of IoT and Machine Learning Patterns
Landscape of IoT and Machine Learning PatternsLandscape of IoT and Machine Learning Patterns
Landscape of IoT and Machine Learning PatternsHironori Washizaki
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Enrico Motta
 
The challenges of implementing generic web and mobile apps for managing and m...
The challenges of implementing generic web and mobile apps for managing and m...The challenges of implementing generic web and mobile apps for managing and m...
The challenges of implementing generic web and mobile apps for managing and m...Rob Worthington
 
Data Mining of Project Management Data: An Analysis of Applied Research Studi...
Data Mining of Project Management Data: An Analysis of Applied Research Studi...Data Mining of Project Management Data: An Analysis of Applied Research Studi...
Data Mining of Project Management Data: An Analysis of Applied Research Studi...Gurdal Ertek
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Goa App
 
Syracuse open data presentation
Syracuse open data presentationSyracuse open data presentation
Syracuse open data presentationSam Edelstein
 
MS Lecture 9 information technology
MS Lecture 9 information technologyMS Lecture 9 information technology
MS Lecture 9 information technologyEst
 
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...University of Connecticut Libraries
 
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarMining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarDasha Herrmannova
 

Similar to Association Rule Mining in Social Network Data (20)

Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
 
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
Data Harvesting, Curation and Fusion Model to Support Public Service Recommen...
 
Slides ecir2016
Slides ecir2016Slides ecir2016
Slides ecir2016
 
Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...Recomendation system: Community Detection Based Recomendation System using Hy...
Recomendation system: Community Detection Based Recomendation System using Hy...
 
System dynamics prof nagurney
System dynamics prof nagurneySystem dynamics prof nagurney
System dynamics prof nagurney
 
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
eMadrid 2014-01-17 uned Salvador Ros (UNED) "Big Data in Education"
 
Digital Library
Digital LibraryDigital Library
Digital Library
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
Empowering Digital Direct Democracy: Policy making via Stance Classification
Empowering Digital Direct Democracy: Policy making via Stance ClassificationEmpowering Digital Direct Democracy: Policy making via Stance Classification
Empowering Digital Direct Democracy: Policy making via Stance Classification
 
Landscape of IoT and Machine Learning Patterns
Landscape of IoT and Machine Learning PatternsLandscape of IoT and Machine Learning Patterns
Landscape of IoT and Machine Learning Patterns
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
The challenges of implementing generic web and mobile apps for managing and m...
The challenges of implementing generic web and mobile apps for managing and m...The challenges of implementing generic web and mobile apps for managing and m...
The challenges of implementing generic web and mobile apps for managing and m...
 
Data Mining of Project Management Data: An Analysis of Applied Research Studi...
Data Mining of Project Management Data: An Analysis of Applied Research Studi...Data Mining of Project Management Data: An Analysis of Applied Research Studi...
Data Mining of Project Management Data: An Analysis of Applied Research Studi...
 
Social Network Analysis Using Gephi
Social Network Analysis Using Gephi Social Network Analysis Using Gephi
Social Network Analysis Using Gephi
 
Syracuse open data presentation
Syracuse open data presentationSyracuse open data presentation
Syracuse open data presentation
 
Unit 1 DSS
Unit 1 DSSUnit 1 DSS
Unit 1 DSS
 
MS Lecture 9 information technology
MS Lecture 9 information technologyMS Lecture 9 information technology
MS Lecture 9 information technology
 
IT for management
IT for managementIT for management
IT for management
 
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
 
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarMining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal Seminar
 

More from Hossein Mobasher (7)

Advanced Java
Advanced JavaAdvanced Java
Advanced Java
 
Shive
ShiveShive
Shive
 
CodeIgniter
CodeIgniterCodeIgniter
CodeIgniter
 
ISR
ISRISR
ISR
 
Database
DatabaseDatabase
Database
 
Live API Documentation
Live API Documentation Live API Documentation
Live API Documentation
 
Presentation
PresentationPresentation
Presentation
 

Association Rule Mining in Social Network Data

  • 1. Association Rule Mining in Social Network Data PRESENTED BY: HOSSEIN MOBASHER COURSE: DATA MINING
  • 2. 19/2 Contents • Introduction • Related Works • The proposed Framework • Experimental Evaluation • Conclusion
  • 3. 19/3 Introduction • The use of social networks has altered the way of life of online community since last decade. • Social data uses in: • Academic applications • E-commerce • Discovers the user habits and interests of different geographical online communities • Sentimental analysis of users • Purpose: Support analysts in decision-making and optimal resource management in businesses as well as web maintenance.
  • 4. 19/4 Introduction (continue) • The social data is one of the powerful sources of data: • To get knowledge about social communities • Investigate the behavior and other different aspects of the online communities • User-generated contents (UGC) used to help online organizations to enhance their services based on user perspectives. • The data mining techniques are effectively exploited to discover hidden, interested and meaningful knowledge from the social data.
  • 5. 19/5 Related Works • TwitterEcho • Collect data from distributed architecture (Portuguese Twittosphere) • Use of micro-blogging as the means to predict the political sentiment. • TWICALL • Discovers important events, categorizes and classifies them • NIF-T • Exploring data published on micro-blogging websites (i.e. Twitter)
  • 6. 19/6 The proposed Framework • Environment for the association rule mining to discover hidden patterns from tweets.
  • 7. 19/7 Collecting and preprocessing of tweets • Access tweets using Twitter API. • Received tweets are unsuitable for the subsequent processes. • Includes information which is not required for problem under consideration • Remove unnecessary information and transform them into items and related contextual features. Access data using Twitter API Remove Unnecessary Information Transform into suitable format Mapped into a transactional database
  • 8. 19/8 Collecting and preprocessing of tweets • Transformed tweets are then mapped into a transactional database. • Composed of set of stems • i.e. “Imagination is more important than knowledge” may be mapped into {imagination, important, knowledge} Access data using Twitter API Remove Unnecessary Information Transform into suitable format Mapped into a transactional database
  • 9. 19/9 Discovery of Correlations • Use apriori method to extract frequent itemset mining. • An association rule is usually represented as: If Body then Head • If Body happens then there are more chance that Head may also happen • It is the relationship between them • Strength of the rule depends on association rule support and confidence • The higher the strength of the rule, higher the association in between the terms. • 𝑖𝑚𝑎𝑔𝑖𝑛𝑎𝑡𝑖𝑜𝑛 ⇒ 𝑘𝑛𝑜𝑤𝑙𝑒𝑑𝑔𝑒 • Support = 40% • Confidence = 70%
  • 10. 19/10 Taxonomy Generation • Automatically generates taxonomy based on tweet attributes (i.e. frequent keywords that are generated in the previous phase). • The more generalized or high-level concepts or correlations can be extracted. • The taxonomy nodes represent distinct terms extracted from tweet contents • Graph extraction • Graph partitioning and pruning
  • 11. 19/11 Taxonomy Generation (Graph extraction) • Strong correlations are detected using previous phase result. • Generated correlations are represented in graph format • Edge: The implications present in the rule • Vertices: Items of tweet contents • 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑 𝑠𝑜𝑐𝑖𝑒𝑡𝑦, 𝑝𝑒𝑜𝑝𝑙𝑒 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑 𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑 𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦
  • 12. 19/12 Taxonomy Generation (Graph partitioning and pruning) • Makes the graph compact • Prunes edges which do not have string relevant relationship by performing vertex labeling. (Label represents level of taxonomy)
  • 13. 19/13 Analyzing Correlations • The selection and ranking of the significant correlations • The selection is made having • A rule schema < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑,∗ > ⇒ < 𝑃𝑙𝑎𝑐𝑒,∗ > • Given interesting rule items < 𝐾𝑒𝑦𝑤𝑜𝑟𝑑, 𝑆𝑐ℎ𝑜𝑜𝑙 > ⇒ < 𝑃𝑙𝑎𝑐𝑒, 𝐿𝑜𝑛𝑑𝑜𝑛 > • The results ranked based on their support and confidence quality indexes.
  • 14. 19/14 Experimental Evaluation • The proposed framework highlights famous topical subjects (i.e. European Union) • The results includes 58 transactions with 209 distinct items (i.e. keywords). • Firstly, the effectiveness is presented in two scenarios: • User behavior analysis • Topic trend analysis • Secondly, the effectiveness is presented as quality of generated taxonomies.
  • 15. 19/15 User Behavior Analysis • Extracted correlations allow experts to highlight hidden and potentially interesting user behaviors. • 𝑝𝑒𝑎𝑐𝑒 ⇒ 𝑊𝑜𝑟𝑙𝑑, 𝑠𝑜𝑐𝑖𝑒𝑡𝑦 ⇒ 𝑐𝑜𝑢𝑛𝑡𝑟𝑦, 𝑐𝑜𝑢𝑛𝑡𝑟𝑦 ⇒ 𝑊𝑜𝑟𝑙𝑑 • Proposed framework automatically generates the taxonomy from the mined rules. • The taxonomy clearly highlights the behavior of people towards the peace.
  • 16. 19/16 Topic Trend Analysis • Discovery and analysis of currently matter of contention on Twitter. • Domain expert wants to discover subjects of topical interest for Twitter users. • The taxonomy suggests that society as a general and people in particular are concerns with peace in the World.
  • 17. 19/17 Quality of generated taxonomies • The evaluation of taxonomy generation is measured with • Global quality (Using geometry average) • Local quality (Degree of correlation between non-leaf and leaf nodes) • Spread (Number of nodes across the taxonomy to move from node to its root node in graph) • The results are compared with the approach of • “Evolutionary Taxonomy Construction from Dynamic Tag Space”, 2010
  • 18. 19/18 Quality of generated taxonomies (continue) • Global quality remained same in both approaches. • Produced pretty balanced local quality vs. spread measurement indexes. • Proposed approach takes slightly less time comparing with the approach reported in.
  • 19. 19/19 Conclusion • Present the mechanism of extracting hidden correlations between contents. • Generated correlations are helpful to understand the hidden associations among the textual and contextual features of the UGC. • Proposed approach automatically generates taxonomy. • The experimental results validate the efficiency and effectiveness of the proposed framework.
  • 20. Thanks for your attentions  Questions ?