Data Mining: Graph mining and social network analysisPresentation Transcript
Graph Mining, Social Network Analysis, and Multi relational Data Mining
Why and What is Graph Mining? Graphs become increasingly important in modeling complicated structures, such as circuits, images, biological networks, social networks, the Web, and XML documents. Many graph search algorithms have been developed in chemical informatics, computer vision, video indexing, and text retrieval. With the increasing demand on the analysis of large amounts of structured data, graph mining has become an active and important theme in data mining.
Methods for Mining Frequent Sub graphs Apriori-based Approach Apriori-based algorithms for frequent substructure mining include AGM, FSG, and a path-join method. AGM shares similar characteristics with Apriori-based item-set mining. FSG and the path-join method explore edges and connections in an Apriori-based fashion.
Other Approach for Mining Frequent Sub graphs Pattern Growth Graph Approach : Simplistic pattern growth-based frequent substructure mining. gSpan: A pattern-growth algorithm for frequent substructure mining. (for detailed algorithm refer wiki)
Characteristics of Social Networks Densification power law Shrinking diameter Heavy-tailed out-degree and in-degree distributions
Link Mining Traditional methods of machine learning and data mining, taking, as input, a random sample of homogenous objects from a single relation, may not be appropriate in social networks. The data comprising social networks tend to be heterogeneous, multi relational, and semi-structured. As a result, a new field of research has emerged called link mining.
Tasks involved in link mining Link-based object classification. Object type prediction. Link type prediction. Predicting link existence Link cardinality estimation. Object reconciliation. Group detection Sub graph detection Metadata mining
Challenges faced by Link Mining Logical versus statistical dependencies Feature construction Instances versus classes. Collective classification and collective consolidation. Effective use of labeled and unlabeled data Link prediction Closed versus open world assumption Community mining from multi relational networks.
What is Multi relational Data Mining? Multi relational data mining (MRDM) methods search for patterns that involve multiple tables (relations) from a relational database
Multi relational Clustering with User Guidance Multi relational clustering is the process of partitioning data objects into a set of clusters based on their similarity, utilizing information in multiple relations.
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net