This document provides a tutorial on relationship mining in online social networks. It begins with introductions to basic concepts like defining the relationship mining task and relationship concepts from sociology. It then discusses how text mining can help with relationship mining by extracting features from text data. It outlines several sub-fields for relationship mining, including data acquisition/storage, different relationship mining approaches, and associating user attributes with relationships. The document concludes by discussing specific relationship mining systems.
This is a tutorial about recommender system for CS410 @ UIUC. It summarize some good research paper about how user profile and tags can improve recommender systems.
This is a tutorial about recommender system for CS410 @ UIUC. It summarize some good research paper about how user profile and tags can improve recommender systems.
Recommender systems are knowledge-based systems which support human decision-making. In an era of overwhelming choice, they help us decide which
products, services and information to consume. The focus of attention in recommender systems research and development has been on making recommendations to individual consumers. These places focus on the easier case, but ignore the fact that it is as common, if not more common, for us to consume items in groups such as couples, families and parties of friends. The choice of a date movie, a family holiday destination, or a restaurant for a celebration meal all require the balancing of the preferences of multiple consumers
Supervised Sentiment Classification using DTDP algorithmIJSRD
Sentiment analysis is the process widely used in all fields and it uses the statistical machine learning approach for text modeling. The primarily used approach is Bag-of-words (BOW). Though, this technique has some limitations in polarity shift problem. Thus, here we propose a new method called Dual sentiment analysis (DSA) which resolves the polarity shift problem. Proposed method involves two approaches such as dual training and dual prediction (DPDT). First, we propose a data expansion technique by creating a reversed review for training data. Second, dual training and dual prediction algorithm is developed for doing analysis on sentiment data. The dual training algorithm is used for learning a sentiment classifier and the dual prediction algorithm is developed for classifying the review by considering two sides of one review.
Summary of a Recommender Systems Survey paperChangsung Moon
This is the summary of the following paper:
J. Bobadilla, F. Ortega, A. Hernando and A. Gutierrez, “Recommender Systems Survey,” Knowledge Based Systems, Vol. 26, 2013, pp. 109-132.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
Summary slides of "Unsupervised Model for Topic Viewpoint Discovery in Online Debates Leveraging Author Interactions" published at ICWSM 2018 from Amine Trabelsi and Osmar R. Za ̈ıane.
UNIT V TEXT AND OPINION MINING
Text Mining in Social Networks -Opinion extraction – Sentiment classification and clustering -
Temporal sentiment analysis - Irony detection in opinion mining - Wish analysis – Product review mining – Review Classification – Tracking sentiments towards topics over time
Recommender systems are knowledge-based systems which support human decision-making. In an era of overwhelming choice, they help us decide which
products, services and information to consume. The focus of attention in recommender systems research and development has been on making recommendations to individual consumers. These places focus on the easier case, but ignore the fact that it is as common, if not more common, for us to consume items in groups such as couples, families and parties of friends. The choice of a date movie, a family holiday destination, or a restaurant for a celebration meal all require the balancing of the preferences of multiple consumers
Supervised Sentiment Classification using DTDP algorithmIJSRD
Sentiment analysis is the process widely used in all fields and it uses the statistical machine learning approach for text modeling. The primarily used approach is Bag-of-words (BOW). Though, this technique has some limitations in polarity shift problem. Thus, here we propose a new method called Dual sentiment analysis (DSA) which resolves the polarity shift problem. Proposed method involves two approaches such as dual training and dual prediction (DPDT). First, we propose a data expansion technique by creating a reversed review for training data. Second, dual training and dual prediction algorithm is developed for doing analysis on sentiment data. The dual training algorithm is used for learning a sentiment classifier and the dual prediction algorithm is developed for classifying the review by considering two sides of one review.
Summary of a Recommender Systems Survey paperChangsung Moon
This is the summary of the following paper:
J. Bobadilla, F. Ortega, A. Hernando and A. Gutierrez, “Recommender Systems Survey,” Knowledge Based Systems, Vol. 26, 2013, pp. 109-132.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
Summary slides of "Unsupervised Model for Topic Viewpoint Discovery in Online Debates Leveraging Author Interactions" published at ICWSM 2018 from Amine Trabelsi and Osmar R. Za ̈ıane.
UNIT V TEXT AND OPINION MINING
Text Mining in Social Networks -Opinion extraction – Sentiment classification and clustering -
Temporal sentiment analysis - Irony detection in opinion mining - Wish analysis – Product review mining – Review Classification – Tracking sentiments towards topics over time
Kaplan & Haenlein - The early bird catches the news nine things you should kn...ESCP Exchange
Micro-blogs (e.g., Twitter, Jaiku, Plurk, Tumblr) are starting to become an established category within the general group of social media. Yet, while they rapidly gain interest among consumers and companies alike, there is no evidence to explain why anybody should be interested in an application that is limited to the exchange of short, 140-character text messages. To this end, our article intends to provide some insight. First, we demonstrate that the success of micro-blogs is due to the specific set of characteristics they possess: the creation of ambient awareness; a unique form of push-push-pull communication; and the ability to serve as a platform for virtual exhibitionism and voyeurism. We then discuss how applications such as Twitter can generate value for companies along all three stages of the marketing process: prepurchase (i.e., marketing research); purchase (i.e., marketing communications); and post-purchase (i.e., customer services). Finally, we present a set of rules–—The Three Rs of Micro-Blogging: Relevance; Respect; Return–—which companies should consider when relying on this type of application.
Mining Twitter to Understand Engineering Students' ExperiencesXin Chen
This is a presentation at ASEE2012, San Antonio, Texas. Mining Twitter data to understand engineering students learning experiences. This presentation contains the qualitative research part. An updated version of this project with large-scale data mining (using classification and detection algorithm to identify potentially at-risk students) is published in this paper. http://web.ics.purdue.edu/~chen654/pub/XinChen_etal_IEEETrans_tlt-cs_Mining_Twitter.pdf
Admixture of Poisson MRFs: A New Topic Model with Word DependenciesDavid Inouye
Given a large collection of uncategorized text documents such as blogs, news articles, research papers or historical documents, how can we automatically discover major subject areas or topics in the collection? In addition, how should the abstract notion of "topic" be mathematically represented and presented to an end-user? For example, a document describing UTCS intuitively might be a combination the topic "computer science" and the topic "University of Texas". Most topic models--and in particular the most common model Latent Dirichlet Allocation (LDA)--attempt to answer these questions by proposing that each topic can be represented as a simple frequency distribution over possible words (i.e. a Multinomial distribution). With this representation of a topic, the ubiquitous presentation of the topic to an end-user is a list of top 10 or 20 words. While LDA has been useful in many applications, we suggest that a simple frequency distribution is an oversimplified notion of topic and hinders both interpretation and further analysis of these topics. Thus, we propose the new topic model Admixture of Poisson MRFs (APM). Unlike in previous models, the topic representations in APM allow dependencies between words. For example, if a computer science paper contains the word "programming", it is more likely to contain the word "languages" than a random computer science paper. This talk describes the APM model, the optimization algorithm for fitting APM, some preliminary results, and some future directions.
Slides: Epidemiological Modeling of News and Rumors on TwitterParang Saraf
Abstract: Characterizing information diffusion on social platforms like Twitter enables us to understand the properties of underlying media and model communication patterns. As Twitter gains in popularity, it has also become a venue to broadcast rumors and misinformation. We use epidemiological models to characterize information cascades in twitter resulting from both news and rumors. Specifically, we use the SEIZ enhanced epidemic model that explicitly recognizes skeptics to characterize eight events across the world and spanning a range of event types. We demonstrate that our approach is accurate at capturing diffusion in these events. Our approach can be fruitfully combined with other strategies that use content modeling and graph theoretic features to detect (and possibly disrupt) rumors.
For more information, please visit: http://people.cs.vt.edu/parang/ or contact parang at firstname at cs vt edu
Eavesdropping on the Twitter Microblogging SiteShalin Hai-Jew
Research analysts go to Twitter to capture the general trends of public conversations, identify and profile influential accounts, and extract subgroups within larger collectives and larger discourses; they also go to eavesdrop on individual self-talk and individual-to-individual conversations. So what is technically in your tweets, asked Dave Rosenberg famously in a CNET article (2010). The answer: a whole lot more than 140 characters. How are the most influential social media accounts identified through #hashtag graphs? How are themes extracted? How are sentiments understood? How can users be profiled through their Tweetstreams? How can locations be mapped in terms of the Twitter conversations occurring in particular physical areas? How can live and trending issues be identified and categorized in terms of sentiment (positive, negative, and neutral)? This presentation will summarize some of the free and open-source tools as well as commercial and proprietary ones that enable increased knowability.
Présentation faite à l'école d'été Ferney-Voltaire 2014 (http://ferney2014.sciencesconf.org/) : initiation à l'analyse de réseaux avec R (packages statnet et igraph)
Optimal Transport between Copulas for Clustering Time SeriesGautier Marti
Presentation slides of our ICASSP 2016 conference paper in Shanghai. They describe the motivation and design of the Target Dependence Coefficient, a coefficient which can target or forget specific dependence relationships between the variables. This coefficient can be useful for clustering financial time series. Several of such use-cases are described on our Tech Blog https://www.datagrapple.com/Tech/optimal-copula-transport.html
In this talk we shall introduce the main ideas of TruSIS (Trust in Social Internetworking System), a Marie Curie Fellowhsip financed by European Union and hosted at VU University, Department of Computer Science, Business and Web group. The goal of TruSIS is to study the baheviour of users who affiliate to multiple social networking sites and
are active in them (e.g., users may publish personal profiles on sites like MySpace and post videos on sites like YouTube). We briefly called this scenario as SIS (Social Internetworking system).
As a first research contribution, we implemented a crawler to gather data about users and link their profiles on multiple social networking websites. To this purpose we used Google Social Graph API, a powerful API released by Google in 2008. We obtained a sample of about 1.3 millions of user accounts and 36 millions of connections between them.
Parameters from social network theory (like average clustering coefficient, network modularity and so on) were used to study the structural properties of the gathered sample and how these properties depend on user behavious.
A second contribution is about the computation of distance between two users in a SIS on the basis of their social ties. We used a popular parameter from Social Network Theory known as Katz coeffcient and
provide a computationally afficient approach to computing Katz coefficient which relies on the usage of a popular tool from linear algebra known as Sherman- Morrison formula.
Finally, we shall describe our work on extending the notion of trust from single social networks to a SIS. We describe the main research challenges tied to the definition of trust and how they relate to Semantic Web technologies.
Jordan, K. (2015) Characterising the structure of academics’ personal networks on academic social networking sites and Twitter. Presentation at the Computers and Learning Research Group (CALRG) annual conference, The Open University, Milton Keynes, UK, 17th June 2015.
In social networks, where users send messages to each other, the issue of what triggers communication between unrelated users arises: does communication between previously unrelated users depend on friend-of-a-friend type of relationships, common interests, or other factors? In this work, we study the problem of predicting directed communication
intention between two users. Link prediction is similar to communication intention in that it uses network structure for prediction. However, these two problems exhibit fundamental
differences that originate from their focus. Link prediction uses evidence to predict network structure evolution, whereas our focal point is directed communication initiation between
users who are previously not structurally connected. To address this problem, we employ topological evidence in conjunction to transactional information in order to predict communication intention. It is not intuitive whether methods that work well for
link prediction would work well in this case. In fact, we show in this work that network or content evidence, when considered separately, are not sufficiently accurate predictors. Our novel approach, which jointly considers local structural properties of users in a social network, in conjunction with their generated content, captures numerous interactions, direct and indirect, social and contextual, which have up to date been considered independently. We performed an empirical study to evaluate our method using an extracted network of directed @-messages sent between users of a corporate microblogging service, which resembles Twitter. We find that our method outperforms state of the art techniques for link prediction. Our findings have implications for a wide range of social web applications, such as contextual expert recommendation for Q&A, new friendship relationships creation, and targeted content delivery.
Data Mining In Social Networks Using K-Means Clustering Algorithmnishant24894
This topic deals with K-Means Clustering Algorithm which is used to categorize the data set into clusters depending upon their similarities like common interest or organization or colleges, etc. It categorize the data into clusters on the basis of mutual friendship.
Social Network Analysis based on MOOC's (Massive Open Online Classes)ShankarPrasaadRajama
Collected data by conducting a survey about MOOC among fellow classmates and created edge lists of students and their skills and students and MOOC websites they do courses using Python from the survey data.
Performed visualization of student network in UCINET and found out the densities among clusters in the network.
Performed hypothesis testing to see whether characteristic of a student affects their position(centrality) in the network.
Initiating a Network Effect in a Social Network - A Facebook ExperimentNasri Messarra
- Can we initiate network effects on the Facebook social network in a non-automated experiment under controlled environment?
- How to put into evidence network effects in a social network?
Similar to Tutorial on Relationship Mining In Online Social Networks (20)
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Accelerate your Kubernetes clusters with Varnish Caching
Tutorial on Relationship Mining In Online Social Networks
1. Образец заголовка
Tutorial on Relationship Mining
In Online Social Networks
Peifeng Jing NetID: pjing2
Department of Computer Science
University of Illinois Urbana-Champaign
Prepared as an assignment for CS410: Text Information Systems in Spring 2016
2. Образец заголовкаAgenda
• Introduction
• Basic Concepts
• Text Mining in Social Network
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Summary
3. Образец заголовкаAgenda
• Introduction
• Basic Concepts
• Text Mining in Social Network
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Summary
4. Образец заголовка
Social Networks Play An Important Role In
Our Daily Lives
• People conduct communications and share
information through social relations with others
such as friends, family colleagues, collaborators,
and business partners.
h"p://socialnetworking.lovetoknow.com/image/37420~SocialNetworkingAnalysisSo@ware.jpg
5. Образец заголовка
Online Large-Scale Social Network
Is Growing
• For example, Facebook, Twitter, Google, etc.
• Over 1.59 billion users on Facebook
• Over 555 million users on Twitter
h"p://www.staCsta.com/staCsCcs/272014/global-‐social-‐networks-‐ranked-‐by-‐number-‐of-‐users/
h"p://libeltyseo.com/wp-‐content/uploads/2013/03/social-‐networking.png
h"p://www.bramblingdesign.com/being-‐visible-‐Cps-‐on-‐how-‐to-‐best-‐use-‐social-‐network/
6. Образец заголовка
There Are Many Cool Applications In
Social Network and Relationship Mining
• Business intelligence and market analysis
• Discovery of community structures
• Inferring social relationships
• Personal profiling
• E-mail filtering
• Mining hidden advisor-advisee relationships in
academic networks
• Positive/Negative relationship mining, e.g. trust
network and voting network
• Biological networks, e.g. predicting food webs,
protein-protein interactions, metabolic networks, etc.
7. Образец заголовка
But We Have Problems To Find
Relations In Large-Scale Network
• We used to depend on users to label
relations
• Some personal information are hidden in
the online social network
• People generally use unstructured or semi-
structured languages for communication
8. Образец заголовкаCan We Automatically Find Relations?
• More digital data are available online
• More people share personal information
through social media
h"p://www.slideshare.net/julia594/social-‐networking-‐sites-‐selling-‐informaCon-‐to-‐third-‐parCes
9. Образец заголовкаAgenda
• Introduction
• Basic Concepts
– Problem Definition
– Relationship Concepts in Sociology
• Text Mining in Social Network
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Text Mining in Social Network
• Summary
10. Образец заголовка
What Is The Task of
Relationship Mining?
• Introduction
• Basic Concepts
– Problem Definition
– Relationship Concepts in Sociology
• Text Mining in Social Network
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Text Mining in Social Network
• Summary
11. Образец заголовкаWhat Is Relationship Mining?
• User Information and
their relationships are not
complete in online social
network
• Relationship Mining is to
predict or discover the
hidden relationships in
the social networks
12. Образец заголовкаFormulated Definition of Relationship Mining
• Given a network G(V,E) and a list of
attributes A= {A1,…,Am}.
– V = {v0,…,vn} represents users.
– E={eij} are the set of connections
between users.
– The attributes A represents profiles of
users.
– The label L={lij} are the types of edge
{eij}
• For user vi, some information might
be missing
– attributes aij are unknown
– edges eij are missing
– labels lij are unknown
• The task is to discover or predict the missing edges eij and unknown
labels lij
[1] Wenbin Tang, Honglei Zhuang, and Jie Tang, Learning to Infer Social Ties in Large Networkds, ECML/PKDD’11, 2011
[2]
Rui
Li,
Chi
Wang,
Kevin
Chen-‐Chuan
Chang,
User
Profiling
in
an
Ego
Network:
Coprofiling
A"ributes
and
RelaConships,
WWW’14,
April
7-‐11,
2014,
Seuol,
Korea,
ACM
978-‐1-‐4503-‐2744-‐2/14/04
13. Образец заголовка
How Do We Measure The Edges
(Relations) In Social Network?
• Binary Measurement
– 0: absent
– 1: existing
• Strength of Social Tie
– A concept from Sociology
[1] Mark Granovetter, The Strength of Weak Ties: A Network Theory Revisited, Sociological Theory, Vol. 1 (1983), pp. 201-233
[2] Mark S. Granovetter, The Strength of Weak Ties, American Journal of Sociology, Vol. 78, No. 6 (1973), pp. 1360-1380
[3] David Easley, Jon Kleinberg, Network, Crowds, and Markets: Reasonging about a Highly Connected World, Cambridge University Press 2010, pp. 47-83
14. Образец заголовкаWhat Does Sociology Tell Us?
• Introduction
• Basic Concepts
– Problem Definition
– Relationship Concepts in Sociology
• Text Mining in Social Network
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Text Mining in Social Network
• Summary
15. Образец заголовка
Relationships Has Been Studied For A Long
Time In Sociology (Since 1954)
• Sociologists consider relationships as
“Social Tie”
• In sociology, social tie is measured by its
“Strength”
David Easley, Jon Kleinberg, Network, Crowds, and Markets: Reasonging about a Highly Connected World, Cambridge University Press 2010, pp. 47-83
16. Образец заголовкаWhat Is Social Tie?
• Social tie (also called interpersonal tie) is the
information-carrying connections between people
• Types of ties
– Strong tie: the connections to people who you really trust,
whose social circles tightly overlap with your own.
– Weak tie: the connections to people who are merely
acquaintances. Weak tie often provide access to novel
information that not circulate in the closely knit network of
strong ties
• How do we distinguish strong and weak ties?
– Strength of social ties
Eric Gilbert and Karrie Karahalios, Predicting Tie Strength With Social Media, CHI ’09 Proceedings of the SIGCHI Conference of Human Factors in
Computing Systems, Page 211-220
17. Образец заголовкаProperty of Social Tie: Strength
• Strength of ties: the strength of a tie is a (probably
linear) combination of the amount of time, the
emotional intensity, the intimacy (mutual confiding),
and the reciprocal services which characterize the
tie
• Dimensions of Tie Strength
– Amount of time, intimacy, intensity, reciprocal services,
network topology and information social circles, emotional
supports, socioeconomic status, education level, political
affiliation, race, gender, etc.
[1] Mohammad Karim Sohrabi, Soodeh Akbar, A comprehensive study on the effects of using data mining techniques to predict tie strength, Computers in
Human Behavior, 60 (2016), pp. 534-541
[2] Eric Glbert and Karrie Karahalios, Predicting Tie Strength With Social Media, CHI’09 Proceedings of the SIGCHI Conference on Human Factors in
Computer Systems (2009) pp. 211-220
18. Образец заголовкаHistory of Tie Strength Research
• The concept of tie strength was first introduced by Granovetter
(1973) and ties are split into “strong” and “weak”
• Dimensions of tie strength are improved by Lin, Ensel, and Vaughn
(1981) and Wellman and Wortley (1990)
• Marsden and Campbell (1984) first did researches to predict tie
strength
• Krackhardt and Stern (1988) demonstrate strong relationship
between employees of different organizational sub-units can help
an organization to resist in the crisis
• Strong partners are demonstrated to create crisis and pressure for
institutional changes in the organization (Krackhardt 1992)
• Granovetter (1995) demonstrated that weak ties are more
beneficial for job seekers
19. Образец заголовкаHistory of Tie Strength Research (Cont.)
• Burt (2009) believes that structural factors are effective in shaping
tie strength, such as network topology and informal social circle.
• Wilson et al., Viswanath et al., Kahanda and Neville (2009) studied
social graphs and different interaction patterns. They also uses
characteristic features, topological features, transactional
characteristics and network-transactional features and concluded
that the most prominent features in predicting tie strength are
network-transactional characteristics
• Gilbert and Karahalios (2009) uses 70 variables and achieve 85%
accuracy in predicting tie strength
• In 2011, a study was conducted using regression analysis based
on the principle that tie strength is a combination of the variables,
such as friendships.
20. Образец заголовкаHistory of Tie Strength Research (Cont.)
• In 2013, Servia-Rodriguez, Diaz-Redondo, Fernandez-Vilas and
Pazos-Arias extracted information through Facebook API’s (with
users permission)
• In 2014, evaluation of performance testing is performed by means
of BFF (Fogues, Such, Espinosa and Garcia-Fornes)
• Lee, Lee and Hwang (2014) used perceived business tie to
investigate trust transfer
• Lin and Utz (2015) studied on the roles of tie strength in predicting
the emotional outcomes of reading a post on Facebook
• Chen, Liy, and Zou (2016) proposed a Social Tie Factor Graph
(STFG) model to estimate the home locations of users in the
Twitter network based on user-centric data and tie strength
21. Образец заголовка
Another Approach To Categorize
Relationships: Positive/Negative Ties
• Positive and Negative Social Ties
• Also called “Signed Networks”
• Positive Relationship: links to indicate friendship, support or
approval
• Negative Relationship: links to indicate disapproval,
disagreement or distrust of opinions
• It has cool applications in predicting voting and elections
Jure
Leskovec,
Daniel
Hu"enlocher,
Jon
Kleinberg,
PredicCng
PosiCve
and
NegaCve
Links
in
Online
Social
Networks,
WWW
2010,
April
26-‐30,
2010,
Raleigh,
North
Carolina,
USA,
ACM
978-‐1-‐60558-‐799-‐8/10/04
22. Образец заголовка
Why Do We Care Social Tie For
Relationship Mining?
• We can use strength of social tie to
determine whether there are relations
between users
• Based on theory of Sociology, different
types social ties represent different
properties of the social network
23. Образец заголовкаHow To Do Relationship Mining?
• Introduction
• Basic Concepts
• Text Mining in Relationship Mining
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Summary
24. Образец заголовка
Text Mining Is An Important Technology In
Relationship Mining
• The most popular social networking websites are
Facebook, LinkedIn, and MySpace where text is
the dominant way of communication
• People in online social networks generally use
unstructured or semi-structured languages for
communication
https://dcurt.is/facebooks-predicament http://www.forbes.com/forbes/welcome/
25. Образец заголовкаWhat Can Text Mining Do?
• Pre-processing:
– Feature Extraction
– Feature Selection
– Document Representation
Rizwana
Irfan,
et
al.,
A
Survey
on
Text
Mining
in
Social
Networks,
The
knowledge
Engineering
Review,
30(2)
(2015),
pp.
157-‐170
26. Образец заголовкаWhat Can Text Mining Do? (cont.)
• Classification
– Ontology Based
– Machine Learning Based
Rizwana
Irfan,
et
al.,
A
Survey
on
Text
Mining
in
Social
Networks,
The
knowledge
Engineering
Review,
30(2)
(2015),
pp.
157-‐170
27. Образец заголовкаWhat Can Text Mining Do? (cont.)
• Clustering
– Hierarchical Clustering
– Partitional Clustering
– Semantic-based
Clustering
Rizwana
Irfan,
et
al.,
A
Survey
on
Text
Mining
in
Social
Networks,
The
knowledge
Engineering
Review,
30(2)
(2015),
pp.
157-‐170
28. Образец заголовкаAgenda
• Introduction
• Basic Concepts
• Text Mining in Social Network
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Summary
29. Образец заголовкаThree Sub-fields of Online Social Networks
• Data
– Data acquisition, storage and visualization
– Scalability for large-scale network
• Approaches on Relationship Mining
– Strength of social ties
– Positive/Negative Social Ties Prediction
– Relationship classification
– Relationship mining
• Association between users’ attributes and relationships
30. Образец заголовка
Data Acquisition, Storage and
Visualization
• Acquisition: HTML Web page; FOAF profiles from
the Semantic Web (using RDF crawler); Collection
of emails (from POP3 or IMAP store); Bibliographic
data; Publication data and research profile from
Web; Telecommunication data; and so on
• Storage: Sesame server (Flink); RNKB (researcher
network knowledge base, ArnetMiner); Handoop
(BC-PDM)
• Visualization: Model-View-Controller, Java Server
Pages and Java Standard Tag Library (Flink)
[1]
Yutaka
Matsuo,
et
al.,
POLYPHONET:
An
advanced
social
network
extracCon
system
from
the
Web,
Web
SemanCcs:
Science,
Services
and
Agents
on
the
World
Wide
Web,
2007
[2] Peter
Mika,
Flink:
SemanCc
Web
technology
for
the
extracCon
and
analysis
of
social
networks,
J.
Web
SemanCcs
3
(2)
(2005)
211–223
[3] Jie
Tang,
et
al.,
ArnetMiner:
ExtracCon
and
Mining
of
Academic
Social
Networks,
KDD’08,
August24-‐27,
2008,
Las
Vegas,
Nevada,
USA.
[4] Le Yu, et al. BC-PDM: Data Mining, Social Network Analysis and Text Mining System Based on Cloud Computing, KDD’12, Beijing China (2012) 1496-1499
31. Образец заголовкаScalability for Large-Scale Network
• Filtering out pairs of persons that seem to have no
relation (POLYPHONET, TPFG model)
• Parallel and cloud computing: Sesame Server
(Flink), Handoop (BC-PDM)
h"ps://clinked.com/2016/02/23/cloud-‐compuCng-‐benefits-‐drawbacks/
h"p://www.ahay.org/wiki/Parallel_CompuCng
32. Образец заголовкаStrength Mining of Social Ties
• Model strength as linear combination of the
predictive variables and network structures
• Predictive variables: friendship, the intensity of
feelings, intimacy, mutual trust and mutual services,
and so on
• In this study, Gilbert & Karahalio achieves accuracy
of about 85% with more than 70 predictive variables
[1] Mohammad Karim Sohrabi, Soodeh Akbar, A comprehensive study on the effects of using data mining techniques to predict tie strength, Computers in
Human Behavior, 60 (2016), pp. 534-541
[2] Eric Glbert and Karrie Karahalios, Predicting Tie Strength With Social Media, CHI’09 Proceedings of the SIGCHI Conference on Human Factors in
Computer Systems (2009) pp. 211-220
33. Образец заголовкаExample (Gilbert & Karahalio 2009)
• Predictive variables from online social media
– Intensity: wall words exchanged, participant-initiated wall posts, friend-initiated wall posts,
inbox messages exchanged, inbox thread depth, participant’s status updates, friends status
updates
– Intimacy: participant’s number of friends, friend’s number of friends, days since last
communication, wall intimacy words, inbox intimacy words, appearances together in photo,
participant’s appearance in photo, distance between hometowns (mi), friend’s relationship
status
– Duration: days since first communication
– Reciprocal Services: links exchanged by wall post, applications in common
– Structural: number of mutual friends, groups in common, Norm. TF-IDF of interests and
about
– Emotional Support: wall & inbox positive emotion words, wall & inbox negative emotion
words
– Social Distance: age difference (days), number of occupations difference, educational
difference (degrees), overlapping words in religion, political difference (scale)
si =α + βRi +γDi + N(i)+εi
N(i) = λ0µM + λ1medM + λk (s −µM )k
+ λ5 minM + λ6 maxM
s∈M
∑
k=2
4
∑
M={sj: j and i are mutual friends}
Eric Glbert and Karrie Karahalios, Predicting Tie Strength With Social Media, CHI’09 Proceedings of the SIGCHI Conference on Human Factors in Computer Systems (2009) pp. 211-220
34. Образец заголовкаPositive/Negative Prediction
• Edge Sign Prediction Problem
Given a social network with signs on all its edges, but the
sign on the edge from node u to node v, denoted s(u, v), has
been “hidden.” How reliably can we infer this sign s(u, v)
using the information provided by the rest of the network?
• Related Work
Leskovec et al. (2010) proposed a method to predict the
signs of links (positive or negative), yet the prediction of
both the existence of a link and its sign has not been well
studied. Recent development of social balance theory may
provide useful hints
35. Образец заголовкаExample (Leskovec et al. 2010)
• Features: 23 features in the machine learning classification
– Use and to denote the number of incoming positive and
negative edges to v, respectively. Similarly we use and to
denote the number of outgoing positive and negative edges from u,
respectively. We use C(u, v) to denote the total number of common
neighbors of u and v in an undirected
– The second class of feature is each triad involving the edge (u, v),
consisting of a node w such that w has an edge either to or from u
and also an edge either to or from v; this leads to 16 possibilities
• Logistic Regression Model
din
+
(v) din
−
(v)
dout
+
(u) dout
−
(u)
P(+ | x) =
1
1+exp[−(b0 + bi xi
i=1
n
∑ )]
Jure
Leskovec,
Daniel
Hu"enlocher,
Jon
Kleinberg,
PredicCng
PosiCve
and
NegaCve
Links
in
Online
Social
Networks,
WWW
2010,
April
26-‐30,
2010,
Raleigh,
North
Carolina,
USA,
ACM
978-‐1-‐60558-‐799-‐8/10/04
36. Образец заголовкаRelationship Classification
• Real world domains are richly structured; entities of
multiple types are related to each other
• Many relationships (links) are hidden in online social
network
• Class of Relationships: friends, colleagues, families,
teammates, etc.
• The relationship classification task: model for
effectively and efficiently mining relationship types
37. Образец заголовкаExample (Tang, et al. 2011)
• Features: user-specific information, link-specific information
and global constraints
• Relationship semantics: a triple (eij, rij, pij), where eij ∈ E is
a social relationship, rij ∈ Y is a label associated with the
relationship, and pij is the probability (confidence) obtained
by an algorithm for inferring relationship type
• Factor functions: attribute factor, correlation factor
(correlation between the relationships), and constraint factor
(constraints between relationships)
Where is parameter configuration and s is factor functions
p(Y |G) =
1
Z
exp{θT
s(yi )
i
∑ }
θ
Wenbin
Tang,
Honglei
Zhuang,
and
Jie
Tang,
Learning
to
Infer
Social
Ties
in
Large
Networkds,
ECML/PKDD’11,
2011
38. Образец заголовкаRelationship Mining and Prediction
• Many links are hidden in online networks. Generally, we do
not know which links are missing; So we need to predict/
mine hidden relationships
• Two standard metrics are used to quantify the accuracy of
prediction algorithms
– Area under the receiver operating characteristic curve (AUC):
Provided the rank of all non-observed links, the AUC value can be
interpreted as the probability that a randomly chosen missing link (i.e.,
a link in E) is given a higher score than a randomly chosen
nonexistent link
AUC = [(# of missing links that have higher score) + 0.5 x (# of missing link
that have the same score)] / (# of total comparisons)
– Precision: Given the ranking of the non-observed links, the precision
is defined as the ratio of relevant items selected to the number of
items selected
Linyuan
Lu,
Tao
Zhou,
Link
preidicCon
in
complex
network:
A
survey,
Physica
A:
Sta-s-cal
Mechanics
and
its
applica-ons,
390
(6),
2011,
pp.
1150-‐1170
39. Образец заголовкаExample (Wang et al. 2010)
• Mining advisor-advisee relationships from research
publication networks
• Challenges
– Latent relation
– Time-dependent
– Scalability
• Joint probability
Rank score
P({yi,sti,edi}ai ∈Va ) =
1
Z
g(yi,sti,edi )
ai ∈Va
∏
rij = maxP(y1,..., yna
| yi = j)
Chi Wang, Jiawei Han, Yuntao Jia, Jie Tang, Duo Zhang, Yintao Yu, Jingyi Guo, Mining Advisor-Advisee Relationships from Research Publication Networks,
KDD’10 July 25-28, 2010, Washignton, DC, USA
40. Образец заголовка
Association of Users Attributes and
Relationships
• The attributes of users and their relationships are
not independent from each other.
• Two Tasks at The Same Time
– User attribute profiling
– Relationship type profiling
• Advantages
– Can achieve higher accuracy (e.g. 70%-90% precision in
Li et al, 2014)
41. Образец заголовкаExample (Staiano et al, 2012)
• Inferring personality traits from social networks
• Features: centrality measures, small world and
efficiency measures, transitivity measures, triadic
measures
• Quantize personality traits score into two classes
(Low/High). Classification was performed by Means
of Random Forest
Jacopo Stalano, et al. Friends don’t Lie – Inferring Personality Traits from Social Network Structure, UbiComp’12 Pittsburgh, USA (2012) 321-330
42. Образец заголовкаExample (Li et al, 2014)
• User profiling in an Ego network
• Social connections are discriminatively correlated with attributes via a
hidden factor relationship type
• Feature
– The circle: a set of friends who have the same type of connections with the
ego
– The attribute-circle dependency: the friends in a circle share the same value
with the ego user for certain attributes.
– The circle-connection dependency: friends across circles are loosely
connected
• Cost function: the linear combination of the three features
cost = λ1 {( (wt ⋅( fi − fj ))2
+ (wt ⋅( f0 − f i ))2
vi ∈Ci
∑
eij ∈E',vi,vj ∈Ci
∑ )}
t=1
K
∑ + λ2 (wt ⋅ fi −1)2
vi ∈L∩Ci
∑
t=1
K
∑ + λ3 1(1)
eij ∈E',xi!=xj
∑
Rui
Li,
Chi
Wang,
Kevin
Chen-‐Chuan
Chang,
User
Profiling
in
an
Ego
Network:
Coprofiling
A"ributes
and
RelaConships,
WWW’14,
April
7-‐11,
2014,
Seuol,
Korea,
ACM
978-‐1-‐4503-‐2744-‐2/14/04
43. Образец заголовкаAgenda
• Introduction
• Basic Concepts
• Text Mining in Social Network
• Sub-fields of Online Social Networks
• Relationship Mining Systems
• Summary
44. Образец заголовкаThere Are Many Well-Developed Relationship
Mining Systems
• Flink (Peter Mika, 2005)
• POLYPHONET (Matsuo et al., 2007)
• BC-PDM (Yu et al., 2012)
• etc.
45. Образец заголовкаFlink
Semantic Web technology for the extraction and
analysis of social networks
First Layer: metadata acquisition
Second Layer: Storage and Inference
Third Layer: visualization
Peter
Mika,
Flink:
SemanCc
Web
technology
for
the
extracCon
and
analysis
of
social
networks,
J.
Web
SemanCcs
3
(2)
(2005)
211–223
46. Образец заголовкаPOLYPHONET
• Social network extraction system that extracts relations of persons,
detects groups of persons and obtains keywords for a person
• Algorithms for social network extraction
– Basic algorithm: co-occurrence, matching coefficient, Jaccard coefficient,
overlap coefficient
– Advanced algorithm: classifying relations
– Scalability: GoogleCooc, GoogleCoocTop
Overview of module dependency Relate–identify process of Iterative Social Network Mining
Yutaka
Matsuo,
et
al.,
POLYPHONET:
An
advanced
social
network
extracCon
system
from
the
Web,
Web
SemanCcs:
Science,
Services
and
Agents
on
the
World
Wide
Web,
2007
47. Образец заголовкаBC-PDM
• Data mining, social network analysis and text mining
system based on cloud computing
The result of community detection
The architecture of BC-PDM
Le Yu, et al. BC-PDM: Data Mining, Social Network Analysis and Text Mining System Based on Cloud Computing, KDD’12, Beijing China (2012) 1496-1499
48. Образец заголовкаSummary
• Online social network play a more important role
in our life; and digital data allows us to
automatically find relationships in our social
networks
• Text Mining plays an important role in social
network applications
– Pre-processing: feature extraction, feature
selection, document representation
– Classification: Ontology Based, Machine Learning
Based
– Clustering: Hierarchical Clustering, Partitional
Clustering, Semantic-based Clustering
49. Образец заголовкаSummary (cont.)
Researches in Relationship Mining
• Data
– Data acquisition, storage and visualization
– Scalability for large-scale network
• Approaches on Relationship Mining
– Strength of social ties
– Positive/Negative Social Ties Prediction
– Relationship classification
– Relationship mining
• Association between users’ attributes and relationships
There are Many Systems for relationship extraction and mining
• Flink, POLYPHONET, BC-PDM, etc.