This document proposes a new influence maximization approach called ANIM based on existing work. It uses a Yelp dataset to build a social network and calculate edge weights. ANIM is a greedy algorithm that iteratively selects nodes to maximize the difference in influence spread between the current set and adding another node. Experiments show ANIM has better influence spread and runtime than other algorithms like DegreeDiscount and NewGreedyIC. The goal is to identify influential customers for local businesses to target through online review sites.
Revenue Maximization in Incentivized Social AdvertisingCigdem Aslay
The document summarizes research on maximizing revenue from incentivized social advertising. It introduces models for topic-aware independent cascade influence propagation and cost-per-engagement pricing. Algorithms are presented for selecting initial advertisers (seeds) to maximize total revenue subject to budgets, including cost-agnostic and cost-sensitive greedy approaches. The problem is shown to be NP-hard, and approximation guarantees are derived. Scalable two-phase iterative algorithms are developed based on reverse influence sampling. Experiments demonstrate the algorithms on real-world networks under different seed incentive models. Open problems concern building on more recent influence maximization techniques and improving scalability.
In this talk we detail the step to creating a Visual Search engine for 1M Amazon product using MXNet Gluon and the K-Nearest Neighbor search library HNSW.
For implementation details, check this repository: https://github.com/ThomasDelteil/VisualSearch_MXNet
Video available here:
https://www.youtube.com/watch?v=9a8MAtfFVwI
Demo website available here:
https://thomasdelteil.github.io/VisualSearch_MXNet/
Two Stage Reversible Data Hiding Based On Image Interpolation and Histogram ...IJMER
In this paper a two stage reversible data hiding technique is proposed. At the first stage, an
interpolation technique is used to generate a cover image from the input image. The difference values
from input image and cover image is used as the carrier to embed data. At the second stage, a histogram
modification is applied on a difference image to embed data. The extraction process also works in two
stages. The proposed algorithm is expected to increase the embedding capacity as two techniques are
combined. The interpolation technique helps to keep the distortion low. Experimental results show that
the new method has higher embedding capacity than other existing methods.
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...Wei Lu
VLDB'16 Research Paper.
Influence maximization is a well-studied problem that asks for a
small set of influential users from a social network, such that by targeting them as early adopters, the expected total adoption through influence cascades over the network is maximized. However, almost all prior work focuses on cascades of a single propagating entity or purely-competitive entities. In this work, we propose the Comparative Independent Cascade (Com-IC) model that covers the full spectrum of entity interactions from competition to complementarity. In Com-IC, users’ adoption decisions depend not only on edge-level information propagation, but also on a node-level automaton whose behavior is governed by a set of model parameters, enabling our model to capture not only competition, but also complementarity, to any possible degree. We study two natural optimization problems, Self Influence Maximization and Complementary Influence Maximization, in a novel setting with complementary
entities. Both problems are NP-hard, and we devise efficient
and effective approximation algorithms via non-trivial techniques
based on reverse-reachable sets and a novel “sandwich approximation” strategy. The applicability of both techniques extends beyond our model and problems. Our experiments show that the proposed algorithms consistently outperform intuitive baselines on four real-world social networks, often by a significant margin. In addition, we learn model parameters from real user action logs.
Scalable and Parallelizable Processing of Influence Maximization for Large-S...Jinha Kim
This document discusses an algorithm called IPA (Influence Path Algorithm) for efficiently solving the influence maximization problem in large social networks. IPA works by extracting meaningful influence paths between node pairs from the graph and evaluating influence in parallel by approximating the influence spread of individual nodes. An empirical evaluation on real datasets shows IPA can process influence maximization much faster than other algorithms while achieving comparable influence spread results.
This thesis constitutes one of the first investigations that lie at the intersection of social influence propagation, viral marketing, and social advertising. The objective of this thesis is to take the algorithmic aspects of viral marketing out of the lab, and further enhance these aspects to account for the real world social advertisement models, by drawing on the viral marketing literature to study social influence aware ad allocation for social advertising. To this end, we take a first step towards enabling social influence online analytics in support of viral marketing decision making, and propose efficient influence indexing framework that can accurately answer topic-aware viral marketing queries with milliseconds response time. We then initiate investigation in the area of social advertising through the viral marketing lens, aligned with real world social advertisement models, and introduce two fundamental optimization problems, regarding the allocation of ads to social network users under social influence. We devise greedy approximation algorithms with provable approximation guarantees for the novel problems introduced. We also develop scalable versions of our approximation algorithms by leveraging the notion of reverse reachability sampling on social graphs, and experimentally confirm that our algorithms are scalable and deliver high quality solutions.
The document discusses different search techniques including binary search, bisection, and ternary search. It provides descriptions of binary search and gives iterative and recursive pseudocode implementations. It also discusses the logarithmic worst case performance of binary search. Ternary search is introduced as a technique for finding the minimum or maximum of a unimodal function. Bisection is described as dividing elements into two parts and sorting one part to enable binary search.
IMAX uses large-format 70mm film and a proprietary projection system to provide cinema experiences with increased image clarity and resolution compared to standard formats. Key aspects of the IMAX system include large 65mm film cameras, dual projectors that combine two images, and specially designed theaters with massive screens close to audiences for an immersive effect. IMAX digital projection now uses two 2K or 4K resolution projectors from Christie to replicate the IMAX experience digitally.
Revenue Maximization in Incentivized Social AdvertisingCigdem Aslay
The document summarizes research on maximizing revenue from incentivized social advertising. It introduces models for topic-aware independent cascade influence propagation and cost-per-engagement pricing. Algorithms are presented for selecting initial advertisers (seeds) to maximize total revenue subject to budgets, including cost-agnostic and cost-sensitive greedy approaches. The problem is shown to be NP-hard, and approximation guarantees are derived. Scalable two-phase iterative algorithms are developed based on reverse influence sampling. Experiments demonstrate the algorithms on real-world networks under different seed incentive models. Open problems concern building on more recent influence maximization techniques and improving scalability.
In this talk we detail the step to creating a Visual Search engine for 1M Amazon product using MXNet Gluon and the K-Nearest Neighbor search library HNSW.
For implementation details, check this repository: https://github.com/ThomasDelteil/VisualSearch_MXNet
Video available here:
https://www.youtube.com/watch?v=9a8MAtfFVwI
Demo website available here:
https://thomasdelteil.github.io/VisualSearch_MXNet/
Two Stage Reversible Data Hiding Based On Image Interpolation and Histogram ...IJMER
In this paper a two stage reversible data hiding technique is proposed. At the first stage, an
interpolation technique is used to generate a cover image from the input image. The difference values
from input image and cover image is used as the carrier to embed data. At the second stage, a histogram
modification is applied on a difference image to embed data. The extraction process also works in two
stages. The proposed algorithm is expected to increase the embedding capacity as two techniques are
combined. The interpolation technique helps to keep the distortion low. Experimental results show that
the new method has higher embedding capacity than other existing methods.
From Competition to Complementarity: Comparative Influence Diffusion and Maxi...Wei Lu
VLDB'16 Research Paper.
Influence maximization is a well-studied problem that asks for a
small set of influential users from a social network, such that by targeting them as early adopters, the expected total adoption through influence cascades over the network is maximized. However, almost all prior work focuses on cascades of a single propagating entity or purely-competitive entities. In this work, we propose the Comparative Independent Cascade (Com-IC) model that covers the full spectrum of entity interactions from competition to complementarity. In Com-IC, users’ adoption decisions depend not only on edge-level information propagation, but also on a node-level automaton whose behavior is governed by a set of model parameters, enabling our model to capture not only competition, but also complementarity, to any possible degree. We study two natural optimization problems, Self Influence Maximization and Complementary Influence Maximization, in a novel setting with complementary
entities. Both problems are NP-hard, and we devise efficient
and effective approximation algorithms via non-trivial techniques
based on reverse-reachable sets and a novel “sandwich approximation” strategy. The applicability of both techniques extends beyond our model and problems. Our experiments show that the proposed algorithms consistently outperform intuitive baselines on four real-world social networks, often by a significant margin. In addition, we learn model parameters from real user action logs.
Scalable and Parallelizable Processing of Influence Maximization for Large-S...Jinha Kim
This document discusses an algorithm called IPA (Influence Path Algorithm) for efficiently solving the influence maximization problem in large social networks. IPA works by extracting meaningful influence paths between node pairs from the graph and evaluating influence in parallel by approximating the influence spread of individual nodes. An empirical evaluation on real datasets shows IPA can process influence maximization much faster than other algorithms while achieving comparable influence spread results.
This thesis constitutes one of the first investigations that lie at the intersection of social influence propagation, viral marketing, and social advertising. The objective of this thesis is to take the algorithmic aspects of viral marketing out of the lab, and further enhance these aspects to account for the real world social advertisement models, by drawing on the viral marketing literature to study social influence aware ad allocation for social advertising. To this end, we take a first step towards enabling social influence online analytics in support of viral marketing decision making, and propose efficient influence indexing framework that can accurately answer topic-aware viral marketing queries with milliseconds response time. We then initiate investigation in the area of social advertising through the viral marketing lens, aligned with real world social advertisement models, and introduce two fundamental optimization problems, regarding the allocation of ads to social network users under social influence. We devise greedy approximation algorithms with provable approximation guarantees for the novel problems introduced. We also develop scalable versions of our approximation algorithms by leveraging the notion of reverse reachability sampling on social graphs, and experimentally confirm that our algorithms are scalable and deliver high quality solutions.
The document discusses different search techniques including binary search, bisection, and ternary search. It provides descriptions of binary search and gives iterative and recursive pseudocode implementations. It also discusses the logarithmic worst case performance of binary search. Ternary search is introduced as a technique for finding the minimum or maximum of a unimodal function. Bisection is described as dividing elements into two parts and sorting one part to enable binary search.
IMAX uses large-format 70mm film and a proprietary projection system to provide cinema experiences with increased image clarity and resolution compared to standard formats. Key aspects of the IMAX system include large 65mm film cameras, dual projectors that combine two images, and specially designed theaters with massive screens close to audiences for an immersive effect. IMAX digital projection now uses two 2K or 4K resolution projectors from Christie to replicate the IMAX experience digitally.
This document describes an algorithm to automatically classify social media contacts into circles or lists. It first identifies leaders in an ego network by calculating metrics like betweenness centrality, clustering coefficient, and degree density for each node. Leaders are used as seeds to form circles by including their friend connections. The algorithm uses conductance scoring to iteratively add or remove nodes at the circle boundaries. The approach is evaluated on a Facebook dataset and achieves results comparable to previous methods, with balanced error rates and F1 scores used as metrics.
The document discusses big data challenges and potential solutions. It begins by outlining how big data is generated from various sources and used in applications like search engines. The main challenges are determining which subset of big data to analyze and how to clean noisy data. Two potential solutions discussed are:
1) Intelligent sampling to determine a representative subset of data to analyze instead of the entire dataset, in order to improve running time. Adaptive sampling techniques like IDASA are proposed.
2) Filtering techniques like ensemble filtering use multiple models to identify and remove mislabeled instances from training data, in order to improve predictive accuracy by cleaning the data. Bayesian analysis can interpret filtering as a form of model averaging.
The document provides an overview of cluster analysis techniques. It discusses the need for segmentation to group large populations into meaningful subsets. Common clustering algorithms like k-means are introduced, which assign data points to clusters based on similarity. The document also covers calculating distances between observations, defining the distance between clusters, and interpreting the results of clustering analysis. Real-world applications of segmentation and clustering are mentioned such as market research, credit risk analysis, and operations management.
High-performance graph analysis is unlocking knowledge in computer security, bioinformatics, social networks, and many other data integration areas. Graphs provide a convenient abstraction for many data problems beyond linear algebra. Some problems map directly to linear algebra. Others, like community detection, look eerily similar to sparse linear algebra techniques. And then there are algorithms that strongly resist attempts at making them look like linear algebra. This talk will cover recent results with an emphasis on streaming graph problems where the graph changes and results need updated with minimal latency. We’ll also touch on issues of sensitivity and reliability where graph analysis needs to learn from numerical analysis and linear algebra.
Ripple Algorithm to Evaluate the Importance of Network Nodesrahulmonikasharma
Inthis paper raise the ripples algorithm to evaluate the importance of network node was proposed, its principle is based onthe direct influence of adjacent nodes, and affect farther nodes indirectlyby closer ones just like the ripples on the water. Then we defined two judgments,the discriminationof node importance and the accuracy of key node selecting, to verify its efficiency. The greater degree of discriminationand higher accuracy means better efficiency of algorithm. At last we performed experiment on ARPA network, to compare the efficiency of different algorithms, closeness centricity, node deletion, node contraction method, algorithm raised by Zhou Xuan etc. and ripple method. Results show that ripple algorithm is better than the other measures in the discrimination of node importance and the accuracy of key node selecting.
Introduction to machine learning and model building using linear regressionGirish Gore
An basic introduction of Machine learning and a kick start to model building process using Linear Regression. Covers fundamentals of Data Science field called Machine Learning covering the fundamental topic of supervised learning method called linear regression. Importantly it covers this using R language and throws light on how to interpret linear regression results of a model. Interpretation of results , tuning and accuracy metrics like RMSE Root Mean Squared Error are covered here.
A network pruning based approach for subset specific influential detectionArun Kalyanasundaram
This document presents an iterative pruning approach for subset-specific influential detection in networks. The approach identifies a set of "influenced" nodes based on a threshold parameter and prunes paths that only lead to these nodes. This pruning significantly improves efficiency compared to existing greedy algorithms with only a small reduction in influence spread. Experiments on two real-world networks show the approach achieves up to 96% better efficiency than subset-adapted greedy algorithms while maintaining most of the influence spread.
This document analyzes the structure and evolution of call graphs from mobile telecom data over time. It finds that the call graphs exhibit small-world properties and follow a bow-tie or "treasure hunt" model topology. Additionally, the size of the strongly connected component increases rapidly over time as preferential attachment pulls more nodes into the dense core of the graph.
This document discusses various software quality metrics including lines of code count, defect rates based on lines of code, cyclomatic complexity, fan-in and fan-out, and structural and data complexity metrics. It explains that while lines of code is commonly used, it does not fully capture complexity. Other metrics like cyclomatic complexity, fan-in/fan-out, and data/structural complexity provide additional insight into a program's quality and maintainability. The optimal size of a program may depend on factors like language, project, and environment.
This document discusses various software quality metrics including lines of code count, defect density as it relates to size, cyclomatic complexity, fan-in/fan-out, and other structural and data complexity metrics. It provides empirical data on the relationship between size and defects, defines key metrics like cyclomatic complexity, and discusses how these metrics can help evaluate software quality and estimate testing effort.
(141205) Masters_Thesis_Defense_Sundong_KimSundong Kim
Masters thesis defense presentation slide
Topic : Maximizing Influence over a Target user through Friend Recommendation
Presenter : Sundong Kim @ KAIST IsysE department
Keywords : Social network, Friend recommendation, Incremental Algorithm, Maximizing influence
Basic explanation about graph mining for social network analysis (SNA). I tried to describe some metrics and benefit from SNA (focusing on telecommunication field). Basic spark with graphx script to analyse the graph also in the slide
- The document describes a project to predict customer churn for a telecom company using classification algorithms. It analyzes a dataset of 3333 customers to identify variables that contribute to churn and builds models using KNN and C4.5.
- The C4.5 model achieved higher accuracy (94.9%) than KNN (87.1%) on the test data. Key variables for predicting churn were found to be day minutes, customer service calls, and international plan.
- The model can help the telecom company prevent churn by focusing retention efforts on at-risk customers identified through these important variables.
This document discusses using clustering algorithms to construct ontologies from text documents. It begins with an introduction to semantic search, ontologies in the semantic web, and clustering. It then describes the ROCK clustering algorithm in detail. The main tasks to perform are preprocessing text documents, normalizing term weights, applying latent semantic indexing via singular value decomposition, and using the ROCK clustering algorithm. The goal is to group similar documents into clusters to help construct an ontology from the unstructured text data.
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIANexgen Technology
TO GET THIS PROJECT COMPLETE SOURCE ON SUPPORT WITH EXECUTION PLEASE CALL BELOW CONTACT DETAILS
MOBILE: 9791938249, 0413-2211159, WEB: WWW.NEXGENPROJECT.COM,WWW.FINALYEAR-IEEEPROJECTS.COM, EMAIL:Praveen@nexgenproject.com
NEXGEN TECHNOLOGY provides total software solutions to its customers. Apsys works closely with the customers to identify their business processes for computerization and help them implement state-of-the-art solutions. By identifying and enhancing their processes through information technology solutions. NEXGEN TECHNOLOGY help it customers optimally use their resources.
This document provides an overview of relational machine learning models and applications. It discusses how networks and graphs like social networks, biological networks, financial networks, and knowledge graphs can be modeled using relational machine learning. Specific models discussed include recommendation engines that use matrix factorization, the RESCAL model for multi-relational data, bilinear diagonal models for scalability, and TransE which models relationships as translations in the embedding space. The document also covers generating negative samples and different loss functions used for training these models.
This is the presentation for the paper "Fractional Step Discriminant Pruning: A Filter Pruning Framework for Deep Convolutional Neural Networks", delivered by N. Gkalelis and V. Mezaris at the 7th IEEE Int. Workshop on Mobile Multimedia Computing (MMC2020) that was held as part of the IEEE Int. Conf. on Multimedia and Expo (ICME), in July 2020.
Creating Community at WeWork through Graph Embeddings with node2vec - Karry LuRising Media Ltd.
This document discusses how WeWork is using graph embeddings and the node2vec algorithm to power member recommendations. It first describes WeWork's member knowledge graph that contains data on members' profiles, interactions, interests and skills. It then explains how node2vec can learn vector representations of each member node that capture similarities, which can be used for recommendations. WeWork runs node2vec on the social graph of each location to map members to vectors and identify the most similar members to power recommendations like onboarding suggestions and introductions between members.
More Related Content
Similar to A Novel Target Marketing Approach based on Influence Maximization
This document describes an algorithm to automatically classify social media contacts into circles or lists. It first identifies leaders in an ego network by calculating metrics like betweenness centrality, clustering coefficient, and degree density for each node. Leaders are used as seeds to form circles by including their friend connections. The algorithm uses conductance scoring to iteratively add or remove nodes at the circle boundaries. The approach is evaluated on a Facebook dataset and achieves results comparable to previous methods, with balanced error rates and F1 scores used as metrics.
The document discusses big data challenges and potential solutions. It begins by outlining how big data is generated from various sources and used in applications like search engines. The main challenges are determining which subset of big data to analyze and how to clean noisy data. Two potential solutions discussed are:
1) Intelligent sampling to determine a representative subset of data to analyze instead of the entire dataset, in order to improve running time. Adaptive sampling techniques like IDASA are proposed.
2) Filtering techniques like ensemble filtering use multiple models to identify and remove mislabeled instances from training data, in order to improve predictive accuracy by cleaning the data. Bayesian analysis can interpret filtering as a form of model averaging.
The document provides an overview of cluster analysis techniques. It discusses the need for segmentation to group large populations into meaningful subsets. Common clustering algorithms like k-means are introduced, which assign data points to clusters based on similarity. The document also covers calculating distances between observations, defining the distance between clusters, and interpreting the results of clustering analysis. Real-world applications of segmentation and clustering are mentioned such as market research, credit risk analysis, and operations management.
High-performance graph analysis is unlocking knowledge in computer security, bioinformatics, social networks, and many other data integration areas. Graphs provide a convenient abstraction for many data problems beyond linear algebra. Some problems map directly to linear algebra. Others, like community detection, look eerily similar to sparse linear algebra techniques. And then there are algorithms that strongly resist attempts at making them look like linear algebra. This talk will cover recent results with an emphasis on streaming graph problems where the graph changes and results need updated with minimal latency. We’ll also touch on issues of sensitivity and reliability where graph analysis needs to learn from numerical analysis and linear algebra.
Ripple Algorithm to Evaluate the Importance of Network Nodesrahulmonikasharma
Inthis paper raise the ripples algorithm to evaluate the importance of network node was proposed, its principle is based onthe direct influence of adjacent nodes, and affect farther nodes indirectlyby closer ones just like the ripples on the water. Then we defined two judgments,the discriminationof node importance and the accuracy of key node selecting, to verify its efficiency. The greater degree of discriminationand higher accuracy means better efficiency of algorithm. At last we performed experiment on ARPA network, to compare the efficiency of different algorithms, closeness centricity, node deletion, node contraction method, algorithm raised by Zhou Xuan etc. and ripple method. Results show that ripple algorithm is better than the other measures in the discrimination of node importance and the accuracy of key node selecting.
Introduction to machine learning and model building using linear regressionGirish Gore
An basic introduction of Machine learning and a kick start to model building process using Linear Regression. Covers fundamentals of Data Science field called Machine Learning covering the fundamental topic of supervised learning method called linear regression. Importantly it covers this using R language and throws light on how to interpret linear regression results of a model. Interpretation of results , tuning and accuracy metrics like RMSE Root Mean Squared Error are covered here.
A network pruning based approach for subset specific influential detectionArun Kalyanasundaram
This document presents an iterative pruning approach for subset-specific influential detection in networks. The approach identifies a set of "influenced" nodes based on a threshold parameter and prunes paths that only lead to these nodes. This pruning significantly improves efficiency compared to existing greedy algorithms with only a small reduction in influence spread. Experiments on two real-world networks show the approach achieves up to 96% better efficiency than subset-adapted greedy algorithms while maintaining most of the influence spread.
This document analyzes the structure and evolution of call graphs from mobile telecom data over time. It finds that the call graphs exhibit small-world properties and follow a bow-tie or "treasure hunt" model topology. Additionally, the size of the strongly connected component increases rapidly over time as preferential attachment pulls more nodes into the dense core of the graph.
This document discusses various software quality metrics including lines of code count, defect rates based on lines of code, cyclomatic complexity, fan-in and fan-out, and structural and data complexity metrics. It explains that while lines of code is commonly used, it does not fully capture complexity. Other metrics like cyclomatic complexity, fan-in/fan-out, and data/structural complexity provide additional insight into a program's quality and maintainability. The optimal size of a program may depend on factors like language, project, and environment.
This document discusses various software quality metrics including lines of code count, defect density as it relates to size, cyclomatic complexity, fan-in/fan-out, and other structural and data complexity metrics. It provides empirical data on the relationship between size and defects, defines key metrics like cyclomatic complexity, and discusses how these metrics can help evaluate software quality and estimate testing effort.
(141205) Masters_Thesis_Defense_Sundong_KimSundong Kim
Masters thesis defense presentation slide
Topic : Maximizing Influence over a Target user through Friend Recommendation
Presenter : Sundong Kim @ KAIST IsysE department
Keywords : Social network, Friend recommendation, Incremental Algorithm, Maximizing influence
Basic explanation about graph mining for social network analysis (SNA). I tried to describe some metrics and benefit from SNA (focusing on telecommunication field). Basic spark with graphx script to analyse the graph also in the slide
- The document describes a project to predict customer churn for a telecom company using classification algorithms. It analyzes a dataset of 3333 customers to identify variables that contribute to churn and builds models using KNN and C4.5.
- The C4.5 model achieved higher accuracy (94.9%) than KNN (87.1%) on the test data. Key variables for predicting churn were found to be day minutes, customer service calls, and international plan.
- The model can help the telecom company prevent churn by focusing retention efforts on at-risk customers identified through these important variables.
This document discusses using clustering algorithms to construct ontologies from text documents. It begins with an introduction to semantic search, ontologies in the semantic web, and clustering. It then describes the ROCK clustering algorithm in detail. The main tasks to perform are preprocessing text documents, normalizing term weights, applying latent semantic indexing via singular value decomposition, and using the ROCK clustering algorithm. The goal is to group similar documents into clusters to help construct an ontology from the unstructured text data.
RESOLVING MULTI-PARTY PRIVACY CONFLICTS IN SOCIAL MEDIANexgen Technology
TO GET THIS PROJECT COMPLETE SOURCE ON SUPPORT WITH EXECUTION PLEASE CALL BELOW CONTACT DETAILS
MOBILE: 9791938249, 0413-2211159, WEB: WWW.NEXGENPROJECT.COM,WWW.FINALYEAR-IEEEPROJECTS.COM, EMAIL:Praveen@nexgenproject.com
NEXGEN TECHNOLOGY provides total software solutions to its customers. Apsys works closely with the customers to identify their business processes for computerization and help them implement state-of-the-art solutions. By identifying and enhancing their processes through information technology solutions. NEXGEN TECHNOLOGY help it customers optimally use their resources.
This document provides an overview of relational machine learning models and applications. It discusses how networks and graphs like social networks, biological networks, financial networks, and knowledge graphs can be modeled using relational machine learning. Specific models discussed include recommendation engines that use matrix factorization, the RESCAL model for multi-relational data, bilinear diagonal models for scalability, and TransE which models relationships as translations in the embedding space. The document also covers generating negative samples and different loss functions used for training these models.
This is the presentation for the paper "Fractional Step Discriminant Pruning: A Filter Pruning Framework for Deep Convolutional Neural Networks", delivered by N. Gkalelis and V. Mezaris at the 7th IEEE Int. Workshop on Mobile Multimedia Computing (MMC2020) that was held as part of the IEEE Int. Conf. on Multimedia and Expo (ICME), in July 2020.
Creating Community at WeWork through Graph Embeddings with node2vec - Karry LuRising Media Ltd.
This document discusses how WeWork is using graph embeddings and the node2vec algorithm to power member recommendations. It first describes WeWork's member knowledge graph that contains data on members' profiles, interactions, interests and skills. It then explains how node2vec can learn vector representations of each member node that capture similarities, which can be used for recommendations. WeWork runs node2vec on the social graph of each location to map members to vectors and identify the most similar members to power recommendations like onboarding suggestions and introductions between members.
Similar to A Novel Target Marketing Approach based on Influence Maximization (20)
Creating Community at WeWork through Graph Embeddings with node2vec - Karry Lu
A Novel Target Marketing Approach based on Influence Maximization
1. A Novel Target Marketing
Approach based on
Influence Maximization
2. Motivation
• “Businesses on Facebook and Twitter are reaching only 2% of
their fans and only 0.07% of follower actually interact with
their post.” – Forrester Study, Nov. 17, 2014
• Local business owner need to target market people nearby, to
increase footfall
• Traditional methods of marketing like leafleting are inefficient
• “82% people check review online before spending money on
product/service” – Nielsen Study, July 1, 2013
• Local businesses can use online review websites like Yelp,
Zomato to target customers effectively.
3. Problem Statement
• “To develop a novel approach for Identification of influential customers for target
marketing through Influence Maximization.”
Objectives
Fig. 1
4. Influence Maximization
• It is problem to find K vertices in the graph such that under the diffusion model, the expected
number of vertices influenced by the K vertices (referred to as influence spread) is the largest
possible
• The Independent Cascade (IC) model is the simplest diffusion model. If j is a neighbor of i
then the probability of j being activated by i is:
Eq. 5
i j
pij
wij
5. Existing Work
• Kemp et al. were first to study the optimization problem of influence maximization
• Proved it to be a NP-hard problem, gave a time inefficient Greedy algorithm
• GeneralGreedy repeats k rounds: in the ith round, select a node v that provides the largest
increase in influence spread
• In each round influence spread is calculated by Monte-Carlo simulations.
6. Cont’d
• Chen et al. developed NewGreedyIC, an improved Greedy algorithm
• NewGreedyIC also runs Monte-Carlo simulations, but in each iteration it generates a random
graph G’ by randomly removing edges from the existing graph G. This makes the size of graph
in that iteration smaller and hence is faster than GeneralGreedy method
7. Cont’d
• Chen et al. also proposed a more efficient DegreeDiscount method
• DegreeDiscount method doesn’t run Monte-Carlo simulations, it uses degree discount
heuristics where it is assumed that the spread increases with the degree of nodes.
• It gives discount in the degree of a node by one if any of its neighbors have already been
selected in the set of active nodes.
• It is 6 time faster than NewGreedyIC. It gives influence spread slightly lower than
NewGreedyIC.
[link]
A
3
5
6
A
2
4
5
8. Inspiration from existing work
• DegreeDiscount method eliminates need for Monte-Carlo simulations by using degree
heuristic.
• This reduces running time compared to NewGreedy by manifold.
9. Research Gap
• DegreeDiscount doesn’t take into account
the overlapping part of spread of two
influential nodes
• Due to which the total influence spread will
be lesser than sum of their individual
influence spread
• Our novel algorithm adds that node as kth
node which maximize difference between
spread of already selected k-1 nodes and
that of k nodes after addition
• C-A has more difference in spread than B-A.
A
B
C
A
B
C
13. Data and Preprocessing
• The semi-structured data obtained from Yelp is stored in a Document Oriented database.
• Preprocessing is done to clean the data.
• Social network is formed from users who have reviewed similar nearby businesses.
• Users are represented as nodes in the network, and two nodes are joined by an edge only if
they are friends.
14. Edge weight calculation in network
• The weight of an edge between two users X and Y is calculated by the formula:
• w1 is the normalized count of mutual friends between X and Y
where nx and ny are the list of friends of user X and user Y.
• w2 signifies the similarity in opinion of user X and user Y
where and
• Xpos is the set of businesses that X rated positively; Xneg is the set of businesses that X rated
negatively.
• We have considered a rating of 3 or below as negative review, and 4 or above as positive
review. [old]
Eq. 9
Eq. 7
Eq. 8
15. Propagation probability calculation in network
• Propagation probability of an edge going from u to v was calculated by:
• Strength of an edge between u and v is the average of influence of u and v
• Where
• For popularity we used two attributes of the user, reviewCount and averageStars
• The clustering value is defined as the closeness of a node to a cluster of highly interconnected
nodes.
• C(v) is clustering value of a node given by:
• Quartiles were used for normalization.[link]
Eq. 17
Eq. 16
Eq. 15
Eq. 10
Eq. 11
17. Our novel approach: spreadHeuristicIC Algorithm
• Proposed algorithm is a greedy algorithm.
• It iteratively finds a node and add it to the set S of top-K influential nodes.
• While adding kth node to set S, it finds the node that maximize the difference between
spread of already selected k-1 nodes and spread of set S after adding that kth node.
A
B
C
19. Complexity Analysis
• The algorithm take O(V) steps in line 3 and line 4 take O(T) time, where T is the time to
compute the coverage of a node in the graph G, and it takes O(IE) time (where I is the number
of simulations for the Independent Cascade model, and E is the number of edges in graph G).
• From lines 7-9, complexity of each line is O(VlgV) when we use sorting for union operation.
• So, overall complexity of the algorithm is O(K(VIE + VlgV)).
20. Experiments and Results
• We have conducted experiments for our algorithm and various other algorithms (i.e.- Degree
Discount algorithm, Single Discount algorithm, Degree Discount algorithm, General Greedy
algorithm etc.) on Yelp’s network.
• We find that the Spread Heuristic based algorithm has more influence spread compared to
the other algorithms. The ranking based on influence spread comes out to be:
spreadHeuristicIC > newGreedyIC > degreeDiscountIC > random
21. Cont’d
Influence spread for G with n=1617, E=2058
0
50
100
150
200
250
300
0 10 20 30 40 50 60 70 80
InfluenceSpread
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreedHeuristic
Influence spread for G with n=4292, E=8147
0
50
100
150
200
250
0 10 20 30 40 50 60 70 80
InfluenceSpread
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreadHeuristic
Fig. 9Fig. 7
22. Cont’d
Run time for G with n=1617, E=2058 Run time for G with n=4292, E=8147
-10
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80
RunningTime(sec)
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreadHeuristic
0
50
100
150
200
250
300
350
0 10 20 30 40 50 60 70 80
RunningTime(sec)
K
degreeDiscountIC degreeDiscountIC2 degreeDiscountStar
degreeHeuristic degreeHeuristic2 singleDiscount
highestDegree newGreedyIC randomHeuristic
spreedHeuristic
Fig. 10Fig. 8
23. Conclusion
• With respect to initial aims and objectives of this project, the final outcome is fairly
successful.
• After series of experiments, we concluded that our algorithm outperforms existing influence
maximization algorithms.
• We developed a dashboard for the businesses to visualize the influential users and their
spread among the people nearby.
24. References
[1] M.E.J. Newman, M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E 69 (2) (2004) 026113.
[2] Blondel, Vincent D., et al. "Fast unfolding of communities in large networks. "Journal of Statistical Mechanics: Theory and Experiment 2008.10 (2008):
P10008.
[3]. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. S. Glance. Cost-effective outbreak detection in networks. In Proceedings of the 13th
ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 420–429, 2007.
[4] “Yelp Dataset,” https://www.yelp.com/dataset challenge/dataset.
[5]. D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery
and Data Mining, 2003.
[6]. M. Richardson, P. Domingos. Mining Knowledge-Sharing Sites for Viral Marketing. Eighth Intl. Conf. on Knowledge Discovery and Data Mining, 2002.
[7] J. Goldenberg, B. Libai, E. Muller. Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth. Marketing Letters 12:3(2001),
211-223
[8] M. Granovetter, Threshold models of collective behavior, the American Journal of sociology, vol. 83, no. 6, pp.1420-1443, May 1978
[9] Chen, Wei, Yajun Wang, and Siyu Yang. "Efficient influence maximization in social networks." Proceedings of the 15th ACM SIGKDD international conference
on Knowledge discovery and data mining. ACM, 2009.
[10] Kempe, David, Jon Kleinberg, and Éva Tardos. "Maximizing the spread of influence through a social network." Proceedings of the ninth ACM SIGKDD
international conference on Knowledge discovery and data mining. ACM, 2003.
[11] Wang, Yu, et al. "Community-based greedy algorithm for mining top-k influential nodes in mobile social networks." Proceedings of the 16th ACM SIGKDD
international conference on Knowledge discovery and data mining. ACM, 2010.
[12] Saito, Kazumi, Ryohei Nakano, and Masahiro Kimura. "Prediction of information diffusion probabilities for independent cascade model." Knowledge-Based
Intelligent Information and Engineering Systems. Springer Berlin Heidelberg, 2008.
[13] Newman, Mark EJ. "Analysis of weighted networks." Physical Review E 70.5 (2004): 056131.