Upcoming SlideShare
×

# Pro max icdm2012-slides

321 views
184 views

Published on

Published in: Technology, Business
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

Views
Total views
321
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
1
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Pro max icdm2012-slides

1. 1. Wei LuLaksV.S. LakshmananICDM’12, to appearProfit Maximization overSocial Networks
2. 2. Overview Background: Social Influence Propagation & Maximization Motivation for Profit Maximization Proposed Model & Its Properties Profit MaximizationAlgorithms Experimental Results Conclusions & Discussions
3. 3. Background
4. 4. Influence in Social Networks We live in communities and interact with friends, families,and even strangers This forms social networks In social interactions, people may influence each other
5. 5. Influence Diffusion & Viral MarketingiPhone 5 is greatiPhone 5 is greatiPhone 5 is greatiPhone 5 is greatiPhone 5 is greatSource:Wei Chen’s KDD’10 slidesWord-of-mouth effect
6. 6. Social Network as Directed Graph Nodes: Individuals in the network Edges: Links/relationships between individuals Edge weight on (𝑖, 𝑗): Influence weight 𝑤𝑖,𝑗0.80.70.10.130.30.410.270.20.90.010.60.540.10.1100.20.7
7. 7. Linear Threshold Model – Definition Each node 𝑣 chooses an activation threshold 𝜃𝑣uniformly at random from [0,1] Time unfolds in discrete steps 0,1,2… At step 0, a set 𝑆 of seeds are activated At any step 𝑡, activate node 𝑣 if𝑤 𝑢,𝑣active in neighbor 𝑢 ≥ 𝜃𝑣 The diffusion stops when no more nodes can be activated Influence spread of 𝑆:The expected number of activenodes by the end of the diffusion, when targeting 𝑆 initially
8. 8. Linear Threshold Model – ExampleInactive NodeActive NodeThresholdTotal InfluenceWeightsSource: David Kempe’s slidesvw0.50.30.20.50.10.40.3 0.20.60.2Stop!U8xInfluence spread of {v} is 4
9. 9. Influence MaximizationProblemSelect k individuals such that byactivating them, influence spread ismaximized.InputOutputA directed graph representing asocial network, with influenceweights on edgesNP-hard  #P-hard to compute exact influence 
10. 10. Motivation for Profit Maximization
11. 11. Influence vs. Product Adoption Classical models do not fully capture monetary aspects ofproduct adoptions Being influenced != Being willing to purchase HPTouchPad significant price drop in 2011 (\$499  \$99) Worldwide market share of cellphones (as of 2011.7):1. Nokia 2. Samsung (boo…)3. LG (…)4. Apple iPhone  iPhone: More expensive in hardware and monthly rate plans(ask Rogers,Telus, or Bell…)
12. 12. Product Adoption Product adoption is a two-stage process (Kalish 85) 1st stage: Awareness Get exposed to the product Become familiar with its features 2nd stage: Actual adoption Only if valuation outweighs price Awareness is modeled as being propagated through word-of-mouth: captured by classical models OTOH, the 2nd stage is not captured
13. 13. Valuations for Products One’s valuation for a product = the maximum amount ofmoney one is willing to pay People do not want to reveal valuations for trust and privacyreasons (Kleinberg & Leighton, FOCS’03) IPV (Independent PrivateValue) assumption:Thevaluation of each person’s is drawn independently at randomfrom a certain distribution (Shoham & Leyton-Brown 09) Price-taker assumption: Users respond myopically toprice, comparing it only with own valuation
14. 14. Our Contribution Incorporate monetary aspects into the modeling of thediffusion process of product adoption Price & user valuations Seeding costs LT  LT with user valuations (LT-V) Profit maximization (ProMax) under LT-V Price-Aware GrEedy algorithm (PAGE)
15. 15. Proposed Model &Problem Definition
16. 16. Linear Threshold Model with Valuations(LT-V) Three node states: Inactive, Influenced, and Adopting Inactive  Influenced: same as in LT Influenced  Adopting: Only if the valuation is at least the quotedprice Only adopting nodes will propagate influence to inactive neighbors Model is progressive (see figure)
17. 17. More about LT-V Our LT-V model captures the two-stage product adoptionprocess in (Kalish 1985) Only adopting nodes propagate influence:Actual adopters canaccess experienced-based features of the product Usability, e.g., Easy to shoot night scenes using Nikon D600? Durability, e.g., How long can iPhone 5’s battery last on LTE? Still quite abstract, with room for extensions and refinements(more to come later)
18. 18. ProMax: Notations 𝒑 = 𝑝1, 𝑝2, … , 𝑝 𝑉 : the vector of quoted prices, one pereach node 𝑆: the seed set 𝜋: 2 𝑉× [0,1] 𝑉→ 𝑹: the profit function 𝜋(𝑆, 𝒑): the expected profit earned by targeting 𝑆 and settingprices 𝒑
19. 19. ProMax Problem DefinitionProblemSelect a set 𝑆 of seeds & determinea vector 𝑝 of quoted price, suchthat the 𝜋(𝑆, 𝒑) is maximizedunder the LT-V modelInputOutputA directed graph representing asocial network, with influenceweights on edges
20. 20. ProMax vs. InfMax Difference w/ InfMax under LT Propagation models are different & have distinct properties InfMax only requires “binary decision” on nodes, while ProMaxrequires to set prices
21. 21. A Restricted Special Case Simplifying assumptions: Valuation distributions degenerate to a single point:𝑣𝑖 = 𝑝, ∀𝑢𝑖 ∈ 𝑉 Seeds get the item for free (price = 0) Optimal price vector is out of question Restricted ProMax: Find an optimal seed set 𝑆 tomaximize 𝜋 𝑆 = 𝑝 ∗ ℎ 𝐿 𝑆 − 𝑆 − 𝑐 𝑎 ∗ |𝑆| ℎ 𝐿 𝑆 : expected #adopting nodes under LT-V 𝑐 𝑎: acquisition cost (seeding expenses)
22. 22. A Restricted Special Case 𝜋 𝑆 is non-monotone, but submodular in 𝑆 No need to preset #seeds to pick (the number k in InfMax) Simple greedy cannot be applied to get approx. guarantees Theorem:The restricted ProMax problem is NP-hard. Reduction from the MinimumVertex Cover problem Aside on Maximizing non-monotone submodular functions:Local search approximation algorithms (Feige, Mirrokni, &Vondrak, FOCS’07) Nice, but time complexity too high They assumed an oracle for evaluating the function
23. 23. Unbudgeted Greedy (U-Greedy) Simply grow the seed set 𝑆 by selecting the node with thelargest marginal increase in profit, and stop when no nodescan provide positive marginal gain. Theorem (Quality guarantee of U-Greedy)𝜋 𝑆 𝑔 ≥ 1 −1𝑒𝜋 𝑆∗ − Θ(max 𝑆∗ , |𝑆 𝑔| ) 𝑆 𝑔: Seed set by U-Greedy 𝑆∗: optimal seed set Proof: Some algebra… omitted…
24. 24. Properties of LT-V (in general) For an arbitrary vector of valuation samples 𝒗 =(𝑣1, 𝑣2, … , 𝑣|𝑉|), given an instance of the LT-V model, forany fixed vector 𝒑 of prices, the profit function 𝜋 𝑆, 𝒑 issubmodular in 𝑆. It is #P-hard to compute the exact value of 𝜋 𝑆, 𝒑 , givenany 𝑆 and 𝒑.
25. 25. Algorithms for Profit Maximization
26. 26. ProMax Algorithm: All-OMP Given the distribution function (CDF) 𝐹𝑖 of 𝑣𝑖, theOptimal Myopic Price (OMP) is First baseline –All OMP: Offer OMP to all nodes, and selectseeds using U-Greedy. Ensures max. profit earned solely from a single influencednode Ignores network structures and “profit potential” (frominfluence) of seeds
27. 27. ProMax Algorithm: Free-For-Seeds (FFS) Seeds receive the product for free Non-seeds are charged OMP Ensure all seeds will adopt & propagate influence Trade-off: immediate profit from seeds vs. profit potentialof seeds (through influence) All-OMP favors the former, good for low influence networks FFS favors the latter, good for high influence networks Can we achieve a more balanced heuristic?
28. 28. Price-Aware GrEedy (PAGE) Algorithm The key question: Given a partial seed set, how to determinethe price 𝑝𝑖 for the next seed candidate 𝑢𝑖? Consider the marginal profit 𝑢𝑖 brings: This is a function of 𝑝𝑖 So, let’s find 𝑝𝑖 that maximizes
29. 29. PAGE details Offer 𝑝𝑖 to 𝑢𝑖 leads to 2 possible worlds: 𝑢𝑖 accepts, w.p. 1 − 𝐹𝑖 𝑝𝑖 . 𝑢𝑖 does not accept, w.p. 𝐹𝑖 𝑝𝑖 . Re-write as follows 𝑌1: expected profit earned from other nodes, if 𝑢𝑖 accepts 𝑌0: -----------------------------------------------, o.w. Finding the optimal 𝑝𝑖 depends on the specific form of 𝐹𝑖 We study two cases: Normal distribution Uniform distribution
30. 30. Normal Distribution CDF:where erf(𝑥) is the error function No analytical solution can be found, since erf(𝑥) has noclosed-form expression, and thus neither does 𝑔𝑖(𝑝𝑖) Numerical method: the golden section search algorithm
31. 31. Aside: Golden Section Search Finds the extremum of a function by iteratively narrowingthe interval inside which the extremum is known to exist The function must be unimodal and continuous over the initial interval Terminates when the length of the interval is smaller than a pre-defined number (say, 10−6) No need to take derivatives (which the Newton’s method will need) Performs only one new function evaluation in each step Has a constant reduction factor for the size of the interval
32. 32. Uniform Distribution CDF: So, Easily, we solve for optimal 𝑝𝑖: If < 0 or >1, normalize it back to 0 or 1 (respectively) N.B.,This solution framework can be applied to anyvaluation distributions.Actual solution may be analytical ornumerical, depending on the distribution itself.
33. 33. Experiments: Datasets & Results
34. 34. Network Datasets Epinions A who-trusts-whom network from the customer reviews siteEpinions.com Flixster A friendship network from social movie site Flixster.com NetHEPT A co-authorship network from arxiv.org High Energey PhysicsTheory section.
35. 35. Network Datasets Statistics of the datasets:
36. 36. Influence Weights in Datasets Weighted Distribution (WD)𝑤 𝑢,𝑣 =𝐴 𝑢,𝑣𝑁𝑣 𝐴 𝑢,𝑣: #actions 𝑢 and 𝑣 both have performed (if data is time-stamped, then 𝑢 should perform earlier 𝑁𝑣: total #actions 𝑣 has performed Trivalency (TV): 𝑤 𝑢,𝑣 is chosen uniformly at random from{0.001, 0.01, 0.1}. All weights shall be normalized to ensure ∀𝑣,𝑤 𝑢,𝑣 ≤ 1𝑢
37. 37. Influence Weights in Datasets
38. 38. Estimating Valuation Distributions Hard to obtain real ones Common practice: estimate from historical sales data
39. 39. Estimating Valuation Distributions Besides ratings (1 to 5 stars), users may optionally providethe price they paid At the end of the same review:
40. 40.  We obtain all reviews for Canon EOS 300D, 350D, and 400DDSLR cameras Sequential releases in 3 years  approximately the samemonetary values Remove reviews without price: 276 samples remain View rating as utilityutility = valuation – price paid Thus, our estimation isEstimating Valuation Distributions
41. 41. Estimating Valuation Distributions The fitted normal distribution: 𝑁(0.53, 0.142) Figure 4(b): Kolmogorov-Smirnov (K-S) statistics
42. 42. Experimental Results: Expected ProfitAchieved
43. 43. Experimental Results: PriceAssignment for Seeds
44. 44. Experimental Results: Running Time PAGE is more efficient Leveraging lazy-forwarding more effectively Extra overhead for computing price is small (golden searchalgorithm converges in less than 40 iterations)
45. 45. Conclusions, Discussions,Related Work
46. 46. Conclusions Extended LT model to incorporate price and valuations &distinguish product adoption from social influence Studies the properties of the extended model Proposed profit maximization (ProMax) problem & effectivealgorithm to solve it
47. 47. Discussions & Future Work Make similar extensions to other influence propagationmodels: IC, LT-C, or even the general threshold model Develop fast heuristics to give more efficient & scalablealgorithms Consider to incorporate other elements into the modeling ofproduct adoption Peoples’ spontaneous interests in product (natural earlyadopters) Valuation may change over time for some people Valuations may observe externalities
48. 48. Related Work Influence maximization (too many…) Revenue/profit maximization in social networks Hartline et al.,WWW’08: Influence-and-Exploit Arthur et al.,WINE’09 Chen et al.,WINE’11 Bloch & Qurou, working paper, 2011 Influence vs. product adoption Bhagat, Goyal, & Lakshmanan,WSDM’12: LT model withColors
49. 49. Thanks!Acknowledgements:Allan Borodin (UofT),Wei Chen (MSRAsia), Pei Lee, Kevin Leyton-Brown, Min Xie, Ruben H. Zamar.